From your comment I was replying to: > "UTF-16" on Windows usually means UCS-2, ...

ChrisSD · on June 11, 2019

I'm sorry if I was unclear but my point was that when you receive a string from the Windows API you cannot make any assumptions about it being valid UTF-16. Therefore converting it to UTF-8 is potentially lossy. So if you then convert it back from UTF-8 to UTF-16 and feed it to the WinAPI you'll get unexpected results. Which is why I feel converting back and forth all the time is risky.

This is one reason why the WTF-8[0] encoding was created as a UTF-8 like encoding that supports invalid unicode.

[0] https://simonsapin.github.io/wtf-8/

ygra · on June 11, 2019

> I would be very surprised if I ask an OS kernel to create a file, and it silently changed the name doing some Unicode normalization.

Doesn't OS X do that? AFAIK files names are in NFD there.

loeg · on June 11, 2019

Yes, Mac normalizes and decomposes. It's weird.