Notepad's New UTF-8 Rules since May 2019
- Notepad in the old days was (and Browsers then and now are) fine with printable Microsoft ANSI codes 127-255. If the character is a Unicode higher than 255, Notepad used to insist on being saved as UTF-8 (or one of the other Unicodes) and placed an invisible BOM (three character Byte Order Mark) at the start of the file.
- Notepad (today) defaults to UTF-8. If you have Notepad open, and ask it to "auto detect" a file that should be opened specifically as ANSI, it doesn't do it properly, the characters in the file in the range 127-255 are converted to meaningless � characters on the screen and stay that way when the file is next saved. Click here to view more about this character.
- On the other hand, if you click on a text file without a BOM, and it does contain old "single byte" characters in the range 127-255, it loads Notepad correctly, telling it to open the file as an ANSI file. The file can then be saved correctly as UTF-8, with the single-byte characters in the range 127-255 converted automatically.