Sometimes the content of a file is read by an application (in our case Nuxeo server) and the octets are used for various jobs, most common displaying information. Imagine an import in which the document title is set to be the first x words.
The problem appears when the OS locales are set on something primitive, like "POSIX" (UTF-7). This character set is not allowing Unicode characters, but it is reading all octets as ASCII. As result, character beyond the ASCII boundaries (French, Spanish, German characters etc.) are replaced by undisplayable characters. As result, the display is flooded by weird characters.
In this case, the only solution is to set the right locale on the server (the machine running the application).
For Ubuntu / Debian users, a simple and fast way to do it is to:
- remove and install the locales package:
- install the locales charsets
and use most probably en_US.UTF-8