Non-latin1 characters in a MySQL version 4 database
If you were using a version 4 MySQL database, and have entered non-latin1 characters into the database, then you have a problem.
Description: MySQL version 4 does not explicitly specify a charset encoding, and defaults to latin1.
However, such languages as Cyrillic, Greek, Hebrew, Arabic, almost all Asian languages, and so on cannot be represented by the Latin charset.
When you have entered such characters, perhaps thru a WYSIWYG interface to a CMS, your browser was probably set to UTF-8 encoding. The WYSIWYG editor and the CMS have accepted your non-Latin characters and have stored them in the database as Unicode data. However, your tables are probably specified to be Latin1 encoding and collation.
You do not see this as a problem right now, because MySQL transparently converts the encoding so that you see correct characters in your database. However, running the default mysql dump command, without specifying the --skip-set-charset and --default-character-set= will tell you whether your data is double-encoded (in other words, the collation of your tables does not correspond to actual data inside).
Open that dump file in an editor with a forced UTF-8 mode (I use OpenOffice or Microsoft Office, although the latter should only be used in an emergency). See if your non-Latin characters are readable.
Alternatively, view your database thru PHP MyAdmin with the browser set to UTF-8 encoding. If you see strange characters in your database, then your database contains UTF-8 encoded characters in tables with Latin1 encoding and collation specified.
If your characters are stored as strange characters, then you must follow instructions in this article to fix your dual encoding: Getting out of MySQL Character Set Hell
˅˅˅ Additional valuable information is available at one of the links below: ˅˅˅
Did you like the article? Let Google Search know by clicking this button:
Page last modified 31-Dec-12 22:02:30 EST