How to convert the database encoding

There are ONLY TWO methods to convert the database charset encoding and collation:

1) EXPORT FOLLOWED BY IMPORT

2) In-place conversion

The first method is safer, and is the only one I will cover.

 

After you obtain your proper mysqldump file you must check that non-Latin characters were exported properly. If your database contained non-Latin characters (Cyrillic, etc), then search for a string of text in those characters in a UTF-8 capable editor set in UTF-8 mode. If those characters appear as gibberish, then you have not specified your mysql dump parameters properly. This also likely means that your database contains UTF-8 encoded non-Latin characters in a table which claims to be Latin1 collation.

Read this article for more information: Non-latin1 characters in a MySQL version 4 database.

You have two options of proceeding:

1) At this time, skip converting your database to UTF-8 encoding, and stick with Latin1. To do this, follow my instructions of creating a readable dump file, but do not search & replace latin1 with utf8. Choose this option if you are on a shared hosting environment, and running of mysql command

show variables like 'char%';

Results in "latin1" specification.

Shared hosting does not allow editing of my.cnf. You will have to wait until your hosting company changes this setting (likely for MySQL v5.5+). Or, talk to your hosting provider tech support.

2) Go thru the additional work of converting the double-encoding present in your tables. You must follow directions of someone else's article here: Getting out of MySQL Character Set Hell.

 

 

Then, examine the table creation statements in that file.

For a version 4 MySQL source, you might only see:

CREATE TABLE 'tablename' (
blahblahblah
) ENGINE=MyISAM;

because MySQL version 4 defaulted to Latin1 without explicitly saying so.

 

For a version 5 MySQL source, you might see:

CREATE TABLE 'tablename' (
blahblahblah
) ENGINE=MyISAM DEFAULT CHARSET=latin1 COLLATE=latin1_german2_ci;

You might now see the COLLATE statement specified, and it may also be other latin1 variants (swedish, etc).

 

Now, in a UTF-8 capable editor (NOT Notepad, Microsoft Word, or Wordpad!), Search & Replace so that the CREATE TABLE is specified thus:

CREATE TABLE 'tablename' (
blahblahblah
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_general_ci COLLATE=utf8_general_ci;

 

When you save this modified file, do not overwrite the original mysqldump; create another file.


 

For further reading:

http://alexking.org/blog/2008/03/06/mysql-latin1-utf8-conversion

http://en.gentoo-wiki.com/wiki/TIP_Convert_latin1_to_UTF-8_in_MySQL

http://docs.moodle.org/20/en/Converting_your_MySQL_database_to_UTF8

http://itworkarounds.blogspot.com/2011/07/mysql-database-migration-and-character.html

 

Procedures to follow if your database has a mix of encodings, or is double-encoded:

http://www.mindfiresolutions.com/convertmysqlcharsettoutf.html

http://www.bluebox.net/about/blog/2009/07/mysql_encoding/

http://codex.wordpress.org/Converting_Database_Character_Sets

 

 

I have prepared a Website Management Commands Master Reference which can aid you in this process.



˅˅˅ Additional valuable information is available at one of the links below: ˅˅˅

 

Did you like the article? Let Google Search know by clicking this button: . Please link to content that you find useful on this website on your own website, forum or blog! You can also comment on this page below, or contact me to ask a question or suggest a topic for me to research. There is a user-editable Wiki available on my website, as well as a Forum that you can contribute to. Site Map.

Page last modified 31-Dec-12 22:00:37 EST
Comments on this page:


Add a comment to this page
Comment Title:
Your Name:
Your Email Address:
Notify me of new comments to this item:
Additional Comments:
This is a captcha-picture. It is used to prevent mass-access by robots. (see: www.captcha.net)