How to convert the database encoding

There are ONLY TWO methods to convert the database charset encoding and collation:

1) EXPORT FOLLOWED BY IMPORT

2) In-place conversion

The first method is safer, and is the only one I will cover.

 

After you obtain your proper mysqldump file you must check that non-Latin characters were exported properly. If your database contained non-Latin characters (Cyrillic, etc), then search for a string of text in those characters in a UTF-8 capable editor set in UTF-8 mode. If those characters appear as gibberish, then you have not specified your mysql dump parameters properly. This also likely means that your database contains UTF-8 encoded non-Latin characters in a table which claims to be Latin1 collation.

Read this article for more information: Non-latin1 characters in a MySQL version 4 database.

You have two options of proceeding:

1) At this time, skip converting your database to UTF-8 encoding, and stick with Latin1. To do this, follow my instructions of creating a readable dump file, but do not search & replace latin1 with utf8. Choose this option if you are on a shared hosting environment, and running of mysql command

show variables like 'char%';

Results in "latin1" specification.

Shared hosting does not allow editing of my.cnf. You will have to wait until your hosting company changes this setting (likely for MySQL v5.5+). Or, talk to your hosting provider tech support.

2) Go thru the additional work of converting the double-encoding present in your tables. You must follow directions of someone else's article here: Getting out of MySQL Character Set Hell.

 

 

Then, examine the table creation statements in that file.

For a version 4 MySQL source, you might only see:

CREATE TABLE 'tablename' (
blahblahblah
) ENGINE=MyISAM;

because MySQL version 4 defaulted to Latin1 without explicitly saying so.

 

For a version 5 MySQL source, you might see:

CREATE TABLE 'tablename' (
blahblahblah
) ENGINE=MyISAM DEFAULT CHARSET=latin1 COLLATE=latin1_german2_ci;

You might now see the COLLATE statement specified, and it may also be other latin1 variants (swedish, etc).

 

Now, in a UTF-8 capable editor (NOT Notepad, Microsoft Word, or Wordpad!), Search & Replace so that the CREATE TABLE is specified thus:

CREATE TABLE 'tablename' (
blahblahblah
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_general_ci COLLATE=utf8_general_ci;

 

When you save this modified file, do not overwrite the original mysqldump; create another file.


 

For further reading:

http://alexking.org/blog/2008/03/06/mysql-latin1-utf8-conversion

http://en.gentoo-wiki.com/wiki/TIP_Convert_latin1_to_UTF-8_in_MySQL

http://docs.moodle.org/20/en/Converting_your_MySQL_database_to_UTF8

http://itworkarounds.blogspot.com/2011/07/mysql-database-migration-and-character.html

 

Procedures to follow if your database has a mix of encodings, or is double-encoded:

http://www.mindfiresolutions.com/convertmysqlcharsettoutf.html

http://www.bluebox.net/about/blog/2009/07/mysql_encoding/

http://codex.wordpress.org/Converting_Database_Character_Sets

 

 

I have prepared a Website Management Commands Master Reference which can aid you in this process.



˅˅˅ Additional valuable information is available at one of the links below: ˅˅˅

 

Did you like the article? Let Google Search know by clicking this button: . Please link to content that you find useful on this website on your own website, forum or blog! You can also comment on this page below, or to ask a question or suggest a topic for me to research. There is a user-editable Wiki available on my website, as well as a Forum that you can contribute to. Site Map.

Page last modified 31-Dec-12 22:00:37 EST
Comments on this page:


Add a comment to this page
Comment Title:
Your Name:
Your Email Address:
Notify me of new comments to this item:
Additional Comments:
This is a captcha-picture. It is used to prevent mass-access by robots. (see: www.captcha.net)