Unluckily it’s very common not to change the default charset of your MySQL server and, since the default is
latin1, when someone wishes to store cyrillic or chinese character there are many problems.
The first step is to fix the MySQL installation in order to store internationalized information., so locate your
my.cnf configuration file on Linux, or the
my.ini on Windows boxes.
Search in the configuration file the
[mysqld] section when there is the configuration of the MySQL server.
Insert the following lines and eventually remove any existing configuration option with the same name.
[mysqld] character-set-server=utf8 default-collation=utf8_unicode_ci
character-set-server=utf8 tells to the server that, if not otherwise specified, the character set of the created databases, tables, column will be
utf8 columns will be able to store cyrillic or simplified chinese character, just to give you two examples.
The collation defines how alphabetical ordering will happen, in few words which is the order of the letters that we expect on
ORDER BY columnName clauses.
_ci means that ordering and comparison will be case insensitive and this is the common behavior used in databases.
Be very careful, because usually programming languages (i.e. Java) have case sensitive
.equals(String string) method on
String class, so it’s quite common to have some mistakes caused by this incongruency.
Then look for the
[client] section of your configuration file, and write this line below it.
This is very important because it defines the character set used by the MySQL command-line client, and that’s what will be used to migrate the data from
Now everything is setup, restart MySQL to make sure it’s using the updated configuration, and shut-down any application that is using the database that’s going to be migrated.
mysqldump will create a .sql file containing all the data:
mysqldump --skip-set-charset --no-create-db –no-create-info -h hostname --protocol=TCP -P 3306 -u username -p old_database > dump.sql
--skip-set-charset prevents that in the dump file will be any reference to the old (and wrong) character sets. The options
--no-create-info are used because the new database name will be defined later.
Now the new database is going to be created:
mysql -u username -p and the following SQL should be executed in the terminal:
create schema new_database; quit
Finally the last step is to populate the brand new database with the dumped data:
mysql -u username -p new_database < dump.sql
In this way all the previous data from
old_database is now stored in utf8 format in
I hope this tutorial can be useful, please ask any question or give your feedback.