
This section describes how to configure the server to use different character sets. It also discusses how to set the server's time zone and enable per-connection time zone support.
latin1 (cp1252 West European) character set and the latin1_swedish_ci collation that sorts according to Swedish/Finnish rules. These defaults are suitable for the United States and most of Western Europe.
All MySQL binary distributions are compiled with --with-extra-charsets=complex. This adds code to all standard programs that enables them to handle latin1 and all multi-byte character sets within the binary. Other character sets are loaded from a character-set definition file when needed.
The character set determines what characters are allowed in identifiers. The collation determines how strings are sorted by the ORDER BY and GROUP BY clauses of the SELECT statement.
You can change the default server character set and collation with the --character-set-server and --collation-server options when you start the server. The collation must be a legal collation for the default character set. (Use the SHOW COLLATION statement to determine which collations are available for each character set.) See Section 5.2.2, “Command Options”.
The character sets available depend on the --with-charset= and charset_name--with-extra-charsets= options to configure, and the character set configuration files listed in list-of-charsets | complex | all | none. See Section 2.4.15.2, “Typical configure Options”. SHAREDIR/charsets/Index
If you change the character set when running MySQL, that may also change the sort order. Consequently, you must run myisamchk -r -q --set-collation=collation_name on all MyISAM tables, or your indexes may not be ordered correctly.
When a client connects to a MySQL server, the server indicates to the client what the server's default character set is. The client switches to this character set for this connection.
You should use mysql_real_escape_string() when escaping strings for an SQL query. mysql_real_escape_string() is identical to the old mysql_escape_string() function, except that it takes the MYSQL connection handle as the first parameter so that the appropriate character set can be taken into account when escaping characters.
If the client is compiled with paths that differ from where the server is installed and the user who configured MySQL didn't include all character sets in the MySQL binary, you must tell the client where it can find the additional character sets it needs if the server runs with a different character set from the client. You can do this by specifying a --character-sets-dir option to indicate the path to the directory in which the dynamic MySQL character sets are stored. For example, you can put the following in an option file:
[client] character-sets-dir=/usr/local/mysql/share/mysql/charsets
You can force the client to use specific character set as follows:
[client]
default-character-set=charset_name
This is normally unnecessary, however.
In MySQL 5.0, character set and collation are specified separately. This means that if you want German sort order, you should select the latin1 character set and either the latin1_german1_ci or latin1_german2_ci collation. For example, to start the server with the latin1_german1_ci collation, use the --character-set-server=latin1 and --collation-server=latin1_german1_ci options.
For information on the differences between these two collations, see Section 9.11.2, “West European Character Sets”.
By default, mysqld produces error messages in English, but they can also be displayed in any of these other languages: Czech, Danish, Dutch, Estonian, French, German, Greek, Hungarian, Italian, Japanese, Korean, Norwegian, Norwegian-ny, Polish, Portuguese, Romanian, Russian, Slovak, Spanish, or Swedish.
To start mysqld with a particular language for error messages, use the --language or -L option. The option value can be a language name or the full path to the error message file. For example:
shell> mysqld --language=swedish
Or:
shell> mysqld --language=/usr/local/share/swedish
The language name should be specified in lowercase.
By default, the language files are located in the share/ directory under the MySQL base directory. LANGUAGE
You can also change the content of the error messages produced by the server. Details can be found in the MySQL Internals manual, available at http://forge.mysql.com/wiki/MySQL_Internals_Error_Messages. If you upgrade to a newer version of MySQL after changing the error messages, remember to repeat your changes after the upgrade.
This section discusses the procedure for adding a new character set to MySQL. You must have a MySQL source distribution to use these instructions. To choose the proper procedure, determine whether the character set is simple or complex:
-
If the character set does not need to use special string collating routines for sorting and does not need multi-byte character support, it is simple.
-
If it needs either of those features, it is complex.
For example, latin1 and danish are simple character sets, whereas big5 and czech are complex character sets.
In the following instructions, the name of the character set is represented by MYSET.
For a simple character set, do the following:
-
Add
MYSETto the end of thesql/share/charsets/Indexfile. Assign a unique number to it. -
Create the file
sql/share/charsets/. (You can use a copy ofMYSET.confsql/share/charsets/latin1.confas the basis for this file.)The syntax for the file is very simple:
-
Comments start with a “
#” character and continue to the end of the line. -
Words are separated by arbitrary amounts of whitespace.
-
When defining the character set, every word must be a number in hexadecimal format.
-
The
ctypearray takes up the first 257 words. Theto_lower[],to_upper[]andsort_order[]arrays take up 256 words each after that.
-
-
Add the character set name to the
CHARSETS_AVAILABLEandCOMPILED_CHARSETSlists inconfigure.in. -
Reconfigure, recompile, and test.
For a complex character set, do the following:
-
Create the file
strings/ctype-in the MySQL source distribution.MYSET.c -
Add
MYSETto the end of thesql/share/charsets/Indexfile. Assign a unique number to it. -
Look at one of the existing
ctype-*.cfiles (such asstrings/ctype-big5.c) to see what needs to be defined. Note that the arrays in your file must have names likectype_,MYSETto_lower_, and so on. These correspond to the arrays for a simple character set. See Section 5.10.4, “The Character Definition Arrays”.MYSET -
Near the top of the file, place a special comment like this:
/* * This comment is parsed by configure to create ctype.c, * so don't change it unless you know what you are doing. * * .configure. number_
MYSET=MYNUMBER* .configure. strxfrm_multiply_MYSET=N* .configure. mbmaxlen_MYSET=N*/The configure program uses this comment to include the character set into the MySQL library automatically.
The
strxfrm_multiplyandmbmaxlenlines are explained in the following sections. You need include them only if you need the string collating functions or the multi-byte character set functions, respectively. -
You should then create some of the following functions:
-
my_strncoll_MYSET() -
my_strcoll_MYSET() -
my_strxfrm_MYSET() -
my_like_range_MYSET()
-
-
Add the character set name to the
CHARSETS_AVAILABLEandCOMPILED_CHARSETSlists inconfigure.in. -
Reconfigure, recompile, and test.
The sql/share/charsets/README file includes additional instructions.
If you want to have the character set included in the MySQL distribution, mail a patch to the MySQL internals mailing list. See Section 1.6.1, “MySQL Mailing Lists”.
to_lower[] and to_upper[] are simple arrays that hold the lowercase and uppercase characters corresponding to each member of the character set. For example:
to_lower['A'] should contain 'a' to_upper['a'] should contain 'A'
sort_order[] is a map indicating how characters should be ordered for comparison and sorting purposes. Quite often (but not for all character sets) this is the same as to_upper[], which means that sorting is case-insensitive. MySQL sorts characters based on the values of sort_order[] elements. For more complicated sorting rules, see the discussion of string collating in Section 5.10.5, “String Collating Support”.
ctype[] is an array of bit values, with one element for one character. (Note that to_lower[], to_upper[], and sort_order[] are indexed by character value, but ctype[] is indexed by character value + 1. This is an old legacy convention for handling EOF.)
You can find the following bitmask definitions in m_ctype.h:
#define _U 01 /* Uppercase */ #define _L 02 /* Lowercase */ #define _N 04 /* Numeral (digit) */ #define _S 010 /* Spacing character */ #define _P 020 /* Punctuation */ #define _C 040 /* Control character */ #define _B 0100 /* Blank */ #define _X 0200 /* heXadecimal digit */
The ctype[] entry for each character should be the union of the applicable bitmask values that describe the character. For example, 'A' is an uppercase character (_U) as well as a hexadecimal digit (_X), so ctype['A'+1] should contain the value:
_U + _X = 01 + 0200 = 0201
If the sorting rules for your language are too complex to be handled with the simple sort_order[] table, you need to use the string collating functions.
The best documentation for this is the existing character sets. Look at the big5, czech, gbk, sjis, and tis160 character sets for examples.
You must specify the strxfrm_multiply_ value in the special comment at the top of the file. MYSET=NN should be set to the maximum ratio the strings may grow during my_strxfrm_ (it must be a positive integer). MYSET
If you want to add support for a new character set that includes multi-byte characters, you need to use the multi-byte character functions.
The best documentation for this is the existing character sets. Look at the euc_kr, gb2312, gbk, sjis, and ujis character sets for examples. These are implemented in the ctype- files in the charset_name.cstrings directory.
You must specify the mbmaxlen_ value in the special comment at the top of the source file. MYSET=NN should be set to the size in bytes of the largest character in the set.
If you try to use a character set that is not compiled into your binary, you might run into the following problems:
-
Your program uses an incorrect path to determine where the character sets are stored. (Default
/usr/local/mysql/share/mysql/charsets). This can be fixed by using the--character-sets-diroption when you run the program in question. -
The character set is a multi-byte character set that cannot be loaded dynamically. In this case, you must recompile the program with support for the character set.
-
The character set is a dynamic character set, but you do not have a configure file for it. In this case, you should install the configure file for the character set from a new MySQL distribution.
-
If your
Indexfile does not contain the name for the character set, your program displays the following error message:ERROR 1105: File '/usr/local/share/mysql/charsets/?.conf' not found (Errcode: 2)
In this case, you should either get a new
Indexfile or manually add the name of any missing character sets to the current file.
For MyISAM tables, you can check the character set name and number for a table with myisamchk -dvv tbl_name.
The MySQL server maintains several time zone settings:
-
The system time zone. When the server starts, it attempts to determine the time zone of the host machine and uses it to set the
system_time_zonesystem variable. The value does not change thereafter.You can set the system time zone for MySQL Server at startup with the
--timezone=option to mysqld_safe. You can also set it by setting thetimezone_nameTZenvironment variable before you start mysqld. The allowable values for--timezoneorTZare system-dependent. Consult your operating system documentation to see what values are acceptable. -
The server's current time zone. The global
time_zonesystem variable indicates the time zone the server currently is operating in. The initial value fortime_zoneis'SYSTEM', which indicates that the server time zone is the same as the system time zone.The initial global server time zone value can be specified explicitly at startup with the
--default-time-zone=option on the command line, or you can use the following line in an option file:timezonedefault-time-zone='
timezone'If you have the
SUPERprivilege, you can set the global server time zone value at runtime with this statement:mysql>
SET GLOBAL time_zone =timezone; -
Per-connection time zones. Each client that connects has its own time zone setting, given by the session
time_zonevariable. Initially, the session variable takes its value from the globaltime_zonevariable, but the client can change its own time zone with this statement:mysql>
SET time_zone =timezone;
The current session time zone setting affects display and storage of time values that are zone-sensitive. This includes the values displayed by functions such as NOW() or CURTIME(), and values stored in and retrieved from TIMESTAMP columns. Values for TIMESTAMP columns are converted from the current time zone to UTC for storage, and from UTC to the current time zone for retrieval. The current time zone setting does not affect values displayed by functions such as UTC_TIMESTAMP() or values in DATE, TIME, or DATETIME columns.
The current values of the global and client-specific time zones can be retrieved like this:
mysql> SELECT @@global.time_zone, @@session.time_zone;
timezone values can be given in several formats, none of which are case sensitive:
-
The value
'SYSTEM'indicates that the time zone should be the same as the system time zone. -
The value can be given as a string indicating an offset from UTC, such as
'+10:00'or'-6:00'. -
The value can be given as a named time zone, such as
'Europe/Helsinki','US/Eastern', or'MET'. Named time zones can be used only if the time zone information tables in themysqldatabase have been created and populated.
The MySQL installation procedure creates the time zone tables in the mysql database, but does not load them. You must do so manually using the following instructions. (If you are upgrading to MySQL 4.1.3 or later from an earlier version, you can create the tables by upgrading your mysql database. Use the instructions in Section 5.5.8, “mysql_upgrade — Check Tables for MySQL Upgrade”. After creating the tables, you can load them.)
Note
Loading the time zone information is not necessarily a one-time operation because the information changes occasionally. For example, the rules for Daylight Saving Time in the United States, Mexico, and parts of Canada changed in 2007. When such changes occur, applications that use the old rules become out of date and you may find it necessary to reload the time zone tables to keep the information used by your MySQL server current. See the notes at the end of this section.
If your system has its own zoneinfo database (the set of files describing time zones), you should use the mysql_tzinfo_to_sql program for filling the time zone tables. Examples of such systems are Linux, FreeBSD, Sun Solaris, and Mac OS X. One likely location for these files is the /usr/share/zoneinfo directory. If your system does not have a zoneinfo database, you can use the downloadable package described later in this section.
The mysql_tzinfo_to_sql program is used to load the time zone tables. On the command line, pass the zoneinfo directory pathname to mysql_tzinfo_to_sql and send the output into the mysql program. For example:
shell> mysql_tzinfo_to_sql /usr/share/zoneinfo | mysql -u root mysql
mysql_tzinfo_to_sql reads your system's time zone files and generates SQL statements from them. mysql processes those statements to load the time zone tables.
mysql_tzinfo_to_sql also can be used to load a single time zone file or to generate leap second information:
-
To load a single time zone file
tz_filethat corresponds to a time zone nametz_name, invoke mysql_tzinfo_to_sql like this:shell>
mysql_tzinfo_to_sqltz_filetz_name| mysql -u root mysqlWith this approach, you must execute a separate command to load the time zone file for each named zone that the server needs to know about.
-
If your time zone needs to account for leap seconds, initialize the leap second information like this, where
tz_fileis the name of your time zone file:shell>
mysql_tzinfo_to_sql --leaptz_file| mysql -u root mysql
If your system is one that has no zoneinfo database (for example, Windows or HP-UX), you can use the package of pre-built time zone tables that is available for download at the MySQL Developer Zone:
http://dev.mysql.com/downloads/timezones.html
This time zone package contains .frm, .MYD, and .MYI files for the MyISAM time zone tables. These tables should be part of the mysql database, so you should place the files in the mysql subdirectory of your MySQL server's data directory. The server should be stopped while you do this and restarted afterward.
Warning
Do not use the downloadable package if your system has a zoneinfo database. Use the mysql_tzinfo_to_sql utility instead. Otherwise, you may cause a difference in datetime handling between MySQL and other applications on your system.
For information about time zone settings in replication setup, please see Section 15.3.1, “Replication Features and Issues”.
Staying Current with Time Zone Changes
As mentioned earlier, when the time zone rules change, applications that use the old rules become out of date. To stay current, it is necessary to make sure that your system uses current time zone information is used. For MySQL, there are two factors to consider in staying current:
-
The operating system time affects the value that the MySQL server uses for times if its time zone is set to
SYSTEM. Make sure that your operating system is using the latest time zone information. For most operating systems, the latest update or service pack prepares your system for the time changes. Check the Web site for your operating system vendor for an update that addresses the time changes. -
If you replace the system's
/etc/localtimetimezone file with a verion that uses rules differing from those in effect at mysqld startup, you should restart mysqld so that it uses the updated rules. Otherwise, mysqld might not notice when the system changes its time. -
If you use named time zones with MySQL, make sure that the time zone tables in the
mysqldatabase are up to date. If your system has its own zoneinfo database, you should reload the MySQL time zone tables whenever the zoneinfo database is updated, using the instructions given earlier in this section. For systems that do not have their own zoneinfo database, check the MySQL Developer Zone for updates. When a new update is available, download it and use it to replace your current time zone tables. mysqld caches time zone information that it looks up, so after replacing the time zone tables, you should restart mysqld to make sure that it does not continue to serve outdated time zone data.
If you are uncertain whether named time zones are available, for use either as the server's time zone setting or by clients that set their own time zone, check whether your time zone tables are empty. The following query determines whether the table that contains time zone names has any rows:
mysql> SELECT COUNT(*) FROM mysql.time_zone_name;
+----------+
| COUNT(*) |
+----------+
| 0 |
+----------+
A count of zero indicates that the table is empty. In this case, no one can be using named time zones, and you don't need to update the tables. A count greater than zero indicates that the table is not empty and that its contents are available to be used for named time zone support. In this case, you should be sure to reload your time zone tables so that anyone who uses named time zones will get correct query results.
To check whether your MySQL installation is updated properly for a change in Daylight Saving Time rules, use a test like the one following. The example uses values that are appropriate for the 2007 DST 1-hour change that occurs in the United States on March 11 at 2 a.m.
The test uses these two queries:
SELECT CONVERT_TZ('2007-03-11 2:00:00','US/Eastern','US/Central');
SELECT CONVERT_TZ('2007-03-11 3:00:00','US/Eastern','US/Central');
The two time values indicate the times at which the DST change occurs, and the use of named time zones requires that the time zone tables be used. The desired result is that both queries return the same result (the input time, converted to the equivalent value in the 'US/Central' time zone).
Before updating the time zone tables, you would see an incorrect result like this:
mysql>SELECT CONVERT_TZ('2007-03-11 2:00:00','US/Eastern','US/Central');+------------------------------------------------------------+ | CONVERT_TZ('2007-03-11 2:00:00','US/Eastern','US/Central') | +------------------------------------------------------------+ | 2007-03-11 01:00:00 | +------------------------------------------------------------+ mysql>SELECT CONVERT_TZ('2007-03-11 3:00:00','US/Eastern','US/Central');+------------------------------------------------------------+ | CONVERT_TZ('2007-03-11 3:00:00','US/Eastern','US/Central') | +------------------------------------------------------------+ | 2007-03-11 02:00:00 | +------------------------------------------------------------+
After updating the tables, you should see the correct result:
mysql>SELECT CONVERT_TZ('2007-03-11 2:00:00','US/Eastern','US/Central');+------------------------------------------------------------+ | CONVERT_TZ('2007-03-11 2:00:00','US/Eastern','US/Central') | +------------------------------------------------------------+ | 2007-03-11 01:00:00 | +------------------------------------------------------------+ mysql>SELECT CONVERT_TZ('2007-03-11 3:00:00','US/Eastern','US/Central');+------------------------------------------------------------+ | CONVERT_TZ('2007-03-11 3:00:00','US/Eastern','US/Central') | +------------------------------------------------------------+ | 2007-03-11 01:00:00 | +------------------------------------------------------------+
Beginning with MySQL 5.0.25, the locale indicated by the lc_time_names system variable controls the language used to display day and month names and abbreviations. This variable affects the output from the DATE_FORMAT(), DAYNAME() and MONTHNAME() functions.
Locale names are POSIX-style values such as 'ja_JP' or 'pt_BR'. The default value is 'en_US' regardless of your system's locale setting, but any client can examine or set its lc_time_names value as shown in the following example:
mysql>SET NAMES 'utf8';Query OK, 0 rows affected (0.09 sec) mysql>SELECT @@lc_time_names;+-----------------+ | @@lc_time_names | +-----------------+ | en_US | +-----------------+ 1 row in set (0.00 sec) mysql>SELECT DAYNAME('2010-01-01'), MONTHNAME('2010-01-01');+-----------------------+-------------------------+ | DAYNAME('2010-01-01') | MONTHNAME('2010-01-01') | +-----------------------+-------------------------+ | Friday | January | +-----------------------+-------------------------+ 1 row in set (0.00 sec) mysql>SELECT DATE_FORMAT('2010-01-01','%W %a %M %b');+-----------------------------------------+ | DATE_FORMAT('2010-01-01','%W %a %M %b') | +-----------------------------------------+ | Friday Fri January Jan | +-----------------------------------------+ 1 row in set (0.00 sec) mysql>SET lc_time_names = 'es_MX';Query OK, 0 rows affected (0.00 sec) mysql>SELECT @@lc_time_names;+-----------------+ | @@lc_time_names | +-----------------+ | es_MX | +-----------------+ 1 row in set (0.00 sec) mysql>SELECT DAYNAME('2010-01-01'), MONTHNAME('2010-01-01');+-----------------------+-------------------------+ | DAYNAME('2010-01-01') | MONTHNAME('2010-01-01') | +-----------------------+-------------------------+ | viernes | enero | +-----------------------+-------------------------+ 1 row in set (0.00 sec) mysql>SELECT DATE_FORMAT('2010-01-01','%W %a %M %b');+-----------------------------------------+ | DATE_FORMAT('2010-01-01','%W %a %M %b') | +-----------------------------------------+ | viernes vie enero ene | +-----------------------------------------+ 1 row in set (0.00 sec)
The day or month name for each of the affected functions is converted from utf8 to the character set indicated by the character_set_connection system variable.
lc_time_names may be set to any of the following locale values:
ar_AE: Arabic - United Arab Emirates |
ar_BH: Arabic - Bahrain |
ar_DZ: Arabic - Algeria |
ar_EG: Arabic - Egypt |
ar_IN: Arabic - Iran |
ar_IQ: Arabic - Iraq |
ar_JO: Arabic - Jordan |
ar_KW: Arabic - Kuwait |
ar_LB: Arabic - Lebanon |
ar_LY: Arabic - Libya |
ar_MA: Arabic - Morocco |
ar_OM: Arabic - Oman |
ar_QA: Arabic - Qatar |
ar_SA: Arabic - Saudi Arabia |
ar_SD: Arabic - Sudan |
ar_SY: Arabic - Syria |
ar_TN: Arabic - Tunisia |
ar_YE: Arabic - Yemen |
be_BY: Belarusian - Belarus |
bg_BG: Bulgarian - Bulgaria |
ca_ES: Catalan - Catalan |
cs_CZ: Czech - Czech Republic |
da_DK: Danish - Denmark |
de_AT: German - Austria |
de_BE: German - Belgium |
de_CH: German - Switzerland |
de_DE: German - Germany |
de_LU: German - Luxembourg |
EE: Estonian - Estonia |
en_AU: English - Australia |
en_CA: English - Canada |
en_GB: English - United Kingdom |
en_IN: English - India |
en_NZ: English - New Zealand |
en_PH: English - Philippines |
en_US: English - United States |
en_ZA: English - South Africa |
en_ZW: English - Zimbabwe |
es_AR: Spanish - Argentina |
es_BO: Spanish - Bolivia |
es_CL: Spanish - Chile |
es_CO: Spanish - Columbia |
es_CR: Spanish - Costa Rica |
es_DO: Spanish - Dominican Republic |
es_EC: Spanish - Ecuador |
es_ES: Spanish - Spain |
es_GT: Spanish - Guatemala |
es_HN: Spanish - Honduras |
es_MX: Spanish - Mexico |
es_NI: Spanish - Nicaragua |
es_PA: Spanish - Panama |
es_PE: Spanish - Peru |
es_PR: Spanish - Puerto Rico |
es_PY: Spanish - Paraguay |
es_SV: Spanish - El Salvador |
es_US: Spanish - United States |
es_UY: Spanish - Uruguay |
es_VE: Spanish - Venezuela |
eu_ES: Basque - Basque |
fi_FI: Finnish - Finland |
fo_FO: Faroese - Faroe Islands |
fr_BE: French - Belgium |
fr_CA: French - Canada |
fr_CH: French - Switzerland |
fr_FR: French - France |
fr_LU: French - Luxembourg |
gl_ES: Galician - Galician |
gu_IN: Gujarati - India |
he_IL: Hebrew - Israel |
hi_IN: Hindi - India |
hr_HR: Croatian - Croatia |
hu_HU: Hungarian - Hungary |
id_ID: Indonesian - Indonesia |
is_IS: Icelandic - Iceland |
it_CH: Italian - Switzerland |
it_IT: Italian - Italy |
ja_JP: Japanese - Japan |
ko_KR: Korean - Korea |
lt_LT: Lithuanian - Lithuania |
lv_LV: Latvian - Latvia |
mk_MK: Macedonian - FYROM |
mn_MN: Mongolia - Mongolian |
ms_MY: Malay - Malaysia |
nb_NO: Norwegian(Bokml) - Norway |
nl_BE: Dutch - Belgium |
nl_NL: Dutch - The Netherlands |
no_NO: Norwegian - Norway |
pl_PL: Polish - Poland |
pt_BR: Portugese - Brazil |
pt_PT: Portugese - Portugal |
ro_RO: Romanian - Romania |
ru_RU: Russian - Russia |
ru_UA: Russian - Ukraine |
sk_SK: Slovak - Slovakia |
sl_SI: Slovenian - Slovenia |
sq_AL: Albanian - Albania |
sr_YU: Serbian - Yugoslavia |
sv_FI: Swedish - Finland |
sv_SE: Swedish - Sweden |
ta_IN: Tamil - India |
te_IN: Telugu - India |
th_TH: Thai - Thailand |
tr_TR: Turkish - Turkey |
uk_UA: Ukrainian - Ukraine |
ur_PK: Urdu - Pakistan |
vi_VN: Vietnamese - Vietnam |
zh_CN: Chinese - Peoples Republic of China |
zh_HK: Chinese - Hong Kong SAR |
zh_TW: Chinese - Taiwan |
lc_time_names currently does not affect the STR_TO_DATE() or GET_FORMAT() function.
: 科技


