Data Types: Overview
In MySQL, the available data types can be grouped into four major categories:
- Numeric: Numeric values
- Binary: Binary data strings
- Character: Text strings
- Temporal: Time and date values
Within each category, there are numerous specific data types that use varying amounts of memory and disk space, and thus have varying effects on performance. Choosing the best data type for the column has a rather small effect on performance in an individual record, but as the database grows these small effects can lead to larger effects. Take this into account early in the design process, before the effects become performance issues.
MySQL also supports spatial extensions as a storage engine feature. These enable the generation, storage, and analysis of geographic features in character and binary formats.
ABCs of data types:
- Appropriate: The data needs to be represented in the type that best fits the item it represents.
- Brief: Choose the data type that uses the least amount of storage space. This saves resources and increases performance.
- Complete: Choose the data type that allocates enough room to store the largest possible value for the particular item.
Creating Tables with Data Types
When you create a table, you declare each of its columns, including the column name, a data type that indicates what kind of values the column can hold, and possibly some attributes (options) that more specifically define how MySQL should handle the column.
Example 1: Column declarations
CREATE TABLE people ( id INT, first_name CHAR(30), last_name CHAR(30) );
The first example above creates a table named people, which contains an integer-valued numeric column named id and two 30-character string columns named first_name and last_name. The column definitions in the CREATE TABLE statement contain only names and data types. To more specifically control how MySQL handles a column, add attributes to the column definition.
Example 2: Disallowing negative and unknown values
CREATE TABLE people ( id INT UNSIGNED NOT NULL, first_name CHAR(30), last_name CHAR(30) );
The second example shows how to disallow negative values in the id column by adding the UNSIGNED attribute.
Numeric Data Types
Range of values that the data type represents and amount of space that column values require are 2 important factors to consider with numeric data types. For storing numeric data, MySQL provides the following numeric data type classes:
- Integer:Integers do not have a fractional part. That is, a single integer value has no decimal places.
- Floating-Point:Represents approximate-value numbers that have an integer part, a fractional part, or both. This data type class represents values in the native binary floating-point format (IEEE 754) used by the server host’s CPU. This is a very efficient type for storage and computation, but values are subject to rounding error.
- Fixed-Point:Has an integer part, a fractional part, or both
- BIT:BIT column specifications take a width indicating the number of bits per value, from 1 to 64 bits.
Precision and scale are terms that apply to floating-point and fixed-point values, which can have both an integer part and a fractional part.
- Precision:The number of significant digits
- Scale:The number of digits to the right of the decimal point
Numeric Data Types
The integer types are listed in order of ascending precision. That is, each successive integer data type can hold a larger range of integer values than the previous one in the list.
The floating-point data types include FLOAT and DOUBLE. Each of these types can be used to represent approximate-value numbers that have an integer part, a fractional part, or both. FLOATand DOUBLE data types represent values in the native binary floating-point format (IEEE 754) used by the server host’s CPU. This is a very efficient type for storage and computation, but values are subject to rounding error. For both FLOAT and DOUBLE the default value is NULL if the column is nullable, and 0 (numerical zero) if the column is not nullable.
|Integer||TINYINT||Very small integer data type|
|Integer||SMALLINT||Small integer data type|
|Integer||MEDIUMINT||Medium-sized integer data type|
|Integer||INT||Normal- (average-) sized integer data type|
|Integer||BIGINT||Large integer data type|
|Floating-Point||FLOAT||Small, single-precision (four-byte) floating- point number|
|Floating-Point||DOUBLE||Normal, double-precision (eight-byte) floating- point number|
|Fixed-Point||DECIMAL||Exact-value numbers that have an integer part, a fractional part, or both|
DECIMAL uses a fixed-decimal storage format: All values in a DECIMAL column have the same number of decimal places and are stored exactly as given. DECIMAL values are not stored quite as efficiently as FLOAT or DOUBLE values, but DECIMAL values are not subject to rounding errors, so they are more precise. For example, the DECIMAL data type is often used to store currency values, where the precision of each value is more important than its stored size. BITcolumn specifications take a width indicating the number of bits per value, from 1 to 64 bits.
Character String Data Types
A character string data type represents a sequence of alphanumeric characters of a given character set. It stores text or binary data and is implemented in nearly every programming language. A character string supports character sets and collation. Character string data types differ in several ways, including:
- Whether data is stored in a fixed or variable length format
- What the maximum length is that can be stored
- Whether the type supports unstructured string values
Character string data belongs to one of the following storage classes:
Represents true character string data types. You can use this type for storing unstructured, freely formatted character strings (if their length fits the defined amount of space).
Represents structured string types. Use these types when the values to be stored in a column are chosen from a fixed set of values. You define columns for both types in terms of string values, but MySQL represents them internally as integers.
|Text||CHAR||Fixed-length character string, up to a maximum of 255 characters|
|Text||VARCHAR||Variable-length character string, up to a maximum of 65,535 characters|
|Text||TINYTEXT||Variable-length character string, up to a maximum of 255 characters|
|Text||TEXT||Variable-length character string, up to a maximum of 65,535 characters|
|Text||MEDIUMTEXT||Variable-length character string, up to a maximum of 16,777,215 characters|
|Text||LONGTEXT||Variable-length character string, up to a maximum of 4,294,967,295 characters|
|Integer||ENUM||Enumeration consisting of a fixed set of legal values|
|Integer||SET||Set consisting of a fixed set of legal values|
Character Set and Collation Support
A character set is a named set of symbols and encodings. A character string belongs to a specific character set. Character strings have the following characteristics:
- The sequence consists of characters that belong to a specific character set.
- Multibyte character sets may require a fixed or a variable number of bytes per character.
- Comparisons are based on the collation of the character set associated with the string.
- Multibyte character comparisons are performed in character units, not in byte units.
- Collation verifies whether uppercase and lowercase versions of a character are equivalent. It also defines the character sort order.
- Collation determines whether different accent marks for a character are equivalent. It also governs how individual characters and character strings can be compared to each other.
- Binary character strings are sequences of bytes and do not have a character set or collation. Comparisons are based on the numeric values of those bytes.
Column definitions for string data types can specify a character set or collation or both for each column. Attributes apply to CHAR, VARCHAR, TEXT, ENUM, and SET data types, as in the following example:
CREATE TABLE t ( c1 VARCHAR(20) CHARACTER SET utf8, c2 TEXT CHARACTER SET latin1 COLLATE latin1_general_cs );
MySQL has an extensive list of character sets and collations from which to choose. Proper choice can have a significant impact on performance. To view the available character sets, use the following statement:
SHOW CHARACTER SET;
Collation selection can enable you to select different sort orders for the same character set. For example, to show all the latin1 character sets, use the following statement:
SHOW COLLATION LIKE 'latin1%';
Binary String Data Types
Binary string data types are basically a sequence of bytes. For example, binary digits (bits) grouped in eights. These binary types are string types in the sense that they are also a sequence of smaller units. Unlike character strings, the bytes that makeup such a binary string value do not represent characters. Consequently, binary strings do not have attached character semantics, and they lack character set and collation information that is present for the character string classes.
- Binary: Stores binary byte strings of both fixed and variable length.
- BLOB: Stored as a single value in a database management system. BLOBs are typically images, audio, or other multimedia objects, although sometimes binary code is stored as a BLOB. In MySQL, BLOBs are quite similar to the TEXT types without attached character set and collation.
|Binary||BINARY||Similar to the CHAR (fixed-length) type, but stores binary byte strings instead of nonbinary character strings|
|Binary||VARBINARY||Similar to the VARCHAR (variable-length) type, but stores binary byte strings instead of nonbinary character strings|
|BLOB||TINYBLOB||BLOB column with a maximum length of 255 bytes|
|BLOB||BLOB||BLOB column with a maximum length of 65,535 bytes|
|BLOB||MEDIUMBLOB||BLOB column with a maximum length of 16,777,215 bytes|
|BLOB||LONGBLOB||BLOB column with a maximum length of 4,294,967,295 bytes|
Temporal Data Types
Temporal data types store date, time, and year values (or a combination of these values).
|DATETIME||YYYY-MM-DD hh:mm:ss[.uuuuuu]||2006-08-04 12:59:02.123|
|TIMESTAMP||YYYY-MM-DD hh:mm:ss[.uuuuuu]||2006-08-04 12:59:02.12|
YYYY, MM, DD, hh, mm, ss, and uuuuuu represent the values of year, month, day of month, hour, minute, second, and the optional fractional second, respectively. Declare optional fractional seconds by providing an argument to the type. For example, TIME(3)is a TIME type with a fractional second component of up to three digits.
– DATE values range from 1000-01-01 to 9999-12-31.
– DATETIME values range from 1000-01-01 00:00:00.000000to 9999-12-31 23:59:59.999999.
– TIMESTAMP values are stored in UTC (converted to and from the local time as required), and range from 1970-01-01 00:00:00.000000 to 2038-01-19 03:14:07.999999.
– DATETIME and TIMESTAMP can be defined to automatically record the current date and time whenever the row is INSERTed or UPDATEd.
– TIMESTAMP columns differ in a number of ways from DATETIME columns:
- TIMESTAMP columns have a smaller range than DATETIME, and require fewer bytes of storage per value.
- You can set a TIMESTAMP value that does not allow NULLs to the current date and time by assigning it a NULL value.
Spatial Data Types
MySQL supports spatial data type extensions to enable the generation, storage, and analysis of geographic features. A geographic (or geospatial) feature is anything in the world that has a location:
- Entity: A mountain, a pond, or a city
- Space: A town district or the tropics
- Definable location:A crossroad (a particular place where two streets intersect)
You can use spatial extensions with the InnoDB, MyISAM, NDB, and ARCHIVE storage engine tables. MyISAM supports both spatial and non-spatial indexes. Other storage engines support non-spatial indexes.
MySQL implements a subset of the “SQL with Geometry Types” environment proposed by Open Geospatial Consortium (OGC).
Use for single geometry values:
- GEOMETRY: Root class of the hierarchy, values of any type
- POINT: Single location in coordinate space
- CURVE: One-dimensional geometry, a sequence of points
- LINESTRING: Curve with linear interpolation between points
- SURFACE: Two-dimensional geometry
- POLYGON: Plane representing a multisided geometry
Use to hold collections of geometry values:
- MULTIPOINT: Point elements
- MULTICURVE: Curve elements
- MULTILINESTRING: LineString elements
- MULTISURFACE: Surface elements
- MULTIPOLYGON: Polygon elements
- GEOMETRYCOLLECTION: Geometries of any class
Setting Data Types to NULL
In SQL, expressions can evaluate to the null value. The null value is a special value that represents the fact that a value cannot be computed or is not known. It can allow missing values or an empty query result.
|Unknown||There is a value, but the precise value is unknown at this time.|
|Not applicable||If a value is specified, it would not be accurately representative.|
When to Use NULL
In the beginning stages of database design, when you are making a list of the data to be included, it becomes clear that some data may not be available for all columns. Examine these cases and determine whether null values should be allowed. Also, you can change this for an existing table if a problem is detected due to occurrences of null values in the column.
When Not to Use NULL
There are cases when you should not allow null values in a column. The most common case is when it is a primary key. Another example is any column that must have a value for the database design to make sense.
Creating Tables with Column Attributes
This determines how MySQL handles columns:
CREATE TABLE t( i INT UNSIGNED NOT NULL, c CHAR(10) CHARACTER SET utf8, d DATE DEFAULT '2013-01-01');
The table in the example shown above, contains an integer column that is UNSIGNED and cannot contain NULLs, a string column that has a character set of utf8, and a date column that has a default value of ‘2013-01-01’.
Categories of column attributes:
• Numeric: Apply to numeric data types (other than BIT)
• String: Apply to the nonbinary string data types
• General: Apply to any data type
|Numeric||UNSIGNED||Causes negative values to be not permitted|
|Integer- only||AUTO_INCREMENT||Generates sequences of successive unique integer values|
|String||CHARACTER SET||Specifies the character set to use|
|String||COLLATE||Specifies the character set collation|
|String||BINARY||Specifies the binary collation|
|All*||NULL or NOT NULL||Indicates whether a column can contain NULL or not|
|All||DEFAULT||Provides a default value for a new record when no value is specified|
* With the exception of TIMESTAMP.
The following attributes can be used with all data types:
- NULLor NOT NULL: If you do not specify either of these attributes, the default is to allow NULLin the column.
- DEFAULT: For example, default values are used when an INSERT statement is executed that does not provide values for all columns in the table.
Choosing Data Types
– Consider which data types and character sets minimize storage and disk I/O.
– Use fixed-length data types: – If all stored string values have the same length
– Use variable-length data types:
- If stored string values vary
- For multibyte character sets
– Use a multibyte character set that uses less space for frequently used characters.
– Use additional Unicode character sets outside the Basic Multilingual Plane (BMP).