Storage Engines and MySQL
When you create a table, MySQL uses the InnoDB storage engine to create the storage for that table on the hard disk. You can choose an alternative storage engine to use for each table. Typically, you make this choice according to which storage engine offers features that best fit the needs of your application. Each storage engine has a particular set of operational characteristics. These characteristics include the types of locks that are used to manage query contention, and whether the storage engine supports transactions. These engine properties have implications for query processing performance, concurrency, and deadlock prevention.
Although there are many other storage engines available, InnoDB is the best fit for most use cases.
Available Storage Engines
MySQL provides the following storage engines:
- InnoDB
- MyISAM
- MEMORY
- ARCHIVE
- FEDERATED
- EXAMPLE
- BLACKHOLE
- MERGE
- NDBCLUSTER
- CSV
Third-party storage engines are also available.
MySQL provides and maintains several storage engines. The MySQL server is also compatible with many third-party storage engines. A MySQL storage engine is a low-level engine inside the database server that takes care of storing and retrieving data and can be accessed through an internal MySQL API or, in some situations, can be accessed directly by an application. Note that one application can have more than one storage engine in use at any given time. The slide lists the currently supported storage engines. Some of the commonly used MySQL storage engines are outlined in this lesson. Note that InnoDB and NDBCLUSTER are the only two MySQL storage engines that are transactional.
Third-party engines have different sources and features and are not supported by MySQL. For further information, documentation, installation guides, bug reporting, or any help or assistance with these engines, contact the developer of the engine directly.
InnoDB Storage Engine
InnoDB, the default storage engine for MySQL, provides high reliability and high performance, as well as the following primary advantages:
- Transaction-safe (ACID compliant)
- MVCC (Multi-Versioning Concurrency Control) – InnoDB row-level locking and Oracle-style consistent non-locking reads.
- Table data arranged to optimize primary key based queries
- Support for foreign-key referential integrity constraints
- Maximum performance on large data volumes
- Mixing of queries on tables with different storage engines
- Fast auto-recovery after a crash
- Buffer pool for caching data and indexes in memory
It is difficult for any other disk-related relational database engine to rival the efficiency of InnoDB. Here are some additional advantages of using InnoDB:
- Transaction-safe: ACID compliance is achieved with transaction commit, rollback, and crash-recovery capabilities to protect user data.
- Foreign key support: Includes cascaded deletes and updates
- Recovery and backup: Supports consistent and online logical backup
- Mixing queries: Within the same statement, you can mix InnoDB tables with tables from other MySQL storage engines. For example, you can use a join operation to combine data from InnoDB and MEMORY tables in a single query.
- Full-text indexing: Enables efficient searching for words or phrases within text columns
InnoDB as Default Storage Engine
The default storage engine for new tables is InnoDB. Choose an alternative storage engine when creating or altering tables with a clause such as ENGINE=[Storage Engine]. InnoDB is built in to the MySQL server and addresses many industry trends that demand an efficient storage engine:
- Rising hard drive and memory capacity
- Growing performance-to-price ratio requirement
- Increased performance requiring reliability and crash recovery
- Increasingly large, busy, robust, distributed, and important MySQL databases
Displaying the Storage Engine Setting
Confirm the setting by displaying the current value of the storage_engine variable:
mysql> SELECT @@default_storage_engine; +--------------------------+ | @@default_storage_engine | +--------------------------+ | InnoDB | +--------------------------+
Use SHOW to confirm the storage engine, per table:
SHOW CREATE TABLE City\G SHOW TABLE STATUS LIKE 'CountryLanguage'\G
For example:
mysql> SHOW CREATE TABLE City\G *************************** 1. row *************************** Table: City Create Table: CREATE TABLE `City` ( ... ) ENGINE=InnoDB AUTO_INCREMENT=4081 DEFAULT CHARSET=latin1 mysql> SHOW TABLE STATUS LIKE 'CountryLanguage'\G *************************** 1. row *************************** Name: CountryLanguage Engine: InnoDB
Use INFORMATION_SCHEMA to confirm the storage engine, per table:
SELECT TABLE_NAME, ENGINE FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME = 'City' AND TABLE_SCHEMA = 'world_innodb'\G
Setting the Storage Engine
If a table is created without using an ENGINE option to specify a storage engine explicitly, the MySQL server creates the table using the default engine, which is given by the value of the storage_enginesystem variable.
Set the server storage engine as part of the startup config file:
[mysqld] default-storage-engine=[Storage Engine]
You can overwrite the server default at the session level by using the SET command. For example:
SET @@storage_engine=[Storage Engine];
Specify using the CREATE TABLE statement:
CREATE TABLE t (i INT) ENGINE = [Storage Engine];
Converting Existing Tables to InnoDB
Tables that are currently using other storage engines can be changed to InnoDB tables using ALTER TABLE. For exmaple:
ALTER TABLE t ENGINE = InnoDB;
Using ALTER TABLE to change the engine is an expensive operation because it copies all the data internally from one engine to another. Do not convert MySQL system tables in the mysql database (such as user or host) to InnoDB. This is an unsupported operation. The system tables use the MyISAM engine.
When you make an InnoDB table that is a clone of a table using another storage engine, you can also create the indexes after inserting the data. Follow the steps outlined below to clone a table from another storage engine:
1. Create an empty InnoDB table with identical definitions.
2. Create the appropriate indexes.
3. Insert the rows:
INSERT INTO [innodb_table] SELECT * FROM [other_table];
4. Insert large tables in smaller pieces for greater control.
To get better control over the insertion process, you might insert big tables in pieces:
INSERT INTO newtable SELECT * FROM oldtable WHERE yourkey > something AND yourkey <= somethingelse;
During the conversion of big tables, increase the size of the InnoDB buffer pool to reduce disk I/O. You can also increase the sizes of the InnoDB log files. It is best to run the MySQL server mysqld from the command prompt when you first start the server with InnoDB enabled, not from mysqld_safe or as a Windows service.
InnoDB System Tablespace
InnoDB operates using two primary disk-based resources:
- Tablespace: Stores table contents (data rows) and indexes in a single logical storage area
- Log files: Record transaction activity for rollback and recovery
InnoDB stores data, indexes, metadata, logs, and buffers in tablespaces. By default, data and indexes are stored in per-table tablespaces. InnoDB uses a shared tablespace to contain metadata, the undo log, the change buffer, and the doublewrite buffer.
The shared tablespace can occupy multiple files. You can configure the final file in the shared tablespace to be auto-extending, in which case InnoDB expands it automatically if the tablespace fills up.
The shared tablespace also contains a rollback segment by default. As transactions modify rows, undo log information is stored in the rollback segment. This information is used to roll back failed transactions. Move the rollback segment out of the shared tablespace file by setting the innodb_undo_logs option to a non-zero value, and configuring the value of innodb_undo_tablespaces.
Data Tablespaces
Each new table that InnoDB creates sets up an .ibd file in the database directory to accompany the table’s .frm file. The .ibd file acts as the table’s own tablespace file, and InnoDB stores table contents and indexes. You still need the shared tablespace because it contains the InnoDB data dictionary and the rollback segment. You can control this setting with the innodb_file_per_tableoption. You need the default setting to use some of the other features, such as table compression and fast truncation.
Rather than store data in per-table tablespaces, you can store data in the shared database by using the skip_innodb_file_per_tableoption or by setting the innodb_file_per_tableoption to OFF. Disabling the option does not affect the accessibility of any InnoDB tables that have already been created. Those tables remain accessible. You can mix the tablespace type among different tables in the same database. Changing the setting merely changes the default for new tables created, or for tables altered to set the engine to InnoDB (even tables that are already using InnoDB).
Tablespace Directory Structure
In addition to its tablespace files, the InnoDB storage engine manages a set of InnoDBspecific log files that contain information about ongoing transactions. As a client performs a transaction, the changes that it makes are held in the InnoDB log (ib_logfile). The more recent log contents are cached in memory. Normally, the cached log information is written and flushed to log files on disk at transaction commit time, although that may also occur earlier.
Shared Tablespace Configuration
We can increase the tablespace size by adding data files. Use the innodb_data_file_path option in the my.cnf file. The value must be a list of specifications:
[mysqld] innodb_data_file_path=datafile_spec1[;datafile_spec2]...
If more than one data file is named, separate them by semicolon (;) characters, as shown in the example below.
Configuration example: Create a tablespace with a 50 MB (fixed) data file named ibdata1 and an auto-extended 50 MB data file named ibdata2:
[mysqld] innodb_data_file_path=ibdata1:50M;ibdata2:50M:autoextend
Files are placed in the data directory by default. Explicitly specify the file location, if desired.
An easier way to increase the size of the InnoDB system tablespace is to configure it from the beginning to be auto-extending. Specify the autoextend attribute for the last data file in the tablespace definition. Then InnoDB increases the size of that file automatically in 64 MB increments when it runs out of space. You can change the increment size by setting the value of the innodb_autoextend_increment system variable, which is measured in MB. An added benefit of this approach is that you do not need to restart the server.
If your last data file was defined with the keyword autoextend, the procedure for reconfiguring the tablespace must take into account the size to which the last data file has grown. Obtain the size of the data file, round it down to the closest multiple of 1024 × 1024 bytes (= 1 MB), and specify the rounded size explicitly in innodb_data_file_path.Then you can add another data file (using a file name that does not yet exist). Remember that only the last data file in the innodb_data_file_path can be specified as auto-extending.
Log Files and Buffers: Diagram
As a client performs a transaction, the changes that it makes are held in the InnoDB log. The more recent log contents are cached in memory (log buffer). The cached log information is written and flushed to log files on disk at transaction commit time, although that may also occur earlier. If a crash occurs while the tables are being modified, the log files are used for auto-recovery. When the MySQL server restarts, it reapplies the changes recorded in the logs to ensure that the tables reflect all committed transactions.
InnoDB maintains its own buffer pool for caching frequently used data and indexes in main memory. InnoDB stores its tables and indexes in a tablespace. InnoDB tables can be very large, even on operating systems where file size is limited to 2 GB. You can enable multiple buffer pools to minimize contention. This cache applies to so many types of information, and speeds up processing so much, that you should assign up to 80% of your database servers’ physical memory to the InnoDB buffer pool.
If InnoDB configuration options are not specified, MySQL creates an auto-extending 10 MB data file named ibdata1 and two 5 MB log files named ib_logfile0and ib_logfile1in the MySQL data directory. To get good performance, you should specify explicit InnoDB parameters. You can choose a different number of InnoDB log files by using the innodb_log_files_in_group option. The default (and recommended) value is 2, resulting in the two files ib_logfile0 and ib_logfile1.
To change the location of the ib_logfile* files, specify a directory path with the innodb_log_group_home_dir option. This is useful if you want to minimize the risk of hardware failure by storing log files on a different physical device to that which contains the data files.
Configuring Buffer Pools for Performance
When the InnoDB buffer pool is large, many data requests can be quickly retrieved from memory rather than hard disk. MySQL uses multiple buffer pools as a performance enhancement for large buffer pools, typically in the multi-gigabyte range.ach page in the buffer pool is assigned to one of the buffer pools randomly by using a hashing function. Each buffer pool manages its own free lists, flush lists, LRUs, and all other data structures connected to a buffer pool, and is protected by its own buffer pool mutex.
By default, MySQL configures eight buffer pool instances (except on 32-bit Windows systems). To change this, set the innodb_buffer_pool_instances configuration option to a value from 1 (the minimum) to 64 (the maximum). This option takes effect only when you set the innodb_buffer_pool_size to a size of 1 GB or more. The total size that you specify is divided among all the buffer pools. You should specify a combination of innodb_buffer_pool_instances and innodb_buffer_pool_size so that each buffer pool instance is at least 1 GB.
To preload the buffer pool at server restart, enable the options:
- innodb_buffer_pool_dump_at_shutdown and
- innodb_buffer_pool_load_at_startup.
NoSQL and the memcached API
InnoDB supports a plugin with an integrated memcached daemon, which operates in the same process space for efficient data communication with low overhead. By using the simpler memcached API (both text-based and binary protocols) instead of using the SQL layer for some operations, your applications can bypass the parsing and optimizing stages required when submitting SQL statements, and avoid the overhead of checking strongly-typed data. This type of application is called NoSQL (Not Only SQL).
In addition to the classic memcached environment, which loses the key/value store when memcached is restarted, the InnoDB integration means you can configure the memcached store to persist to InnoDB. Because the store is backed by InnoDB, you benefit from other features of InnoDB such as buffered data access and crash recovery. This means you get the speed and simplicity of using a memory-stored memcached store, backed by a robust and persistent InnoDB table.
You can still access the underlying table through SQL for reporting, analysis, ad hoc queries, bulk loading, set operations (such as union and intersection), and other operations (such as summarizing and aggregation) that are well suited to the expressiveness and flexibility of SQL. Because memcached consumes relatively little CPU, and its memory footprint is easy to control, it can run comfortably alongside a MySQL instance on the same system.
Configuring memcached
The innodb_memcached_config.sql script sets up the memcached configuration database and its tables. The default InnoDB backing store is the demo_test table in the test database. You can change this by modifying the containers table in the innodb_memcache database.
Configure memcached tables:
SOURCE [MYSQL_HOME]/scripts/innodb_memcached_config.sql
The abive command creates the innodb_memcache database, which contains the following tables:
- cache_policies: Holds the backing store policy for each memcached operation; local RAM cache, InnoDB, or both.
- Containers: Holds the configuration of each InnoDB table used as a memcached backing store.
- config_options: Holds values for the separatorand table_map_delimiterconfiguration options.
Install the memcached plugin:
INSTALL PLUGIN daemon_memcached SONAME "libmemcached.so";
The plugin enables a number of configurable variables. To view the memcached configuration variables, issue the following command:
SHOW VARIABLES LIKE 'daemon_memcached%';
For example, the following options configure the behavior of the memcached plugin:
- daemon_memcached_enable_binlog: Enable binary logging for memcached operations, which enables replication of memcached API operations
- daemon_memcached_r_batch_size: Control the read batch size, for performance considerations.
- daemon_memcached_w_batch_size : Control the write batch size. The default value is 1, so each memcached write operation is committed immediately.
Key/Value Data
The memcached protocol supports operations such as add, delete, set, get, incr, and ,decr.
Each operation works with key/value pairs in the cache.
- Keys and values are passed as strings.
- Keys cannot contain whitespace characters such as spaces, carriage returns, newlines, tabs, and so on.
- A key is similar to a primary key in a table, and a value is similar to a second column in the same table.
Cache is a flat namespace that ensure that each key is unique. For example, if you are representing key/value pairs from different tables, prepend the table name to the key. A value can be a simple string (or number stored as a string) or serialized complex data. The main benefit of using memcached is its efficiency when reading or writing key/value data. Conceptually, this is similar to a two-column table where the key column is a primary key (and therefore unique). Both key and value are stored as strings, and due to the nature of the memcached protocol, key values cannot contain spaces or newlines.
The direct, low-level database access path used by the memcached plugin is much more efficient for key/value lookups than equivalent SQL queries. The value is stored as a string, and can represent a simple value such as a number or a string, or a complex value that has been serialized (that is, put in a textual form that can reconstruct the original pointers and nesting). You must perform this serialization outside of memcached at the API interface (for example, by implementing the Java Serializable interface in your code) or by formatting the complex data as JSON or XML.
Similarly, simple numeric values must be converted from string to number in SQL statements that access the memcached table (for example, by using the CAST() function).
Using NoSQL with MySQL Databases
To get the maximum benefit from memcached, use it as a fast key/value access method, with an InnoDB backing store. The memcached protocol does not support easy aggregation, joining data, or other features that the SQL protocol supports. Using InnoDB tables as containers for memcached key/value pairs allows you to execute powerful SQL statements performing grouping, aggregation, and joining of data created and modified with the memcached protocol.
The innodb_memcache.containers table allows you to configure one or more InnoDB tables to use as memcached backing stores. Specify the container table’s name, schema, and columns for key, value, flags, and other settings. Each container has a memcached name, which you can use to identify the container for memcached operations. Operations target the container named “default” until requested to change container.
The /usr/share/mysql/innodb_memcached_config.sql file creates and populates the innodb_memcache.containers table, and is a well-commented example of configuring a container.