In-Memory Column Store
In-Memory (IM) column store is a new pool in SGA introduced in Oracle Database Release 12.1, Patchset 1. Segments populated into the IM column store are converted into a columnar format. In-Memory segments are transactionally consistent with the buffer cache.
The In-Memory Column Store enables objects (tables, partitions, and other types) to be stored in memory in a new format known as the columnar format. This format enables scans, joins, and aggregates to perform much faster than the traditional on-disk format, providing fast reporting and DML performance for both OLTP and DW environments.
The in-memory columnar format does not replace the on-disk or buffer cache format. This means that when a segment such as a table or a partition is populated into the IM column store, the on-disk format segment is automatically converted into a columnar format and optionally compressed. The columnar format is a pure in-memory format. There is no columnar format storage on disk. It never causes additional writes to disk and therefore does not require any logging or undo space.
All data is stored on disk in the traditional row format. Moreover, the columnar format of a segment is a transaction-consistent copy of the segment either on disk or in the buffer cache. Transaction consistency between the two pools is maintained.
If sufficient space is allocated to the IM column store in SGA, a query accessing objects that are populated into the IM column store performs much faster. The improved performance allows more ad-hoc analytic queries to be executed directly on the real-time transaction data without impacting the existing workload. A lack of IM column store space does not prevent statements to execute against tables that could have been populated into IM column store.
The DBA must decide, according to the types of queries and DMLs executed against the segments, which segments should be defined as non in-memory segments and those as in-memory segments. The DBA can also define more precisely which columns are good candidates for IM column store:
- In row format exclusively: The segments being frequently accessed by OLTP style queries, operating on few rows returning many columns are good candidates for the buffer cache. These segments should not necessarily be defined as in-memory segments and will be sent to the buffer cache only.
- In dual format simultaneously: The segments being frequently accessed by analytical style queries, operating on many rows returning few columns are good candidates for IM column store. If a segment is defined as an in-memory segment but has some columns defined as non in-memory columns, the queries that select any non in-memory columns are sent to the buffer cache and those selecting in-memory columns only are sent to the IM column store. Any fetch-by-rowid is performed on the segment through the buffer cache.
Any DML performed on these objects is executed via the buffer cache.
Advantages of In-Memory Column Store
There are three distinct advantage of In-Memory Column Store:
- Queries run a lot faster: All data can be populated in memory in a compressed columnar format. No index is required and used. Queries run at least 100 times faster than when fetching data from buffer cache thanks to the columnar compressed format.
- DMLs are faster: Analytics indexes can be eliminated being replaced by scans of the IM column store representation of the table.
- Arbitrary ad-hoc queries run with good performance, since the table behaves as if all columns are indexed
In-Memory Column Store Pools
The In-Memory area is sub-divided into two pools: a 1MB pool used to store the actual column formatted data populated into memory, and a 64K pool used to store metadata about the objects that are populated into the IM column store. The amount of available memory in each pool is visible in theV$INMEMORY_AREA view. The relative size of the two pools is determined by internal heuristics; the majority of the In-Memory area memory is allocated to the 1MB pool.
sql> select pool, alloc_bytes, used_bytes, population_status from V$INMEMORY_AREA; POOL ALLOC_BYTES USED_BYTES POLULATE STATUS ---------- ----------- ---------- --------------- 1MB POOL 1710227456 16777216 DONE 64 KB POOL 419430400 1900544 DONE
The size of the In-Memory area, within the SGA, is controlled by the initialization parameter INMEMORY_SIZE (default 0). The In-Memory area must have a minimum size of 100MB. The current size of the In-Memory area is visible in V$SGA. Starting in 12.2, it is possible to increase the size of the In-Memory area on the fly, by increasing the INMEMORY_SIZE parameter via an ALTER SYSTEM command, assuming there is spare memory within the SGA. The INMEMORY_SIZE parameter must be increased by 128MB or more in order for this change to take effect. It is not possible to shrink the size of the In-Memory area on the fly. A reduction in the size of the INMEMORY_SIZE parameter will not take effect until the database instance is restarted. It is important to note that the In-Memory area is not impacted or controlled by Oracle Automatic Memory Management (AMM).
Implementing In-Memory Column Store
Before tables or materialized views can be populated into the IM column store, you must enable the IM column store for the database. Before this can be accomplished, you must ensure that the database is open and The COMPATIBLE initialization parameter is set to 12.1.0 or higher. To enable the In-Memory column store:
1. In SQL*Plus or SQL Developer, log in to the database with administrative privileges.
2. Set the INMEMORY_SIZE initialization parameter to a non-zero value. The minimum setting is 100M. When you set this initialization parameter in a server parameter file (SPFILE) using the ALTER SYSTEM statement, you must specify SCOPE=SPFILE.
SQL> ALTER SYSTEM SET INMEMORY_SIZE = 1G SCOPE=SPFILE;
3. Shut down the database, and then reopen it to initialize the IM column store in the SGA.
4. Check the amount of memory currently allocated for the IM column store:
SQL> SHOW PARAMETER INMEMORY_SIZE NAME TYPE VALUE ----------------------------------- ----------- ----- inmemory_size big integer 1G
If the compatibility level is 12.2 or greater, you can dynamically increase the IM column store using the SQL ALTER SYSTEM SET INMEMORY_SIZE command. For example:
SQL> ALTER SYSTEM SET INMEMORY_SIZE = 2G SCOPE=BOTH;
Note that the size of the IM column store cannot be decreased dynamically. If you wish to do this, you must issue an ALTER SYSTEM SET INMEMORY_SIZE…SCOPE=SPFILE command then restart the database.
In-Memory Column Store Population
Not all of the objects in an Oracle database need to be populated in the IM column store. The IM column store should be populated with the most performance-critical data in the database. Less performance-critical data can reside on lower cost flash or disk. Of course, if your database is small enough, you can populate all of your tables into the IM column store. In-Memory adds a new INMEMORY attribute for tables and materialized views. Only objects with the INMEMORY attribute are populated into the IM column store. The INMEMORY attribute can be specified on a tablespace, table, (sub)partition, or materialized view. If it is enabled at the tablespace level, then all new tables and materialized views in the tablespace will be enabled for the IM column store by default.
By default, all of the columns in an object with the INMEMORY attribute will be populated into the IM column store. However, it is possible to populate only a subset of columns if desired using the NO INMEMORY clause. This clause can also be used to indicate an object is no longer a candidate and remove it from the IM store:
ALTER TABLE sales MODIFY PARTITION SALES_Q2_2008 NO INMEMORY;
Only objects with the INMEMORY attribute are eligible for population. The INMEMORY attribute can be specified on a tablespace, table, partition, or materialized view.
SQL> ALTER TABLE sales INMEMORY
All columns in an object with the INMEMORY attribute will be populated into the IM column store. To exclude specific columns:
SQL> ALTER TABLE sales INMEMORY NO INMEMORY(prod_id);
If enabled at the tablespace level, then new materialized views and tables in the tablespace will be enabled by default:
SQL> ALTER TABLESPACE ts_data DEFAULT INMEMORY;
Prioritization of In-Memory Population
The order in which objects are populated is controlled by the keyword PRIORITY, which has five levels.
|CRITICAL||Object is populated immediately after database is opened|
|HIGH||Object is populated after all CRITICAL objects have been populated, if space remains available in the IM column store|
|MEDIUM||Object is populated after all CRITICAL and HIGH objects have been populated, and space remains available in the IM column store|
|LOW||Object is populated after all CRITICAL, HIGH, and MEDIUM objects have been populated, if space remains available in the IM column store|
|NONE||Objects only populated after they are scanned for the first time (Default), if space is available in the IM column store|
You can specify that the database populates objects in the IM column store either at database instance startup or when INMEMORY objects are accessed. The population algorithm also varies depending on whether you use single-instance or Oracle RAC.
DDL statements include an INMEMORY PRIORITY subclause that provides more control over the population queue.
Objects are populated into the IM column store either in a prioritized list immediately after the database is opened or after they are scanned (queried) for the first time. The order in which objects are populated is controlled by the keyword PRIORITY, which has five levels (see above). The default PRIORITY is NONE, which means an object is populated only after it is scanned for the first time. All objects at a given priority level must be fully populated before the population for any objects at a lower priority level can commence. However, the population order can be superseded if an object without a PRIORITY is scanned, triggering its population into IM column store.
In-Memory Column Store and Oracle RAC
Each node in an Oracle RAC environment has its own In-Memory (IM) column store. Oracle recommends that you equally size the IM column stores on each Oracle RAC node. For any Oracle RAC node that does not require an IM column store, set the INMEMORY_SIZE parameter to 0.
It is possible to have completely different objects populated on every node, or to have larger objects distributed across all of the IM column stores in the cluster. It is also possible to have the same objects appear in the IM column store on every node (on engineered systems, only). The distribution of objects across the IM column stores in a cluster is controlled by two additional sub-clauses to the INMEMORY attribute; DISTRIBUTE and DUPLICATE.
The DISTRIBUTE clause can be used to specify how an object is distributed across the cluster. By default(DISTRIBUTE AUTO), the type of partitioning used (if any) determines how the object is distributed. If the object is not partitioned it is distributed by rowid range.
Alternatively, you can specify the DISTRIBUTE BY clause to over-ride the default behavior. The DISTRIBUTE clause supports BY ROWID, BY PARTITION, BY SUBPARTITION, and FOR SERVICE:
SQL> ALTER TABLE lineorder INMEMORY DISTRIBUTE BY PARTITION; SQL> ALTER TABLE sales INMEMORY DISTRIBUTE FOR SERVICE sales_ebiz;
The DUPLICATE clause is used to control how an object should be duplicated across the IM column stores in the cluster. If you specify just DUPLICATE, then one mirrored copy of the data is distributed across the IM column stores in the cluster. If you want to duplicate the entire object in each IM column store in the cluster, then specify DUPLICATE ALL.
The IM column store is populated whenever a database instance restarts, which can be a slow operation that is I/O-intensive and CPU-intensive. In-Memory FastStart optimizes the population of database objects in the In-Memory column store by storing In-Memory compression units directly on disk.
When IM FastStart is enabled, the database periodically saves a copy of columnar data to disk for faster repopulation during instance restarts. If the database re-opens after being closed, then the database reads columnar data from the FastStart area, and then populates it into the IM column store, ensuring that all transactional consistencies are maintained.
An IM FastStart tablespace requires intermittent I/O while the database is open and operational. The performance gain occurs when the database re-opens because the database avoids the CPU-intensive compression and formatting of data.
- The FastStart enhancement to In-Memory column store was introduced in Oracle Database 12.2.
- The In-Memory column store is populated whenever a database instance restarts. This can be a slow operation that is I/O and CPU intensive.
- IM FastStart optimizes object population in the IM column store by storing IM compression units (IMCU) directly on disk. This results in faster repopulation during instance restarts
- If the database re-opens after being closed, then the database reads columnar data from the FastStart area. It is then populated it into the IM column store, ensuring that all transactional consistencies are maintained.
In-Memory FastStart Architecture
During the first population after the FastStart area is enabled, the database creates the FastStart area. The database manages the FastStart area automatically as follows:
- Whenever population or repopulation of an object occurs, the database writes its columnar data to the FastStart area. The Space Management Worker Processes (Wnnn) write IMCUs to the SecureFiles LOB named SYSDBinstance_name_LOBSEG$. The database writes FastStart metadata to the SYSAUX tablespace, which must be online. Depending on how much DML activity occurs for a CU, a lag can exist between the CUs in the FastStart area and the CUs in the IM column store. The “hotter” a CU is, the less frequently the database populates it in the IM column store and writes it to the FastStart area. If the database crashes, then some CUs that were populated in the IM column store may not exist in the FastStart area.
- If the attribute of a populated object is changed to NOINMEMORY, then the database automatically removes its IMCUs from the FastStart area.
- If the FastStart tablespace runs out of space, then the database uses an internal algorithm to drop the oldest segments, and continues writing to the FastStart area. If no space remains, then the database stops writing to the FastStart area.
The figure above shows PROD, CUST, and SALES populated in the IM column store. When the FastStart area is enabled, the database also writes the IMCUs for these segments to the FastStart area in the fs_tbs. If the database re-opens or if the instance restarts, then the database can validate the IMCUs for modifications to ensure the transactional consistency, and reuse the IMCUs. Regardless of whether the FastStart area is enabled, the database stores data blocks and segments on disk in the users tablespace.
Enabling In-Memory FastStart
A FastStart area is a designated tablespace where IM FastStart stores and manages data for INMEMORY objects. Oracle Database manages the FastStart tablespace without DBA intervention. Only one FastStart area, and one designated FastStart tablespace, is allowed for each PDB or non-CDB. You cannot alter or drop the tablespace while it is the designated IM FastStart tablespace. In an Oracle RAC database, all nodes share the FastStart data.
The DBMS_INMEMORY_ADMIN package provides procedures to enable and disable IM FastStart and also to migrate the IM FastStart to another tablespace, and a function to retrieve the name of the tablespace that is currently designated as the IM FastStart area. Use the DBMS_INMEMORY_ADMIN.FASTSTART_ENABLE procedure to enable a FastStart tablespace.
SQL> exec DBMS_INMEMORY_ADMIN.ENABLE_FASTSTART ('fs_tbs')
The Space Management Worker Processes (Wnnn) creates an empty SecureFiles LOB named SYSDBinstance_name_LOBSEG$
FastStart Area in Oracle RAC
The FastStart area is shared across all Oracle RAC nodes. This feature enables maximum sharing and reusability across the cluster. Only one copy of an IMCU resides in the FastStart area. For example, if DUPLICATE ALL is specified for an object in a four-node cluster, then four copies of the object exist in the IM column stores. However, the database saves only one copy to the FastStart area.
Any database instance in an Oracle RAC cluster can use an IMCU in the FastStart area. This feature improves performance of instance restarts in an Oracle RAC environment. For example, the sales table might have three partitions: sales_2014, sales_2015, and sales_2016, with each partition populated in a different instance.
An instance failure occurs, with one instance unable to restart. If sufficient space is available in the IM column stores, then the surviving instances can read the IMCUs that were previously populated in the inaccessible instance. Thus, all three sales table partitions are available to applications.
How the Database Reads from the FastStart Area
The FastStart area defines what data is loaded when the database reopens, but not when it is loaded. Population is controlled by the priority settings. When the database reopens, the standard PRIORITY rules determine population. For example, the database populates objects with PRIORITY NONE on demand. Objects with priority CRITICAL are higher in the automatic population queue than objects with priority LOW.
For example, in a single-instance database, the sales, customers, and product tables are populated with PRIORITY NONE in the IM column store. At every repopulation, the database saves the IMCUs for these tables to the FastStart area. Assume that the instance unexpectedly terminates. When you reopen the database, the IM column store is empty. If a query scans the sales, customers, or product table, then the database loads the IMCUs for this table from the FastStart area into the IM column store.
In most cases, the FastStart area increases the speed of population. However, if any CU stored in the FastStart area reaches an internal threshold of DML activity, then the database populates the row data from data files instead of from the FastStart area.