• About GEMINI Tables
  • Using GEMINI Tables

     


    About GEMINI Tables

    Contents

    GEMINI is a transaction-safe table handler for MySQL. It provides row-level locking, robust transaction support and reliable crash recovery. It is targeted for databases that need to handle heavy multi-user updates typical of transaction processing applications while still providing excellent performance for read-intensive operations. The GEMINI table type is developed and supported by NuSphere Corporation (see http://www.nusphere.com/).

    GEMINI provides full ACID transaction properties (Atomic, Consistent, Independent, and Durable) with a programming model that includes support for statement atomicity and all four standard isolation levels (Read Uncommitted, Read Committed, Repeatable Read, and Serializable) defined in the SQL standard.

    GEMINI tables support row-level and table-level locking to increase concurrency in applications and allow reading of tables without locking for maximum concurrency in a heavy update environment. The transaction, locking, and recovery mechanisms are tightly integrated to eliminate unnecessary administration overhead.

    In general, if GEMINI tables are selected for an application, it is recommended that all tables updated in the application be GEMINI tables to provide well-defined system behavior. If non-GEMINI tables are mixed into the application then, ACID transaction properties cannot be maintained. While there are clearly cases where mixing table types is appropriate, it should always be done with careful consideration of the impact on transaction consistency and recoverability needs of the application and underlying database.

    The GEMINI table type is derived from a successful commercial database and uses the storage kernel technology tightly integrated with MySQL server. The basic GEMINI technology is in use by millions of users worldwide in production environments today. This maturity allows GEMINI tables to provide a solution for those users who require transaction-based behavior as part of their applications.

    The GEMINI table handler supports a configurable data cache that allows a significant portion of any database to be maintained in memory while still allowing durable updates.

     GEMINI Features

    The following summarizes the major features provided by GEMINI tables.

     GEMINI Concepts

    This section highlights some of the important concepts behind GEMINI and the GEMINI programming model, including:

    These features are described below.

    ACID Transactions

    ACID in the context of transactions is an acronym which stands for Atomicity, Consistency, Isolation, Durability.

    Attribute Description
    Atomicity A transaction allows for the grouping of one or more changes to tables and rows in the database to form an atomic or indivisible operation. That is, either all of the changes occur or none of them do. If for any reason the transaction cannot be completed, everything this transaction changed can be restored to the state it was in prior to the start of the transaction via a rollback operation.
    Consistency Transactions always operate on a consistent view of the data and when they end always leave the data in a consistent state. Data may be said to be consistent as long as it conforms to a set of invariants, such as no two rows in the customer table have the same customer ID and all orders have an associated customer row. While a transaction executes, these invariants may be violated, but no other transaction will be allowed to see these inconsistencies, and all such inconsistencies will have been eliminated by the time the transaction ends.
    Isolation To a given transaction, it should appear as though it is running all by itself on the database. The effects of concurrently running transactions are invisible to this transaction, and the effects of this transaction are invisible to others until the transaction is committed.
    Durability Once a transaction is committed, its effects are guaranteed to persist even in the event of subsequent system failures. Until the transaction commits, not only are any changes made by that transaction not durable, but are guaranteed not to persist in the face of a system failures, as crash recovery will rollback their effects.

    Transaction COMMIT/ROLLBACK

    As stated above, a transaction is a group of work being done to data. Unless otherwise directed, MySQL considers each statement a transaction in itself. Multiple updates can be accomplished by placing them in a single statement, however they are limited to a single table.

    Applications tend to require more robust use of transaction concepts. Take, for example, a system that processes an order: A row may be inserted in an order table, additional rows may be added to an order-line table, updates may be made to inventory tables, etc. It is important that if the order completes, all the changes are made to all the tables involved; likewise if the order fails, none of the changes to the tables must occur. To facilitate this requirement, MySQL has syntax to start a transaction called BEGIN WORK. All statements that occur after the BEGIN WORK statement are grouped into a single transaction. The end of this transaction occurs when a COMMIT or ROLLBACK statement is encountered. After the COMMIT or ROLLBACK the system returns back to the behavior before the BEGIN WORK statement was encountered where every statement is a transaction.

    To permanently turn off the behavior where every statement is a transaction, MySQL added a variable called AUTOCOMMIT. The AUTOCOMMIT variable can have two values, 1 and 0. The mode where every statement is a transaction is when AUTOCOMMIT is set to 1 (AUTOCOMMIT=1). When AUTOCOMMIT is set to 0 (AUTOCOMMIT=0), then every statement is part of the same transaction until the transaction end by either COMMIT or ROLLBACK. Once a transaction completes, a new transaction is immediately started and the process repeats.

    Here is an example of the SQL statements that you may find in a typical order:

    BEGIN WORK;
       INSERT INTO order VALUES ...;
       INSERT INTO order-lines VALUES ...;
       INSERT INTO order-lines VALUES ...;
       INSERT INTO order-lines VALUES ...;
       UPDATE inventory WHERE ...;
    COMMIT;

    This example shows how to use the BEGIN WORK statement to start a transaction. If the variable AUTOCOMMIT is set to 0, then a transaction would have been started already. In this case, the BEGIN WORK commits the current transaction and starts a new one.

    Statement Atomicity

    As mentioned above, when running with AUTOCOMMIT set to 1, each statement executes as a single transaction. When a statement has an error, then all changes make by the statement must be undone. Transactions support this behavior. Non-transaction safe table handlers would have a partial statement update where some of the changes from the statement would be contained in the database and other changes from the statement would not. Work would need to be done to manually recover from the error.

    Recovery

    Transactions are the basis for database recovery. Recovery is what supports the Durability attribute of the ACID transaction.

    GEMINI uses a separate file called the Recovery Log located in the $DATADIR directory named gemini.rl. This file maintains the integrity of all the GEMINI tables. GEMINI can not recover any data from non-GEMINI tables. In addition, the gemini.rl file is used to rollback transactions in support of the ROLLBACK statement.

    In the event of a system failure, the next time the MySQL server is started, GEMINI will automatically go through its crash recovery process. The result of crash recovery is that all the GEMINI tables will contain the latest changes made to them, and all transactions that were open at the time of the crash will have been rolled back.

    The GEMINI Recovery Log reuses space when it can. Space can be reused when information in the Recovery Log is no longer needed for crash recovery or rollback.

    Isolation Levels

    There are four isolation levels supported by GEMINI:

    These isolation levels apply only to shared locks obtained by select statements, excluding select for update. Statements that get exclusive locks always retain those locks until the transaction commits or rolls back.

    By default, GEMINI operates at the READ COMMITTED level. You can override the default using the following command:

    SET [GLOBAL | SESSION] TRANSACTION ISOLATION LEVEL [READ UNCOMMITTED | READ COMMITTED | REPEATABLE READ | SERIALIZABLE ]

    If the SESSION qualifier used, the specified isolation level persists for the entire session. If the GLOBAL qualifier is used, the specified isolation level is applied to all new connections from this point forward. Note that the specified isolation level will not change the behavior for existing connections including the connection that executes the SET GLOBAL TRANSACTION ISOLATION LEVEL statement.

    Isolation Level Description
    READ UNCOMMITTED Does not obtain any locks when reading rows. This means that if a row is locked by another process in a transaction that has a more strict isolation level, the READ UNCOMMITTED query will not wait until the locks are released before reading the row. You will get an error if attempt any updates while running at this isolation level.
    READ COMMITTED Locks the requested rows long enough to copy the row from the database block to the client row buffer. If a READ COMMITTED query finds that a row is locked exclusively by another process, it will wait until either the row has been released, or the lock timeout value has expired.
    REPEATABLE READ Locks all the rows needed to satisfy the query. These locks are held until the transaction ends (commits or rolls back). If a REPEATABLE READ query finds that a row is locked exclusively by another process, it will wait until either the row has been released, or the lock timeout value has expired.
    SERIALIZABLE Locks the table that contains the rows needed to satisfy the query. This lock is held until the transaction ends (commits or rolls back). If a SERIALIZABLE query finds that a row is exclusively locked by another process, it will wait until either the row has been released, or the lock timeout value has expired.

    The statements that get exclusive locks are INSERT, UPDATE, DELETE and SELECT ... FOR UPDATE. Select statements without the FOR UPDATE qualifier get shared locks which allow other not "for update" select statements to read the same rows but block anyone trying to update the row from accessing it. Rows or tables with exclusive locks block all access to the row from other transactions until the transaction ends.

    In general terms, the higher the Isolation level the more likelihood of having concurrent locks and therefore lock conflicts. In such cases, adjust the -O gemini_lock_table_size accordingly.

    Row-Level Locking

    GEMINI uses row locks, which allows high concurrency for requests on the same table.

    In order to avoid lock table overflow, SQL statements that require applying locks to a large number of rows should either be run at the serializable isolation level or should be covered by a lock table statement.

    Memory must be pre-allocated for the lock table. The mysqld server startup option -0 gemini_lock_table_size can be used to adjust the number of concurrent locks.

     GEMINI Limitations

    The following limitations are in effect for the current version of GEMINI:


    Using GEMINI Tables

    Contents

    This section explains the various startup options you can use with GEMINI tables, how to backup GEMINI tables, some performance considerations and sample configurations, and a brief discussion of when to use GEMINI tables.

    Specifically, the topics covered in this section are:

     Startup Options

    The table below lists options to mysqld that can be used to change the behavior of GEMINI tables.

    Option Description
    --default-table-type=gemini Sets the default table handler to be GEMINI. All create table statements will create GEMINI tables unless otherwise specified with TYPE=table-type. As noted above, there is currently a limitation with TEMPORARY tables using GEMINI.
    --gemini-flush-log-at-commit Forces the recovery log buffers to be flushed after every commit. This can have a serious performance penalty, so use with caution.
    --gemini-recovery=FULL | NONE | FORCE Sets the recovery mode. Default is FULL. NONE is useful for performing repeatable batch operations because the updates are not recorded in the recovery log. FORCE skips crash recovery upon startup; this corrupts the database, and should be used in emergencies only.
    --gemini-unbuffered-io All database writes bypass the OS cache. This can provide a performance boost on heavily updated systems where most of the dataset being worked on is cached in memory with the gemini_buffer_cache parameter.
    --O gemini_buffer_cache=size Amount of memory to allocate for database buffers, including Index and Record information. It is recommended that this number be 10% of the total size of all GEMINI tables. Do not exceed amount of memory on the system!
    --O gemini_connection_limit=# Maximum number of connections to GEMINI; default is 100. Each connection consumes about 1K of memory.
    --O gemini_io_threads=# Number of background I/O threads; default is 2. Increase the number when using --gemini-unbuffered-io
    --O gemini_lock_table_size=# Sets the maximum number of concurrent locks; default is 4096. Using SET [ GLOBAL | SESSION ] TRANSACTION ISOLATION = ... will determine how long a program will hold row locks.
    --O gemini_lock_wait_timeout=seconds Number of seconds to wait for record locks when performing queries; default is 10 seconds. Using SET [ GLOBAL | SESSION ] TRANSACTION ISOLATION = ... will determine how long a program will hold row locks.
    --skip-gemini Do not use GEMINI. If you use --skip-gemini, MySQL will not initialize the GEMINI table handler, saving memory; you cannot use GEMINI tables if you use --skip-gemini.
    --transaction-isolation=READ-UNCOMMITTED | READ-COMMITTED | REPEATABLE-READ | SERIALIZABLE Sets the GLOBAL transaction isolation level for all users that connect to the server; can be overridden with the SET ISOLATION LEVEL statement.

     Creating GEMINI Tables

    GEMINI tables can be created by either using the CREATE TABLE syntax or the ALTER TABLE syntax.

    See Chapter 9, "MySQL Tutorial," in the MySQL Reference Manual for more information on how to create and use MySQL tables.

     Backing Up GEMINI Tables

    GEMINI supports both BACKUP TABLE and RESTORE TABLE syntax. To learn more about how to use BACKUP and RESTORE, see Section 7.3, "BACKUP TABLE Syntax," and Section 7.4, "RESTORE TABLE Syntax," in the MySQL Reference Manual.

    To backup GEMINI tables outside of the MySQL environment, you must first shut down the MySQL server. Once the server is shut down, you can copy the files associated with GEMINI to a different location. The files that make up the GEMINI table handler are:

    All the GEMINI files must be copied together. You can not copy just the .gmi and .gmd files to a different $DATADIR and have them become part of a new database. You can copy an entire $DATADIR directory to another location and start a MySQL server using the new $DATADIR.

     Restoring GEMINI Tables

    To restore GEMINI tables outside of the MySQL environment, you must first shut down the MySQL server. Once the server is shut down, you can remove all GEMINI files in the target $DATADIR and then copy the files previously backed up into the $DATADIR directory.

    As mentioned above, the files that make up the GEMINI table handler are:

    When restoring a table, all the GEMINI files must be copied together. You can not restore just the .gmi and .gmd files.

     Using Auto_Increment Columns With GEMINI Tables

    As mentioned previously, GEMINI tables support row-level and table-level locking to increase concurrency in applications and to allow reading of tables without locking for maximum concurrency in heavy update environments. This feature has several implications when working with auto_increment tables.

    In MySQL, when a column is defined as an auto_increment column, and a row is inserted into the table with a NULL for the column, the auto_increment column is updated to be 1 higher than the highest value in the column.

    With MyISAM tables, the auto_increment function is implemented by looking in the index and finding the highest value and adding 1 to it. This is possible because the entire ISAM table is locked during the update period and the increment value is therefore guaranteed to not be changing.

    With GEMINI tables, the auto_increment function is implemented by maintaining a counter in a separate location from the table data. Instead of looking at the highest value in the table index, GEMINI tables look at this separately maintained counter. This means that in a transactional model, unlike the bottleneck inherent in the MyISAM approach, GEMINI users do not have to wait until the transaction that added the last value either commits or rollbacks before looking at the value.

    Two side-effects of the GEMINI implementation are:

    Note that if you delete the row containing the maximum value for an auto_increment column, the value will be reused with a GEMINI table but not with a MyISAM table.

    See Section 7.7, "CREATE TABLE Syntax," in the MySQL Reference Manual for more information about creating auto_increment columns.

     Performance Considerations

    In addition to designing the best possible application, configuration of the data and the server startup parameters need to be considered. How the hardware is being used can have a dramatic affect on how fast the system will respond to queries. Disk Drives and Memory must both be considered.

    Disk Drives

    For best performance, you want to spread the data out over as many disks as possible. Using RAID 10 stripes work very well. If there are a lot of updates then the recovery log (gemini.rl) should be on a relatively quiet disk drive.

    To spread the data out without using RAID 10, you can do the following:

    Memory

    The more data that can be placed in memory the faster the access to the data. Figure out how large the GEMINI data is by adding up the .gmd and .gmi file sizes. If you can, put at least 10% of the data into memory. You allocate memory for the rows and indexes by using the gemini_buffer_cache startup parameter. For example:

    mysqld -O gemini_buffer_cache=800M

    would allocate 800MB of memory for the GEMINI buffer cache.

     Sample Configurations

    Based on the performance considerations above, we can look at some examples for how to get the best performance out of the system when using GEMINI tables.

    Hardware Configuration
    One CPU, 128MB memory, one disk drive

    Allocate 80MB of memory for reading and updating GEMINI tables by starting the mysqld server with the following option:

    -O gemini_buffer_cache=80M

    Two CPUs, 512MB memory, four disk drives

    Use RAID 10 to stripe the data across all available disks, or use the method described in the performance considerations section, above. Allocate 450MB of memory for reading/updating GEMINI tables:

    -O gemini_buffer_cache=450M

     When To Use GEMINI Tables

    Because the GEMINI table handler provides crash recovery and transaction support, there is extra overhead that is not found in other non-transaction safe table handlers. Here are some general guidelines for when to employ GEMINI and when to use other non-transaction safe tables (NTST).

    Access Trends Table Type Reason
    Read-only NTST Less overhead and faster
    Critical data GEMINI Crash recovery protection
    High concurrency GEMINI Row-level locking
    Heavy update GEMINI Row-level locking

    The table below shows how a typical application schema could be defined.

    Table Contents Table Type Reason
    account Customer account data GEMINI Critical data, heavy update
    order Orders for a customer GEMINI Critical data, heavy update
    orderline Orderline detail for an order GEMINI Critical data, heavy update
    invdesc Inventory description NTST Read-only, frequent access
    salesrep Sales rep information NTST Infrequent update
    inventory Inventory information GEMINI High concurrency, critical data
    config System configuration NTST Read-only