Exploring MariaDB’s Storage Engine Options

Author: Srinivasa Krishna | 7 min read | April 12, 2021

MariaDB is a flexible, modern relational database that’s open source and is capable of turning data into structured information. It supports many types of workloads in a single database platform and offers pluggable storage architecture for flexibility and optimization purposes.

You can set up storage engines on a per-database instance or per-table basis. Here are some of the storage engines you can leverage in MariaDB for your development projects.

InnoDB

Originally, the default MariaDB storage engine was XtraDB. Starting with MariaDB 10.2, InnoDB is the default. However, starting from MariaDB 5.5, InnoDB is included as a plugin.

MariaDB’s default storage engine is InnoDB. When you’re looking for a general-purpose option for your transactional workloads, or you’re not sure which solution to use, InnoDB is a brilliant choice. It works best in a mixed read/write environment and offers compression and encryption. You also need to use InnoDB if you want to set up multi-mastering clustering that supports synchronous replication.

CONNECT

The CONNECT handler was introduced in MariaDB 10.0. The strength of this handler is that it can access data from multiple places on a server as if it was a centralized database. In addition, CONNECT doesn’t use locking, meaning that data files are opened and closed for each query.

You may want to use CONNECT for importing and exporting data from a MariaDB database, and for all types of Business Intelligence applications. However, CONNECT isn’t appropriate for transactional applications.

S3

The S3 storage engine was introduced in MariaDB 10.5.4. S3 is read-only and can be used to archive MariaDB tables in Amazon S3. You can also use it with any of the many third-party public or private cloud systems that implement the S3 API, while still having the data accessible for reading in MariaDB.

Typically, you would use S3 when you have tables that are almost inactive, but you still need to maintain them. Using S3 allows you to move that type of table to an archiving service, using an S3 API. S3 compatible storage is much less costly than other alternatives, and many implementations provide reliable long-term storage.

MyRocks

This storage engine comes from Facebook developers and is optimized for space and write-intensive workloads. The compression functionality is excellent, and it’s a fork of RocksDB, which is a Google project that’s performance focused. When you have servers with SSDs and multi-core processors, you’ll see a good boost with MyRocks. Other advantages include:

It has 2x better compression than InnoDB meaning you get greater space efficiency.
It has a 10x less write amplification compared to InnoDB, giving you greater writing efficiency.
It avoids all compaction overheads when faster data loading is enabled because it writes data directly to the bottommost level.
It offers faster replication because there are no random reads for updating secondary keys, except for unique indexes.

ColumnStore

Put your analytics workloads into ColumnStore for a columnar format. This storage engine keeps the data store separate from the database, which allows it to be distributed on multiple servers. This architecture supports real-time ad hoc queries so you can access insights from your data faster. This storage engine supports hundreds of billions of rows and you do not need to use snowflake schema or indexes. Other characteristics include:

It is designed specially to handle analytical workloads.
Data is written by column rather than row and is automatically partitioned, therefore no indexes are necessary.
It can be used as the analytical storage engine for HTAP.
It is easily scalable.
It supports multiple connectors and data adapters to allow the use of commonly used business intelligence tools.
Network latency has only a small impact on Enterprise ColumnStore.

Xpand

When scaling is your focus, Xpand is your top choice. This storage engine allows you to distribute your tables and elastically scale them in a high availability environment. Its performance keeps up with its scale, supporting millions of transactions per second. One of the biggest advantages of Xpand is that you can scale without bringing in a specialized database solution. Other advantages include:

It provides distributed SQL capabilities and is ACID compliant.
It is highly available due to maintaining replicas of each slice, allowing it to recover from a node failure without losing data.
It can maintain multiple replicas of each slice and is zone aware, allowing it to recover from multi-node failures or zone failures without losing data.
Its rebalancer maintains data distribution, meaning that a node or zone failure causes the creation of new replicas for each slice, and the rebalancer then redistributes the data.
It has write scaling since every node writes concurrently and all nodes have the latest data, and it performs writes in parallel.
It scales out because each node can read and write, reads are lockless. Writes don’t block reads, and additional nodes can be added to increase capacity.

Aria

Do you have non-transactional workloads that are read-heavy and need a crash-safe option? Aria fills these requirements nicely. It’s a performance-focused storage engine for system tables and delivers the reliability that this data needs. In addition, Aria offers a number of advantages over MyISAM, including:

Aria can replay almost everything from the log, so making a backup is easy by just copying the log. Exceptions include Batch Insert into an empty table and Alter table.
You can do unit tests of most parts.
If you experience a crash, changes will go back to the start of a statement or the last Lock Tables statement.
It allows multiple concurrent inserters into the same table.

Spider

You can use a MongoDB-like sharding architecture with this virtual storage engine. Set up list, range, and hash schemas and create partitions you can spread over multiple databases. In most cases, you use Spider alongside another storage engine. It pairs nicely with InnoDB and MyRocks for scaling their respective workloads. You may also want to use Spider for the following applications:

Sharding a big table.
Tracking data for multiple locations or branches of a company.
Query tables on multiple MariaDB servers.
Migrating tables between servers.
Running queries on both MariaDB and remote tables on non-MariaDB databases.
Migrating tables from a non-MariaDB server to a MariaDB server.

All of MariaDB’s storage engines have their place, and the best part about this database is that you can pick and choose the combinations that work best for your workload. You don’t need to compromise on your requirements or have to use a different database technology. You get it all in one place with MariaDB, along with all the other benefits it provides.

Going Open-Source: Making the Move to MariaDB from Oracle

Download our white paper to learn more about this powerful database technology, its features, and how to handle the migration process.

Blog Author

Srinivasa Krishna

Global Practice Lead & Director, MySQL Services

Krishna is the Global Practice Lead of MySQL and MariaDB Services at Datavail. Over 16+ years of experience in database architecture and solutions with an extensive technical background in database operations, cloud migrations and modernizations specialized in MySQL, MariaDB and Percona platforms. He also has extensive experience in cloud strategies, migrations and supporting opensource database technologies in the cloud, including AWS RDS, Aurora. Deep understanding of relational, non-relational and hybrid data models, Krishna’s collaborated with Enterprise customers and led projects that delivered high-end database solutions in Finance, Healthcare, Retail, e-commerce, Manufacturing and Telecommunications industries and help customers. He’s active as a thought leader and presents at database conferences such in SSWUG, IOUG, DBTA and AWS Workshops.