Percona Server for MySQL 5.6.36-82.0 is Now Available

MySQL Performance Blog - Fri, 2017-05-12 17:43

Percona announces the release of Percona Server for MySQL 5.6.36-82.0 on May 12, 2017. Download the latest version from the Percona web site or the Percona Software Repositories. You can also run Docker containers from the images in the Docker Hub repository.

Based on MySQL 5.6.36, and including all the bug fixes in it, Percona Server for MySQL 5.6.36-82.0 is the current GA release in the Percona Server for MySQL 5.6 series. Percona Server for MySQL is open-source and free – this is the latest release of our enhanced, drop-in replacement for MySQL. Complete details of this release are available in the 5.6.36-82.0 milestone on Launchpad.

New Features: Bugs Fixed:
  • Deadlock could occur in I/O-bound workloads when server was using several small buffer pool instances in combination with small redo log files and variable innodb_empty_free_list_algorithm set to backoff algorithm. Bug fixed #1651657.
  • Querying TABLE_STATISTICS in combination with a stored function could lead to a server crash. Bug fixed #1659992.
  • tokubackup_slave_info file was created for a master server after taking the backup with Percona TokuBackup. Bug fixed #135.
  • Fixed a memory leak in Percona TokuBackup. Bug fixed #1669005.
  • Compressed columns with dictionaries could not be added to a partitioned table by using ALTER TABLE. Bug fixed #1671492.
  • Fixed a memory leak that happened in case of failure to create a multi-threaded slave worker thread. Bug fixed #1675716.
  • The combination of using any audit API-using plugin, like Audit Log Plugin and Response Time Distribution, with multi-byte collation connection and PREPARE statement with a parse error could lead to a server crash. Bug fixed #1688698 (upstream #86209).
  • Fix for a #1433432 bug in Percona Server 5.6.28-76.1 caused a performance regression due to suboptimal LRU manager thread flushing heuristics. Bug fixed #1631309.
  • Creating Compressed columns with dictionaries in MyISAM tables by specifying partition engines would not result in error. Bug fixed #1631954.
  • It was not possible to configure basedir as a symlink. Bug fixed #1639735.
  • Replication slave did not report Seconds_Behind_Master correctly when running in multi-threaded slave mode. Bug fixed #1654091 (upstream #84415).
  • DROP TEMPORARY TABLE would create a transaction in binary log on a read-only server. Bug fixed #1668602 (upstream #85258).
  • Creating a compression dictionary with innodb_fake_changes enabled could lead to a server crash. Bug fixed #1629257.

Other bugs fixed: #1660828 (upstream #84786), #1664519 (upstream #84940), #1674299, #1683456, #1670588 (upstream #84173), #1672389, #1674507, #1674867, #1675623, #1650294, #1659224, #1660565, #1662908, #1669002, #1671473, #1673800, #1674284, #1676441, #1676705, #1676847 (upstream #85671), #1677130 (upstream #85678), #1677162, #1678692, #1678792, #1680510 (upstream #85838), #1683993, #1684012, #1684078, #1684264, and #1674281.

Release notes for Percona Server for MySQL 5.6.36-82.0 are available in the online documentation. Please report any bugs on the launchpad bug tracker.

Categories: MySQL


MySQL Performance Blog - Thu, 2017-05-11 20:36

In this blog post, we’ll look at MyRocks and the LOCK IN SHARE MODE.

Those who attended the March 30th webinar “MyRocks Troubleshooting” might remember our discussion with Yoshinori on LOCK IN SHARE MODE.

I did more tests, and I can confirm that his words are true: LOCK IN SHARE MODE works in MyRocks.

This quick example demonstrates this. The initial setup:

CREATE TABLE t ( id int(11) NOT NULL, f varchar(100) DEFAULT NULL, PRIMARY KEY (id) ) ENGINE=ROCKSDB; insert into t values(12345, 'value1'), (54321, 'value2');

In session 1:

session 1> begin; Query OK, 0 rows affected (0.00 sec) session 1> select * from t where id=12345 lock in share mode; +-------+--------+ | id | f | +-------+--------+ | 12345 | value1 | +-------+--------+ 1 row in set (0.01 sec)

In session 2:

session 2> begin; Query OK, 0 rows affected (0.00 sec) session 2> update t set f='value3' where id=12345; ERROR HY000: Lock wait timeout exceeded; try restarting transaction

However, in the webinar I wanted to remind everyone about the differences between LOCK IN SHARE MODE  and FOR UPDATE. To do so, I added the former to my “session 2” test for the webinar. Once I did, it ignores the lock set in “session 1”. I can update a row and commit:

session 2> select * from t where id=12345 lock in share mode; +-------+--------+ | id | f | +-------+--------+ | 12345 | value1 | +-------+--------+ 1 row in set (0.00 sec) session 2> update t set f='value3' where id=12345; Query OK, 1 row affected (0.00 sec) Rows matched: 1 Changed: 1 Warnings: 0 session 2> commit; Query OK, 0 rows affected (0.02 sec)

I reported this behavior here, and also at Percona Jira bugs database: MYR-107. In Facebook, this bug is already fixed.

This test clearly demonstrates that it is fixed in Facebook. In “session 1”:

session1> CREATE TABLE `t` ( -> `id` int(11) NOT NULL, -> `f` varchar(100) DEFAULT NULL, -> PRIMARY KEY (`id`) -> ) ENGINE=ROCKSDB; Query OK, 0 rows affected (0.00 sec) session1> insert into t values(12345, 'value1'), (54321, 'value2'); Query OK, 2 rows affected (0.00 sec) Records: 2 Duplicates: 0 Warnings: 0 session1> begin; Query OK, 0 rows affected (0.00 sec) session1> select * from t where id=12345 lock in share mode; +-------+--------+ | id | f | +-------+--------+ | 12345 | value1 | +-------+--------+ 1 row in set (0.00 sec)

And now in another session:

session2> begin; Query OK, 0 rows affected (0.00 sec) session2> select * from t where id=12345 lock in share mode; +-------+--------+ | id | f | +-------+--------+ | 12345 | value1 | +-------+--------+ 1 row in set (0.00 sec) session2> update t set f='value3' where id=12345; ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction: Timeout on index: test.t.PRIMARY

If you want to test the fix with the Facebook MySQL build, you need to update submodules to download the patch: git submodule update.

Categories: MySQL

Percona Server for MySQL 5.5.55-38.8 is Now Available

MySQL Performance Blog - Wed, 2017-05-10 17:43

Percona announces the release of Percona Server for MySQL 5.5.55-38.8 on May 10, 2017. Based on MySQL 5.5.55, including all the bug fixes in it, Percona Server for MySQL 5.5.55-38.8 is now the current stable release in the 5.5 series.

Percona Server for MySQL is open-source and free. You can find release details in the 5.5.55-38.8 milestone on Launchpad. Downloads are available here and from the Percona Software Repositories.

New Features:
  • Percona Server 5.5 packages are now available for Ubuntu 17.04 (Zesty Zapus).
Bugs Fixed:
  • If a bitmap write I/O errors happened in the background log tracking thread while a FLUSH CHANGED_PAGE_BITMAPS is executing concurrently it could cause a server crash. Bug fixed #1651656.
  • Querying TABLE_STATISTICS in combination with a stored function could lead to a server crash. Bug fixed #1659992.
  • Queries from the INNODB_CHANGED_PAGES table would needlessly read potentially incomplete bitmap data past the needed LSN range. Bug fixed #1625466.
  • It was not possible to configure basedir as a symlink. Bug fixed #1639735.

Other bugs fixed: #1688161, #1683456, #1670588 (upstream #84173), #1672389, #1675623, #1660243, #1677156, #1680061, #1680510 (upstream #85838), #1683993, #1684012, #1684025, and #1674281.

Find the release notes for Percona Server for MySQL 5.5.55-38.8 in our online documentation. Report bugs on the launchpad bug tracker.

Categories: MySQL

Percona-Lab/mongodb_consistent_backup: 1.0 Release Explained

MySQL Performance Blog - Wed, 2017-05-10 16:58

In this blog post, I will cover the Percona-Lab/mongodb_consistent_backup tool and the improvements in the 1.0.1 release of the tool.


mongodb_consistent_backup is a tool for performing cluster consistent backups on MongoDB clusters or single-replica sets. This tool is open source Python code, developed by Percona and published under our Percona-Lab GitHub repository. Percona-Lab is a place for code projects maintained and supported with only best-effort from Percona.

By considering the entire MongoDB cluster’s shards and individual shard members, mongodb_consistent_backup can backup a cluster with one or many shards to a single point in time. Single-point-in-time consistency of cluster backups is critical to data integrity for any “sharded” database technology, and is a topic often overlooked in database deployments.

This topic is explained in detail by David Murphy in this Percona blog:

1.0 Release

mongodb_consistent_backup originally was a single replica set backup script internal to Percona, which morphed into a large multi-threaded/concurrent Python project. It was released to the public (Percona-Lab) with some rough edges.

This release focuses on the efficiency and reliability of the existing components, many of the pain-points in extending, deploying and troubleshooting the tool and adding some small features.

New Features: Config File Overhaul

The tool was moved to use a structured, nested YAML config file instead of the messy config implemented in 0.x.

You can see a full example of this new format at this URL:

Here’s an example of a very basic config file that’s using 3 x replica-set config servers as “seed hosts” (a new feature in 1.0!), username+password and the optional Nagios NSCA notification method:

production: host: csReplSet/config01:27019,config02:27019,config03:27019 username: mongodb_consistent_password password: "correct horse battery staple" authdb: admin log_dir: /var/log/mongodb_consistent_backup backup: method: mongodump name: production-eu location: /var/lib/mongodb_consistent_backup archive: method: tar notify: method: nsca nsca: check_host: mongodb-production-eu check_name: "mongodb_consistent_backup" server: upload: method: none

New Features: Logging

Overall there is much more logged in this release, both in “regular” mode and “verbose” mode. A highlight for this release is live logging of the output of mongodump, something that was missing from the 0.x versions of the tool.

Now we can see the progress of the backup of each shard/replset in a cluster! Below we can see the backup of csReplset (a config server replica set) dump many collections and complete its backup. After, we can see the replica sets “test1” and “test2” dumping “wikipedia.pages”.

... [2017-05-05 20:11:05,366] [INFO] [MongodumpThread-7] [MongodumpThread:wait:72] csReplSet/ done dumping config.settings (1 document) [2017-05-05 20:11:05,367] [INFO] [MongodumpThread-7] [MongodumpThread:wait:72] csReplSet/ writing config.version to [2017-05-05 20:11:05,372] [INFO] [MongodumpThread-7] [MongodumpThread:wait:72] csReplSet/ done dumping config.version (1 document) [2017-05-05 20:11:05,373] [INFO] [MongodumpThread-7] [MongodumpThread:wait:72] csReplSet/ writing config.locks to [2017-05-05 20:11:05,377] [INFO] [MongodumpThread-7] [MongodumpThread:wait:72] csReplSet/ done dumping config.locks (3 documents) [2017-05-05 20:11:05,378] [INFO] [MongodumpThread-7] [MongodumpThread:wait:72] csReplSet/ writing config.databases to [2017-05-05 20:11:05,381] [INFO] [MongodumpThread-7] [MongodumpThread:wait:72] csReplSet/ done dumping config.databases (1 document) [2017-05-05 20:11:05,383] [INFO] [MongodumpThread-7] [MongodumpThread:wait:72] csReplSet/ writing config.tags to [2017-05-05 20:11:05,385] [INFO] [MongodumpThread-7] [MongodumpThread:wait:72] csReplSet/ done dumping config.tags (0 documents) [2017-05-05 20:11:05,387] [INFO] [MongodumpThread-7] [MongodumpThread:wait:72] csReplSet/ writing config.changelog to [2017-05-05 20:11:05,399] [INFO] [MongodumpThread-7] [MongodumpThread:wait:72] csReplSet/ done dumping config.changelog (112 documents) [2017-05-05 20:11:05,401] [INFO] [MongodumpThread-7] [MongodumpThread:wait:72] csReplSet/ writing captured oplog to [2017-05-05 20:11:05,578] [INFO] [MongodumpThread-7] [MongodumpThread:run:133] Backup csReplSet/ completed in 0.71 seconds, 0 oplog changes [2017-05-05 20:11:08,042] [INFO] [MongodumpThread-5] [MongodumpThread:wait:72] test1/ [........................] wikipedia.pages 636/35080 (1.8%) [2017-05-05 20:11:08,071] [INFO] [MongodumpThread-6] [MongodumpThread:wait:72] test2/ [........................] wikipedia.pages 878/35118 (2.5%) [2017-05-05 20:11:11,041] [INFO] [MongodumpThread-5] [MongodumpThread:wait:72] test1/ [#.......................] wikipedia.pages 1853/35080 (5.3%) [2017-05-05 20:11:11,068] [INFO] [MongodumpThread-6] [MongodumpThread:wait:72] test2/ [#.......................] wikipedia.pages 2063/35118 (5.9%) [2017-05-05 20:11:14,043] [INFO] [MongodumpThread-5] [MongodumpThread:wait:72] test1/ [##......................] wikipedia.pages 2983/35080 (8.5%) [2017-05-05 20:11:14,075] [INFO] [MongodumpThread-6] [MongodumpThread:wait:72] test2/ [##......................] wikipedia.pages 3357/35118 (9.6%) [2017-05-05 20:11:17,040] [INFO] [MongodumpThread-5] [MongodumpThread:wait:72] test1/ [##......................] wikipedia.pages 4253/35080 (12.1%) [2017-05-05 20:11:17,070] [INFO] [MongodumpThread-6] [MongodumpThread:wait:72] test2/ [###.....................] wikipedia.pages 4561/35118 (13.0%) [2017-05-05 20:11:20,038] [INFO] [MongodumpThread-5] [MongodumpThread:wait:72] test1/ [###.....................] wikipedia.pages 5180/35080 (14.8%) [2017-05-05 20:11:20,067] [INFO] [MongodumpThread-6] [MongodumpThread:wait:72] test2/ [###.....................] wikipedia.pages 5824/35118 (16.6%) [2017-05-05 20:11:23,050] [INFO] [MongodumpThread-5] [MongodumpThread:wait:72] test1/ [####....................] wikipedia.pages 6216/35080 (17.7%) [2017-05-05 20:11:23,072] [INFO] [MongodumpThread-6] [MongodumpThread:wait:72] test2/ [####....................] wikipedia.pages 6964/35118 (19.8%) ...

Also, while backup data is gathered the status output from each Oplog tailing thread is now logged every 30 seconds (by default):

... [2017-05-05 20:12:09,648] [INFO] [TailThread-2] [TailThread:status:60] Oplog tailer test1/ status: 256 oplog changes, ts: Timestamp(1494020048, 6) [2017-05-05 20:12:11,033] [INFO] [TailThread-3] [TailThread:status:60] Oplog tailer test2/ status: 1588 oplog changes, ts: Timestamp(1494020049, 50) [2017-05-05 20:12:22,804] [INFO] [TailThread-4] [TailThread:status:60] Oplog tailer csReplSet/ status: 43 oplog changes, ts: Timestamp(1494020062, 1) ...

You can now write log files to disk by setting the ‘log_dir’ config variable or ‘–log-dir’ command-line flag. One log file per backup is written to this directory, with a symlink pointing to the latest log file. The previous backup’s log file is automatically compressed with gzip.

New Features: ZBackup

ZBackup is an open-source de-duplication, compression and (optional) encryption tool for archive-like data (similar to backups). Files that are fed into ZBackup are organized at a block-level into pieces called “bundles”. When more files are fed into ZBackup, it can re-use the bundles when it notices the same blocks are being backed up. This approach provides significant savings on disk space (required for many database backups). To add to the savings, all data in ZBackup is compressed using LZMA compression, which generally compresses better than gzip/deflate or zip. ZBackup also supports an optional AES-128 encryption at rest. You enable it by providing a key file to ZBackup that allows it to encode/decode the data.

mongodb_consistent_backup 1.0.0 now supports ZBackup as a new archiving method!

Below is an example of ZBackup used on a small database (about 1GB) that is constantly growing.

This graph compares the size added on disk for seven backups taken 10-minutes apart using two methods. The first method is mongodb_consistent_backup, with mongodump built-in gzip compression (available via the –gzip flag since 3.2) enabled. By default mongodump gzip is enabled by mongodb_consistent_backup (if it’s available), so this is a good “baseline”. The second method is mongodb_consistent_backup with mongodump gzip compression disabled and ZBackup used as the mongodb_consistent_backup archiving method, a post-backup stage in our tool. Notice each backup in the graph after the first only adds 14-18mb to the disk usage, meaning ZBackup was able to recognize similarities in the data.

To try out ZBackup as an archive method, use one of these methods:

  1. Set the field “method” under the “archive” section of your mongodb_consistent_backup config file to “zbackup” (example):
    production:   ...   archive:      method: zbackup   ...
  2. Or, add the command-line flag “archive.method=zbackup” to your command line.

This archive method causes mongodb_consistent_backup to create a subdirectory in the backup location named “mongodb-consistent-backup_zbackup” and import completed backups into ZBackup after the backup stage. This directory contains the ZBackup storage files that it needs to operate, and they should not be modified!

Of course, there are trade-offs. ZBackup adds some additional system resource usage and time to the overall backup AND restore process – both importing and exporting data into ZBackup takes some additional time.

By default ZBackup’s restore uses a very small amount of RAM for cache, so increasing the cache with the “–cache-size” flag may improve restore performance. ZBackup uses threading so more CPUs can also improve performance of backups and restores.

New Features: Docker Container

We now offer a Dockerfile for building mongodb_consistent_backup with all dependencies into a Docker container! The goal for the image is to be as “thin” as possible, and so the build merely downloads a prebuilt binary of the tool and installs dependencies. See:

Some interesting use cases for a Docker-based deployment of the tool come to mind:

  • Running MongoDB backups using ephemeral containers on Apache Mesos or Kubernetes (with persistent volumes or remote upload)
  • Restricting system resources used by mongodb_consistent_backup via Docker/cgroup’s isolation features
  • Simplified deployment or isolated dependencies (e.g., Python, Mongodump, etc.)

Up-to-date images of mongodb_consistent_backup are available at this Dockerhub URL: This image includes mongodb_consistent_backup, gzip-capable mongodump binaries and latest-stable ZBackup binaries.

To run the latest Dockerhub image:

$ docker run -i timvaillancourt/mongodb_consistent_backup:latest <mongodb_consistent_backup-flags here>

To just list the “help” page (all available options):

$ docker run -i timvaillancourt/mongodb_consistent_backup:latest --help usage: mongodb-consistent-backup [-h] [-c CONFIGPATH] [-e {production,staging,development}] [-V] [-v] [-H HOST] [-P PORT] [-u USER] [-p PASSWORD] [-a AUTHDB] [-n BACKUP.NAME] [-l BACKUP.LOCATION] [-m {mongodump}] [-L LOG_DIR] [--lock-file LOCK_FILE] [--sharding.balancer.wait_secs SHARDING.BALANCER.WAIT_SECS] [--sharding.balancer.ping_secs SHARDING.BALANCER.PING_SECS] [--archive.method {tar,zbackup,none}] [--archive.tar.compression {gzip,none}] [--archive.tar.threads ARCHIVE.TAR.THREADS] [--archive.zbackup.binary ARCHIVE.ZBACKUP.BINARY] [--archive.zbackup.cache_mb ARCHIVE.ZBACKUP.CACHE_MB] [--archive.zbackup.compression {lzma}] ... ...

An example script for running the container with persistent Docker volumes is available here:

New Features: Multiple Seed Hosts + Config Servers

mongodb_consistent_backup 1.0 introduces the ability to define a list of multiple “seed” hosts, preventing a potential for a single-point of failure in your backups! If a host in the list is unavailable, it will be skipped.

Multiple hosts should be specified with this replica-set URL format, many hosts separated by commas:


Or you can specify a comma-separated list without the replica set name for non-replset nodes (eg: mongos or non-replset config servers):


Also, the functionality to use cluster Config Servers as seed hosts was added. Before version 1.0 a clustered backup needed to use a single mongos router as a seed host to find all shards and cluster members. Sometimes mongos routers can come and go as you scale, making this design brittle.

With this new functionality, mongodb_consistent_backup can use the Cluster Config Servers to map out the cluster, which are usually three times the fairly-static hosts in an infrastructure. This makes the deployment and operation of the tool a bit simpler and more reliable.

Overall Improvements

As mentioned, a focus in this release was improving the existing code. A major refactoring of the code structure of the project was completed in 1.0, and moves the major “phases” or “stages” in the tool to their own Python sub-modules (e.g., “Backup” and “Archive”) that then auto-load their various “methods” like “mongodump” or “Zbackup”.

The code was broken into these high-level stages:

  1. Backup. The stage that gathers the backup of data. During this stage, Oplog tailing and resolving also occur if the backup is for a cluster. More backup methods are coming soon!
  2. Archive. The stage that archives and optionally compresses the backup data. The new ZBackup method also adds de-duplication and encryption ability to this stage.
  3. Upload. The stage that uploads the resulting data to a remote storage. Currently only AWS S3 is supported with Google Cloud Storage and Rsync being added as we speak.
  4. Notify. The stage that notifies external systems of the success/failure of the backup. Currently, our tool only supports Nagios NSCA, with plans for PagerDuty and others to be added.

Some interesting code enhancements include:

  • Reusing of database connections. This reduces the number of connections on seed hosts.
  • Replication heartbeat time (“operational lag”). This is now considered in replica set lag calculations.
  • Added thread safety for oplog tailing threads. This resolves some issues on extremely-overloaded hosts.

Another focus was efficiency and preventing race conditions. The tool should be much less susceptible to error as a result, although if you see any problems we’d like to hear about them on our GitHub “Issues” page.

Lastly, we encourage the open source community to contribute additional functionality to this tool via our GitHub!

Release Notes:

  • 1.0.0
    • Move to dynamic code “Submodules” and subclassing of repeated components
    • Restructuring of YAML config to nested config
    • Safe start/stopping of oplog tailer threads, additional checking on all thread states
    • File-based logging with gzip of old log
    • Oplog tailer ‘oplogReplay’ performance optimization
    • Fixes to oplog durability to-disk
    • Live mongodump output to stdout in realtime
    • Oplog tailer status logging
    • ZBackup archive method: supporting deduplication, compression and option AES encryption
    • Support for list discovery/seed hosts
    • Support configdb servers as cluster seed hosts
    • Fewer (reused) database connections
    • Database connections to use strong write concern
    • Consider replication operational lag in secondary scoring
    • Backup metadata is written for future functionality and troubleshooting
    • mongodb_consistent_backup.Errors custom exceptions for proper exception handling
    • Python PyPi support added
    • Dockerfile support for running under containers
    • Additional log messages
    • Support for MongoDB 3.4 datatypes
    • Significant reworking of existing code for efficiency, reliability and readability

More about our releases can be seen here:

Categories: MySQL

MariaDB Handler_icp_% Counters: What They Are, and How To Use Them

MySQL Performance Blog - Tue, 2017-05-09 19:39

In this post we’ll see how MariaDB’s Handler_icp_% counters status counters (Handler_icp_attempts and Handler_icp_matches) measure ICP-related work done by the server and storage engine layers, and how to see if our queries are getting any gains by using them.

These counters (as seen in SHOW STATUS output) are MariaDB-specific. In a later post, we will see how we can get this information in MySQL and Percona Server. This investigation spun off from comments in Michael’s post about the new MariaDB dashboard in PMM. Comments are very useful, so keep them coming!

Categories: MySQL

Percona Server for MongoDB 3.4.4-1.4 is Now Available

MySQL Performance Blog - Tue, 2017-05-09 17:18

Percona announces the release of Percona Server for MongoDB 3.4.4-1.4 on May 9, 2017. Download the latest version from the Percona web site or the Percona Software Repositories.

Percona Server for MongoDB is an enhanced, open source, fully compatible, highly-scalable, zero-maintenance downtime database supporting the MongoDB v3.4 protocol and drivers. It extends MongoDB with Percona Memory Engine and MongoRocks storage engine, as well as several enterprise-grade features:

Percona Server for MongoDB requires no changes to MongoDB applications or code.

This release is based on MongoDB 3.4.4 and includes the following additional changes:

  • #PSMDB-122: Added the script to binary tarball.
  • #PSMDB-127: Fixed cleanup of deleted documents and indexes for MongoRocks. When you upgrade to this release, deferred compaction may occur and cause database size to decrease significantly.
  • #PSMDB-133: Added the wiredTigerCheckpointSizeMB variable, set to 1000 in the configration template for WiredTiger. Valid values are 32 to 2048 (2GB), with the latter being default.
  • #PSMDB-138: Implemented SERVER-23418 for MongoRocks.

Percona Server for MongoDB 3.4.4-1.4 release notes are available in the official documentation.

Categories: MySQL

Chasing a Hung MySQL Transaction: InnoDB History Length Strikes Back

MySQL Performance Blog - Mon, 2017-05-08 19:09

In this blog post, I’ll review how a hung MySQL transaction can cause the InnoDB history length to grow and negatively affect MySQL performance.

Recently I was helping a customer discover why SELECT queries were running slower and slower until the server restarts (which got things back to normal). It took some time to get from that symptom to a final diagnosis. Please follow me on the journey of chasing this strange MySQL behavior!


Changes in the query response time can mean tons of things. We can check everything from the query plan to the disk performance. However, fixing it with a restart is less common. After looking at “show engine innodb status”, I noticed some strange lines:

Trx read view will not see trx with id >= 41271309593, sees < 41268384363 ---TRANSACTION 41271309586, ACTIVE 766132 sec 2 lock struct(s), heap size 376, 0 row lock(s), undo log entries 1 ...

There was a total of 940 transactions like this.

Another insight was the InnoDB transaction history length graph from Percona Monitoring and Management (PMM):

History length of 6 million and growing clearly indicates a problem.

Problem localized

There have been a number of blog posts describing a similar problem: Peter stated in a blog post: “InnoDB transaction history often hides dangerous ‘debt’“. As the InnoDB transaction history grows, SELECTs need to scan more and more previous versions of the rows, and performance suffers. That explains the issue: SELECT queries get slower and slower until restart. Peter also filed this bug: Major regression having many row versions.

But why does the InnoDB transaction history start growing? There are 940 transactions in this state: ACTIVE 766132 sec. MySQL’s process list shows those transactions in “Sleep” state. It turns out that those transactions were “lost” or “hung”. As we can also see, each of those transactions holds two lock structures and one undo record, so they are not committed and not rolled-back. They are sitting there doing nothing. In this case, with the default isolation level REPEATABLE-READ, InnoDB can’t purge the undo records (transaction history) for other transactions until these “hung” transactions are finished.

The quick solution is simple: kill those connections and InnoDB will roll back those transactions and purge transaction history. After killing those 940 transactions, the graph looked like this:

However, several questions remain:

  1. What are the queries inside of this lost transaction? Where are they coming from? The problem is that neither MySQL’s process list nor InnoDB’s status shows the queries for this transaction, as it is not running those queries right now (the process list is a snapshot of what is happening inside MySQL right at this moment)
  2. Can we fix it so that the “hung” transactions don’t affect other SELECT queries and don’t cause the growth of transaction history?

As it turns out, it is very easy to simulate this issue with sysbench.

Test preparation

To add some load, I’m using sysbench,16 threads (you can open less, it does not really matter here) and a script for a “write-only” load (running for 120 seconds):

conn=" --db-driver=mysql --mysql-host=localhost --mysql-user=user --mysql-password=password --mysql-db=sbtest " sysbench --test=/usr/share/sysbench/tests/include/oltp_legacy/oltp.lua --mysql-table-engine=InnoDB --oltp-table-size=1000000 $conn prepare sysbench --num-threads=16 --max-requests=0 --max-time=120 --test=/usr/share/sysbench/tests/include/oltp_legacy/oltp.lua --oltp-table-size=1000000 $conn --oltp-test-mode=complex --oltp-point-selects=0 --oltp-simple-ranges=0 --oltp-sum-ranges=0 --oltp-order-ranges=0 --oltp-distinct-ranges=0 --oltp-index-updates=1 --oltp-non-index-updates=0 run

Simulate a “hung” transaction

While the above sysbench is running, open another connection to MySQL:

use test; CREATE TABLE `a` ( `i` int(11) DEFAULT NULL ) ENGINE=InnoDB DEFAULT CHARSET=latin1; insert into a values(1); begin; insert into a values(1); select * from a;

Note: we will need to run the SELECT as a part of this transaction. Do not close the connection.

Watch the history

mysql> select name, count from information_schema.INNODB_METRICS where name like '%hist%'; +----------------------+-------+ | name | count | +----------------------+-------+ | trx_rseg_history_len | 34324 | +----------------------+-------+ 1 row in set (0.00 sec) mysql> select name, count from information_schema.INNODB_METRICS where name like '%hist%'; +----------------------+-------+ | name | count | +----------------------+-------+ | trx_rseg_history_len | 36480 | +----------------------+-------+ 1 row in set (0.01 sec)

We can see it is growing. Now it is time to commit or rollback or even kill our original transaction:

mysql> rollback; ... mysql> select name, count from information_schema.INNODB_METRICS where name like '%hist%'; +----------------------+-------+ | name | count | +----------------------+-------+ | trx_rseg_history_len | 793 | +----------------------+-------+ 1 row in set (0.00 sec)

As we can see, it has purged the history length.

Finding the queries from the hung transactions

There are a number of options to find the queries from that “hung” transaction. In older MySQL versions, the only way is to enable the general log (or the slow query log). Starting with MySQL 5.6, we can use the Performance Schema. Here are the steps:

  1. Enable performance_schema if not enabled (it is disabled on RDS / Aurora by default).
  2. Enable events_statements_history:
    mysql> update performance_schema.setup_consumers set ENABLED = 'YES' where NAME='events_statements_history'; Query OK, 1 row affected (0.00 sec) Rows matched: 1 Changed: 1 Warnings: 0
  3. Run the query to find all transaction started 10 seconds ago (change the number of seconds to match your workload):
    SELECT as processlist_id, trx_started, trx_isolation_level, esh.EVENT_ID, esh.TIMER_WAIT, esh.event_name as EVENT_NAME, esh.sql_text as SQL, esh.RETURNED_SQLSTATE, esh.MYSQL_ERRNO, esh.MESSAGE_TEXT, esh.ERRORS, esh.WARNINGS FROM information_schema.innodb_trx trx JOIN information_schema.processlist ps ON trx.trx_mysql_thread_id = LEFT JOIN performance_schema.threads th ON th.processlist_id = trx.trx_mysql_thread_id LEFT JOIN performance_schema.events_statements_history esh ON esh.thread_id = th.thread_id WHERE trx.trx_started < CURRENT_TIME - INTERVAL 10 SECOND AND ps.USER != 'SYSTEM_USER' ORDER BY esh.EVENT_IDG ... PROCESS ID: 1971 trx_started: 2017-05-03 17:36:47 trx_isolation_level: REPEATABLE READ EVENT_ID: 79 TIMER_WAIT: 33767000 EVENT NAME: statement/sql/begin SQL: begin RETURNED_SQLSTATE: 00000 MYSQL_ERRNO: 0 MESSAGE_TEXT: NULL ERRORS: 0 WARNINGS: 0 *************************** 9. row *************************** PROCESS ID: 1971 trx_started: 2017-05-03 17:36:47 trx_isolation_level: REPEATABLE READ EVENT_ID: 80 TIMER_WAIT: 2643082000 EVENT NAME: statement/sql/insert SQL: insert into a values(1) RETURNED_SQLSTATE: 00000 MYSQL_ERRNO: 0 MESSAGE_TEXT: NULL ERRORS: 0 WARNINGS: 0 *************************** 10. row *************************** PROCESS ID: 1971 trx_started: 2017-05-03 17:36:47 trx_isolation_level: REPEATABLE READ EVENT_ID: 81 TIMER_WAIT: 140305000 EVENT NAME: statement/sql/select SQL: select * from a RETURNED_SQLSTATE: NULL MYSQL_ERRNO: 0 MESSAGE_TEXT: NULL ERRORS: 0 WARNINGS: 0
    Now we can see the list of queries from the old transaction (the MySQL query used was taken with modifications from this blog post: Tracking MySQL query history in long running transactions).

At this point, we can chase this issue at the application level and find out why this transaction was not committed. The typical causes:

  • There is a heavy, non-database-related process inside the application code. For example, the application starts a transaction to get a list of images for analysis and then starts an external application to process those images (machine learning or similar), which can take a very long time.
  • The application got an uncaught exception and exited, but the connection to MySQL was not closed for some reason (i.e., returned to the connection pool).

We can also try to configure the timeouts on MySQL or the application so that the connections are closed after “N” minutes.

Changing the transaction isolation level to fix the InnoDB transaction history issue

Now that we know which transaction is holding up the purge process of InnoDB history, we can find this transaction and make changes so it will not “hang”. We can change the transaction isolation level from REPEATABLE READ (default) to READ COMMITTED. In READ COMMITTED, InnoDB does not need to maintain history length when other transactions have committed changes. (More details about different isolation methods and how they affect InnoDB transactions.) That will work in MySQL 5.6 and later. However this doesn’t work in Amazon Aurora (as of now): even with READ COMMITTED isolation level, the history length still grows.

Here is the list of MySQL versions where changing the isolation level fixes the issue

MySQL Version  Transaction isolation  InnoDB History Length MySQL 5.6 repeatable read history is not purged until “hung” transaction finishes MySQL 5.6 read committed (fixed) history is purged Aurora repeatable read history is not purged until “hung” transaction finishes Aurora read committed history is not purged until “hung” transaction finishes


Hung transactions can cause the InnoDB history length to grow and (surprisingly, on the first glance) affect the performance of other running select queries. We can use the performance schema to chase the “hung” transaction. Changing the MySQL transaction isolation level can potentially help.

Categories: MySQL

How much disk space should I allocate for Percona Monitoring and Management?

MySQL Performance Blog - Thu, 2017-05-04 18:15

I heard a frequent question at last week’s Percona Live conference regarding Percona Monitoring and Management (PMM): How much disk space should I allocate for PMM Server?

First, let’s review the three components of Percona Monitoring and Management that consume non-negligible disk space:

  1. Prometheus data source for the time series metrics
  2. Query Analytics (QAN) which uses Percona Server XtraDB (Percona’s enhanced version of the InnoDB storage engine)
  3. Orchestrator, also backed by Percona Server XtraDB

Of these, you’ll find that Prometheus is generally your largest consumer of disk space. Prometheus hits a steady state of disk utilization once you reach the defined storage.local.retention period. If you deploy Percona Monitoring and Management 1.1.3 (the latest stable version), you’ll be using a retention period of 30 days. “Steady state” in this case means you’re not adding or removing nodes frequently, since each node comes with its own 1k-7k metrics to be scraped. Prometheus stores a one-time series per metric scraped, and automatically trims chunks (like InnoDB pages) from the tail of the time series once they exceed the retention period (so the disk requirement per static list of metrics remains “fixed” for the retention period).

However, if you’re in a dynamic environment with nodes being added and removed frequently, or you’re on the extreme end like these guys who re-deploy data centers every day, steady state for Prometheus may remain an elusive goal. The guidance you find below may help you establish at least a minimum disk provisioning threshold.

QAN is based on a web application and uses Percona Server 5.7.17 as it’s datastore. The Percona QAN agent runs one instance per monitored MySQL server, and obtains queries from either the Slow log or Performance Schema. It performs analysis locally to generate a list of unique queries and their corresponding metrics: min, max, avg, med, and p95. There are dimensions based on Tmp table, InnoDB, Query time, Lock time, etc. Check the schema for a full listing, as there are actually 149 columns on this table (show create table pmm.query_class_metricsG). While the table is wide, it isn’t too long: PMM Demo is ~9mil rows and is approximately 1 row per distinct query per minute per host.

Finally, there is Orchestrator. While the disk requirements for Orchestrator are not zero, they are certainly dwarfed by Prometheus and QAN.  As you’ll read below, Percona’s Orchestrator footprint is a measly ~250MB, which is a rounding error. I’d love to hear other experiences with Orchestrator and how large your InnoDB footprint is for a large or active cluster.

For comparison, here is the resource consumption from Percona’s PMM Demo site:

  • ~47k time series
  • 25 hosts, which is on average ~1,900 time series/host, some are +4k
  • 8-day retention for metrics in Prometheus
  • Prometheus data is ~40GB
    • Which should not increase until we add more host, as this isn’t a dynamic Kubernetes environment
Categories: MySQL

Storing UUID and Generated Columns

MySQL Performance Blog - Wed, 2017-05-03 18:15

A lot of things have been said about UUID, and storing UUID in an optimized way. Now that we have generated columns, we can store the decomposed information inside the UUID and merge it again with generated columns. This blog post demonstrates this process.

First, I used a simple table with one char field that I called uuid_char to establish a base case. I used this table with and without a primary key:

CREATE TABLE uuid_char ( uuid char(36) CHARACTER SET utf8 NOT NULL, ) ENGINE=InnoDB DEFAULT CHARSET=latin1; CREATE TABLE uuid_char_pk ( uuid char(36) CHARACTER SET utf8 NOT NULL, PRIMARY KEY (uuid) ) ENGINE=InnoDB DEFAULT CHARSET=latin1;

I performed the tests on a local VM over MySQL 5.7.17 for 30 seconds, with only two threads, because I wanted to just compare the executions:

sysbench --oltp-table-size=100000000 --test=/usr/share/doc/sysbench/tests/db/insert_uuid_generated_columns.uuid_char.lua --oltp-tables-count=4 --num-threads=2 --mysql-user=root --max-requests=5000000 --report-interval=5 --max-time=30 --mysql-db=generatedcolumn run

One pair of executions is with the UUID generated by sysbench, which simulates the UUID that comes from the app:

rs = db_query("INSERT INTO uuid_char (uuid) VALUES " .. string.format("('%s')",c_val))

An alternative execution is for when the UUID is generated by the MySQL function uuid():

rs = db_query("INSERT INTO uuid_char (uuid) VALUES (uuid())")

Below we can see the results: 

The inserts are faster without a PK (but only by 5%), and using the uuid() function doesn’t impact performance.

Now, let’s see the alternative method, which is decomposing the UUID. It has four main information sets:

  • Timestamp: this is a number with seven decimals.
  • MAC: the MAC address of the device that creates the UUID
  • Unique value: this value avoids duplicate cases scenarios
  • UUID version: this will always be “1”, as we are going to use version 1. If you are going to use another version, you will need to review the functions that I used.

The structure of the table that we’ll use is:

CREATE TABLE `uuid_generated` ( `timestamp` decimal(18,7) unsigned NOT NULL, `mac` bigint(20) unsigned NOT NULL, `temp_uniq` binary(2) NOT NULL, PRIMARY KEY (`timestamp`,`mac`,`temp_uniq`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1;

To understand how a UUID is unwrapped, I used this store procedure (which receives a UUID and inserts it into the table):

CREATE PROCEDURE ins_generated_uuid (uuid char(38)) begin set @hex_timestamp = concat(substring(uuid, 16, 3), substring(uuid, 10, 4), substring(uuid, 1, 8)); set @timestamp = concat(conv(@hex_timestamp,16,10)div 10000000 - (141427 * 24 * 60 * 60),'.',right(conv(@hex_timestamp,16,10),7)); set @mac = conv(right(uuid,12),16,10); set @temp_uniq = unhex(substring(uuid,20,4)); insert into uuid_generated (timestamp,mac,temp_uniq) values (@timestamp,@mac,@temp_uniq); end ;;


  • @hex_timestamp is a temporary variable that collects the timestamp in hexadecimal format from the different sections of the UUID
  • @timestamp transforms the hexadecimal timestamp to a decimal number
  • @mac pulls the last number in the UUID (a MAC) so we can store it in as a bigint
  • @temp_uniq is a value to conserve the uniqueness, which is why we store it as binary and it is at the end of the Primary Key

If I wanted to get the UUID again, I can use these two generated columns:

`hex_timestamp` char(40) GENERATED ALWAYS AS (conv(((`timestamp` * 10000000) + (((141427 * 24) * 60) * 600000000)),10,16)) VIRTUAL, `uuid_generated` char(38) GENERATED ALWAYS AS (concat(right(`hex_timestamp`,8),'-',substr(`hex_timestamp`,4,4),'-1',left(`hex_timestamp`,3),'-',convert(hex(`temp_uniq`) using utf8),'-',lpad(conv(`mac`,10,16),12,'0'))) VIRTUAL,

We performed tests over five scenarios:

  • Without the generated columns, the insert used data generated dynamically
  • Same as before, but we added a char field that stores the UUID
  • With the char field, and adding the generated column
  • We used the store procedure detailed before to insert the data into the table
  • We also tested the performance using triggers

The difference between the Base and the previous table structure with Primary Keys is very small. So, the new basic structure has no impact on performance.

We see that Base and +Char Field have the same performance. So leaving a char field has no performance impact (it just uses more disk space).

Using generated columns impact performance. This is expected, as the columns are generated to validate the type before the row is inserted.

Finally, the use of triggers and store procedure has the same impact in performance.

These are the three structures to the tables:

CREATE TABLE `uuid_generated` ( `timestamp` decimal(18,7) unsigned NOT NULL, `mac` bigint(20) unsigned NOT NULL, `temp_uniq` binary(2) NOT NULL, PRIMARY KEY (`timestamp`,`mac`,`temp_uniq`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1; CREATE TABLE `uuid_generated_char` ( `timestamp` decimal(18,7) unsigned NOT NULL, `mac` bigint(20) unsigned NOT NULL, `temp_uniq` binary(2) NOT NULL, `uuid` char(38) DEFAULT NULL, PRIMARY KEY (`timestamp`,`mac`,`temp_uniq`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1; CREATE TABLE `uuid_generated_char_plus` ( `timestamp` decimal(18,7) unsigned NOT NULL, `mac` bigint(20) unsigned NOT NULL, `temp_uniq` binary(2) NOT NULL, `uuid` char(38) DEFAULT NULL, `hex_timestamp` char(40) GENERATED ALWAYS AS (conv(((`timestamp` * 10000000) + (((141427 * 24) * 60) * 600000000)),10,16)) VIRTUAL, `uuid_generated` char(38) GENERATED ALWAYS AS (concat(right(`hex_timestamp`,8),'-',substr(`hex_timestamp`,4,4),'-1',left(`hex_timestamp`,3),'-',convert(hex(`temp_uniq`) using utf8),'-',lpad(conv(`mac`,10,16),12,'0'))) VIRTUAL, PRIMARY KEY (`timestamp`,`mac`,`temp_uniq`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1;

And this is the trigger:

DROP TRIGGER IF EXISTS ins_generated_uuid; delimiter ;; CREATE TRIGGER ins_uuid_generated BEFORE INSERT ON uuid_generated FOR EACH ROW begin set @hex_timestamp = concat(substring(NEW.uuid, 16, 3), substring(NEW.uuid, 10, 4), substring(NEW.uuid, 1, 8)); set NEW.timestamp = concat(conv(@hex_timestamp,16,10)div 10000000 - (141427 * 24 * 60 * 60),'.',right(conv(@hex_timestamp,16,10),7)); set NEW.mac = conv(right(NEW.uuid,12),16,10); set NEW.temp_uniq = unhex(substring(NEW.uuid,20,4)); end ;; delimiter ;


Decomposing the UUID is an alternative to storing them in order, but it won’t speed up inserts. It is simpler to execute queries over a range of dates, and look at the row for a particular device, as you will be able to use the MAC (it is recommended to add an index for it). Generated columns give you the possibility to build the UUID back in just one string.

Categories: MySQL

Percona University in Europe May 9 and May 11

MySQL Performance Blog - Tue, 2017-05-02 18:20

In 2013 we started Percona University, which consists of technology discussion events held in different cities around the world. The next installments of Percona University in Europe are next week when I fly there for Percona University Berlin (May 9) and Percona University Budapest (May 11). Both events are free to attend, and you are very welcome to join us for either of them.

Below are some questions and answers about why you should attend a Percona University session:

What is Percona University? It is a half-day technical educational event, with a wider program when compared to a traditional meetup. Usually, we include about six hours of talks split with a 30-minute coffee break. We encourage people to join us at any point during these talks – we understand that not everyone can take off a half a day from their work or studies.

What is on the agenda for each of the events? Full agendas and registration forms for the Berlin and Budapest events are available at the indicated links.

Does the word “University” mean that we won’t cover any in-depth topics, and these events would only interest college/university students? No, it doesn’t. We designed Percona University presentations for all kinds of “students,” including professionals with years of database industry experience. The word “University” means that this event series is about educating attendees on technical topics (it’s not a sales-oriented event, it’s about educating the community).

Does Percona University cover only Percona technology? We will definitely mention Percona technology, but we will also focus on real-world technical issues and recommend solutions that work (regardless of whether Percona developed them).

Are there other Percona University events coming up besides Berlin and Budapest? We will hold more Percona University events in different locations in the future. Our events newsletter is a good source of information about when and where they will occur. If you want to partner with Percona in organizing a Percona University event, contact our team. You can also check our list of technical webinars to get further educational insights.

These events are free and low-key! We want them to remain easy to organize in any city of the world. They aren’t meant to look like a full conference (like our Percona Live series). Percona University has a different format – it’s purposefully informal, and designed to be perfect for learning and networking. This is an in-person database community gathering, so feel free to come with interesting cases and tricky questions!

I hope to see many of you at Percona University in Europe, Berlin and Budapest editions!

Categories: MySQL

Webinar Thursday May 4, 2017: Percona Software News and Roadmap Update Q2 2017

MySQL Performance Blog - Mon, 2017-05-01 18:34

Come and listen to Percona CEO Peter Zaitsev on Thursday, May 4, 2017 at 11:00 am (PST) / 2:00 pm (EST) discuss Percona’s software news and roadmap, including Percona Server for MySQL and MongoDB, Percona XtraBackup, Percona Toolkit, Percona XtraDB Cluster and Percona Monitoring and Management.

Register Now During this webinar, Peter will talk about newly released features in Percona software, show a few quick demos and share with you highlights from the Percona open source software roadmap.

Peter will also talk about new developments in Percona commercial services, and finish with a Q&A.

You can register for the webinar here.

Peter Zaitsev, CEO of Percona

Peter Zaitsev co-founded Percona and assumed the role of CEO in 2006. As one of the foremost experts on MySQL strategy and optimization, Peter leveraged both his technical vision and entrepreneurial skills to grow Percona from a two-person shop to one of the most respected open source companies in the business. With over 150 professionals in 20 plus countries, Peter’s venture now serves over 3000 customers – including the “who’s who” of internet giants, large enterprises and many exciting startups. Percona was named to the Inc. 5000 in 2013, 2014 and 2015.

Peter was an early employee at MySQL AB, eventually leading the company’s High-Performance Group. A serial entrepreneur, Peter co-founded his first startup while attending Moscow State University where he majored in Computer Science. Peter is a co-author of High-Performance MySQL: Optimization, Backups, and Replication, one of the most popular books on MySQL performance. Peter frequently speaks as an expert lecturer at MySQL and related conferences, and regularly posts on the Percona Data Performance Blog. Fortune and DZone also tapped Peter as a contributor, and his recent ebook Practical MySQL Performance Optimization Volume 1 is one of’s most popular downloads.

Categories: MySQL

From Percona Live 2017: Thank You, Attendees!

MySQL Performance Blog - Fri, 2017-04-28 21:47

From everyone at Percona and Percona Live 2017, we’d like to send a big thank you to all our sponsors, exhibitors, and attendees at this year’s conference.

This year’s conference was an outstanding success! The event brought the open source database community together, with a technical emphasis on the core topics of MySQL, MariaDB, MongoDB, PostgreSQL, AWS, RocksDB, time series, monitoring and other open source database technologies.

We will be posting tutorial and session presentation slides at the Percona Live site, and all of them should be available shortly. 

Highlights This Year:

Thanks to Our Sponsors!

We would like to thank all of our valuable event sponsors, especially our diamond sponsors Continuent and VividCortex – your participation really makes the show happen.

We have developed multiple sponsorship options to allow participation at a level that best meets your partnering needs. Our goal is to create a significant opportunity for our partners to interact with Percona customers, other partners and community members. Sponsorship opportunities are available for Percona Live Europe 2017.

Download a prospectus here.

Percona Live Europe 2017: Dublin, Ireland!

This year’s Percona Live Europe will take place September 25th-27th, 2017, in Dublin, Ireland. Put it on your calendar now! Information on speakers, talks, sponsorship and registration will be available in the coming months.

We look forward to seeing you there!

Categories: MySQL

Percona Live 2017: Beringei – Facebook’s Open Source, In-Memory Time Series Database (TSDB)

MySQL Performance Blog - Thu, 2017-04-27 23:20

So that is just about a wrap here at Percona Live 2017 – except for the closing comments and prize giveaway. Before we leave, I have one more session to highlight: Facebook’s Beringei.

Beringei is Facebook’s open source, in-memory time series database. Justin Teller, Engineering Manager at Facebook, presented the session. According to Justin, large-scale monitoring systems cannot handle large-scale analysis in real time because the query performance is too slow. After evaluating and rejecting several disk-based and existing in-memory cache solutions, Facebook turned their attention to writing their own in-memory TSDB to power the health and performance monitoring system at Facebook. They presented “Gorilla: A Fast, Scalable, In-Memory Time Series Database (” at VLDB 2015.

In December 2016, they open sourced the majority of that work with Beringei ( In this talk, Justin started by presenting how Facebook uses this database to serve production monitoring workloads at Facebook, with an overview of how they use it as the basis for a disaster-ready, high-performance distributed system. He closed by presenting some new performance analysis comparing (favorably) Beringei to Prometheus. Prometheus is an open source TSDB whose time series compression was inspired by the Gorilla VLDB paper and has similar compression behavior.

After the talk, Justin was kind enough to speak briefly with me. Check it out:

It’s been a great conference, and we’re looking forward to seeing you all at Percona Live Europe!

Categories: MySQL

Percona Live 2017: Hawkular Metrics, An Overview

MySQL Performance Blog - Thu, 2017-04-27 22:23

The place is still frantic here at Percona Live 2017 as everybody tries to squeeze in a few more talks before the end of the day. One such talk was given by Red Hat’s Stefan Negrea on Hawkular Metrics.

Hawkular Metrics is a scalable, long-term, high-performance storage engine for metric data. The session was an overview of the project that includes the history of the project, an overview of the Hawkular ecosystem, technical details of the project, developer features and APIs and third party integrations.

Hawkular Metrics is backed by Cassandra for scalability. Hawkular Metrics is used and exposed by Hawkular Services.The API uses JSON to communicate with clients.

Users of Hawkular Metrics include:

  • IoT enthusiasts who need to collect metrics, and possibly trigger alerts
  • Operators who are looking for a solution to store metrics from statsD, collectd, syslog
  • Developers of solutions who need long-term time series database storage
  • Users of ManageIQ who are looking for Middleware management
  • Users of Kubernetes/Heapster who want to store Docker container metrics in a long-term time series database storage, thanks to the Heapster sink for Hawkular.

Stefan was kind enough to speak with me after the talk. Check it out below:

There are more talks today. Check out Thursday’s schedule here. Don’t forget to attend the Closing Remarks and prize give away at 4:00 pm.

Categories: MySQL

Percona Live 2017: Lessons Learned While Automating MySQL Deployments in the AWS Cloud

MySQL Performance Blog - Thu, 2017-04-27 19:37

The last day of Percona Live 2017 is still going strong, with talks all the way until 4:00 pm (and closing remarks and a prize giveaway on the main stage then). I’m going to a few more sessions today, including one from Stephane Combaudon from Slice Technologies: Lessons learned while automating MySQL deployments in the AWS Cloud.

In this talk, Stephane discussed how automating deployments is a key success factor in the cloud. It is actually a great way to leverage the flexibility of the cloud. But often while automation is not too difficult for application code, it is much harder for databases. When Slice started automating their MySQL servers at Slice, they chose simple and production-proven components: Chef to deploy files, MHA for high availability and Percona XtraBackup for backups. But they faced several problems very quickly:

  • How do you maintain an updated list of MySQL servers in the MHA configuration when servers can be automatically stopped or started?
  • How can you coordinate your servers for them to know that they need to be configured as a master or as a replica?
  • How do you write complex logic with Chef without being trapped with Chef’s two pass model?
  • How can you handle clusters with different MySQL versions, or a single cluster where all members do not use the same MySQL version?
  • How can you get reasonable backup and restore time when the dataset is over 1TB and the backups are stored on S3?

This session discussed the errors Slice made, and the solutions they found while tackling MySQL automation.

Stephane was kind enough to let me speak with him after the talk: check it out below:

There are more talks today. Check out Thursday’s schedule here. Don’t forget to attend the Closing Remarks and prize give away at 4:00 pm.

Categories: MySQL

Percona Live 2017: Day Three Keynotes

MySQL Performance Blog - Thu, 2017-04-27 17:58

Welcome to the third (and final) day of the Percona Live Open Source Database Conference 2017, and the third (and final) set of Percona Live keynotes! The enthusiasm hasn’t waned here at Percona Live, and we had a full house on Thursday morning!

Day three of the conference kicked off with three keynotes talks, and ended with the Community Awards Ceremony:

Spinaltap: Airbnb’s Change Data Capture System

Xinyao Hu (AirBnB)

In this talk, Xinyao introduced Airbnb’s change data change system, Spinaltap. He briefly covered its design, and focused on various use cases inside Airbnb. These use cases covered both online serving production and offline large distributed batch processing.

How Percona Contributes to the Open Source Database Ecosystem

Peter Zaitsev (Percona)

Peter Zaitsev, CEO of Percona, discussed the growth and adoption of open source databases, and Percona’s commitment to remaining an unbiased champion of the open source database ecosystem. Percona remains committed to providing open source support and solutions to its customers, users and the community. He also provided updates and highlighted exciting new developments in Percona Server software for MySQL and MongoDB.

Monitoring without looking at MySQL

Jean-François Gagné (

Jean-François Gagné presented a fascinating talk about using a metric for observing’s system health: bookings per second. It wasn’t a technical deep-dive (not MySQL- or Linux-related) but it is one of the most important metric has to detect problems (and customer behavior) on the website. Many things impact this metric, including the time of the day, the day of the week or the season of the year.

Community Award Ceremony

Daniel Nichter (Square), Emily Slocombe (SurveyMonkey)

The MySQL Community Awards initiative is an effort to acknowledge and thank individuals and corporations for their contributions to the MySQL ecosystem. It is a from-the-community, by-the-community and for-the-community effort. Awards are given for Community Contributor, Application, and Corporate Contributor. More information can be found here:

This year’s winners were:

  • Community: René Cannaò, Simon Mudd, Shlomi Noach
  • Application: Sysbench, Gh-ost
  • Corporate: GitHub, Percona

Congrats to the winners, the entire open source community, and to all the Percona Live attendees this year. There are still sessions today, check them out.

It’s been a great conference, and we’re looking forward to seeing you all at Percona Live Europe!

Categories: MySQL

Percona Live 2017: MySQL Makes Toast

MySQL Performance Blog - Thu, 2017-04-27 03:56

Every day at Percona Live 2017 brings something new and unusual – and on this particular day, we found out that MySQL makes toast.

A lot of people think that with MySQL and open source software, you can do anything. While many might view this metaphorically, Percona’s Alexander Rubin (Principal Consultant) takes this statement very seriously. He demonstrated on Tuesday at Percona Live that not only is possible to accomplish just about anything with MySQL, but MySQL makes toast!

Originally, Alexander took on this project to provide an open source fix for MySQL Bug#2 (MySQL Doesn’t Make Toast). After some effort, and some ingenuity, he provided a patch for the infamous bug.

(You can find out all the details in his blog post here.)

Read up on how this was accomplished, and check out the pics below of Alexander demonstrating his ingenious method of grilling breakfast bread!


Alex Prepares to Amaze the Crowd with an Open Source Breakfast

The Crowd Gathers for a Tasty MySQL-Born Treat

Open Source Breakfast is Tiring, Time for a Rest

Don’t miss any of the fun tomorrow! You can find Thursday’s (4/27) session schedule here.

Categories: MySQL

Percona Live 2017: Database Management Made Simple – Amazon RDS

MySQL Performance Blog - Thu, 2017-04-27 03:28

Percona Live 2017 is done for Wednesday, but there was still time to get in one more talk before tonight’s Community Networking Reception – and the last one of the evening was about Amazon RDS.

Darin Briskman, Lead Developer Outreach & Technical Evangelist for Amazon, held two back-to-back sessions on Database management made simple – Amazon RDS. Amazon Relational Database Service (or Amazon RDS) is a distributed relational database service by Amazon Web Services (AWS).

Darin reviewed how Amazon Relational Database Service (Amazon RDS) makes it easy to set up, operate, and scale a relational database in the cloud. He showed how it provides cost-efficient and resizable capacity while managing time-consuming database administration tasks, freeing you to focus on your applications and business. This talk provided guidance and tips for optimizing MySQL-compatible workloads on RDS.

Darin was kind enough to speak with me about his talk afterward. Check it out below:

Don’t miss any of tomorrow’s talks! You can find Thursday’s (4/27) session schedule here.

Categories: MySQL

Percona Live 2017: Histograms in MySQL and MariaDB

MySQL Performance Blog - Thu, 2017-04-27 00:00

The afternoon at Percona Live 2017 is slipping by quickly, and people are still actively looking for sessions to attend – like the session I just sat in on histograms in MySQL and MariaDB.

Histograms are a type of column statistic that provides more detailed information about data distributions in table columns. A histogram sorts values into buckets.

MariaDB Server has had histograms since MariaDB 10.0. Now, MySQL 8.0 will have them too. This session presented an overview of histogram implementations in MariaDB, MySQL 8.0, and looked at PostgreSQL for comparison. The session covered everything about histograms:

  • Why do query optimizers need histograms
  • What are the costs of collecting and maintaining a histogram in each database
  • How the query optimizers use histogram data
  • What are the strong and weak points of histogram implementation in each database

At the end, Sergei talked a bit about a related development in MariaDB Server: the optimizer will have the capability of using constraints.

Sergei was kind enough to speak briefly with me after his talk on histograms in MySQL and MariaDB. Check it out below:

Don’t miss any of tomorrow’s talks! You can find Thursday’s (4/27) session schedule here.

Categories: MySQL

Percona Live 2017: Deploying MongoDB on Public Clouds

MySQL Performance Blog - Wed, 2017-04-26 22:27

Today at Percona Live 2017, the afternoon is jam-packed with open source technology lectures filled with community members eager for the latest on the best strategies – including how you should deploy MongoDB on public clouds.

Dharshan Rangegowda (CEO of ScaleGrid) discussed deploying MongoDB on public clouds. ScaleGrid provides a fully managed Database-as-a-Service (DBaaS) solution used today by thousands of developers, startups, and enterprise customers. In this session, Dharshan talked about how public clouds like AWS and Azure have become very popular platforms over the past few years. These public clouds provide a plethora of infrastructure features to help make life easier, He dug into the features/assets that one should be actively leveraging.

On the flip side, there are also a number of potential pitfalls that require attention and might need a workaround. Dharshan reviewed some common architecture patterns you need to have in place to be successful with MongoDB on the public cloud, including high availability, disaster recovery, scaling, performance and others.

After the lecture, Dharshan was kind enough to talk briefly with me about his session. Check it out:

Don’t miss any of tomorrow’s talks! You can find Thursday’s (4/27) session schedule here.

Categories: MySQL
Syndicate content