MySQL

MariaDB Handler_icp_% Counters: What They Are, and How To Use Them

MySQL Performance Blog - Tue, 2017-05-09 19:39

In this post we’ll see how MariaDB’s Handler_icp_% counters status counters (Handler_icp_attempts and Handler_icp_matches) measure ICP-related work done by the server and storage engine layers, and how to see if our queries are getting any gains by using them.

These counters (as seen in SHOW STATUS output) are MariaDB-specific. In a later post, we will see how we can get this information in MySQL and Percona Server. This investigation spun off from comments in Michael’s post about the new MariaDB dashboard in PMM. Comments are very useful, so keep them coming!

Categories: MySQL

Percona Server for MongoDB 3.4.4-1.4 is Now Available

MySQL Performance Blog - Tue, 2017-05-09 17:18

Percona announces the release of Percona Server for MongoDB 3.4.4-1.4 on May 9, 2017. Download the latest version from the Percona web site or the Percona Software Repositories.

Percona Server for MongoDB is an enhanced, open source, fully compatible, highly-scalable, zero-maintenance downtime database supporting the MongoDB v3.4 protocol and drivers. It extends MongoDB with Percona Memory Engine and MongoRocks storage engine, as well as several enterprise-grade features:

Percona Server for MongoDB requires no changes to MongoDB applications or code.

This release is based on MongoDB 3.4.4 and includes the following additional changes:

  • #PSMDB-122: Added the percona-server-mongodb-enable-auth.sh script to binary tarball.
  • #PSMDB-127: Fixed cleanup of deleted documents and indexes for MongoRocks. When you upgrade to this release, deferred compaction may occur and cause database size to decrease significantly.
  • #PSMDB-133: Added the wiredTigerCheckpointSizeMB variable, set to 1000 in the configration template for WiredTiger. Valid values are 32 to 2048 (2GB), with the latter being default.
  • #PSMDB-138: Implemented SERVER-23418 for MongoRocks.

Percona Server for MongoDB 3.4.4-1.4 release notes are available in the official documentation.

Categories: MySQL

Chasing a Hung MySQL Transaction: InnoDB History Length Strikes Back

MySQL Performance Blog - Mon, 2017-05-08 19:09

In this blog post, I’ll review how a hung MySQL transaction can cause the InnoDB history length to grow and negatively affect MySQL performance.

Recently I was helping a customer discover why SELECT queries were running slower and slower until the server restarts (which got things back to normal). It took some time to get from that symptom to a final diagnosis. Please follow me on the journey of chasing this strange MySQL behavior!

Symptoms

Changes in the query response time can mean tons of things. We can check everything from the query plan to the disk performance. However, fixing it with a restart is less common. After looking at “show engine innodb status”, I noticed some strange lines:

Trx read view will not see trx with id >= 41271309593, sees < 41268384363 ---TRANSACTION 41271309586, ACTIVE 766132 sec 2 lock struct(s), heap size 376, 0 row lock(s), undo log entries 1 ...

There was a total of 940 transactions like this.

Another insight was the InnoDB transaction history length graph from Percona Monitoring and Management (PMM):

History length of 6 million and growing clearly indicates a problem.

Problem localized

There have been a number of blog posts describing a similar problem: Peter stated in a blog post: “InnoDB transaction history often hides dangerous ‘debt’“. As the InnoDB transaction history grows, SELECTs need to scan more and more previous versions of the rows, and performance suffers. That explains the issue: SELECT queries get slower and slower until restart. Peter also filed this bug: Major regression having many row versions.

But why does the InnoDB transaction history start growing? There are 940 transactions in this state: ACTIVE 766132 sec. MySQL’s process list shows those transactions in “Sleep” state. It turns out that those transactions were “lost” or “hung”. As we can also see, each of those transactions holds two lock structures and one undo record, so they are not committed and not rolled-back. They are sitting there doing nothing. In this case, with the default isolation level REPEATABLE-READ, InnoDB can’t purge the undo records (transaction history) for other transactions until these “hung” transactions are finished.

The quick solution is simple: kill those connections and InnoDB will roll back those transactions and purge transaction history. After killing those 940 transactions, the graph looked like this:

However, several questions remain:

  1. What are the queries inside of this lost transaction? Where are they coming from? The problem is that neither MySQL’s process list nor InnoDB’s status shows the queries for this transaction, as it is not running those queries right now (the process list is a snapshot of what is happening inside MySQL right at this moment)
  2. Can we fix it so that the “hung” transactions don’t affect other SELECT queries and don’t cause the growth of transaction history?
Simulation

As it turns out, it is very easy to simulate this issue with sysbench.

Test preparation

To add some load, I’m using sysbench,16 threads (you can open less, it does not really matter here) and a script for a “write-only” load (running for 120 seconds):

conn=" --db-driver=mysql --mysql-host=localhost --mysql-user=user --mysql-password=password --mysql-db=sbtest " sysbench --test=/usr/share/sysbench/tests/include/oltp_legacy/oltp.lua --mysql-table-engine=InnoDB --oltp-table-size=1000000 $conn prepare sysbench --num-threads=16 --max-requests=0 --max-time=120 --test=/usr/share/sysbench/tests/include/oltp_legacy/oltp.lua --oltp-table-size=1000000 $conn --oltp-test-mode=complex --oltp-point-selects=0 --oltp-simple-ranges=0 --oltp-sum-ranges=0 --oltp-order-ranges=0 --oltp-distinct-ranges=0 --oltp-index-updates=1 --oltp-non-index-updates=0 run

Simulate a “hung” transaction

While the above sysbench is running, open another connection to MySQL:

use test; CREATE TABLE `a` ( `i` int(11) DEFAULT NULL ) ENGINE=InnoDB DEFAULT CHARSET=latin1; insert into a values(1); begin; insert into a values(1); select * from a;

Note: we will need to run the SELECT as a part of this transaction. Do not close the connection.

Watch the history

mysql> select name, count from information_schema.INNODB_METRICS where name like '%hist%'; +----------------------+-------+ | name | count | +----------------------+-------+ | trx_rseg_history_len | 34324 | +----------------------+-------+ 1 row in set (0.00 sec) mysql> select name, count from information_schema.INNODB_METRICS where name like '%hist%'; +----------------------+-------+ | name | count | +----------------------+-------+ | trx_rseg_history_len | 36480 | +----------------------+-------+ 1 row in set (0.01 sec)

We can see it is growing. Now it is time to commit or rollback or even kill our original transaction:

mysql> rollback; ... mysql> select name, count from information_schema.INNODB_METRICS where name like '%hist%'; +----------------------+-------+ | name | count | +----------------------+-------+ | trx_rseg_history_len | 793 | +----------------------+-------+ 1 row in set (0.00 sec)

As we can see, it has purged the history length.

Finding the queries from the hung transactions

There are a number of options to find the queries from that “hung” transaction. In older MySQL versions, the only way is to enable the general log (or the slow query log). Starting with MySQL 5.6, we can use the Performance Schema. Here are the steps:

  1. Enable performance_schema if not enabled (it is disabled on RDS / Aurora by default).
  2. Enable events_statements_history:
    mysql> update performance_schema.setup_consumers set ENABLED = 'YES' where NAME='events_statements_history'; Query OK, 1 row affected (0.00 sec) Rows matched: 1 Changed: 1 Warnings: 0
  3. Run the query to find all transaction started 10 seconds ago (change the number of seconds to match your workload):
    SELECT ps.id as processlist_id, trx_started, trx_isolation_level, esh.EVENT_ID, esh.TIMER_WAIT, esh.event_name as EVENT_NAME, esh.sql_text as SQL, esh.RETURNED_SQLSTATE, esh.MYSQL_ERRNO, esh.MESSAGE_TEXT, esh.ERRORS, esh.WARNINGS FROM information_schema.innodb_trx trx JOIN information_schema.processlist ps ON trx.trx_mysql_thread_id = ps.id LEFT JOIN performance_schema.threads th ON th.processlist_id = trx.trx_mysql_thread_id LEFT JOIN performance_schema.events_statements_history esh ON esh.thread_id = th.thread_id WHERE trx.trx_started < CURRENT_TIME - INTERVAL 10 SECOND AND ps.USER != 'SYSTEM_USER' ORDER BY esh.EVENT_IDG ... PROCESS ID: 1971 trx_started: 2017-05-03 17:36:47 trx_isolation_level: REPEATABLE READ EVENT_ID: 79 TIMER_WAIT: 33767000 EVENT NAME: statement/sql/begin SQL: begin RETURNED_SQLSTATE: 00000 MYSQL_ERRNO: 0 MESSAGE_TEXT: NULL ERRORS: 0 WARNINGS: 0 *************************** 9. row *************************** PROCESS ID: 1971 trx_started: 2017-05-03 17:36:47 trx_isolation_level: REPEATABLE READ EVENT_ID: 80 TIMER_WAIT: 2643082000 EVENT NAME: statement/sql/insert SQL: insert into a values(1) RETURNED_SQLSTATE: 00000 MYSQL_ERRNO: 0 MESSAGE_TEXT: NULL ERRORS: 0 WARNINGS: 0 *************************** 10. row *************************** PROCESS ID: 1971 trx_started: 2017-05-03 17:36:47 trx_isolation_level: REPEATABLE READ EVENT_ID: 81 TIMER_WAIT: 140305000 EVENT NAME: statement/sql/select SQL: select * from a RETURNED_SQLSTATE: NULL MYSQL_ERRNO: 0 MESSAGE_TEXT: NULL ERRORS: 0 WARNINGS: 0
    Now we can see the list of queries from the old transaction (the MySQL query used was taken with modifications from this blog post: Tracking MySQL query history in long running transactions).

At this point, we can chase this issue at the application level and find out why this transaction was not committed. The typical causes:

  • There is a heavy, non-database-related process inside the application code. For example, the application starts a transaction to get a list of images for analysis and then starts an external application to process those images (machine learning or similar), which can take a very long time.
  • The application got an uncaught exception and exited, but the connection to MySQL was not closed for some reason (i.e., returned to the connection pool).

We can also try to configure the timeouts on MySQL or the application so that the connections are closed after “N” minutes.

Changing the transaction isolation level to fix the InnoDB transaction history issue

Now that we know which transaction is holding up the purge process of InnoDB history, we can find this transaction and make changes so it will not “hang”. We can change the transaction isolation level from REPEATABLE READ (default) to READ COMMITTED. In READ COMMITTED, InnoDB does not need to maintain history length when other transactions have committed changes. (More details about different isolation methods and how they affect InnoDB transactions.) That will work in MySQL 5.6 and later. However this doesn’t work in Amazon Aurora (as of now): even with READ COMMITTED isolation level, the history length still grows.

Here is the list of MySQL versions where changing the isolation level fixes the issue

MySQL Version  Transaction isolation  InnoDB History Length MySQL 5.6 repeatable read history is not purged until “hung” transaction finishes MySQL 5.6 read committed (fixed) history is purged Aurora repeatable read history is not purged until “hung” transaction finishes Aurora read committed history is not purged until “hung” transaction finishes


Summary

Hung transactions can cause the InnoDB history length to grow and (surprisingly, on the first glance) affect the performance of other running select queries. We can use the performance schema to chase the “hung” transaction. Changing the MySQL transaction isolation level can potentially help.

Categories: MySQL

How much disk space should I allocate for Percona Monitoring and Management?

MySQL Performance Blog - Thu, 2017-05-04 18:15

I heard a frequent question at last week’s Percona Live conference regarding Percona Monitoring and Management (PMM): How much disk space should I allocate for PMM Server?

First, let’s review the three components of Percona Monitoring and Management that consume non-negligible disk space:

  1. Prometheus data source for the time series metrics
  2. Query Analytics (QAN) which uses Percona Server XtraDB (Percona’s enhanced version of the InnoDB storage engine)
  3. Orchestrator, also backed by Percona Server XtraDB

Of these, you’ll find that Prometheus is generally your largest consumer of disk space. Prometheus hits a steady state of disk utilization once you reach the defined storage.local.retention period. If you deploy Percona Monitoring and Management 1.1.3 (the latest stable version), you’ll be using a retention period of 30 days. “Steady state” in this case means you’re not adding or removing nodes frequently, since each node comes with its own 1k-7k metrics to be scraped. Prometheus stores a one-time series per metric scraped, and automatically trims chunks (like InnoDB pages) from the tail of the time series once they exceed the retention period (so the disk requirement per static list of metrics remains “fixed” for the retention period).

However, if you’re in a dynamic environment with nodes being added and removed frequently, or you’re on the extreme end like these guys who re-deploy data centers every day, steady state for Prometheus may remain an elusive goal. The guidance you find below may help you establish at least a minimum disk provisioning threshold.

QAN is based on a web application and uses Percona Server 5.7.17 as it’s datastore. The Percona QAN agent runs one instance per monitored MySQL server, and obtains queries from either the Slow log or Performance Schema. It performs analysis locally to generate a list of unique queries and their corresponding metrics: min, max, avg, med, and p95. There are dimensions based on Tmp table, InnoDB, Query time, Lock time, etc. Check the schema for a full listing, as there are actually 149 columns on this table (show create table pmm.query_class_metricsG). While the table is wide, it isn’t too long: PMM Demo is ~9mil rows and is approximately 1 row per distinct query per minute per host.

Finally, there is Orchestrator. While the disk requirements for Orchestrator are not zero, they are certainly dwarfed by Prometheus and QAN.  As you’ll read below, Percona’s Orchestrator footprint is a measly ~250MB, which is a rounding error. I’d love to hear other experiences with Orchestrator and how large your InnoDB footprint is for a large or active cluster.

For comparison, here is the resource consumption from Percona’s PMM Demo site:

  • ~47k time series
  • 25 hosts, which is on average ~1,900 time series/host, some are +4k
  • 8-day retention for metrics in Prometheus
  • Prometheus data is ~40GB
    • Which should not increase until we add more host, as this isn’t a dynamic Kubernetes environment
Categories: MySQL

Storing UUID and Generated Columns

MySQL Performance Blog - Wed, 2017-05-03 18:15

A lot of things have been said about UUID, and storing UUID in an optimized way. Now that we have generated columns, we can store the decomposed information inside the UUID and merge it again with generated columns. This blog post demonstrates this process.

First, I used a simple table with one char field that I called uuid_char to establish a base case. I used this table with and without a primary key:

CREATE TABLE uuid_char ( uuid char(36) CHARACTER SET utf8 NOT NULL, ) ENGINE=InnoDB DEFAULT CHARSET=latin1; CREATE TABLE uuid_char_pk ( uuid char(36) CHARACTER SET utf8 NOT NULL, PRIMARY KEY (uuid) ) ENGINE=InnoDB DEFAULT CHARSET=latin1;

I performed the tests on a local VM over MySQL 5.7.17 for 30 seconds, with only two threads, because I wanted to just compare the executions:

sysbench --oltp-table-size=100000000 --test=/usr/share/doc/sysbench/tests/db/insert_uuid_generated_columns.uuid_char.lua --oltp-tables-count=4 --num-threads=2 --mysql-user=root --max-requests=5000000 --report-interval=5 --max-time=30 --mysql-db=generatedcolumn run

One pair of executions is with the UUID generated by sysbench, which simulates the UUID that comes from the app:

rs = db_query("INSERT INTO uuid_char (uuid) VALUES " .. string.format("('%s')",c_val))

An alternative execution is for when the UUID is generated by the MySQL function uuid():

rs = db_query("INSERT INTO uuid_char (uuid) VALUES (uuid())")

Below we can see the results: 

The inserts are faster without a PK (but only by 5%), and using the uuid() function doesn’t impact performance.

Now, let’s see the alternative method, which is decomposing the UUID. It has four main information sets:

  • Timestamp: this is a number with seven decimals.
  • MAC: the MAC address of the device that creates the UUID
  • Unique value: this value avoids duplicate cases scenarios
  • UUID version: this will always be “1”, as we are going to use version 1. If you are going to use another version, you will need to review the functions that I used.

The structure of the table that we’ll use is:

CREATE TABLE `uuid_generated` ( `timestamp` decimal(18,7) unsigned NOT NULL, `mac` bigint(20) unsigned NOT NULL, `temp_uniq` binary(2) NOT NULL, PRIMARY KEY (`timestamp`,`mac`,`temp_uniq`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1;

To understand how a UUID is unwrapped, I used this store procedure (which receives a UUID and inserts it into the table):

CREATE PROCEDURE ins_generated_uuid (uuid char(38)) begin set @hex_timestamp = concat(substring(uuid, 16, 3), substring(uuid, 10, 4), substring(uuid, 1, 8)); set @timestamp = concat(conv(@hex_timestamp,16,10)div 10000000 - (141427 * 24 * 60 * 60),'.',right(conv(@hex_timestamp,16,10),7)); set @mac = conv(right(uuid,12),16,10); set @temp_uniq = unhex(substring(uuid,20,4)); insert into uuid_generated (timestamp,mac,temp_uniq) values (@timestamp,@mac,@temp_uniq); end ;;

Explanation:

  • @hex_timestamp is a temporary variable that collects the timestamp in hexadecimal format from the different sections of the UUID
  • @timestamp transforms the hexadecimal timestamp to a decimal number
  • @mac pulls the last number in the UUID (a MAC) so we can store it in as a bigint
  • @temp_uniq is a value to conserve the uniqueness, which is why we store it as binary and it is at the end of the Primary Key

If I wanted to get the UUID again, I can use these two generated columns:

`hex_timestamp` char(40) GENERATED ALWAYS AS (conv(((`timestamp` * 10000000) + (((141427 * 24) * 60) * 600000000)),10,16)) VIRTUAL, `uuid_generated` char(38) GENERATED ALWAYS AS (concat(right(`hex_timestamp`,8),'-',substr(`hex_timestamp`,4,4),'-1',left(`hex_timestamp`,3),'-',convert(hex(`temp_uniq`) using utf8),'-',lpad(conv(`mac`,10,16),12,'0'))) VIRTUAL,

We performed tests over five scenarios:

  • Without the generated columns, the insert used data generated dynamically
  • Same as before, but we added a char field that stores the UUID
  • With the char field, and adding the generated column
  • We used the store procedure detailed before to insert the data into the table
  • We also tested the performance using triggers

The difference between the Base and the previous table structure with Primary Keys is very small. So, the new basic structure has no impact on performance.

We see that Base and +Char Field have the same performance. So leaving a char field has no performance impact (it just uses more disk space).

Using generated columns impact performance. This is expected, as the columns are generated to validate the type before the row is inserted.

Finally, the use of triggers and store procedure has the same impact in performance.

These are the three structures to the tables:

CREATE TABLE `uuid_generated` ( `timestamp` decimal(18,7) unsigned NOT NULL, `mac` bigint(20) unsigned NOT NULL, `temp_uniq` binary(2) NOT NULL, PRIMARY KEY (`timestamp`,`mac`,`temp_uniq`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1; CREATE TABLE `uuid_generated_char` ( `timestamp` decimal(18,7) unsigned NOT NULL, `mac` bigint(20) unsigned NOT NULL, `temp_uniq` binary(2) NOT NULL, `uuid` char(38) DEFAULT NULL, PRIMARY KEY (`timestamp`,`mac`,`temp_uniq`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1; CREATE TABLE `uuid_generated_char_plus` ( `timestamp` decimal(18,7) unsigned NOT NULL, `mac` bigint(20) unsigned NOT NULL, `temp_uniq` binary(2) NOT NULL, `uuid` char(38) DEFAULT NULL, `hex_timestamp` char(40) GENERATED ALWAYS AS (conv(((`timestamp` * 10000000) + (((141427 * 24) * 60) * 600000000)),10,16)) VIRTUAL, `uuid_generated` char(38) GENERATED ALWAYS AS (concat(right(`hex_timestamp`,8),'-',substr(`hex_timestamp`,4,4),'-1',left(`hex_timestamp`,3),'-',convert(hex(`temp_uniq`) using utf8),'-',lpad(conv(`mac`,10,16),12,'0'))) VIRTUAL, PRIMARY KEY (`timestamp`,`mac`,`temp_uniq`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1;

And this is the trigger:

DROP TRIGGER IF EXISTS ins_generated_uuid; delimiter ;; CREATE TRIGGER ins_uuid_generated BEFORE INSERT ON uuid_generated FOR EACH ROW begin set @hex_timestamp = concat(substring(NEW.uuid, 16, 3), substring(NEW.uuid, 10, 4), substring(NEW.uuid, 1, 8)); set NEW.timestamp = concat(conv(@hex_timestamp,16,10)div 10000000 - (141427 * 24 * 60 * 60),'.',right(conv(@hex_timestamp,16,10),7)); set NEW.mac = conv(right(NEW.uuid,12),16,10); set NEW.temp_uniq = unhex(substring(NEW.uuid,20,4)); end ;; delimiter ;

Conclusions

Decomposing the UUID is an alternative to storing them in order, but it won’t speed up inserts. It is simpler to execute queries over a range of dates, and look at the row for a particular device, as you will be able to use the MAC (it is recommended to add an index for it). Generated columns give you the possibility to build the UUID back in just one string.

Categories: MySQL

My proposals for Percona Live: Window Functions and ANALYZE for statements

Sergey Petrunia's blog - Mon, 2015-11-30 17:05

I’ve made two session proposals for Percona Live conference:

if you feel these talks are worth it, please vote!

Categories: MySQL
Syndicate content