MySQL

New MariaDB Dashboard in Percona Monitoring and Management Metrics Monitor

MySQL Performance Blog - Mon, 2017-04-03 21:56

In honor of the upcoming MariaDB M17 conference in New York City on April 11-12, we have enhanced Percona Monitoring and Management (PMM) Metrics Monitor with a new MariaDB Dashboard and multiple new graphs!

The Percona Monitoring and Management MariaDB Dashboard builds on the efforts of the MariaDB development team to instrument the Aria Storage Engine Status Variables related to Aria Pagecache and Aria Transaction Log activity, the tracking of Index Condition Pushdown (ICP), InnoDB Online DDL when using ALTER TABLE ... ALGORITHM=INPLACE, InnoDB Deadlocks Detected, and finally InnoDB Defragmentation. This new dashboard is available in Percona Monitoring and Management release 1.1.2. Download it now using our docker, VirtualBox or Amazon AMI installation options!

Percona Monitoring and Management (PMM) is a free and open-source platform for managing and monitoring MySQL®, MariaDB® and MongoDB® performance. You can run PMM in your own environment for maximum security and reliability. It provides thorough time-based analysis for MySQL, MariaDB® and MongoDB servers to ensure that your data works as efficiently as possible.

Aria Pagecache Reads/Writes

MariaDB 5.1 introduced the Aria Storage Engine, which is MariaDB’s MyISAM replacement. Originally known as the Maria storage engine, they renamed it in late 2010 in order to avoid confusion with the overall MariaDB project name. The Aria Pagecache Status Variables graph plots the count of disk block reads and writes, which occur when the data isn’t already in the Aria Pagecache. We also plot the reads and writes from the Aria Page Cache, which count the reads/writes that did not incur a disk lookup (as the data was previously fetched and available from the Aria pagecache):

Aria Pagecache Blocks

Aria reads and writes to the pagecache in order to cache data in RAM and avoid or delay activity related to disk. Overall, this translates into faster database query response times:

  • Aria_pagecache_blocks_not_flushed: The number of dirty blocks in the Aria pagecache.
  • Aria_pagecache_blocks_unused: Free blocks in the Aria pagecache.
  • Aria_pagecache_blocks_used: Blocks used in the Aria pagecache.

Aria Pagecache Total Blocks is calculated using Aria System Variables and the following formula:aria_pagecache_buffer_size / aria_block_size:

Aria Transaction Log Syncs

As Aria strives to be a fully ACID- and MVCC-compliant storage engine, an important factor is support for transactions. A transaction is the unit of work in a database that defines how to implement the four properties of Atomicity, Consistency, Isolation, and Durability (ACID). This graph tracks the rate at which Aria fsyncs the Aria Transaction Log to disk. You can think of this as the “write penalty” for running a transactional storage engine:

InnoDB Online DDL

MySQL 5.6 released the concept of an in-place DDL operation via ALTER TABLE ... ALGORITHM=INPLACE, which in some cases avoided performing a table copy and thus didn’t block INSERT/UPDATE/DELETE. MariaDB implemented three measures to track ongoing InnoDB Online DDL operations, which we plot via the following three status variables:

  • Innodb_onlineddl_pct_progress: Shows the progress of the in-place alter table. It might not be accurate, as in-place alter is highly dependent on the disk and buffer pool status
  • Innodb_onlineddl_rowlog_pct_used: Shows row log buffer usage in 5-digit integers (10000 means 100.00%)
  • Innodb_onlineddl_rowlog_rows: Number of rows stored in the row log buffer

For more information, please see the MariaDB blog post Monitoring progress and temporal memory usage of Online DDL in InnoDB.

InnoDB Defragmentation

MariaDB merged the Facebook/Kakao defragmentation patch for defragmenting InnoDB tablespaces into their 10.1 release. Your MariaDB instance needs to have started with innodb_defragment=1 and your tables need to be in innodb_file_per_table=1 for this to work. We plot the following three status variables:

  • Innodb_defragment_compression_failures: Number of defragment re-compression failures
  • Innodb_defragment_failures: Number of defragment failures
  • Innodb_defragment_count: Number of defragment operations

Index Condition Pushdown

Oracle introduced this in MySQL 5.6. From the manual:

Index Condition Pushdown (ICP) is an optimization for the case where MySQL retrieves rows from a table using an index. Without ICP, the storage engine traverses the index to locate rows in the base table and returns them to the MySQL server which evaluates the WHERE condition for the rows. With ICP enabled, and if parts of the WHERE condition can be evaluated by using only columns from the index, the MySQL server pushes this part of the WHERE condition down to the storage engine. The storage engine then evaluates the pushed index condition by using the index entry and only if this is satisfied is the row read from the table. ICP can reduce the number of times the storage engine must access the base table and the number of times the MySQL server must access the storage engine.

Essentially, the closer that ICP Attempts are to ICP Matches, the better!

InnoDB Deadlocks Detected (MariaDB 10.1 Only)

Ever since MySQL implemented a transactional storage engine there have been deadlocks. Deadlocks are conditions where different transactions are unable to proceed because each holds a lock that the other needs. In MariaDB 10.1, there is a Status variable that counts the occurrences of deadlocks since the server startup. Previously, you had to instrument your application to get an accurate count of deadlocks, because otherwise you could miss occurrences if your polling interval wasn’t configured frequent enough (even using pt-deadlock-logger). Unfortunately, this Status variable doesn’t appear to be present in the MariaDB 10.2.4 build I tested:

Again, please download Percona Monitoring and Management 1.1.2 to take advantage of the new MariaDB Dashboard and new graphs!  For installation instructions, see the Deployment Guide.

You can see the MariaDB Dashboard and new graphs in action at the PMM Demo site. If you feel the graphs need any tweaking or if I’ve missed anything, leave a note on the blog. You can also write me directly (I look forward to your comments): michael.coburn@percona.com.

To start: on the ICP graph, should we have a line that defines the percentage of successful ICP matches vs. attempts?

Categories: MySQL

Percona Monitoring and Management 1.1.2 is Now Available

MySQL Performance Blog - Mon, 2017-04-03 17:12

Percona announces the release of Percona Monitoring and Management 1.1.2 on April 3, 2017.

For installation instructions, see the Deployment Guide.

This release includes several new dashboards in Metrics Monitor, updated versions of software components used in PMM Server, and a number of small bug fixes.

Thank You to the Community!

We would like to mention some of the key contributors in this release, and thank the community for continued support of PMM:

New Dashboards and Graphs

This release includes the following new dashboards:

  • MariaDB dashboard includes three new graphs for the Aria storage engine. There will be a detailed blog post about monitoring possibilities with these new graphs:

The new MariaDB dashboard also includes three new graphs for monitoring InnoDB within MariaDB. We are planning to move them into one of the existing InnoDB dashboards in the next PMM release:

  • The InnoDB Defragmentation graph shows how OPTIMIZE TABLE impacts defragmentation on tables when running MariaDB with innodb_file_per_table=1 and innodb_defragment=1.

  • The InnoDB Online DDL graph includes metrics related to online DDL operations when using ALTER TABLE ... ALGORITHM=INPLACE in MariaDB.

  • The InnoDB Deadlocks Detected graph currently works only with MariaDB 10.1. We are planning to add support for MariaDB 10.2, Percona Server, and MySQL in the next PMM release.

  • The Index Condition Pushdown graph shows how InnoDB leverages the Index Condition Pushdown (ICP) routines. Currently this graph works only with MariaDB, but we are planning to add support for Percona Server and MySQL in the next PMM release.

Updated Software

PMM is based on several third-party open-source software components. We ensure that PMM includes the latest versions of these components in every release, making it the most secure, stable and feature-rich database monitoring platform possible. Here are some highlights of changes in the latest releases:

  • Grafana 4.2 (from 4.1.1)
    • HipChat integration
    • Templating improvements
    • Alerting enhancements
  • Consul 0.7.5 (from 0.7.3)
    • Bug fix for serious server panic
  • Prometheus 1.5.2 (from 1.5.1)
    • Prometheus binaries are built with Go1.7.5
    • Fixed two panic conditions and one series corruption bug
  • Orchestrator 2.0.3 (from 2.0.1)
    • GTID improvements
    • Logging enhancements
    • Improved timing resolution and faster discoveries
Other Changes in PMM Server
  • Migrated the PMM Server docker container to use CentOS 7 as the base operating system.
  • Changed the entry point so that supervisor is PID 1.
  • PMM-633: Set the following default values in my.cnf:
    [mysqld] # Default MySQL Settings innodb_buffer_pool_size=128M innodb_log_file_size=5M innodb_flush_log_at_trx_commit=1 innodb_file_per_table=1 innodb_flush_method=O_DIRECT # Disable Query Cache by default query_cache_size=0 query_cache_type=0
  • PMM-676: Added descriptions for graphs in Disk Performance and Galera dashboards.
Changes in PMM Client
  • Fixed pmm-admin remove --all to clear all saved credentials.
  • Several fixes to mongodb_exporter including PMM-629 and PMM-642.
  • PMM-504: Added ability to change the name of a client with running services: $ sudo pmm-admin config --client-name new_name --force

    WARNING: Some Metrics Monitor data may be lost when renaming a running client.

About Percona Monitoring and Management

Percona Monitoring and Management is an open-source platform for managing and monitoring MySQL and MongoDB performance. Percona developed it in collaboration with experts in the field of managed database services, support and consulting.

PMM is a free and open-source solution that you can run in your own environment for maximum security and reliability. It provides thorough time-based analysis for MySQL and MongoDB servers to ensure that your data works as efficiently as possible.

A live demo of PMM is available at pmmdemo.percona.com.

Please provide your feedback and questions on the PMM forum.

If you would like to report a bug or submit a feature request, use the PMM project in JIRA.

Categories: MySQL

Performance Evaluation of SST Data Transfer: With Encryption (Part 2)

MySQL Performance Blog - Thu, 2017-03-30 21:08

In this blog post, we’ll look at the performance of SST data transfer using encryption.

In my previous post, we reviewed SST data transfer in an unsecured environment. Now let’s take a closer look at a setup with encrypted network connections between the donor and joiner nodes.

The base setup is the same as the previous time:

  • Database server: Percona XtraDB Cluster 5.7 on donor node
  • Database: sysbench database – 100 tables 4M rows each (total ~122GB)
  • Network: donor/joiner hosts are connected with dedicated 10Gbit LAN
  • Hardware: donor/joiner hosts – boxes with 28 Cores+HT/RAM 256GB/Samsung SSD 850/Ubuntu 16.04

The setup details for the encryption aspects in our testing:

  • Cryptography libraries: openssl-1.0.2, openssl-1.1.0, libgcrypt-1.6.5(for xbstream encryption)
  • CPU hardware acceleration for AES – AES-NI: enabled/disabled
  • Ciphers suites: aes(default), aes128, aes256, chacha20(openssl-1.1.0)

Several notes regarding the above aspects:

  • Cryptography libraries. Now almost every Linux distribution is based on the openssl-1.0.2. This is the previous stable version of the OpenSSL library. The latest stable version (1.1.0) has various performance/scalability fixes and also support of new ciphers that may notably improve throughput, However, it’s problematic to upgrade from 1.0.2 to 1.1.0, or just to find packages for openssl-1.1.0 for existing distributions. This is due to the fact that replacing OpenSSL triggers update/upgrade of a significant number of packages. So in order to use openssl-1.1.0, most likely you will need to build it from sources. The same applies to socat – it will require some effort to build socat with openssl-1.1.0.
  • AES-NI. The Advanced Encryption Standard Instruction Set (AES-NI) is an extension to the x86 CPU’s from Intel and AMD. The purpose of AES-NI is to improve the performance of encryption and decryption operations using the Advanced Encryption Standard (AES), like the AES128/AES256 ciphers. If your CPU supports AES-NI, there should be an option in BIOS that allows you to enabled/disable that feature. In Linux, you can check /proc/cpuinfo for the existence of an “aes” flag. If it’s present, then AES-NI is available and exposed to the OS.There is a way to check what acceleration ratio you can expect from it:
    # AES_NI disabled with OPENSSL_ia32cap OPENSSL_ia32cap="~0x200000200000000" openssl speed -elapsed -evp aes-128-gcm ... The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes aes-128-gcm 57535.13k 65924.18k 164094.81k 175759.36k 178757.63k # AES_NI enabled openssl speed -elapsed -evp aes-128-gcm The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes aes-128-gcm 254276.67k 620945.00k 826301.78k 906044.07k 923740.84k
    Our interest is the very last column: 178MB/s(wo AES-NI) vs 923MB/s(w AES-NI)
  • Ciphers. In our testing for network encryption with socat+openssl 1.0.2/1.1.0, we used the following ciphers suites:
    DEFAULT – if you don’t specify a cipher/cipher string for OpenSSL connection, this suite will be used
    AES128 – suite with aes128 ciphers only
    AES256 – suites with aes256 ciphers onlyAdditionally, for openssl-1.1.0, there is an extra cipher suite:
    CHACHA20 – cipher suites using ChaCha20 algoIn the case of xtrabackup, where internal encryption is based on libgcrypt, we use the AES128/AES256 ciphers from this library.
  • SST methods. Streaming database files from the the donor to joiner with the rsync protocol over an OpenSSL-encrypted connection:
    (donor) rsync | socat+ssl socat+ssl| rsync(daemon mode) (joiner)
    The current approach of wsrep_sst_rsync.sh doesn’t allow you to use the rsync SST method with SSL. However, there is a project that tries to address the lack of SSL support for rsync method. The idea is to create a secure connection with socat and then use that connection as a tunnel to connect rsync between the joiner and donor hosts. In my testing, I used a similar approach.

    Also take a note that in the chart below, there are results for two variants of rsync: “rsync” (the current approach), and “rsync_improved” (the improved one). I’ve explained the difference between them in my previous post.

  • Backup data on the donor side and stream it to the joiner in xbstream format over an OpenSSL encrypted connection

    (donor) xtrabackup| socat+ssl socat+ssl | xbstream (joiner)

    In my testing for streaming over encrypted connections, I used the --parallel=4 option for xtrabackup. In my previous post, I showed that this is important factor to get the best time. There is also a way to pass the name of the cipher that will be used by socat for the OpenSSL connection in the wsrep_sst_xtrabackup-v2.sh script with the sockopt option. For instance:

    [sst] inno-backup-opts="--parallel=4" sockopt=",cipher=AES128"

  • Backup data on the donor side/encrypt it internally(with libgcrypt) and stream the data to the joiner in xbstream format, and afterwards decrypt files on the joiner

    (donor) xtrabackup | socat socat | xbstream ; xtrabackup decrypt (joiner)

    The xtrabackup tool has a feature to encrypt data when performing a backup. That encryption is based on the libgcrypt library, and it’s possible to use AES128 or AES256 ciphers. For encryption, it’s necessary to generate a key and then provide it to xtrabackup to perform encryption on fly. There is a way to specify the number of threads that will encrypt data, along with the chunk size to tune process of encryption.

    The current version of xtrabackup supports an efficient way to read, compress and encrypt data in parallel, and then write/stream it. From the other side, when we accept a stream we can’t decompress/decrypt stream on the fly. At first, the stream should be received/written to disk with the xbstream tool and only after that can you use xtrabackup with --decrypt/--decompress modes to unpack data. The inability to process data on the fly and save the stream to disk for later processing has a notable impact on stream time from the donor to the joiner. We have a plan to fix that issue, so that encryption+compression+streaming of data with xtrabackup happens without the necessity to write stream to the disk on the receiver side.

    For my testing, in the case of xtrabackup with internal encryption, I didn’t use SSL encryption for socat.

Results (click on the image for an enlarged view):

Observations:
  • Transferring data with rsync is very inefficient, and the improved version is 2-2.5 times faster. Also, you may note that in the case of “no-aes-n”, the rsync_improved method has the best time for default/aes128/aes256 ciphers. The reason is that we perform both data transfers in parallel (we spawn rsync process for each file), as well as encryption/decryption (socat forks extra processes for each stream). This approach allows us to compensate for the absence of hardware acceleration by using several CPU cores. In all other cases, we only use one CPU for streaming of data and encryption/decryption.
  • xtrabackup (with hardware optimized crc32) shows the best time in all cases, except for the default/aes128/aes256 ciphers in “no-aes-ni” mode (where rsync_imporved showed the best time). However I would like to remind you that SST with rsync is a blocking operation, and during the data transfer the donor node becomes READ-ONLY. xtrabackup, on the other hand, uses backup locks and allows any operations on donor node during SST.
  • On the boxes without hardware acceleration (no-aes-ni mode), the chacha20 cipher allows you to perform data transfer 2-3 times faster. It’s a very good replacement for “aes” ciphers on such boxes. However, the problem with that cipher is that it is available only in openssl-1.1.0. In order to use it, you will need a custom build of OpenSSL and socat for many distros.
  • Regarding xtrabackup with internal encryption (xtrabackup_enc): reading/encrypting and streaming data is quite fast, especially with the latest libgcrypt library(1.7.x). The problem is decryption. As I’ve explained above, right now we need to get the stream and save encrypted data to storage first, and then perform the extra step of reading/decrypting and saving the data back. That extra part consumes 2/3 of the total time. Improving the xbstream tool to perform steam decryption/decompression on the fly would allow you to get very good results.
Testing Details

For purposes of the testing, I’ve created a script “sst-bench.sh” that covers all the methods used in this post. You can use it to measure all the above SST methods in your environment. In order to run the script, you have to adjust several environment variables at the beginning of the script: joiner ip, datadirs location on joiner and donor hosts, etc. After that, put the script on the “donor” and “joiner” hosts and run it as the following:

#joiner_host> sst_bench.sh --mode=joiner --sst-mode=<tar|xbackup|rsync> --cipher=<DEFAULT|AES128|AES256|CHACHA20> --ssl=<0|1> --aesni=<0|1> #donor_host> sst_bench.sh --mode=donor --sst-mode=<tar|xbackup|rsync|rsync_improved> --cipher=<DEFAULT|AES128|AES256|CHACHA20> --ssl=<0|1> --aesni=<0|1>

Categories: MySQL

Performance Evaluation of SST Data Transfer: Without Encryption (Part 1)

MySQL Performance Blog - Wed, 2017-03-29 21:10

In this blog, we’ll look at evaluating the performance of an SST data transfer without encryption.

A State Snapshot Transfer (SST) operation is an important part of Percona XtraDB Cluster. It’s used to provision the joining node with all the necessary data. There are three methods of SST operation available: mysqldump, rsync, xtrabackup. The most advanced one – xtrabackup – is the default method for SST in Percona XtraDB Cluster.

We decided to evaluate the current state of xtrabackup, focusing on the process of transferring data between the donor and joiner nodes tp find out if there is any room for improvements or optimizations.

Taking into account that the security of the network connections used for Percona XtraDB Cluster deployment is one of the most important factors that affects SST performance, we will evaluate SST operations in two setups: without network encryption, and in a secure environment.

In this post, we will take a look at the setup without network encryption.

Setup:

  • database server: Percona XtraDB Cluster 5.7 on the donor node
  • database: sysbench database – 100 tables, 4M rows each (total ~122GB)
  • network: donor/joiner hosts are connected with dedicated 10Gbit LAN
  • hardware: donor/joiner hosts – boxes with 28 Cores+HT/RAM 256GB/Samsung SSD 850/Ubuntu 16.04

In our test, we will measure the amount of time it takes to stream all necessary data from the donor to the joiner with the help of one of SST’s methods.

Before testing, I measured read/write bandwidth limits of the attached SSD drives (with the help of sysbench/fileio): they are ~530-540MB/sec. That means that the best theoretical time to transfer all of our database files (122GB) is ~230sec.

Schematic view of SST methods:

  • Streaming DB files from the donor to joiner with tar
    (donor) tar | socat socat | tar (joiner)

    • tar is not really an SST method. It’s used here just to get some baseline numbers to understand how long it takes to transfer data without extra overhead.
  • Streaming DB files from the donor to joiner with rsync protocol
    (donor) rsync rsync(daemon mode) (joiner)

    • While working on the testing of the rsync SST method, I found that the current way of data streaming is quite inefficient: rsync parallelization is directory-based, not file-based. So if you have three directories, – for instance sbtest (100files/100GB), mysql (75files/10MB), performance_schema (88files/1M) – the rsync SST script will start three rsync processes, where each process will handle its own directory. As a result, instead of parallel transfer we end up with one stream that only streams the largest directory (sbtest). Replacing that approach with one that iterates over all files in datadir and queues them to rsync workers allows us to speed up the transfer of data 2-3 times.On the charts, ‘rsync’ is the current approach and ‘rsync_improved’ is the improved one.
  • Backup data on the donor side and stream it to the joiner in xbstream format
    (donor) xtrabackup | socat socat | xbstream (joiner)

At the end of this post, you will find the command lines used for testing each SST method.

Streaming of our database files with tar took a minimal amount of time, and it’s very close to the best possible time (~230sec). xtrabackup is slower (~2x), as is rsync (~3x).

From profiling xtrabackup, we can clearly see two things:

  1. IO utilization is quite low
  2. A notable amount of time was spent in crc32 computation

Issue 1
xtrabackup can process data in parallel, however by default it does it with a single thread only. Our tests showed that increasing the number of parallel threads to 2/4 with the --parallel option allows us to improve IO utilization and reduce streaming time. One can pass this option to xtrabackup by adding the following to the [sst] section of my.cnf:

[sst] inno-backup-opts="--parallel=4"

Issue 2
By default xtrabackup uses software-based crc32 functions from the libz library. Replacing this function with a hardware-optimized one allows a notable reduction in CPU usage and a speedup in data transfer. This fix will be included in the next release of xtrabackup.

We ran more tests for xtrabackup with the parallel option and hardware optimized crc32, and got results that confirm our analysis. Streaming time for xtrabackup is now very close to baseline and storage limits.

Testing details

For the purposes of testing, I’ve created a script “sst-bench.sh” that covers all the methods used in this post. You can try to measure all the above SST methods in your environment. In order to run script, you have to adjust several environment variables in the beginning, such as joiner ip, datadirs location on the joiner and donor hosts, etc. After that, put the script to the “donor” and “joiner” hosts and run it as the following:

#joiner_host> sst_bench.sh --mode=joiner --sst-mode=<tar|xbackup|rsync> #donor_host> sst_bench.sh --mode=donor --sst-mode=<tar|xbackup|rsync|rsync_improved>

Categories: MySQL

Webinar Thursday 3/30: MyRocks Troubleshooting

MySQL Performance Blog - Wed, 2017-03-29 20:51

Please join Percona’s Principal Technical Services Engineer Sveta Smirnova, and Senior Software Engineer George Lorch, MariaDB’s Query Optimizer Developer Sergei Petrunia and Facebook’s Database Engineer Yoshinori Matsunobu as they present MyRocks Troubleshooting on March 30, 2017 at 11:00 am PDT / 2:00 pm EDT (UTC-7).

Register Now

MyRocks is an alternative storage engine designed for flash storage. It provides great write workload performance and space efficiency. Like any other powerful engine, it has its own specific configuration scenarios that require special troubleshooting solutions.

This webinar will discuss how to deal with:

  • Data corruption issues
  • Inconsistent data
  • Locks
  • Slow performance

We will use well-known instruments and tools, as well as MyRocks-specific tools, and demonstrate how they work with the MyRocks storage engine.

Register for this webinar here.

Categories: MySQL

Troubleshooting Issues with MySQL Character Sets Q & A

MySQL Performance Blog - Tue, 2017-03-28 18:36

In this blog, I will provide answers to the Q & A for the Troubleshooting Issues with MySQL Character Sets webinar.

First, I want to thank everybody for attending the March 9 MySQL character sets troubleshooting webinar. The recording and slides for the webinar are available here. Below is the list of your questions that I wasn’t able to answer during the webinar, with responses:

Q: We’ve had some issues converting tables from utf8  to utf8mb4. Our issue was that the collation we wanted to use – utf8mb4_unicode_520_ci – did not distinguish between spaces and ideographic (Japanese) spaces, so we were getting unique constraint violations for the varchar fields when two entries had the same text with different kinds of spaces. Have you seen this problem and is there a workaround? We were wondering if this was related to the mother-child character bug with this collation.

A: Unfortunately this issue exists for many languages. For example, in Russian you cannot distinguish “е” and “ё” if you use utf8 or utf8mb4. However, there is hope for Japanese: Oracle announced that they will implement new language-specific utf8mb4 collations in MySQL 8.0. I already see 21 new collations in my 8.0.0 installation.

mysql> show collation like '%0900%'; +----------------------------+---------+-----+---------+----------+---------+ | Collation | Charset | Id | Default | Compiled | Sortlen | +----------------------------+---------+-----+---------+----------+---------+ | utf8mb4_0900_ai_ci | utf8mb4 | 255 | | Yes | 8 | | utf8mb4_cs_0900_ai_ci | utf8mb4 | 266 | | Yes | 8 | | utf8mb4_da_0900_ai_ci | utf8mb4 | 267 | | Yes | 8 | | utf8mb4_de_pb_0900_ai_ci | utf8mb4 | 256 | | Yes | 8 | | utf8mb4_eo_0900_ai_ci | utf8mb4 | 273 | | Yes | 8 | | utf8mb4_es_0900_ai_ci | utf8mb4 | 263 | | Yes | 8 | | utf8mb4_es_trad_0900_ai_ci | utf8mb4 | 270 | | Yes | 8 | | utf8mb4_et_0900_ai_ci | utf8mb4 | 262 | | Yes | 8 | | utf8mb4_hr_0900_ai_ci | utf8mb4 | 275 | | Yes | 8 | | utf8mb4_hu_0900_ai_ci | utf8mb4 | 274 | | Yes | 8 | | utf8mb4_is_0900_ai_ci | utf8mb4 | 257 | | Yes | 8 | | utf8mb4_la_0900_ai_ci | utf8mb4 | 271 | | Yes | 8 | | utf8mb4_lt_0900_ai_ci | utf8mb4 | 268 | | Yes | 8 | | utf8mb4_lv_0900_ai_ci | utf8mb4 | 258 | | Yes | 8 | | utf8mb4_pl_0900_ai_ci | utf8mb4 | 261 | | Yes | 8 | | utf8mb4_ro_0900_ai_ci | utf8mb4 | 259 | | Yes | 8 | | utf8mb4_sk_0900_ai_ci | utf8mb4 | 269 | | Yes | 8 | | utf8mb4_sl_0900_ai_ci | utf8mb4 | 260 | | Yes | 8 | | utf8mb4_sv_0900_ai_ci | utf8mb4 | 264 | | Yes | 8 | | utf8mb4_tr_0900_ai_ci | utf8mb4 | 265 | | Yes | 8 | | utf8mb4_vi_0900_ai_ci | utf8mb4 | 277 | | Yes | 8 | +----------------------------+---------+-----+---------+----------+---------+ 21 rows in set (0,03 sec)

In 8.0.1 they promised new case-sensitive and Japanese collations. Please see this blog post for details. The note about the planned Japanese support is at the end.

Meanwhile, I can only suggest that you implement your own collation as described here. You may use utf8_russian_ci collation from Bug #51976 as an example.

Although the user manual does not list utf8mb4 as a character set for which it’s possible to create new collations, you can actually do it. What you need to do is add a record about the character set utf8mb4 and the new collation into Index.xml, then restart the server.

<charset name="utf8mb4"> <collation name="utf8mb4_russian_ci" id="1033"> <rules> <reset>u0415</reset><p>u0451</p><t>u0401</t> </rules> </collaiton> </charset> mysql> show collation like 'utf8mb4_russian_ci'; +--------------------+---------+------+---------+----------+---------+ | Collation | Charset | Id | Default | Compiled | Sortlen | +--------------------+---------+------+---------+----------+---------+ | utf8mb4_russian_ci | utf8mb4 | 1033 | | | 8 | +--------------------+---------+------+---------+----------+---------+ 1 row in set (0,03 sec) mysql> create table test_yo(gen varchar(100) CHARACTER SET utf8mb4, yo varchar(100) CHARACTER SET utf8mb4 collate utf8mb4_russian_ci) engine=innodb default character set=utf8mb4; Query OK, 0 rows affected (0,25 sec) mysql> set names utf8mb4; Query OK, 0 rows affected (0,02 sec) mysql> insert into test_yo values('ел', 'ел'), ('ель', 'ель'), ('ёлка', 'ёлка'); Query OK, 3 rows affected (0,05 sec) Records: 3 Duplicates: 0 Warnings: 0 mysql> insert into test_yo values('Ел', 'Ел'), ('Ель', 'Ель'), ('Ёлка', 'Ёлка'); Query OK, 3 rows affected (0,06 sec) Records: 3 Duplicates: 0 Warnings: 0 mysql> select * from test_yo order by gen; +----------+----------+ | gen | yo | +----------+----------+ | ел | ел | | Ел | Ел | | ёлка | ёлка | | Ёлка | Ёлка | | ель | ель | | Ель | Ель | +----------+----------+ 6 rows in set (0,00 sec) mysql> select * from test_yo order by yo; +----------+----------+ | gen | yo | +----------+----------+ | ел | ел | | Ел | Ел | | ель | ель | | Ель | Ель | | ёлка | ёлка | | Ёлка | Ёлка | +----------+----------+ 6 rows in set (0,00 sec)

Q: If receiving utf8 on latin1 charset it will be corrupted. Just want to confirm that you can reformat as utf8 and un-corrupt the data? Also, is there a time limit on how quickly this needs to be done?

A: It will be corrupted only if you store utf8 data in the latin1 column. For example, if you have a table, defined as:

create table latin1( f1 varchar(100) ) engine=innodb default charset=latin1;

And then insert a word in utf8 format into it that contains characters that are not in the latin1 character set:

mysql> set names utf8; Query OK, 0 rows affected (0,00 sec) mysql> set sql_mode=''; Query OK, 0 rows affected, 1 warning (0,00 sec) mysql> insert into latin1 values('Sveta'), ('Света'); Query OK, 2 rows affected, 1 warning (0,04 sec) Records: 2 Duplicates: 0 Warnings: 1

The data in UTF8 will be corrupted and can never be recovered:

mysql> select * from latin1; +-------+ | f1 | +-------+ | Sveta | | ????? | +-------+ 2 rows in set (0,00 sec) mysql> select f1, hex(f1) from latin1; +-------+------------+ | f1 | hex(f1) | +-------+------------+ | Sveta | 5376657461 | | ????? | 3F3F3F3F3F | +-------+------------+ 2 rows in set (0,01 sec)

However, if your data is stored in the UTF8 column and you use latin1 for a connection, you will only get a corrupted result set. The data itself will be left untouched:

mysql> create table utf8(f1 varchar(100)) engine=innodb character set utf8; Query OK, 0 rows affected (0,18 sec) mysql> insert into utf8 values('Sveta'), ('Света'); Query OK, 2 rows affected (0,15 sec) Records: 2 Duplicates: 0 Warnings: 0 mysql> set names latin1; Query OK, 0 rows affected (0,00 sec) mysql> select f1, hex(f1) from utf8; +-------+----------------------+ | f1 | hex(f1) | +-------+----------------------+ | Sveta | 5376657461 | | ????? | D0A1D0B2D0B5D182D0B0 | +-------+----------------------+ 2 rows in set (0,00 sec) mysql> set names utf8; Query OK, 0 rows affected (0,00 sec) mysql> select f1, hex(f1) from utf8; +------------+----------------------+ | f1 | hex(f1) | +------------+----------------------+ | Sveta | 5376657461 | | Света | D0A1D0B2D0B5D182D0B0 | +------------+----------------------+ 2 rows in set (0,00 sec)

Q: Can you discuss how charsets affect mysqldump? Specifically, how do we dump a database containing tables with different default charsets?

A: Yes, you can. MySQL can successfully convert data that uses different character sets, so your only job is to specify option --default-character-set for mysqldump. In this case, strings in any character set you use can be converted to the character set specified. For example, if you use cp1251 and latin1, you may set option --default-character-set to cp1251, utf8 and utf8mb4. However, you cannot set it to latin1 because Cyrillic characters exist in the cp1251 character set, but do not exist in latin1.

The default value for mysqldump is utf8. You only need to change this default if you use values that are outside of the range supported by utf8 (for example, the smileys in utf8mb4).

Q: But if you use the --single-transaction option for mysqldump, you can only specify one character set in the default?

A: Yes, and this is OK: all data will be converted into this character set. And then, when you will restore the dump, it will be converted back to the character set specified in column definitions.

Q: I noticed that MySQL doesn’t support case-sensitive UTF-8 character sets. What do you recommend for implementing case-sensitive UTF-8, if it’s at all possible?

A: In the link I provided earlier, Oracle promises to implement case-sensitive collations for utf8mb4 in version 8.0.1. Before that happens, I recommend you to implement your own case-sensitive collation.

Q: How are tools like pt-table-checksum affected by charsets? Is it safe to use a 4-byte charset (like utf8mb4) as the default charset for all comparisons? Assuming our tables are a mix of latin1 , utf8 and utf8mb4.

A: With this combination, you won’t have any issues: pt-table-checksum uses a complicated set of functions that joins columns and calculates a crc32 checksum on them. In your case, all data will be converted to utf8mb4 and no conflicts will happen.

However, if you use incompatible character sets in a single table, you may get the error "Illegal mix of collations for operation 'concat_ws' ":

mysql> create table cp1251(f1 varchar(100) character set latin1, f2 varchar(100) character set cp1251) engine=innodb; Query OK, 0 rows affected (0,32 sec) mysql> set names utf8; Query OK, 0 rows affected (0,00 sec) mysql> insert into cp1251 values('Sveta', 'Света'); Query OK, 1 row affected (0,07 sec) sveta@Thinkie:~/build/mysql-8.0/mysql-test$ ~/build/percona-toolkit/bin/pt-table-checksum h=127.0.0.1,P=13000,u=root,D=test Diffs cannot be detected because no slaves were found. Please read the --recursion-method documentation for information. 03-18T03:51:58 Error executing EXPLAIN SELECT COUNT(*) AS cnt, COALESCE(LOWER(CONV(BIT_XOR(CAST(CRC32(CONCAT_WS('#', `f1`, `f2`, CONCAT(ISNULL(`f1`), ISNULL(`f2`)))) AS UNSIGNED)), 10, 16)), 0) AS crc FROM `db1`.`cp1251` /*explain checksum table*/: DBD::mysql::st execute failed: Illegal mix of collations for operation 'concat_ws' [for Statement "EXPLAIN SELECT COUNT(*) AS cnt, COALESCE(LOWER(CONV(BIT_XOR(CAST(CRC32(CONCAT_WS('#', `f1`, `f2`, CONCAT(ISNULL(`f1`), ISNULL(`f2`)))) AS UNSIGNED)), 10, 16)), 0) AS crc FROM `db1`.`cp1251` /*explain checksum table*/"] at /home/sveta/build/percona-toolkit/bin/pt-table-checksum line 11351. 03-18T03:51:58 Error checksumming table db1.cp1251: Error executing checksum query: DBD::mysql::st execute failed: Illegal mix of collations for operation 'concat_ws' [for Statement "REPLACE INTO `percona`.`checksums` (db, tbl, chunk, chunk_index, lower_boundary, upper_boundary, this_cnt, this_crc) SELECT ?, ?, ?, ?, ?, ?, COUNT(*) AS cnt, COALESCE(LOWER(CONV(BIT_XOR(CAST(CRC32(CONCAT_WS('#', `f1`, `f2`, CONCAT(ISNULL(`f1`), ISNULL(`f2`)))) AS UNSIGNED)), 10, 16)), 0) AS crc FROM `db1`.`cp1251` /*checksum table*/" with ParamValues: 0='db1', 1='cp1251', 2=1, 3=undef, 4=undef, 5=undef] at /home/sveta/build/percona-toolkit/bin/pt-table-checksum line 10741. TS ERRORS DIFFS ROWS CHUNKS SKIPPED TIME TABLE 03-18T03:51:58 2 0 0 1 0 0.003 db1.cp1251 03-18T03:51:58 0 0 2 1 0 0.167 db1.latin1 03-18T03:51:58 0 0 6 1 0 0.198 db1.test_yo ...

The tool continues working, and will process the rest of your tables. I reported this behavior as Bug #1674266.

Thanks for attending the Troubleshooting Issues with MySQL Character Sets webinar.

Categories: MySQL

What’s Next for SQL Databases?

MySQL Performance Blog - Mon, 2017-03-27 22:52

In this blog, I’ll go over my thoughts on what we can expect in the world of SQL databases.

After reading Baron’s prediction on databases, here:

https://www.xaprb.com/blog/defining-moments-in-database-history/

I want to provide my own view on what’s coming up next for SQL databases. I think we live in interesting times, when we can see the beginning of the next-generation of RDBMSs.

There are defining characteristics of such databases:

  1. Auto-scaling. The ability to add and use resources depending on the current load and database size. This is done transparently for users and DBAs.
  2. Auto-healing. The automatic handling of node failures.
  3. Multi-regional, cloud-agnostic, geo-distributed. The ability to support multiple data centers and multiple clouds, in different parts of the world.
  4. Transactional. All the above, with the ability to support multi-statements transactional workloads.
  5. Strong consistency. The full definition of strong consistency is pretty involved. For simplicity, let’s say it means that reads (in the absence of ongoing writes) will return the same data, despite what region or data center you are getting it from. A simple counter-example is the famous MySQL asynchronous replication, where (with the slave delay) reading the data on a slave can return very outdated data. I am focusing on reads, because in a distributed environment the consistent reads performance will be affected. This is where network latency (often limited by the speed of light) will define performance.
  6. SQL language. SQL, despite being old and widely criticized, is not going anywhere. This is a universal language for app developers to access data.

With this, I see following interesting projects:

  • Google Cloud Spanner (https://cloud.google.com/spanner/). Recently announced and still in the Beta stage. Definitely an interesting projects, with the obvious limitation of running only in Google Cloud.
  • FaunaDB (https://fauna.com/). Also very recently announced, so it is hard to say how it performs. The major downside I see is that it does not provide SQL access, but uses a custom language.
  • Two open source projects:
    • CockroachDB (https://www.cockroachlabs.com/). This is still in the Beta stage, but definitely an interesting project to follow. Initially, the project planned to support only key-value access, but later they made a very smart decision to provide SQL access via a PostgreSQL-compatible protocol.
    • TiDB (https://github.com/pingcap/tidb). Right now in RC stages, and the target is to provide SQL access over a MySQL compatible protocol (and later PostgreSQL protocol).

Protocol compatibility is a wise approach, although not strictly necessary. It lowers an entry barrier for the existing applications.

Both CockroachDB and TiDB, at the moment of this writing, still have rough edges and can’t be used in serious deployments (from my experience). I expect both projects will make a big progress in 2017.

What shared characteristics can we expect from these systems?

As I mentioned above, we may see that the read performance is degraded (as latency increases), and often it will be defined more by network performance than anything else. Storage IO and CPU cycles will be secondary factors. There will be more work on how to understand and tune the network traffic.

We may need to get used to the fact that point or small range selects become much slower. Right now, we see very fast point selects for traditional RDBM (MySQL, PostgreSQL, etc.).

Heavy writes will be problematic. The problem is that all writes will need to go through the consistency protocol. Write-optimized storage engines will help (both CockroachDB and TiDB use RocksDB in the storage layer).

The long transactions (let’s say changing 100000 or more rows) also will be problematic. There is just too much network round-trips and housekeeping work on each node, making long transactions an issue for distributed systems.

Another shared property (at least between CockroachDB and TiDB) is the active use of the Raft protocol to achieve consistency. So it will be important to understand how this protocol works to use it effectively. You can find a good overview of the Raft protocol here: http://container-solutions.com/raft-explained-part-1-the-consenus-problem/.

There probably are more NewSQL technologies than I have mentioned here, but I do not think any of them captured critical market- or mind-share. So we are at the beginning of interesting times . . .

What about MySQL? Can MySQL become the database that provides all these characteristics? It is possible, but I do not think it will happen anytime soon. MySQL would need to provide automatic sharding to do this, which will be very hard to implement given the current internal design. It may happen in the future, though it will require a lot of engineering efforts to make it work properly.

Categories: MySQL

Percona Toolkit 3.0.2 is now available

MySQL Performance Blog - Mon, 2017-03-27 18:48

Percona announces the availability of Percona Toolkit 3.0.2 on March 27, 2017.

Percona Toolkit is a collection of advanced command-line tools that perform a variety of MySQL and MongoDB server and system tasks too difficult or complex for DBAs to perform manually. Percona Toolkit, like all Percona software, is free and open source.

This release includes the following changes:

New Features
  • PT-73: Added support for SSL connections to pt-mongodb-summary and pt-mongodb-query-digest
  • 1642751: Enabled gathering of information about locks and transactions by pt-stalk using Performance Schema if it is enabled (Thanks, Agustin Gallego)
Bug Fixes
  • PT-74: Fixed gathering of security settings when running pt-mongodb-summary on a mongod instance that is specified as the host
  • PT-75: Changed the default sort order in pt-mongodb-query-digest output to descending
  • PT-76: Added support of & and # symbols in passwords for pt-mysql-summary
  • PT-77: Updated Makefile to support new MongoDB tools
  • PT-89: Fixed pt-stalk to run top more than once to collect useful CPU usage
  • PT-93: Fixed pt-mongodb-query-digest to make query ID match query key (Thanks, Kamil Dziedzic)
  • PT-94: Fixed pt-online-schema-change to not make duplicate rows in _t_new when updating the primary key. Also, see 1646713.
  • PT-101: Fixed pt-table-checksum to correctly use the –slave-user and –slave-password options. Also, see 1651002.
  • PT-105: Fixed pt-table-checksum to continue running if a database is dropped in the process

You can find release details in the release notes. Bugs can be reported on Toolkit’s launchpad bug tracker.

Categories: MySQL

$5 Percona Live keynote passes and $101 101 crash courses

MySQL Performance Blog - Fri, 2017-03-24 21:49

The Percona Live Open Source Database Conference 2017 in Santa Clara, California is just around the corner: April 24-27, 2017. We’re busy getting things ready to make sure everybody gets the most out of their time there. As part of that, we have some news and a couple of outstanding offers for you!

$5 Percona Live Keynote Passes and
$101 101 Crash Courses

Keynote Speakers
We are extremely pleased with our keynote speakers this year. We have a wide assortment of industry leaders and experts presenting a great set of topics, with more to come! Below are some of our keynote presentations:

Day 1

Day 2

Day 3

  • 9:00 AM – Peter Zaitsev, Percona: Closing Keynote
  • 9:30 AM – Jean Francois Gagne, Booking.com LT: Bookings-per-seconds
  • 9:50 AM – Community Award Ceremony

$5 Keynote Passes
To help make the keynotes and the community events accessible to the greatest number of community members, we are once again offering $5 Percona Live keynote passes for the Percona Live Open Source Database Performance Conference 2017.

A keynote pass provides access to the keynote addresses, Birds of a Feather sessions, the exhibit floor and the Community Networking Reception on Wednesday night. The first 100 people who register for an Expo-Only pass (new registrations only) using the discount code “KEY” will be able to register for just $5.

$101 101 Passes

For a limited time, you can get access to the 101 Crash Courses for only $101! Percona Live is once again hosting Crash Courses for developers, systems administrators, and other technical resources.

The MySQL 101 will be on Tuesday, April 25 and MongoDB 101 will be on Wednesday, April 26.

To learn more about our crash courses, read this blog. Register now using the following codes for your discount:

  • 101: $299 off either the MySQL or MongoDB tickets
  • 202: $498 off the combined MySQL/MongoDB ticket

This deal expires soon, so reserve your spot now!

All Percona Live registration options can be found here. Register now! See you at Percona Live Open Source Database Performance Conference 2017!

Make sure to follow us on Twitter using #PerconaLive, and to visit our conference website to stay up-to-date with the exciting announcements yet to come! We hope to see you in Santa Clara, CA in April!

Want to sponsor Percona Live? Booth selection for our limited sponsorship opportunities is on a first-come, first-served basis. Download the sponsorship prospectus.

Categories: MySQL

Percona Server for MySQL 5.7.17-12 is Now Available

MySQL Performance Blog - Fri, 2017-03-24 17:43

Percona announces the GA release of Percona Server for MySQL 5.7.17-12 on March 24, 2017. Download the latest version from the Percona web site or the Percona Software Repositories. You can also run Docker containers from the images in the Docker Hub repository.

Based on MySQL 5.7.17, including all the bug fixes in it, Percona Server for MySQL 5.7.17-12 is the current GA release in the Percona Server for MySQL 5.7 series. Percona’s provides completely open-source and free software. Find release details in the 5.7.17-12 milestone at Launchpad.

New Features:
  • Percona Server has implemented new mysqldump --order-by-primary-desc option. This feature tells mysqldump to take the backup by descending primary key order (PRIMARY KEY DESC) which can be useful if storage engine is using reverse order column family for a primary key.
  • mysqldump will now detect when MyRocks is installed and available by seeing if there is a session variable named rocksdb_skip_fill_cache and setting it to 1 if it exists.
  • mysqldump will now automatically enable session variable rocksdb_bulk_load if it is supported by the target server.
Bugs Fixed:
  • If the variable thread_handling was set to pool-of-threads in the MySQL configuration file, the server couldn’t be gracefully shut down. Bug fixed #1537554.
  • When innodb_ft_result_cache_limit was exceeded by internal memory allocated by InnoDB during the FT scan not all memory was released which could lead to server assertion. Bug fixed #1634932 (upstream #83648).
  • Executing the FLUSH LOGS on a read-only slave with a user that doesn’t have the SUPER privilege would result in Error 1290. Bug fixed #1652852 (upstream #84350).
  • FLUSH LOGS was disabled with read_only and super_read_only variables. Bug fixed #1654682 (upstream #84437).
  • If SHOW BINLOGS or PERFORMANCE_SCHEMA.GLOBAL_STATUS query, and a transaction commit would run in parallel, they could deadlock. Bug fixed #1657128.
  • A long-running binary log commit would block SHOW STATUS, which in turn could block a number of other operations such as client connects and disconnects. Bug fixed #1646100.
  • Log tracking initialization did not find last valid bitmap data correctly. Bug fixed #1658055.
  • A query using range scan with a complex range condition could lead to a server crash. Bug fixed #1660591 (upstream #84736).
  • Race condition between buffer pool page optimistic access and eviction could lead to a server crash. Bug fixed #1664280.
  • If Audit Log Plugin was unable to create file pointed by audit_log_file, the server would crash during the startup. Bug fixed #1666496.
  • A DROP TEMPORARY TABLE ... for a table created by a CREATE TEMPORARY TABLE ... SELECT ... would get logged in the binary log on a disconnect with mixed mode replication. Bug fixed #1671013.
  • TokuDB did not use an index with even if cardinality was good. Bug fixed #1671152.
  • Row-based replication events were not reflected in Rows_updated fields in the User Statistics INFORMATION_SCHEMA tables. Bug fixed #995624.
  • When DuplicateWeedout strategy was used for joins, use was not reported in the query plan info output extension for the slow query log. Bug fixed #1592694.
  • It was impossible to use column compression dictionaries with partitioned InnoDB tables. Bug fixed #1653104.
  • Diagnostics for OpenSSL errors have been improved. Bug fixed #1660339 (upstream #75311).

Other bugs fixed: #1665545, #1650321, #1654501, #1663251, #1659548, #1663452, #1670834, #1672871, #1626545, #1658006, #1658021, #1659218, #1659746, #1660239, #1660243, #1660348, #1662163 (upstream #81467), #1664219, #1664473, #1671076, and #1671123.

The release notes for Percona Server for MySQL 5.7.17-12 are available in the online documentation. Please report any bugs on the launchpad bug tracker.

Categories: MySQL

Percona Server 5.6.35-81.0 is Now Available

MySQL Performance Blog - Fri, 2017-03-24 17:25

Percona announces the release of Percona Server 5.6.35-81.0 on March 24, 2017. Download the latest version from the Percona web site or the Percona Software Repositories. You can also run Docker containers from the images in the Docker Hub repository.

Based on MySQL 5.6.35, and including all the bug fixes in it, Percona Server 5.6.35-81.0 is the current GA release in the Percona Server 5.6 series. Percona Server is open-source and free – this is the latest release of our enhanced, drop-in replacement for MySQL. Complete details of this release are available in the 5.6.35-81.0 milestone on Launchpad.

New Features:
  • Percona Server has implemented new mysqldump --order-by-primary-desc option. This feature tells mysqldump to take the backup by descending primary key order (PRIMARY KEY DESC) which can be useful if storage engine is using reverse order column for a primary key.
Bugs Fixed:
  • When innodb_ft_result_cache_limit was exceeded by internal memory allocated by InnoDB during the FT scan not all memory was released which could lead to server assertion. Bug fixed #1634932 (upstream #83648).
  • Log tracking initialization did not find last valid bitmap data correctly, potentially resulting in needless redo log retracking or hole in the tracked LSN range. Bug fixed #1658055.
  • If Audit Log Plugin was unable to create file pointed by audit_log_file, the server would crash during the startup. Bug fixed #1666496.
  • A DROP TEMPORARY TABLE ... for a table created by a CREATE TEMPORARY TABLE ... SELECT ... would get logged in the binary log on a disconnect with mixed mode replication. Bug fixed #1671013.
  • TokuDB did not use an index with even if cardinality was good. Bug fixed #1671152.
  • Row-based replication events were not reflected in Rows_updated fields in the User Statistics INFORMATION_SCHEMA tables. Bug fixed #995624.
  • A long-running binary log commit would block SHOW STATUS, which in turn could block a number of other operations such as client connects and disconnects. Bug fixed #1646100.
  • It was impossible to use column compression dictionaries with partitioned InnoDB tables. Bug fixed #1653104.
  • Diagnostics for OpenSSL errors have been improved. Bug fixed #1660339 (upstream #75311).
  • When DuplicateWeedout strategy was used for joins, use was not reported in the query plan info output extension for the slow query log. Bug fixed #1592694.

Other bugs fixed: #1650321, #1650322, #1654501, #1663251, #1666213, #1652912, #1659548, #1663452, #1670834, #1672871, #1626545, #1644174, #1658006, #1658021, #1659218, #1659746, #1660239, #1660243, #1660255, #1660348, #1662163 upstream (#81467), #1664219, #1664473, #1671076, and #1671123.

Release notes for Percona Server 5.6.35-81.0 are available in the online documentation. Please report any bugs on the launchpad bug tracker.

Categories: MySQL

Percona Live Featured Session with Luís Soares: The New MySQL Replication Features in MySQL 8

MySQL Performance Blog - Fri, 2017-03-24 01:15

Welcome to another post in the series of Percona Live featured session blogs! In these blogs, we’ll highlight some of the session speakers that will be at this year’s Percona Live conference. We’ll also discuss how these sessions can help you improve your database environment. Make sure to read to the end to get a special Percona Live 2017 registration bonus!

In this Percona Live featured session, we’ll meet Luís Soares, Principal Software Engineer at Oracle. His session is The New MySQL Replication Features in MySQL 8 (with fellow presenter Lars Thalmann, Development Director at Oracle). The most popular high availability (HA) techniques deployed are based on making services redundant, in particular by means of replication. This fits quite naturally in the MySQL universe, as MySQL server has provided a mature replication solution for over a decade now. Moreover, the new replication developments (and their roadmap) show that MySQL is also catering for the requirements posed by popular environments such as the cloud.

I had a chance to speak with Luís about MySQL 8.0 replication:

Percona: How did you get into database technology? What do you love about it?

Luís: My background is in distributed systems, particularly in database replication, message passing technologies and fault-tolerance. It all started while I was taking my BSc in computer science. As I finished it, I felt very much drawn towards these subjects. That alone made me enroll in a Master’s course that allowed me to focus almost full time on database replication using group communication technologies. I continued to study this field for years, and further deepened my knowledge on this area. That was great fun and quite a learning experience!

Years went by, and eventually I ended up working at MySQL on the replication team. This happened after I came to a MySQL user conference to present some of the work that I was doing at the time.

These are very fond memories! But I digress!

Back to the point. In general, the thing I love about working on database replication is that I am constantly facing new and interesting problems. Data replication in itself is hard. Add to that the semantics and requirements of a database server, and complexity increases quite a bit. Also, building a generic database replication service that fits in a large set of use cases requires a lot of discipline and careful thinking when designing new features. And let’s not forget the Web itself, which is constantly changing. New technologies come and go at a fast pace. The volume of data that has to be handled, year after year, increases considerably. This poses scalability and integration challenges that need to be addressed.

All in all, these are very exciting times to work with high availability, data replication and data integration.

Now specifically about MySQL, I love the fact that I work on a popular database technology that embraced replication very early in its life cycle. Replication awareness runs deep in the product and in its ecosystem. Consequently, MySQL has an extensive user base exploring many different use case scenarios around replication. And this is extremely motivating, rewarding and exciting. I can honestly say that my day-to-day work is never boring!

Percona: Your talk is called The New MySQL Replication Features in MySQL 8. What are the key replication features in MySQL 8.0, and why are they important?

Luís: It was a huge amount of work to get the MySQL Group Replication plugin out with MySQL 5.7.17. Group Replication is a new plugin that gives the user some replication nice properties by resorting to group communication and state machine replication. This makes the system able to protect data against split brain situations, enables fault-tolerance and high availability and provides coordination between servers committing transactions that change the data.

In addition to Group Replication, the team has also invested quite a bit on core replication features. Some of these features were already released, and others will be released at some point in time in a MySQL DMR.

In the first 8.0 DMR (MySQL 8.0.0) replication has better instrumentation for row-based replication. The user can observe the row-based replication applier progress by querying performance schema tables. There is also an enhanced global transaction identifier.

GTIDs history management, as the user can set the variable GTID_PURGED in scenarios other than those where the server has an empty GTID execution history. And the user can now specify the stop condition when starting the relay log applier, even if there are multiple applier threads started.

All these features combined are of great help, since they reduce operations overhead through automation, better observability and coordination between servers.

Work continues on many fronts: performance, availability, scalability, efficiency and observability. Stay tuned!

Percona: How do these features make DBAs lives easier? What problems do they solve?

Luís: As mentioned above, the features in MySQL 8.0.0 take some of the operations burden from the DBA. Moreover, they allow the user to better observe what is happening inside the replication pipeline. This alone is quite interesting, since DBAs need to make decisions both when designing new deployments and when tackling issues, possibly having to meet very tight deadlines.

Simply put, these features will help DBAs to diagnose and fix problems faster.

Percona: What do you want attendees to take away from your session? Why should they attend?

Luís: Our session is primarily about the shiny new replication features already in MySQL 8. This is the first takeaway. To know, first hand, what is in MySQL 8 replication-wise. But there is another takeaway, and quite an interesting one. Part of the session is dedicated to presenting the overall ideas around MySQL replication. So attendees will get an overview of the roadmap, and will be able to participate and provide feedback along the way. They will learn more about the big picture, and we will bring together some of the hot MySQL technologies that we keep hearing about nowadays: Group Replication, InnoDB Clusters, Multi-Threaded Replication and more!

It will be fun.

Percona: What are you most looking forward to at Percona Live 2017?

Luís: As a conference participant, I look forward to doing some networking with the vibrant MySQL community. I must say, that I really enjoy engaging in nice technical discussions about my favorite topics: fault-tolerance, replication, dependability and distributed systems overall. The conference gives me a great opportunity to do this.

As a MySQL developer, and one that has been developing MySQL replication for quite some time now, I look forward to talking about the recent work that my team has done and getting all the feedback I can.

As a bystander, conferences like Percona Live make me realize how much MySQL has grown, and how much it has evolved. Replication, for instance, has had so many interesting features, release after release over the last eight or nine years. The community has embraced and deployed them, often worked/interacted with the developers to improve them by providing feedback, feature requests or contributions. And this means that they are part of the story too!

These conferences are always a great learning experience! After spending a week with the MySQL community, I always feel refreshed, energized, extra motivated and with lots of food for thought when I get back home.

Go MySQL!

Register for Percona Live Data Performance Conference 2017, and see Luís present his session on The New MySQL Replication Features in MySQL 8 (with fellow presenter Lars Thalmann, Development Director at Oracle). Use the code FeaturedTalk and receive $100 off the current registration price!

Percona Live Data Performance Conference 2017 is the premier open source event for the data performance ecosystem. It is the place to be for the open source community, as well as businesses that thrive in the MySQL, NoSQL, cloud, big data and Internet of Things (IoT) marketplaces. Attendees include DBAs, sysadmins, developers, architects, CTOs, CEOs, and vendors from around the world.

The Percona Live Data Performance Conference will be April 24-27, 2017 at the Hyatt Regency Santa Clara & The Santa Clara Convention Center.

Categories: MySQL

Running Percona XtraDB Cluster on Windows … in Docker

MySQL Performance Blog - Thu, 2017-03-23 18:03

In this blog post, we’ll look at how to run Percona XtraDB Cluster on Windows using Docker.

This is a follow-up to my previous post on Percona XtraBackup on Windows. The fact is that with Docker you can now run a variety of software applications on Windows that previously were available only for Linux platforms.

We can run (to evaluate and for testing purposes) several nodes of Percona XtraDB Cluster on a single Windows box.

The steps for this are:

  1. Setup Docker on the Windows box.
    https://docs.docker.com/docker-for-windows/install
  2. Create a Docker network.
    docker network create net1
  3. Bootstrap the cluster.
    docker run -e MYSQL_ROOT_PASSWORD=test -e CLUSTER_NAME=cl1 --name=node1 --net=net1 perconalab/percona-xtradb-cluster
  4. Join the nodes.
    docker run -e MYSQL_ROOT_PASSWORD=test -e CLUSTER_NAME=cl1 -e CLUSTER_JOIN=node1 --net=net1 perconalab/percona-xtradb-cluster

Repeat step 4 as many times as you want.

The result: the cluster is now running on Windows!

Categories: MySQL

The Puzzling Performance of the Samsung 960 Pro

MySQL Performance Blog - Thu, 2017-03-23 00:14

In this blog post, I’ll take a look at the performance of the Samsung 960 Pro SSD NVME.

First, I know the Samsung 960 Pro is a consumer SSD NVME drive, not intended for sustained data center workloads. But the AnandTech review looked good enough that I decided to take it for a test spin to see if it would work well with MySQL benchmarks.

Before that, I decided to do a simple sysbench file IO test to see how the drives handled sustained workloads, and if it would start acting up.

My expectation for a consumer SSD drive is that its write consistency will suffer. Many of those drives can sustain high bursts for short periods of time but have to slow down to keep up with write leveling (and other internal activities SSDs must to do). This is not what I saw, however.

I did a benchmark on E5-2630L V3, 64GB RAM Ubuntu 16.04 LTS, XFS Filesystem, Samsung 960 Pro 512GB (FW:1B6QCXP7):  

sysbench --num-threads=64 --max-time=86400 --max-requests=0 --test=fileio --file-num=1 --file-total-size=260G --file-io-mode=async --file-extra-flags=direct --file-test-mode=rndrd run

Note: I used asynchronous direct IO to keep it close to how MySQL (InnoDB) submits IO requests.

This is what the “Read Throughput” graph looks in Percona Monitoring and Management (PMM):

As you can see, in addition to some reasonable ebbs and flows we have some major dips from about 1.5GB/sec of random reads to around 800MB/sec. This almost halves the performance. We can clearly see two of those dips, with the third one starting when the test ended.  

What is really interesting is that as I did a read-write test, it performed much more uniformly:

sysbench --num-threads=64 --max-time=86400 --max-requests=0 --test=fileio --file-num=1 --file-total-size=260G --file-io-mode=async --file-extra-flags=direct --file-test-mode=rndrw run

Any ideas on what the cause of such strange periodic IO performance regression for reads could be?

This does not look like overheating throttling. It is much too regular for that (and I checked the temperature – is wasn’t any different during this performance regression).

One theory I have is “read disturb management”: could the SSD need to rewrite the data after so many reads? By my calculations, every cell is read some 166 times during the eight hours between those gaps. This doesn’t sound like a lot.

What are your thoughts?

Categories: MySQL

Percona Server for MySQL 5.5.54-38.7 is Now Available

MySQL Performance Blog - Wed, 2017-03-22 17:26

Percona announces the release of Percona Server for MySQL 5.5.54-38.7 on March 22, 2017. Based on MySQL 5.5.54, including all the bug fixes in it, Percona Server for MySQL 5.5.54-38.7 is now the current stable release in the 5.5 series.

Percona Server for MySQL is open-source and free. You can find release details in the 5.5.54-38.7 milestone on Launchpad. Downloads are available here and from the Percona Software Repositories.

Bugs Fixed:
  • Log tracking initialization did not find last valid bitmap data correctly, potentially resulting in needless redo log retracking or hole in the tracked LSN range. Bug fixed #1658055.

Other bugs fixed: #1652912, and #1655587.

Find the release notes for Percona Server for MySQL 5.5.54-38.7 in our online documentation. Report bugs on the launchpad bug tracker.

Categories: MySQL

Dropping the Foreign Key Constraint Using pt-online-schema-change

MySQL Performance Blog - Tue, 2017-03-21 22:35

In this blog post, we’ll look at how to get rid of the unused Foreign Key (FK) constraint and/or related columns/keys with the help of pt-online-schema-change and the power of its plugins.

Before we proceed, here is a useful blog post written by Peter Zaitsev on Hijacking Innodb Foreign Keys.

If you are trying to get rid of an unused foreign key (FK) constraint and related columns from versions older than MySQL 5.6, or tables that cannot be executed with ALTER TABLE ... ALGORITHM=INPLACE because of limitations mentioned here (specifically, tables with 5.5 TIMESTAMP formats), you can use pt-online-schema-change.

For DROP FOREIGN KEY constraint_name  with pt-online-schema-change requires specifying _constraint_name rather than the real constraint_name. This is due to a limitation in MySQL: pt-online-schema-change adds a leading underscore to foreign key constraint names when creating the new table. Here’s is a simple example of one such case:

CREATE TABLE `test3` (  `Id` int(11) NOT NULL DEFAULT '0',  `Firstname` varchar(32) DEFAULT NULL,  `City` varchar(32) DEFAULT NULL,  PRIMARY KEY (`Id`),  CONSTRAINT `FKID` FOREIGN KEY (`Id`) REFERENCES `test4` (`Id`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1

To drop the constraint, we are supposed to add an underscore prior to constraint_name FKID:

[root@siddhant ~]# pt-online-schema-change --user=root --execute --set-vars=foreign_key_checks=0  --alter-foreign-keys-method=rebuild_constraints --alter="DROP FOREIGN KEY _FKID" D=apps02,t=test3 --socket=/tmp/mysql-master5520.sock Operation, tries, wait: analyze_table, 10, 1 copy_rows, 10, 0.25 ……...Altering `apps02`.`test3`... Creating new table... Created new table apps02._test3_new OK. Altering new table….... ……. 2017-02-11T12:45:12 Dropped old table `apps02`.`_test3_old` OK. 2017-02-11T12:45:12 Dropping triggers... 2017-02-11T12:45:12 Dropped triggers OK. Successfully altered `apps02`.`test3`.

Below is one case where if, for some reason, you already have an FK constraint with an underscore the above method of adding an additional underscore to already underscored _FK will fail with an error while dropping it:

Error altering new table `apps02`.`_test3_new`: DBD::mysql::db do failed: Error on rename of './apps02/_test3_new' to './apps02/#sql2-697-19' (errno: 152) [for Statement "ALTER TABLE `apps02`.`_test3_new` DROP FOREIGN KEY ___FKID"] at /usr/bin/pt-online-schema-change line 9069.

In such cases, we will have to make use of the --plugin  option used along with a file that calls the pt_online_schema_change_plugin class and a hook after_alter_new_table to drop the FK constraint. For example, a table with the FK constraint with an underscore is:

CREATE TABLE `test` (  `Id` int(11) NOT NULL DEFAULT '0',  `Firstname` varchar(32) DEFAULT NULL,  `City` varchar(32) DEFAULT NULL,  PRIMARY KEY (`Id`),  CONSTRAINT `___fkId` FOREIGN KEY (`Id`) REFERENCES `test2` (`Id`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1

Here we have a table with foreign key ___fkid using three underscores. Our plugin for dropping the constraint should be as follows:

[root@siddhant ~]# cat ptosc_plugin_drop_fk.pl package pt_online_schema_change_plugin; use strict; sub new {   my ($class, %args) = @_;   my $self = { %args };   return bless $self, $class; } sub after_alter_new_table {   my ($self, %args) = @_;   my $new_tbl = $args{new_tbl};   my $dbh     = $self->{cxn}->dbh;   my $sth = $dbh->prepare("ALTER TABLE $new_tbl->{name} DROP FOREIGN KEY __fkId");    $sth->execute(); } 1;

NOTE: DROP FOREIGN KEY CONSTRAINT in the plugin has one underscore less than original foreign key constraint, __fkId vs. ___fkId. Also, the alter statement will be NOOP alter (i.e., --alter ="ENGINE=INNODB").

Here is the pt-online-schema-change execution example with the plugin.

[root@siddhant ~]#  pt-online-schema-change --user=root --execute  --set-vars=foreign_key_checks=0  --alter-foreign-keys-method=rebuild_constraints --alter="ENGINE=INNODB" --plugin=/root/ptosc_plugin_drop_fk.pl  D=apps01,t=test --socket=/tmp/mysql-master5520.sock Created plugin from /root/ptosc_plugin_drop_fk.pl. Operation, tries, wait:  analyze_table, 10, 1  copy_rows, 10, 0.25  create_triggers, 10, 1  drop_triggers, 10, 1 swap_tables, 10, 1  update_foreign_keys, 10, 1 Altering `apps01`.`test`... Creating new table... Created new table apps01._test_new OK. Altering new table... Altered `apps01`.`_test_new` OK. 2017-02-11T11:26:14 Creating triggers... 2017-02-11T11:26:14 Created triggers OK. 2017-02-11T11:26:14 Copied rows OK. 2017-02-11T11:26:14 Swapping tables... 2017-02-11T11:26:14 Swapped original and new tables OK. 2017-02-11T11:26:14 Dropping old table... 2017-02-11T11:26:14 Dropped old table `apps01`.`_test_old` OK. 2017-02-11T11:26:14 Dropping triggers... 2017-02-11T11:26:14 Dropped triggers OK. Successfully altered `apps01`.`test`.

Categories: MySQL

Webinar Wednesday March 22, 2017: TokuDB Troubleshooting

MySQL Performance Blog - Tue, 2017-03-21 14:02

Please join Percona’s Principal Technical Services Engineer, Sveta Smirnova, Senior Software Engineer, George Lorch and Software Engineer, Vlad Lesin as they present TokuDB Troubleshooting on March 22, 2017 at 11:00 am PDT / 2:00 pm EDT (UTC-7).

Register Now  TokuDB is an alternative storage engine, designed for big data applications. It provides great write workload scalability features. While the feature set is similar to InnoDB’s, this engine has its own specific configuration settings and troubleshooting instruments. This webinar will look at how to use them for optimal performance.

We will discuss how to deal with:

  • Data corruption issues
  • Inconsistent data
  • Locks
  • Slow performance

We will use well-known instruments and tools, and how they work with the TokuDB storage engine.

Register for the webinar here.

Vladislav Lesin, Software Engineer

Vladislav Lesin is a software engineer at Percona, joining in April 2012. Before coming to Percona, he worked on improving the performance and reliability of high load projects with LAMP architectures. His work consisted of developing fast servers and modules with C and C++, projects state monitoring, searching bottlenecks, and open source projects patching including nginx, memcache, sphinx, php, ejabberd. He took part in developing not only server-side applications, but desktop and mobile ones too. He also has experience in project/product management, hiring, partners negotiations.

Before that he worked in several IT companies, where he developed desktop applications on C++ for such areas as industrial automation, parallel computing, media production. He holds a Master’s Degree in Technique and Technology from Tula State University. Now he lives in Tula City with his wife and daughter.

Sveta Smirnova, Principal Technical Services Engineer

Sveta joined Percona in 2015. Her main professional interests are problem-solving, working with tricky issues, bugs, finding patterns that can solve typical issues quicker and teaching others how to deal with MySQL issues, bugs and gotchas effectively. Before joining Percona Sveta worked as a Support Engineer in the MySQL Bugs Analysis Support Group in MySQL AB-Sun-Oracle.

She is the author of the book “MySQL Troubleshooting” and JSON UDF functions for MySQL.

George Lorch, Software Engineer George joined the Percona development team in April 2012. George has over 20 years of experience in software support, development, architecture and project management. Prior to joining Percona, George was focused on Windows-based enterprise application server development and network protocol classification and optimization, with heavy doses of database schema design, architecture and tuning.
Categories: MySQL

Running Percona XtraBackup on Windows … in Docker

MySQL Performance Blog - Mon, 2017-03-20 23:06

In this blog, we’ll look at running Percona XtraBackup on Windows via a Docker container.

The question whether Percona XtraBackup is available for Windows comes up every so often. While we are not planning to provide regular releases for Windows, I decided to share a way to run Percona XtraBackup in a Docker container (especially since Docker support for Windows has become more and more stable).

For this exercise, I created a playground Docker image: perconalab/percona-xtrabackup.

First, we need to prepare a few things to make it work:

  1. Install Docker on Windows (the current version I am running is 17.03)
  2. Enable the sharing of disk C in Docker settings
  3. Find out the IP address MySQL is running on (192.168.1.122 in my case)
  4. Grant backup-required privileges for the xtrabackup user:

GRANT RELOAD,PROCESS,LOCK TABLES,REPLICATION CLIENT ON *.* TO 'xtrabackup'@'192.%' IDENTIFIED by 'xtrapassword'

Now, let’s assume our datadir is in C:/mysqldata, and we want to backup to C:/mysqlbackup. Needless to say, that XtraBackup image must run on the same server as MySQL’s datadir (since XtraBackup needs to access the data to copy it).

Now to take a backup we execute:

docker run --rm -it -v //C/mysqldata:/var/lib/mysql -v //C/mysqlbackup:/xtrabackup_backupfiles perconalab/percona-xtrabackup --backup --host=192.168.1.122 --user=xtrabackup --password=xtrapassword

We find our backup in C:/mysqlbackup when it is done.

Enjoy!

Categories: MySQL

Prophet: Forecasting our Metrics (or Predicting the Future)

MySQL Performance Blog - Mon, 2017-03-20 19:54

In this blog post, we’ll look at how Prophet can forecast metrics.

Facebook recently released a forecasting tool called Prophet. Prophet can forecast a particular metric in which we have an interest. It works by fitting time-series data to get a prediction of how that metric will look in the future.

For example, it could be used to:

  • Predict how much HTTP traffic we will get, and scale accordingly when needed
  • See if a particular feature of our application will have success or if its usage will decline
  • Get an approximate date when our database server’s resources will be exhausted
  • Forecast new customer’s sign up and resize the staff accordingly
  • See what next year’s Black Friday or Cyber Monday will look like, and if we have the resources to handle them
  • Predict how many animals will enter a shelter in the coming years, as I did in a personal project I will show here

At its core, it uses a Generalized Additive Model. It is basically the merging of two models. First, a generalized linear model that, in the case of Prophet, can be a linear or logistic regression (depending on what we choose). Second, an additive model applied to that regression. The final graph represents the combination of those two. That is, the smoothed regression area of the variable to predict. For more technical details of how it works, check out Prophet’s paper.

Most of the previous points can be summarized in a simple concept, capacity planning. Let’s see how it works.

Usage Example

Prophet provides either a Python or R library. The following example will use the Python one. You can install it using:

pip install prophet

Prophet expects the metrics with a particular structure: a Pandas DataFrame with two columns, ds and y:

ds y 0 2013-10-01 34 1 2013-10-02 43 2 2013-10-03 20 3 2013-10-04 12 4 2013-10-05 46 … … …

 

The data I am going to use here is from Kaggle Competition Shelter Animal Outcomes. The idea is to find out how Austin Animal Center‘s workload will evolve in the future by trying to predict the number of animal outcomes per day for the next three years. I am using this dataset because it has enough data, shows a very simple trend and it is a non-technical metric (no previous knowledge on the topic is needed). The same method can be applied to most of the services or business metrics you could have.

At this point, we have the metric stored in a local variable, called “series” in this particular example. Now we only need to fit it into our model:

m = Prophet() m.fit(series);

and define how far into the future we want to predict (three years in this case):

future = m.make_future_dataframe(periods=365*3)

Now, just plot the data:

m.plot(forecast) plt.title("Outcomes forecast per Year",fontsize=20) plt.xlabel("Year",fontsize=20) plt.ylabel("Number of outcomes",fontsize=20) plt.show()

The graph shows a smoothed regression surface. We can see that the data provided covers from the last months 2013 to the first of 2016. From that point, those are the predictions.

We can already find some interesting data. Our data shows a large increase during the summer months and predicts it to continue in the future. But this representation also has some problems. As we can see, there are at least three outliers with values > 65. The fastest way to deal with outliers is to just remove them.

Categories: MySQL

Defining Moments in Database History

Xaprb, home of innotop - Sun, 2017-03-19 20:59

The rise of the LAMP stack in the early- to mid-2000s created a shift in the technology landscape, as well as the impetus for contenders to emerge. I’ve been reflecting on key factors in that phenomenon and what’s happened since then—and what it can teach us about what’s happening now.

What was it about the LAMP stack, anyway? All of the ingredients in that stack were interesting and signaled tectonic shifts (or were the result of them), but I think MySQL in particular was the bellwether for today’s database trends.

MySQL is the database that came to power much of the Internet as we know it today. MySQL was remarkable for many reasons, although it’s easy to forget them in hindsight. It wasn’t the first or perhaps even the best open source database, but it was just enough better that it became the best for the situation at hand. And ultimately it became a commercial success that even in hindsight seems improbable.

I’ve thought of many necessary conditions for MySQL to flourish in 2000-2010. The big question is which combination of those conditions were sufficient. I am not certain of the answer, but I’m certain the answer is plural.

And yet, partially because of its enormous popularity, MySQL helped spur the rise of NoSQL in 2008-2009. These databases sought to define a new moment in database history: one in which legacy relational technology would finally be replaced by an utterly new generation. The disruptor was being disrupted.

Where do we find ourselves today? Relational implementations rapidly improved (enter NewSQL), and NoSQL was backronymed to mean “not only SQL” instead of being a rejection of SQL. Many NoSQL databases today sport SQL-like languages.

Was NoSQL just a flare-up? Is there a real need for next-generation data storage and processing? Or is good old relational going to improve and obviate every next-gen data technology anyway?

I believe relational will endure, and continue to evolve to address new use cases, but is already past its heyday of complete dominance. I see a few current trends, and I’m sure that at least some of them will become equally enduring. I think we are seeing historic shifts in database technology emerge right now.

Next-Generation General-Purpose Databases

Relational, and SQL, are painful. SQL is a Yoda language that causes a lot of problems. It obscures intent, introduces illogical logic such as tri-valued truth, prompts books from the likes of Celko and Date about the small subset of how to do it right, and creates endless opportunities for the server to do things you didn’t intend and cause incredibly subtle bugs and performance disasters.

Not least, SQL is practically an open sore when it’s written in a program. Think about it: you’ve got this nice strictly-typed language with all sorts of compiler guarantees, and in the middle of it is a meaningless string blob that isn’t compiled, syntax-checked, or type-checked. It is bound to a foreign source of data through an API that isn’t knowable to the program or compiler, and may change without warning. It’s “I give up, random potentially correct garbage of dubious meaning goes here.” It’s the equivalent of an ugly CDATA in an XML document.

This should present significant opportunities for improvement. One can imagine a number of sensible first steps to take: find a way for the program and the database to use the same language and toolset; design a database query language that works similarly to a programming language; memory-map the database into the program; and so on. Problems begin immediately, and indeed the relational model was created to solve many of those issues—issues that have been happily and naively reinvented ever since. Those who are ignorant of history are doomed to repeat it.

But into this fray waded a brave new generation in 2009, with map-reduce databases, key-value databases, Javascript databases, and so forth. There were even some databases that were overall platforms, such as Firebase. All with some good ideas, all going in some productive direction, all with at least some aspects that could be legitimately criticized.

I’ve been an avid student of emerging databases and have even been seen as a champion of some of them. A while ago I predicted that MongoDB, Redis, and Riak would survive in a meaningful way. Of these, Riak seems to have been sidelined, but MongoDB and Redis are going strong.

Which other NoSQL databases have had impact on par with those two? Perhaps Cassandra, and arguably Neo4J, but both of those are less mainstream. MongoDB and Redis are ubiquitous.

Why? It’s instructive to look at the problems they solve. Redis starts with a simple conceptual foundation: label a piece of data, then you can use the label to fetch and manipulate the data. The data can be richly structured in ways that are familiar to programmers, and the operations you can perform on these structures are a Swiss Army knife of building blocks for applications. The types of things that otherwise force you to write boilerplate code or build frameworks. Redis focuses on doing these things well, and doesn’t try to solve a lot of other problems.

MongoDB also starts with a simple concept, essentially that databases should store nested, structured “documents” that can map directly to the structs or objects you use in your programming language. And on top of this, MongoDB adds another power tool: the programming language you use to query the database is the ubiquitous JavaScript, arguably the most popular and flexible programming language today. There’s much more, too, such as built-in scalability so you don’t have to build “sharding” into your app. (Anyone who’s done that knows that you’re actually building a new custom database in your app code.)

Many of the NoSQL databases that sprang up like weeds in 2009 didn’t solve these types of problems in these kinds of ways. For example, Cassandra solved the scalability problem, but gave the programmer only limited expressive power. Ultimately, a highly scalable but not very “powerful” database can be less attractive than one that acts as a force multiplier for programmer productivity. To a first approximation, high scalability is a tech ops force multiplier, and devs outnumber ops tenfold.

Perhaps this is what makes Redis and MongoDB endure. I don’t know, but I am sure it’s part of what makes them a joy to use. And for better or for worse, from where I sit they seem to be the most viable answer to the proposition “a more modern database is a practical and useful thing to create.”

Time Series Databases

Another distinct emerging category is time series databases. These databases store facts with timestamps, and treat the time as a native and essential part of the data model. They allow you to do time-based analysis. Not only that, they really view temporal queries as central. Many of them even make time a mandatory dimension of any query.

I wrote extensively about time series databases previously. For example, I argued that the world is time series and I shared my requirements for a time series database a bit later. (That latter article is not something I agree with fully today).

InfluxDB is on a very steep growth trajectory as it seeks to define what it means for a database to be natively time oriented, and answer the question of whether that is enough for a database, or if there’ll be a “last mile problem” that will make people want some of the stuff they can get from other types of databases too. Defining the boundaries of a database’s functionality is hard. But InfluxDB seems to be doing an admirable job of it.

An alternative is ElasticSearch, which offers time series functionality in some ways, but not as the sole and central concept. It’s really a distributed search engine that knows about time. This quite naturally and properly raises the question: if you’re going to use a non-time-series database that knows about time, why use a search engine? Why not a relational database that has time series functionality?

There are many, many others. Time will tell what survives and what set of problems is worth solving and leaves nothing essential unsatisfied. I’d bet on InfluxDB at this point, personally. But one thing is certain: time series is important enough that first-class time series databases are necessary and worthwhile. It’s not enough to foist this use case onto another “yeah we do that too” database.

Stream-Oriented Databases

The other enduring standalone category I see today is stream-oriented, pub-sub, queueing, or messaging—choose your terminology; they’re different but related. These databases are essentially logs or buses (and some of them have names that indicate this). Instead of permanently storing the data and letting you retrieve and mutate it, the concept is insertion, immutable storage in order, and later reading it out again (potentially multiple times, potentially deleting on retrieval).

Why would you want this? It’s not obvious at first glance, but this “river of data, from which everything in the enterprise can drink” architecture is at once enormously powerful and enormously virtuous. It enables data processing patterns that otherwise require contortions and great effort, but makes them clean and easy.

The typical enterprise data architecture quickly becomes a nightmare spaghetti tangle. Data flows through the architecture in weird ways that are difficult to understand and manage. And problems like performance, reliability, and guarantees about hard things such as processing order, are prime motivators for a lot of complexity that you can solve or avoid with a queue or streaming database.

There are a lot of concepts related to these databases and their interplay with other types of database; too many to list here. I’ll just say that it’s a fundamental mindset shift, similar to the type of epiphany you get the first time you really understand purely functional programming. For example, you suddenly want to abolish replication forevermore, and you never want anything to poll or batch process again, ever.

Lots of technologies such as Spark are emerging around these areas. But in my view, Apache Kafka is the undisputed game-changer. It’s truly a watershed technology. Rather than try to explain why, I’ll just point you to the commercial company behind Kafka, Confluent. Read their materials. I know many of the people working there; they are genuine, smart, and it’s not marketing fluff. You can drink from their well. Deeply.

Conclusions

If anyone thought that NoSQL was just a flare-up and it’s died down now, they were wrong. NoSQL did flare up, and we did see a lot of bad technology emerge for a time. But the pains and many of the solutions are real. A key determinant of what’ll survive and what’ll be lost to history is going to be product-market fit. In my opinion, three important areas where markets aren’t being satisfied by relational technologies are relational and SQL backwardness, time series, and streaming data. Time will tell if I’m right.

Pic Credit

Categories: MySQL
Syndicate content