Percona Server for MongoDB 3.0.12-1.7 is now available

MySQL Performance Blog - Fri, 2016-06-24 16:14

Percona announces the release of Percona Server for MongoDB 3.0.12-1.7 on June 24, 2016. Download the latest version from the Percona web site or the Percona Software Repositories.

Percona Server for MongoDB 3.0.12-1.7 is an enhanced, open source, fully compatible, highly scalable, zero-maintenance downtime database supporting the MongoDB v3.0 protocol and drivers. Based on MongoDB 3.0.12, it extends MongoDB with MongoRocks and PerconaFT storage engines, as well as features like external authentication and audit logging. Percona Server for MongoDB requires no changes to MongoDB applications or code.

NOTE: The MongoRocks storage engine is still under development. There is currently no officially released version of MongoRocks that can be recommended for production.

This release includes all changes from MongoDB 3.0.12, and the following known issue that will be fixed in a future release:

  • The --version does not correctly report the software version. The effected binaries are:
    • bsondump
    • mongodump
    • mongoexport
    • mongofiles
    • mongoimport
    • mongooplog
    • mongorestore
    • mongostat
    • mongotop

The release notes are available in the official documentation.


Categories: MySQL

Migrate from MS SQL Server to MySQL

MySQL Performance Blog - Thu, 2016-06-23 22:56

In this blog series, I will share my experiences as I migrate commercial databases (i.e., Microsoft SQL or Oracle) to open source (MySQL). More specifically, we will look at how you can migrate from MS SQL Server to MySQL.

For this first blog post I’ve chosen Jira database as an example, and used three different tools to migrate Jira database in Microsoft SQL Server to MySQL:

  1. MySQL Workbench (opensource)
  2. Amazon DMS (cloud tool)
  3. Ispirer MnMTK 2015 (commercial tool)

When I started my research, I was under the impression that Jira database would be easy to migrate (no stored procedures, no triggers, etc.). It turned out that there were some problems that I was able to fix.

One of the reasons I chose Jira as opposed to some standard MS SQL database (such as AdventureWorks2014) is that it is a non-standard choice. Most of the software vendors use standard databases to test their software, and it works perfectly on those standard databases. Jira is not a usual choice and will be closer to real life.

MySQL Workbench

MySQL Workbench supports Microsoft SQL Server migration. The migration is straightforward except the issues with character sets. I have experienced the error “Could not successfully convert UCS-2 string to UTF-8”.

It turns out (with the help of Martin Brennan’s blog) that we will need to use “ODBC (FreeTDS)” drive for MS SQL, and enable sending Unicode data as UTF8:

After changing those settings, I was able to successfully migrate Jira database from MS SQL to MySQL.

Advantages and disadvantages:

  • Plus: free and open source tool, multi-platform
  • Plus: successful migration for Jira
  • Plus: supports multi-threaded migrations (increase worker tasks if needed, default value is 2)
  • Minus: needed some tweaks to work with character sets
  • Minus: not very easy to debug errors

Amazon DMS

AWS Database Migration Service supports migrating from MS SQL to MySQL, but the actual migration method is different from other tools. It uses the source database (MS SQL server in this case) replication feature to stream the data to the target database (MySQL). Amazon DMS starts a temporary “migration” instance that is used to stream data. Both the source and destination database can be in AWS (EC2 or RDS) or outside AWS (no restriction).

The important limitation for MS SQL migration: it only works with MS SQL Server versions that support replication (subscription service). It doesn’t work with MS SQL Express edition. Also, if the subscription service is not enabled the DMS can’t even see the schema(s) to migrate (full list of limitations for MS SQL Server migration).

I’ve also gotten errors around the “constraint” name being too long:

2016-04-02T18:20:23 [TARGET_LOAD ]E: Failed to execute statement: 'ALTER TABLE `dbo`.`AO_38321B_CUSTOM_CONTENT_LINK` ADD CONSTRAINT `AO_38321B_CUSTOM_CONTENT_LINK_pk_AO_38321B_CUSTOM_CONTENT_LINK_ID` PRIMARY KEY ( `ID` )' [122502] ODBC general error. (ar_odbc_stmt.c:4048)

The problem here is that “AO_38321B_CUSTOM_CONTENT_LINK_pk_AO_38321B_CUSTOM_CONTENT_LINK_ID” is too long a string for MySQL. At the same time, this name does not really matter as this is the PRIMARY KEY.

After changing the “constraint” name in MS SQL to smaller strings, I could migrate all tables to MySQL.

Amazon DMS notes: Amazon DMS lets you migrate from a database located anywhere (not necessarily in AWS) to another database located anywhere (not necessarily in AWS) — however, the traffic will go thru AWS. Thus the migration path is fastest and the most beneficial if either the source or target (or both) instances are in AWS (for example, ec2 or rds instances).

Advantages and disadvantages:

  • Plus: successful migration for Jira
  • Plus: Multithreaded execution (this is a huge advantage for migrating large databases)
  • Plus: In addition to migration, you can also use a replication link between the SQL Server and the new MySQL to fetch the new changes. This is a huge advantage when migrating a large database with lots of traffic and tight downtime requirements.
  • Minus: replication should be enabled to perform a migration, which means that migrating from SQL Server Express isn’t supported.
  • (Can be plus and minus): All traffic is going through a cloud environment.

Potential issues

(This section has been updated) I’ve figured out that the table structures generated by Workbench, Amazon DMS and SQLWays are different. For example:

Workbench generated:


AWS DMS generated:


SQLWays wizard generated:


In AWS DMS version, the index on CONTENT_KEY is missing and ID is not declared as auto_increment. However, the Workbench “lost” the character set. SQLWays wizard has created partial key (191 characters).

At the same time, I was able to start Jira on top of two versions of the MySQL database (migrated by using Workbench and Amazon DMS).

Ispirer MnMTK 2015

Ispirer MnMTK 2015 toolkit is a commercial (not open-source) software application that lets you migrate from MS SQL Server to MySQL (among other databases). Ispirer has provided me with a demo license so I can test the migration.

I was able to migrate the Jira database from MS SQL to MySQL with the Ispirer SQLWays Wizard:

One issue with this process is that SQL Ways Wizard relies on the MySQL command line utility (“mysql”), which should be in the path. If you do not have MySQL installed on the migration machine, or it is not in the path, the migration will fail:

To fix simply add the MySQL “bin” directory to the path. In addition, you can use the SQL Ways Wizard to generate scripts and run those scripts on the destination host where the utilities are installed.

Advantages and disadvantages:

  • Plus: successful migration for Jira
  • Plus: support from SQL ways: can work on fixing potential migration issues (requires paid license)
  • Plus: can convert stored procedures, triggers, foreign key constraints
  • Minus: commercial, not open source software.
  • Minus: only runs on Windows (however, target machine for the database migration can be Linux)
  • Minus: no multi-treaded migration support (can manually run multiple instances of SQL Ways)


All tools I tried finally worked, but at the same time I was surprised with the number of issues I found. Migrating a very simple database (no stored procedures, no triggers, no foreign key constraints) should be easier.

Another surprise was that all tools are focused on a nice GUI with “next” buttons. For migrating one database to another, I would prefer using a command line tool interface (may be similar to Percona toolkit or iconv):

# dbmigrate --source user:pass@sqlserverhost:1433 --target user:pass@mysqlhost:3309 --parallel 8 --verbose --overwrite

Actually, Ispirer MnMTK does have a command line migration utility included, but it only works on Windows.

Until somebody develops a better command line tool, any of the above solutions will help you migrate from MS SQL Server to MySQL.

Categories: MySQL

Troubleshooting configuration issues: Q & A

MySQL Performance Blog - Wed, 2016-06-22 23:18

In this blog, I will provide answers to the Q & A for the Troubleshooting configuration issues webinar.

First, I want to thank you for attending the June, 9 webinar. The recording and slides for the webinar are available here. Below is the list of your questions that I wasn’t able to answer during the webinar, with responses:

Q: What are the predefined roles installed per default?

A: MySQL does not support roles by default. But, if you need roles you can either use MariaDB or emulate roles with Percona PAM Plugin and proxy users. But MySQL has pre-built user accounts. One of them is root@localhost, which has all available privileges, and anonymous account that can access the test database without a password. See the user manual for more information.

Q: How do you determine the root cause for a query that just hangs?

A: There are several possible reasons for a hanging query. They include:

  1. Poorly optimized query – it doesn’t really “hang,” but just requires a lot of time to execute. Dealing with these types of queries was discussed in the Troubleshooting Slow Queries webinar.
  2. The query is waiting for a lock set by another query or transaction. These issues were discussed in the Troubleshooting locking issues webinar
  3. A bug in MySQL.

When I see hanging queries I look into all the above possibilities. But I prefer to start at query optimization.

Q: Can we get the powerpoint slides?

A: I do not use PowerPoint to create slides. Slides are available in PDF format only.

Q: 1) Is it safety to use SQL_LOG_BIN = 0 for specific connections (statements), for example for DELETE, when we need to keep old data on a slave, but delete from master? What are side-effects? Can it break replication?

A: Using SQL_LOG_BIN = 0 itself is safe, but you need to understand what you are doing. For example, if you delete data in a table that has a unique key on the master, and then insert a row that has the same unique value that existed in one of rows you deleted, replication will fail with “Duplicate key” error.

Q: Is it reasonable to disable query_cache_type (set 0) on MySQL instances with very big (huge) amount of data?

A: Yes. I would recommend it.

Q: 3) How does the number of innodb_buffer_pool_instances affect performance? Does a lot of innodb_buffer_pool_instances = high performance?

A: InnoDB introduced buffer pool instances to reduce contention “as different threads read and write to cached pages“. However, they improve performance only if you have many concurrent threads inside InnoDB.

Q: I have a question, where can I download the threadpool plugin at Percona? I checked your download page and couldn’t find it. Is it bundled in the Percona official release? wW have 5.6.28,29 and 30 and there is no in the plugin directory. Can you let me know how to get it?

A: Percona built thread pool into Percona Server; a separate download isn’t necessary. See the user manual for instructions.


Categories: MySQL

Sneak peek at the Percona Live Europe Amsterdam 2016 talks

MySQL Performance Blog - Wed, 2016-06-22 17:46

On behalf of the Percona Live Conference Committee, I am excited to announce the sneak peek schedule for the Percona Live Europe Amsterdam 2016 talks!

Percona Live Europe will feature a variety of formal tracks and sessions related to MySQL, MongoDB and ODBMS from members of the open source community. With many slots to fill, there will be no shortage of great content this year. Though we won’t finalize the entire conference schedule until mid-July, this preview list of talks is sure to whet your appetite! So without further ado, here is the SNEAK PEEK SCHEDULE!

Want to Present at Percona Live Europe Amsterdam?

We are still looking for people to give talks! The committee has begun their work rating talks, but there is still time to submit. We are looking for a range of interesting talks and subjects around MySQL, MongoDB, and ODBMS. Some topics of interest include:

  • Database Virtualization
  • Integration or coexistence between SQL and NoSQL
  • Automating service life-cycle
  • Load balancing
  • Cluster control
  • Multiple Data Store Technologies and Management Solutions

But hurry, the call for papers closes July 18th! If selected, you get a free pass, and the chance to dazzle your open source community peers. Apply now!

Become a Percona Live Europe Amsterdam Sponsor

Sponsor the Percona Live Europe Amsterdam conference. There are still sponsorship opportunities at Percona Live Europe. Being a sponsor allows you to demonstrate thought leadership, promote brand awareness and support the open source community. Need a reason? Here are ten reasons to become a sponsor. Sponsor the Percona Live Europe now.

Sponsor Percona Live Europe now.

Categories: MySQL

Docker automatic MySQL slave propagation

MySQL Performance Blog - Tue, 2016-06-21 23:42

In this post, we’ll discuss Docker automatic MySQL slave propagation for help with scaling.

In my previous posts on the Docker environment, I covered Percona XtraDB Cluster. Percona XtraDB Cluster can automatically scale by conveniently adding new nodes using the highly automated State Snapshot Transfer. State Snapshot Transfer allows a new node to copy data from an existing node (I still want to see how this is possible with MySQL Group Replication).

This is not the case with regular MySQL Replication. With MySQL Replication, the slave setup still requires manual steps (well, unless you’ve already scripted it for your environment). At least these are “simple” steps (ha!). Percona XtraBackup can setup replication with less work (see this link for details:, but it still requires poking around and switching between servers.

However, nothing stops us from making it more automatic (similar to SST in Percona XtraDB Cluster), especially with Docker images. Why Docker? Because Docker provides a highly-controlled environment where we can orchestrate how scripts get executed. Severalnines provides a great intro into MySQL with Docker.

There are a few more components for this setup:

Before jumping to my solution, I should point to some work in this area by Joyent:

I propose my image, with sources on GitHub

First, we need to start a master node:

docker run -d -p 3306:3306 --net=replicaset_net --name=replicaset_master -e MYSQL_ROOT_PASSWORD=Theistareyk perconalab/ps-master-slave --innodb-buffer-pool-size=2G

I assume that we’ve created the network replicaset_net already, either bridge or overlay.

You can create a slave by pointing to the master container:

docker run -d -p 3306 --net=replicaset_net --name=replicaset_slave1 -e MYSQL_ROOT_PASSWORD=Theistareyk -e MASTER_HOST=replicaset_master perconalab/ps-master-slave --innodb-buffer-pool-size=2G

The started node will automatically connect to MASTER_HOST, copy the data and perform all the steps needed to start the slave.

You can even copy data from a running slave, instead of the master, like this:

docker run -d -p 3306 --net=replicaset_net --name=replicaset_slave2 -e MYSQL_ROOT_PASSWORD=Theistareyk -e MASTER_HOST=replicaset_master -e SLAVE_HOST=replicaset_slave1 perconalab/ps-master-slave --innodb-buffer-pool-size=2G

This node will copy data from SLAVE_HOST, but then will point itself to MASTER_HOST.

Docker Network lets you use container names "replicaset_master" and "replicaset_slave1" instead of IP addresses, which is very convenient.

As the result of above, we have one master and two slaves running. We can start as many slave nodes as needed.

Please remember, this is more proof-of-concept than “production ready” images, but it gives a good direction for implementation.

Categories: MySQL

Running Percona XtraDB Cluster nodes with Linux Network namespaces on the same host

MySQL Performance Blog - Mon, 2016-06-20 22:52

This post is a continuance of my Docker series, and examines Running Percona XtraDB Cluster nodes with Linux Network namespaces on the same host.

In this blog I want to look into a lower-level building block: Linux Network Namespace.

The same as with cgroups, Docker uses Linux Network Namespace for resource isolation. I was looking into cgroup a year ago, and now I want to understand more about Network Namespace.

The goal is to both understand a bit more about Docker internals, and to see how we can provide network isolation for different processes within the same host. You might need to isolate process when running several MySQL or MongoDB instances on the same server (which might come in handy during testing). In this case, I needed to test ProxySQL without Docker.

We can always use different ports for different MySQL instances (such as 3306, 3307, 3308), but it quickly gets complicated.

We could also use IP address aliases for an existing network interface, and use bind=<IP.ADD.RE.SS> for each instance. But since Percona XtraDB Cluster can use three different IP ports and network channels for communications, this also quickly gets complicated.

Linux Network Namespace provides greater network isolation for resources so that it can be a better fit for Percona XtraDB Cluster nodes. Now, setting up Network namespaces in and of itself can be confusing; my recommendation is if you can use Docker, use Docker instead. It provides isolation on process ID and mount points, and takes care of all the script plumbing to create and destroy networks. As you will see in our scripts, we need to talk about directory location for datadirs.

Let’s create a network for Percona XtraDB Cluster with Network Namespaces.

I will try to do the following:

  • Start four nodes of Percona XtraDB Cluster
  • For each node, create separate network namespace so the nodes will be able to allocate network ports 3306, 4567, 4568 without conflicts
  • Assign the nodes IP addresses:
  • Create a “bridge interface” for the nodes to communicate, using IP address

For reference, I took ideas from this post: Linux Switching – Interconnecting Namespaces

First, we must create the bridge interface on the host:

BRIDGE=br-pxc brctl addbr $BRIDGE brctl stp $BRIDGE off ip addr add dev $BRIDGE ip link set dev $BRIDGE up

Next, we create four namespaces (one per Percona XtraDB Cluster node) using the following logic:

for i in 1 2 3 4 do ip netns add pxc_ns$i ip link add pxc-veth$i type veth peer name br-pxc-veth$i brctl addif $BRIDGE br-pxc-veth$i ip link set pxc-veth$i netns pxc_ns$i ip netns exec pxc_ns$i ip addr add 10.200.10.$((i+1))/24 dev pxc-veth$i ip netns exec pxc_ns$i ip link set dev pxc-veth$i up ip link set dev br-pxc-veth$i up ip netns exec pxc_ns$i ip link set lo up ip netns exec pxc_ns$i ip route add default via done

We see the following interfaces on the host:

1153: br-pxc: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP link/ether 32:32:4c:36:22:87 brd ff:ff:ff:ff:ff:ff inet scope global br-pxc valid_lft forever preferred_lft forever inet6 fe80::2ccd:6ff:fe04:c7d5/64 scope link valid_lft forever preferred_lft forever 1154: br-pxc-veth1@if1155: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master br-pxc state UP qlen 1000 link/ether c6:28:2d:23:3b:a4 brd ff:ff:ff:ff:ff:ff link-netnsid 8 inet6 fe80::c428:2dff:fe23:3ba4/64 scope link valid_lft forever preferred_lft forever 1156: br-pxc-veth2@if1157: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master br-pxc state UP qlen 1000 link/ether 32:32:4c:36:22:87 brd ff:ff:ff:ff:ff:ff link-netnsid 12 inet6 fe80::3032:4cff:fe36:2287/64 scope link valid_lft forever preferred_lft forever 1158: br-pxc-veth3@if1159: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master br-pxc state UP qlen 1000 link/ether 8a:3a:c1:e0:8a:67 brd ff:ff:ff:ff:ff:ff link-netnsid 13 inet6 fe80::883a:c1ff:fee0:8a67/64 scope link valid_lft forever preferred_lft forever 1160: br-pxc-veth4@if1161: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master br-pxc state UP qlen 1000 link/ether aa:56:7f:41:1d:c3 brd ff:ff:ff:ff:ff:ff link-netnsid 11 inet6 fe80::a856:7fff:fe41:1dc3/64 scope link valid_lft forever preferred_lft forever

We also see the following network namespaces:

# ip netns pxc_ns4 (id: 11) pxc_ns3 (id: 13) pxc_ns2 (id: 12) pxc_ns1 (id: 8)

After that, we can check the namespace IP address:

# ip netns exec pxc_ns3 bash # ip addr 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 1159: pxc-veth3@if1158: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether 4a:ad:be:6a:aa:c6 brd ff:ff:ff:ff:ff:ff link-netnsid 0 inet scope global pxc-veth3 valid_lft forever preferred_lft forever inet6 fe80::48ad:beff:fe6a:aac6/64 scope link valid_lft forever preferred_lft forever

To enable communication from inside the network namespace to the external world, we should add some iptables rules, e.g.:

iptables -t nat -A POSTROUTING -s -o enp2s0f0 -j MASQUERADE iptables -A FORWARD -i enp2s0f0 -o $BRIDGE -j ACCEPT iptables -A FORWARD -o enp2s0f0 -i $BRIDGE -j ACCEPT

where enp2s0f0 is an interface that has an external IP address (by some reason modern Linux distros decided to use names like enp2s0f0 for network interfaces, instead old good "eth0").

To start a node (or mysqld instance) inside a network namespace, we should use ip netns exec prefix for commands.

For example to start Percona XtraDB Cluster first node, in the namespace pxc_ns1, with IP address, we use:

ip netns exec pxc_ns1 mysqld --defaults-file=node.cnf --datadir=/data/datadir/node1 --socket=/tmp/node1_mysql.sock --user=root --wsrep_cluster_name=cluster1

To start following nodes:

NODE=2 ip netns exec pxc_ns${NODE} mysqld --defaults-file=node${NODE}.cnf --datadir=/data/datadir/node${NODE} --socket=/tmp/node${NODE}_mysql.sock --user=root --wsrep_cluster_address="gcomm://" --wsrep_cluster_name=cluster1   NODE=3 ip netns exec pxc_ns${NODE} mysqld --defaults-file=node${NODE}.cnf --datadir=/data/datadir/node${NODE} --socket=/tmp/node${NODE}_mysql.sock --user=root --wsrep_cluster_address="gcomm://" --wsrep_cluster_name=cluster1   etc

As the result of this procedure, we have four Percona XtraDB Cluster nodes running in an individual network namespace, not worrying about IP address and ports conflicts. We also allocated a dedicated IP range for our cluster.

This procedure isn’t trivial, but it is easy to script. I also think provides a good understanding what Docker, LXC or other containerization technologies do behind the scenes with networks.


Categories: MySQL

Webinar Thursday June 23: Choosing a MySQL High Availability Solution Today

MySQL Performance Blog - Mon, 2016-06-20 19:28

Please join Percona, Technical Account Manager, Michael Patrick on Thursday, June 23, 2016 at 10 AM PDT (UTC-7) as he presents “Choosing a MySQL High Availability Solution Today.”

High availability (HA) is one of the solutions to improve performance, avoid data outages, and recover quickly from disasters. An HA environment helps guarantee that your database doesn’t have a single point of failure, accommodates rapid growth and exponentially increasing database size, and enables the applications that power your business.

Michael will discuss various topologies for achieving High Availability with MySQL.

Topics include:

  • Percona XtraDB Cluster
  • DRBD
  • MHA
  • MySQL Orchestrator

Each solution has advantages and challenges. Attendees will gain a deeper understanding of how to choose the best solution for their needs while avoiding some of the pitfalls of making the wrong choices. Avoid the costly mistakes that commonly cause outages and lost revenue. Plus get the latest and greatest developments in the technologies!

Register now.

Michael Patrick Technical Account Manager

Mike came to Percona in 2015 after working for a variety of large corporations running hundreds of MySQL and Percona XtraDB Clusters in production environments. He is skilled in performance tuning, server auditing, high availability, multi-data center replication, migration, and other MySQL-related activities. Mike holds a B.S. in Computer Science from East Tennessee State University. In his off time, he enjoys Martial Arts and Cave Exploration. He lives in East Tennessee with his wife and he has four children.

Categories: MySQL

InnoDB locks and transaction isolation level

MySQL Performance Blog - Fri, 2016-06-17 18:49

What is the difference between InnoDB locks and transaction isolation level? We’ll discuss it in this post.

Recently I received a question from a user about one of my earlier blog posts. Since it wasn’t sent as a comment, I will answer it here. The question:

> I am reading your article:

> Full table scan locks whole table.

> Some bad select (read) query can do full table scan on InnoDB, does it lock whole table please?

> My understanding was that SELECT (read) blocks another DML only in MyISAM.

To answer this question, we to need understand two different concepts: locking and transaction isolation level.

In the default transaction isolation mode for InnoDB, REPEATABLE READ and the lower TRANSACTION ISOLATION levels, SELECT  doesn’t block any DML unless it uses SELECT ... FOR UPDATE or SELECT ... LOCK IN SHARE MODE. On slide 20 (31) of my presentation, I use the SELECT ... FOR UPDATE statement (this is why a lock is seen).

However, if transaction isolation mode is SERIALIZABLE, then SELECT can block updates. You can see this in the example below:

mysql1> set transaction isolation level serializable; Query OK, 0 rows affected (0,00 sec) mysql1> begin; Query OK, 0 rows affected (0,00 sec) mysql1> select * from employees join titles using(emp_no); c936e6fc4c6cbaf77679ba5013339dff - 443308 rows in set (29,69 sec) mysql2> begin; Query OK, 0 rows affected (0,00 sec) mysql2> update titles set title='Engineer' where emp_no=15504; ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction

Regarding the different level locks set by InnoDB, I recommend that you follow the links on slide 37 (67) of the presentation, especially “InnoDB Locking Explained With Stick Figures” by Bill Karwin. Innodb table locks are also explained very well in this blog post by Peter Zaitsev.

Categories: MySQL

MongoDB Security: Why pay for Enterprise when Open Source has you covered?

MySQL Performance Blog - Fri, 2016-06-17 16:55

Does ensuring MongoDB security justify the cost of the Enterprise version? In my opinion, the answer is no.

MongoDB Inc© blasted an email with a study showing that the average cost of a data breach can be $5.9M. You can find the key finding in IBM’s 2015 report here:

NH Learning Solutions

Key findings:

Data breaches cost the most in the US and Germany and the lowest in Brazil and India. The average per capita cost of a data breach is $217 in the US and $211 in Germany. The lowest cost is in Brazil ($78) and India ($56). The average total organizational cost in the US is $6.5 million and in Germany $4.9 million. The lowest organizational cost is in Brazil ($1.8 million) and India ($1.5 million).

So it might be even worse than you thought! The study presented a clear per-record cost, which helps companies determine the real cost to them. This brings us to the recently-published MongoDB Security Architecture white paper. While being a great document, it raises some questions that should be addressed. We should dissect exactly what paying for an enterprise build gets you that is not already available in the open source community version. 

The key security features in above white paper are:

  • Authentication. LDAP Authentication centralizes things with your company directory (for PCI)
  • Authorization. What role-based access controls the database provides
  • Encryption. Broken into “At-Rest” and “In-Transit” as part of regular PCI requirements
  • Governance. Document validation and even checking for sensitive data such as an SSN or birth data
  • Auditing. The ability to see who did what in the database (also required for PCI).

That list lets us break down each into why they are important, and is it something that should be free in the MongoDB ecosystem.


MongoDB has built-in users (off by default). It misses things, however, like password complexity, age-based rotation, centralization, and identification of user roles versus service functions. These are essential to passing PCI. PCI requires that people don’t use old passwords, easy-to-break passwords, and that user access gets revoked when there is a change in status (such as leaving a department or the company). Thankfully LDAP is an open-source project of its own. Many connectors allow the use of Windows Active Directory (AD) systems to talk with LDAP.

Using LDAP and AD, you can tie users in with your corporate directory. When they change roles or leave the company, they can be removed by HR from your database group. Practically, this means there are automated systems in place to ensure only those you want to access the data manually can do so, without accidently missing something. It is important to note that the MongoDB Community Edition© does not have LDAP support. For this you need MongoDB Inc’s© Enterprise build. You can also use Percona Server© for MongoDB. Percona’s build is open source and free, and we offer a support contract so that if you require support or help it is available.


Role-based Authorization or (RBAC) is core to MongoDB Community©, MongoDB Enterprise© and Percona Server for MongoDB. In Mongo 2.6+ you can use built-in roles, or even craft your own down to what actions someone might be able to do – only exposing exactly what you want users to be able to do with razor precision. This is such a core MongoDB feature that it is everywhere and in every build regardless of vendor.


As I mentioned before, this is broken into two areas of discussion: At-Rest and In-Transit

At-Rest.: This is defined as on-disk, and typically refers to data saved to an encrypted storage location. This is to prevent theft by physical means and create backups that are stored in a fashion not easily read by any third party. There are practical limits to this. The biggest are trusting your sys-admins and assuming that a hacker has not been able to get access to the system to pretend they are one of your admins. This is not an issue unique to MongoDB or Percona. Such systems used more globally work here as well. They might include things like LUKS/cryptfs, or might go into even more secure methods such as signing encryption keys with LDAP, Smart Cards, and RSA type tokens.

When doing this level of encryption, you need to consider things like automounting and decrypting of drives. However, the general point is this is not something new to your system administrators, and they can manage this requirement in the same way they manage it in other parts of the system. The added benefit is a single procedure for storage encryption, not one per whatever technology a particular function uses.

In-Transit.: To keep this simple, we are just talking about using SSL on all connections (it can be as simple as that in the end). Depending on your internal requirements,©  and Percona Server for MongoDB all additionally support custom Signing Authorities (CA), x509 clients and member certificates. The x509 system is very useful when you want to make sure only authorized machines are allowed to talk to your system because they can even attempt to send a user/password to the system.


Put simply; this is the ability to enforce complex standards on the system by using Document Validation. This is an important feature that is available to MongoDB Community©, MongoDB Enterprise© and Percona Server for MongoDB. Governance is about the insertion and updating of data. It is also useful for checking if a field name like bday, birthday, ssn, social, ect is defined. We are not limited to those: you could also do string regex’s on things like user_id to check for a $regex such as “^d{3}-d{2}-d{4}$” (which would be a Social Security Number), or a checking for a credit card number. These examples are ways your DBAs and security architects can help prevent developers from exposing the company to added risk.

You can also ensure schema changes only occur when properly vetted by your DBA staff, as the developer code could fail if they change the format of what you’re storing in the database. This brings an additional layer of control to MongoDB’s dynamic production schema (allowing itself to store anything even if it should not).


Central to any good security design – and required by PCI – is being able to track what user did what action in the database (very similar to how you need to do it on your actual servers). At this time, MongoDB’s Community© build can’t track this. However, both MongoDB Enterprise©  and Percona Server for MongoDB support this feature. Both work in similar ways, allowing you to filter output to a particular user, database, collection, or source location. This gives you a log to review in any security incident and, more importantly, shows your PCI auditor that you’ve taken the correct steps to both protect your database from an intrusion and understand and incursions depth (should one occur).

Hopefully, this has been a good overview of the security options in MongoDB Community© and Enterprise© versions and Percona Server for MongoDB. Even without an enterprise-style contract, you can fulfill all your PCI compliance needs and protect your company using reasonable and known methods. Please note Percona strives to bring enterprise features to the community, but not to the point of wanting to lock you into a non-open source build. If you need support with anything MongoDB, we have the community and its users as our first priority!

Categories: MySQL

Scaling Percona XtraDB Cluster with ProxySQL in Kubernetes

MySQL Performance Blog - Thu, 2016-06-16 18:38

How do you scale Percona XtraDB Cluster with ProxySQL in Kubernetes?

In my previous post I looked how to run Percona XtraDB Cluster in a Docker Swarm orchestration system, and today I want to review how can we do it in the more advanced Kubernetes environment.

There are already some existing posts from Patrick Galbraith ( and Raghavendra Prabhu ( on this topic. For this post, I will show how to run as many nodes as I want, see what happens if we add/remove nodes dynamically and handle incoming traffic with ProxySQL (which routes queries to one of working nodes). I also want to see if we can reuse the ReplicationController infrastructure from Kubernetes to scale nodes to a given number.

These goals should be easy to accomplish using our existing Docker images for Percona XtraDB Cluster (, and I will again rely on the running service discovery (right now the images only work with etcd).

The process of setting up Kubernetes can be pretty involved (but it can be done; check out the Kubernetes documentation to see how: It is much more convenient to use a cloud that supports it already (Google Cloud, for example). I will use Microsoft Azure, and follow this guide: Unfortunately the scripts from the guide install previous version of Kubernetes (1.1.2), which does not allow me to use ConfigMap. To compensate, I will duplicate the ENVIRONMENT variables definitions for Percona XtraDB Cluster and ProxySQL pods. This can be done more optimally in the recent version of Kubernetes.

After getting Kurbernetes running, starting Percona XtraDB Cluster with ProxySQL is easy using following pxc.yaml file (which you also can find with our Docker sources

apiVersion: v1 kind: ReplicationController metadata: name: pxc-rc app: pxc-app spec: replicas: 3 # tells deployment to run N pods matching the template selector: app: pxc-app template: # create pods using pod definition in this template metadata: name: pxc labels: app: pxc-app spec: containers: - name: percona-xtradb-cluster image: perconalab/percona-xtradb-cluster:5.6test ports: - containerPort: 3306 - containerPort: 4567 - containerPort: 4568 env: - name: MYSQL_ROOT_PASSWORD value: "Theistareyk" - name: DISCOVERY_SERVICE value: "" - name: CLUSTER_NAME value: "k8scluster2" - name: XTRABACKUP_PASSWORD value: "Theistare" volumeMounts: - name: mysql-persistent-storage mountPath: /var/lib/mysql volumes: - name: mysql-persistent-storage emptyDir: {} imagePullPolicy: Always --- apiVersion: v1 kind: ReplicationController metadata: name: proxysql-rc app: proxysql-app spec: replicas: 1 # tells deployment to run N pods matching the template selector: front: proxysql template: # create pods using pod definition in this template metadata: name: proxysql labels: app: pxc-app front: proxysql spec: containers: - name: proxysql image: perconalab/proxysql ports: - containerPort: 3306 - containerPort: 6032 env: - name: MYSQL_ROOT_PASSWORD value: "Theistareyk" - name: DISCOVERY_SERVICE value: "" - name: CLUSTER_NAME value: "k8scluster2" - name: MYSQL_PROXY_USER value: "proxyuser" - name: MYSQL_PROXY_PASSWORD value: "s3cret" --- apiVersion: v1 kind: Service metadata: name: pxc-service labels: app: pxc-app spec: ports: # the port that this service should serve on - port: 3306 targetPort: 3306 name: "mysql" - port: 6032 targetPort: 6032 name: "proxyadm" # label keys and values that must match in order to receive traffic for this service selector: front: proxysql

Here is the command to start the cluster:

kubectl create -f pxc.yaml

The command will start three pods with Percona XtraDB Cluster and one pod with ProxySQL.

Percona XtraDB Cluster nodes will register themselves in the discovery service and we will need to add them to ProxySQL (it can be done automatically with scripting, for now it is a manual task):

kubectl exec -it proxysql-rc-4e936

Increasing the cluster size can be done with the scale command:

kubectl scale --replicas=6 -f pxc.yaml

You can connect to the cluster using a single connection point with ProxySQL: You can find it this way:

kubectl describe -f pxc.yaml Name: pxc-service Namespace: default Labels: app=pxc-app Selector: front=proxysql Type: ClusterIP IP: Port: mysql 3306/TCP Endpoints: <none> Port: proxyadm 6032/TCP Endpoints: <none> Session Affinity: None

It exposes the endpoint IP address and two ports: 3306 for the MySQL connection and 6032 for the ProxySQL admin connection.

So you can see that scaling Percona XtraDB Cluster with ProxySQL in Kubernetes is pretty easy. In the next post, I want to run benchmarks in the different Docker network environments.

Categories: MySQL

Why MongoRocks: Deprecating PerconaFT and MongoDB Optimistic locking

MySQL Performance Blog - Thu, 2016-06-16 13:51

In this post, we’ll discuss the rationale behind deprecating PerconaFT and embracing RocksDB.

Why is Percona deprecating PerconaFT in favor of RocksDB?

Many of you may have seen Peter Zaitsev’s recent post about Percona embracing RocksDB and deprecating PerconaFT. I’m going to shed a bit more light on the issues between the locking models for PerconaFT’s and MongoDB’s core servers. When making this decision, we looked at how the differences between the engines measure up and impact other improvements we could make. In the end, we can do more for the community by focusing on engines that are in line with assumptions the core server makes every second in your daily operations.Then we have more resources available for improving the users’ experience by adding new tools, features, and improving the core server.

What is pessimistic locking?

Pessimistic locking locks an entity in the database for the entire time that it is actively used in application memory. A lock either limits or prevents other users from working with the entity in the database. A write lock indicates that the holder of the lock intends to update the entity and disallows anyone from reading, updating, or deleting the object. A read lock means that the owner of the lock does not want the object to change while it holds the lock, allowing others to read the entity but not update or delete it. The scope of a lock might be the entire database, a table, a collection of ranges of documents or a single document.

You can order pessimistic locks as follows (from broad to granular):

  1. Database locks
  2. Collection locks
  3. Range locks
  4. Document locks

The advantage of pessimistic locking is that changes to the database get made consistently and safely. The primary disadvantage is that this approach isn’t as scalable. The chance of waiting for a lock to be released increases when:

  • A system has a lot of users
  • The transactions (in MongoDB, there are transactions in the engine but not at the user level) involve a greater number of entities
  • When transactions are long-lived

Therefore, pessimistic locks limit the practical number of simultaneous users that your system can support.

What is optimistic locking?

In most database systems (NoSQL and RDBMS) expect collisions to be relatively uncommon. For example, although two clients are working with user objects, one might be working with the Bob Vader object while another works with the Luke Vader object. These won’t collide. In this case, optimistic locking becomes the most viable concurrency control strategy. If you accept the premise that collisions infrequently occur, instead of trying to prevent them you can choose to detect and then resolve them when they do occur.

MongoDB has something called a Write Conflict Exception (WCE). A WCE is like an engine-level deadlock. If a record inside the engine changes due to thread #1, thread #2 must wait for a time that it is safe to change the record, and retry then. Typically this occurs when a single document gets updated frequently. It can also occur when there are many updates, or there are ranges of locks happening concurrently. This is a perfect case of optimistic locking, preferring to resolve or retry operations when they occur rather than prevent them from happening.

Can you make these play well while limiting the amount of development resources needed?

These views are as polar opposite as you can get in the database world. In one view you lock as much as possible, preventing anyone else from making a change. In the other view you let things be as parallel as possible, and accept you will retry if two clients are updating the same document. With the nature of how many documents fit in a single block of memory, this has some real-world concerns. When you have more than one document in a memory block, you could have a situation where locking one document means 400% more documents get affected. For example, if we have an update using the IN operator with 25 entries, you could be blocking 125 documents (not 25 documents)!

That escalated rather quickly, don’t you think? Using optimistic locking in the same situation, you at most would have to retry five document write locks as the data changed. The challenge for optimistic locking is that if I have five clients that are all updating all documents, you get a flurry of updates. WCE’s come in and eventually resolve things. If you use pessimistic locking, everybody waits their turn, and each one would finish before the next could run.

Much of Percona’s engineering effort goes into what types of systems we should put in place to simulate cheap latches or locks in optimistic locking to allow pessimistic locking to work (without killing performance). This requires an enormous amount of work just to get on-par behavior from the system – specifically in update type workloads, given delete/inserts are very similar in the systems. As a result, we’ve spent more time improving the engine rather than adding additional variables and instrumentation.

Looking forward, MongoRocks aligns more to WiredTiger in its locking structure (they both run as log sequence merges or LSMs), and this means more time working on new optimizer patterns, building things to improve diagnostics or tuning the engine/system to your specific needs. We think you will be excited to see some of the progress we have been discussing for Mongo 3.4 (some of which might even make it directly into 3.2).

What is the MongoRocks anyhow and how does it compare to PerconaFT?

The last concept I want to cover is what RocksDB is exactly, what its future is and how it stacks up to PerconaFT. The most important news is Facebook is working on the core engine, which is used both by MyRocks and MongoRocks (you might have seen some of their talks on the engine). This means Percona can leverage some of the brilliant people working on RocksDB inside Facebook and focus instead on the API linking the engine into place, as well as optimizing how it uses the engine – rather than building the engine completely. Facebook is making some very significant bets on the backend use of RocksDB in several parts of the system, and potentially some user parts of the system (which have historically used InnoDB).

So what is RocksDB, and how does it work? Its core is an LSM system, which means it puts new data into the newest files as an append. Over time, the files get merged into five different levels (not covered here). As part of this process, when you have an update a new copy of the data is saved at the end of the latest file, and a memory table points a record to that location for the latest “version” of that data. In the background, the older records are marked for deletion by something called a “tombstone.” There is a background thread merging still-relevant data from old files into new files so that empty old files get unlinked from the filesystem.

This streamlines the process better than B-Tree’s constant rebalancing and empty blocks in files that need to be compacted or re-used over time. Being write-optimized means that, like PerconaFT previously, it will be faster for write-based operations than something like WiredTiger. (WiredTiger in some cases can be faster on reads, with MMAP being the fastest possible approach for reads.) This also means things like TTL’s can work pretty well in an LSM since all the items that were inserted in time order age out, and the engine can just delete the unneeded file. This solves some of the reasons people needed partitions to begin with, and it still allows sharding to work well.

We are also very excited about creating additional tools that let you query your backups in MongoRocks, as well as some of the very simple ways it will take binary-based backups quickly, consistently (even when sharded) and continually.

I hope this explains more about lock types and what their implications mean as a follow up to Peter’s blog post about the direction Percona is moving regarding PerconaFT and MongoRocks. If you want to ask any more questions, or would like another blog that covers some areas of the system more deeply, please let us know via this blog, email, twitter or even pigeon!

Categories: MySQL

Troubleshooting hardware resource usage webinar: Q & A

MySQL Performance Blog - Wed, 2016-06-15 22:12

In this blog, I provide answers to the Q & A for the Troubleshooting hardware resource usage webinar.

First, I want to thank everybody who attended the May 26 webinar. The recording and slides for the webinar are available here. Below is the list of your questions that I wasn’t able to answer during the webinar, with responses:

Q: How did you find the memory IO LEAK?

A: Do you mean the replication bug I was talking about in the webinar? I wrote about this bug here. See also comments in the bug report itself.

Q: Do you have common formulas you use to tune MySQL?

A: There are best practices: relate thread concurrency to number of CPU cores you have, set InnoDB buffer pool size large enough so it can contain all your working dataset (which is not always possible), and do not set the Query Cache size larger than 512MB (or even better, turn it off) to avoid issues with global lock set when it needs to be de-fragmented. I prefer not to call them “formulas,” because all options need to be adjusted to match the workload. If this weren’t the case, MySQL Server would have an automatic configuration. There is also a separate webinar on configuration (Troubleshooting configuration issues) where I discuss these practices.

Q: Slide 11: is this real time? Can we get this info for a job that has already finished?

A: Yes, this is real time. No, it is not possible to get this info for a thread that does not exist.

Q: Slide 11: what do negative numbers mean?

A: Numbers are taken from the CURRENT_NUMBER_OF_BYTES_USED field for table memory_summary_by_thread_by_event_name in Performance Schema. These values, in turn, are calculated as (memory allocated by thread) – (memory freed by thread). Negative numbers here mean either a memory leak or incorrect calculation of memory used by the thread. I reported this behavior in the MySQL Bugs database. Please subscribe to the bug report and wait to see how InnoDB and Performance Schema developers answer.

Q: Are TokuDB memory usage stats recorded in the sys.memory_by_thread_by_current_bytes  table also?  Do we have to set something to enable this collection? I ran the query, but it shows 0 for everything.

A: TokuDB currently does not support Performance Schema, thus its memory statistics are not instrumented. See the user manual on how memory instrumentation works.

Q: With disk what we will check for disk I/O?

A: I quite don’t understand the question. Are you asking on which disk we should check IO statistics? For datadir and other disks, look at the locations where MySQL stores data and log files (if you set custom locations).

Q: How can we put CPU in parallel to process multiple client requests? Put multiple requests In memory by parallel way. By defining transaction. Or there any query hints?

A: We cannot directly put CPU in parallel, but we can affect it indirectly by tuning InnoDB threads-related options (innodb_threads_concurrency, innodb_read_io_threads, innodb_write_io_threads) and using the thread pool.

Q: Is there any information the Performance Schema that is not found in the SYS schema?

A: Yes. For example, sys schema does not have a view for statistics about prepared statements, while Performance Schema does, because sys schema takes its statement statistics from digest tables (which make no sense for prepared statements).

Q: What is your favorite tool to investigate a memory issue with a task/job that has already finished?

A: I don’t know that there is such a tool suitable for use in production. In a test environment, you can use valgrind or similar tools. You can also make core dumps of the mysqld process and investigate them after the issue is gone.

Categories: MySQL

Installing MongoDB 3.2 and upgrading MongoDB replica set

MySQL Performance Blog - Wed, 2016-06-15 20:55

In this post, we’ll examine a couple of ways for upgrading MongoDB replica set.

With the release of MongoDB 3.2, comes a rash of new features and improvements. One of these enhancements is improved replica sets. From MongoDB: “A replica set in MongoDB is a group of mongod processes that maintain the same data set. Replica sets provide redundancy and high availability, and are the basis for all production deployments.”

Config servers are replica sets!

This is HUGE. It signals a significant advancement in backups, metadata stability and overall maturity. It is a very long-awaited feature that shows MongoDB is maturing. It means:

  • Mongos’ can retry connection vs error
  • Unified and consistent backups!
  • Up to 50 secondaries
    • Remove bugs with Mongos’ not near config servers!

How do we activate all these new awesome features? Let’s do it!

Upgrading to 3.2
  • Replace binaries and restart one secondary at a time
  • Then primaries as well
  • Restart configs in reverse order
    • If configdb=con1, con2, con3
      • Restart con3, con2, and then finally con1 with 3.2
      • Do con1 as FAST as possible, while the balancer is also disabled
    • You no longer need to restart a mongos –upgrade (as of 3.2)
    • Restart all mongos, this will reset ALL connections at some point (whether you do at once or space it out).
Upgrading the replset to the new protocol

This is by far the easiest upgrade bit but DON’T do it until you know your stable on 3.2. Log into each primary and run:

>cfg = rs.conf(); { "_id" : "r1", "version" : 2, "members" : [ { "_id" : 0, "host" : "localhost:17001" }, { "_id" : 1, "host" : "localhost:17002" }, { "_id" : 2, "host" : "localhost:17003", } ] } >cfg.protocolVersion=1; >rs.reconfig(cfg); { "ok" : 1, "$gleStats" : { "lastOpTime" : Timestamp(1464947003, 1), "electionId" : ObjectId("7fffffff0000000000000018") } }


>db.getSiblingDB(‘config’).shards.forEach(function(shard){ x = new Mongo(; /* Assumes no auth needed */ conf =x.getDB("local").system.replset.findOne() conf.protcolVersion=1;conf.version++; x.getDB(‘admin’).runCommand({ replSetReconfig: conf }); });

The quick upgrade scripts

Does what it says: kills every process and launches them on 3.2 binaries with no other changes.


Simply runs the quick rs.reconfig() on each primary, adds the new settings to enable to new replication features.

Let’s upgrade the configs the right way!

This is not included as part of a normal upgrade so only do this AFTER you’re stable and don’t do it before upgrading the protocolVersion we just talked about. (I mean it! Disregard this advice and your life will seriously not be awesome!)

Upgrading to a Config ReplicaSet ( the official way)
  1. Run rs.initiate on the first config in the list (must be 3.2.4+)
    • Must be a fully configured document with configsrv:true defined.
  2. Restart same config server adding
    • configsvrMode = sccc
    • replSet = <name used in rs.initiate()>
    • storageEngine= WiredTiger
  3. Start the new config servers for the other two nodes (should be a new dbpath and port)
  4. Add those nodes to the replSet and check their status
  5. Remove the second original config server from the running
  6. Restart the 1st node you set “sccc” on to not have that setting
  7. At this point, the 1st node will transition to removed if using MMAP.
  8. Restart a mongos with a new configdb line
    • –configdb <replSetName>/node1:port,node2:port,…
    • Only replset members should be listed
  9. Verify you can work and query through mongos
  10. Repeat on all mongos
  11. Remove the 1st node with rs.remove
  12. Shutdown final original config and enable balancer
There is also an easy way.

The easy way, with a small maintenance window, which lets you just restore a good backup and have a nice and simple rollback plan:

  1. Stop all mongos after backing up the config directory
  2. Run rs.initiate on first config server
  3. Stop the 2nd, then the 3rd, restarting them with an empty dbpath directory
  4. Check the rs.status now
  5. Stop the 1st config server and restart with an empty dbpath directory
  6. Check Status
  7. Restart all mongos, adding <replSetName>/ to the front of the configdb line.
  8. Done!

Oh look there is a quick script we have made for you:

    • Kill all  config and mongos processes
    • Restart the first config server on non-standard port
    • Mongodump config database
    • Restart c1 as WiredTiger, clearing that data path
    • Import dump back into first config server
    • Restart on normal port
    • Initialize Replica Set
    • Restart second and third config server after clearing dbpath folder
    • After the initial sync, start all the mongos.
    • Done and script exits!


Categories: MySQL

MongoDB and non-existent collections

MySQL Performance Blog - Tue, 2016-06-14 22:29

In this blog, I will discuss how I found some of my basic SQL assumptions that don’t hold true when dealing with MongoDB and non-existent collections.

Coming from a MySQL background, I have some assumptions about databases that don’t apply to MongoDB (or other kinds of databases that are neither SQL-based nor relationally-inspired).

An example of this is the assumption that data is organized in rows that are part of a table, with all tables having a strict schema (i.e., a single type of row). When working with MongoDB, this assumption must be transformed into the idea that data is organized in documents that are part of a collection and have a flexible schema so that different types of documents can reside in the same collection.

That’s an easy adjustment to make because a dynamic schema is one of the defining features of MongoDB. There are other less-obvious assumptions that need to be adjusted or redefined as you get familiar with a new product like MongoDB (for example, MySQL does not currently support built-in sharding, while MongoDB does).

There is a more fundamental kind of assumption, and by “fundamental” I mean an assumption that is deeply ingrained because you rely on it so often it’s automatic (i.e., unconscious). We’re usually hit by these when changing programming languages, especially in dynamic ones (“Will I be able to add a number to a string? If so, how will it behave?”). These can make it hard to adjust to a new database (or programming language, operating system, etc.) because we don’t consciously think about them and so we may forget to verify if they hold in the new system. This can happen in “real” life too: try going to a country where cars drive on the other side of the road from yours!

While working on a MongoDB benchmark recently, I was hit by one of these assumptions. I thought sharing my mistake may help others who are also coming to MongoDB from an SQL background.

One of my computing assumptions can be summarized as “reading from a non-existent source will fail with an error.”

Sure enough, it seems to be true for my operating system:

telecaster:~ fernandoipar$ cat notexists cat: notexists: No such file or directory

And for MySQL:

mysql> select 1 from notexists; ERROR 1146 (42S02): Table 'test.notexists' doesn't exist

But what happens in MongoDB?

> db.notexists.find() > db.notexists.find().count() 0

No errors. You get no results. Interestingly, you can even count the number of documents in a cursor that is associated with a non-existent collection.

As I said, I hit this while working on a benchmark. How? I was comparing the throughput for different engines and various configurations, and after preparing the graphs they all showed the same behavior. While it’s not impossible for this to be accurate, it was very unlikely given what I was trying to measure. Some investigation led me to discover a mistake in the preparation phase of my benchmarks. To save time and to use the same data baseline for all tests, I was only running sysbench prepare once, backing up the data directory for each engine, and then restoring this backup before each experiment. The error was that I was restoring one subdirectory below MongoDB’s expected dbpath (i.e., to /data/db/db instead of /data/db), and so my scripts were reading from non-existent collections.

On a MySQL experiment, this would have immediately blown up in my face; with MongoDB, that is not the case.

On reflection, this behavior makes sense for MongoDB in that it is consistent with the write behavior. You don’t need to create a new collection, or even a new database. It’s enough that you write a document to it, and it gets created for you. If writing to a non-existent collection produces no errors, reading from one shouldn’t either.

Still, sometimes an application needs to know if a collection exists. How can you do this? There are multiple ways to do this, and I think the best approach is to verify their existence during the application initialization stage. Here are a couple of examples:

> db.notexists.stats() { "ok" : 0, "errmsg" : "Collection [test.notexists] not found." } > db.getCollectionNames().indexOf("notexists") >= 0 false

I hope you find this useful, and if you’ve been hit by similar problems (such as MongoDB and non-existent collections) due to broken assumptions when moving to MongoDB with an SQL background, I’d love to read about them in the comments!

Categories: MySQL

Scaling Percona XtraDB Cluster with ProxySQL in Docker Swarm

MySQL Performance Blog - Tue, 2016-06-14 17:07

In this post, we’ll look at scaling Percona XtraDB Cluster with ProxySQL in Docker Swarm.

In my previous post, I showed how to employ Percona XtraDB Cluster on multiple nodes in a Docker network.

The intention is to be able to start/stop nodes and increase/decrease the cluster size dynamically. This means that we should track running nodes, but also to have an easy way to connect to the cluster.

So there are two components we need: service discovery to register nodes and ProxySQL to handle incoming traffic.

The work with service discovery is already bundled with Percona XtraDB Cluster Docker images, and I have experimental images for ProxySQL

For multi-node management, we also need some orchestration tool, and a good start is Docker Swarm. Docker Swarm is simple and only provides basic functionality, but it works for a good start. (For more complicated setups, consider Kubernetes.)

I assume you have Docker Swarm running, but if not here is some good material on how to get it rolling. You also need to have service discovery running (see and my previous post).

To start a cluster with ProxySQL, we need a docker-compose definition file docker-compose.yml.:

version: '2' services: proxy: image: perconalab/proxysql networks: - front - Theistareykjarbunga ports: - "3306:3306" - "6032:6032" env_file: .env percona-xtradb-cluster: image: percona/percona-xtradb-cluster:5.6 networks: - Theistareykjarbunga ports: - "3306" env_file: .env networks: Theistareykjarbunga: driver: overlay front: driver: overlay

For convenience, both proxy and percona-xtradb-cluster share the same environment files (.env):


You can also get both files from

To start both the cluster node and proxy:

docker-compose up -d

We can start as many Percona XtraDB Cluster nodes as we want:

docker-compose scale percona-xtradb-cluster=5

The command above will make sure that five nodes are running.

We can check it with docker ps:

docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 725f5f2699cc percona/percona-xtradb-cluster:5.6 "/ " 34 minutes ago Up 38 minutes 4567-4568/tcp,>3306/tcp smblade04/swarm_percona-xtradb-cluster_5 1c85ea1367e8 percona/percona-xtradb-cluster:5.6 "/ " 34 minutes ago Up 38 minutes 4567-4568/tcp,>3306/tcp smblade04/swarm_percona-xtradb-cluster_2 df87e9c1342e percona/percona-xtradb-cluster:5.6 "/ " 34 minutes ago Up 38 minutes 4567-4568/tcp,>3306/tcp smblade04/swarm_percona-xtradb-cluster_4 cbb82f7a9789 perconalab/proxysql "/ " 36 minutes ago Up 40 minutes>3306/tcp,>6032/tcp smblade04/swarm_proxy_1 59e049fe22a9 percona/percona-xtradb-cluster:5.6 "/ " 36 minutes ago Up 40 minutes 4567-4568/tcp,>3306/tcp smblade04/swarm_percona-xtradb-cluster_1 0921a2611c3c percona/percona-xtradb-cluster:5.6 "/ " 37 minutes ago Up 42 minutes 4567-4568/tcp,>3306/tcp centos/swarm_percona-xtradb-cluster_3

We can see that Docker schedules containers on two different nodes, the Proxy SQL container is smblade04/swarm_proxy_1, and the connection point is

To register Percona XtraDB Cluster in ProxySQL we can just execute the following:

docker exec -it smblade04/swarm_proxy_1

The script will connect to the service discovery DISCOVERY_SERVICE (defined in .env file) and register nodes in ProxySQL.

To check that they are all running:

mysql -h10.20.2.66 -P6032 -uadmin -padmin MySQL [(none)]> select * from stats.stats_mysql_connection_pool; +-----------+-----------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+ | hostgroup | srv_host | srv_port | status | ConnUsed | ConnFree | ConnOK | ConnERR | Queries | Bytes_data_sent | Bytes_data_recv | Latency_ms | +-----------+-----------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+ | 0 | | 3306 | ONLINE | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 212 | | 0 | | 3306 | ONLINE | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 155 | | 0 | | 3306 | ONLINE | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 136 | | 0 | | 3306 | ONLINE | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 123 | | 0 | | 3306 | ONLINE | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 287 | +-----------+-----------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+

We can connect to a cluster using a ProxySQL endpoint:

mysql -h10.20.2.66 -uproxyuser -psecret mysql -h10.20.2.66 -P3306 -uproxyuser -ps3cret -e "SELECT @@hostname" +--------------+ | @@hostname | +--------------+ | 59e049fe22a9 | +--------------+ mysql -h10.20.2.66 -P3306 -uproxyuser -ps3cret -e "SELECT @@hostname" +--------------+ | @@hostname | +--------------+ | 725f5f2699cc | +--------------+

We can see that we connect to a different node every time.

Now if we want to get crazy and make sure we have ten Percona XtraDB Cluster nodes running, we can execute the following:

docker-compose scale percona-xtradb-cluster=10 Creating and starting swarm_percona-xtradb-cluster_6 ... Creating and starting swarm_percona-xtradb-cluster_7 ... Creating and starting swarm_percona-xtradb-cluster_8 ... Creating and starting swarm_percona-xtradb-cluster_9 ... Creating and starting swarm_percona-xtradb-cluster_10 ...

And Docker Swarm will make sure ten nodes are running.

I hope this demonstrates that you can easily start playing with multi-nodes using Percona XtraDB Cluster. In the next post, I will show how to use Percona XtraDB Cluster with Kubernetes.

Categories: MySQL

RocksDB from Docker containers

MySQL Performance Blog - Mon, 2016-06-13 18:14

This post will discuss how to get RocksDB from Docker containers to use with Percona Server for MongoDB.

With our Percona Server for MongoDB 3.2 release, we made RocksDB a first class citizen. With this newly-available engine, we want to make it easy for everybody interested to try it. So it is now available in docker images from

If you have docker running, starting RocksDB is very easy:

docker run -d -p 27017:27017 percona/percona-server-mongodb --storageEngine=rocksdb

Then run:

mongo --eval "printjson(db.serverStatus())"

You should see this as part of the output:

"storageEngine" : { "name" : "rocksdb", "supportsCommittedReads" : true, "persistent" : true },

Give it a try, and let us know how RocksDB works for you!

Categories: MySQL

Webinar Thursday, June 16: MongoDB Schema Design

MySQL Performance Blog - Mon, 2016-06-13 15:49

Please join Jon Tobin, Director of Solutions Engineering at Percona on Thursday, June 16, 2016 10:00am PDT (UTC-7) for a webinar on “MongoDB® Schema Design.”

Jon will discuss the most common misconception when evaluating the use of MongoDB: that it is “schemaless.” THIS IS NOT TRUE. MongoDB has a document structure, and thus, a schema. While the structure is much more dynamic than that of most relational database models, choices that you make can and will pay themselves forward (or haunt you forever).

In this webinar, we’ll cover what a document is, how they can be structured, and what structures work (and don’t work) for a particular use case. We will also touch on design decisions and how they affect the ability of the cluster to scale in the future. Some of the topics that will be covered are:

  • Document Structure
  • Embedding vs Referencing
  • Normalization vs De-Normalization
  • Atomicity
  • MongoDB Sharding

Register here.

Jon Tobin, Director of Solution Engineering

When not saving kittens from sequoias or helping the elderly across busy intersections, Jon Tobin is Percona’s Director of Solutions Engineering. He has spent over 15 years in the IT industry. For the last 6 years, Jon has been helping innovative IT companies assess and address customer’s business needs through well-designed solutions.


Categories: MySQL

Percona XtraDB Cluster 5.6.30-25.16 is now available

MySQL Performance Blog - Fri, 2016-06-10 22:49

Percona is glad to announce the new release of Percona XtraDB Cluster 5.6 on June 10, 2016. Binaries are available from the downloads area or our software repositories.

Percona XtraDB Cluster 5.6.30-25.16 is now the current release, based on the following:

All of Percona software is open-source and free, and all the details of the release can be found in the 5.6.30-25.16 milestone on Launchpad.

For more information about relevant Codership releases, see this announcement.

New Features:

  • PXC now uses wsrep_desync_count introduced in Galera 3.16 by Codership, instead of the concept that was previously implemented by Percona. The following logic applies:
    • If a node is explicitly desynced, then implicitly desyncing a node using RSU/FTWRL is allowed.
    • If a node is implicitly desynced using RSU/FTWRL, then explicitly desyncing a node is blocked until implicit desync is complete.
    • If a node is explicitly desynced and then implicitly desycned using RSU/FTWRL, then any request for another implicit desync is blocked until the former implicit desync is complete.

Bugs Fixed:

  • Changing wsrep_provider while the node is paused or desynced is not allowed.
  • TOI now checks that a node is ready to process DDL and DML before starting execution, to prevent a node from crashing if it becomes non-primary.
  • The wsrep_row_upd_check_foreign_constraints function now checks that fk-reference-table is open before marking it open.

Help us improve our software quality by reporting any bugs you encounter using our bug tracking system. As always, thanks for your continued support of Percona!

Categories: MySQL

Running Percona XtraDB Cluster in a multi-host Docker network

MySQL Performance Blog - Fri, 2016-06-10 20:32

In this post, I’ll discuss how to run Percona XtraDB Cluster in a multi-host Docker network.

With our release of Percona XtraDB Cluster 5.7 beta, we’ve also decided to provide Docker images for both Percona XtraDB Cluster 5.6 and Percona XtraDB Cluster 5.7.

Starting one node is very easy, and not that different from starting Percona Server image. The only an extra requirement is to have the CLUSTER_NAME variable defined. The startup command might look like this:

docker run -d -p 3306:3306 -e MYSQL_ROOT_PASSWORD=Theistareyk -e CLUSTER_NAME=Theistareykjarbunga -e XTRABACKUP_PASSWORD=Theistare percona/percona-xtradb-cluster

You might also notice we can optionally define an XTRABACKUP_PASSWORD password, which a xtrabackup@localhost user will employ for the xtrabackup-SST method.

Running Percona XtraDB Cluster in single mode kind of defeats the purpose of having the cluster. With our docker images, we tried to resolve the following tasks:

  1. Run in multiple-host environment (followed by running in Docker Swarm and Kubernetes)
  2. Start as many nodes in the cluster as we want
  3. Register all nodes in the service discovery, so that the client can see how many nodes are running and their status
  4. Integrate with ProxySQL

Let’s review these points one by one.

Using a multi-host network is when a Docker network becomes helpful. The recent Docker versions come with a network overlay driver, which we will use to run a virtual network over multiple boxes. Starting Docker overlay network is out of scope for this post, but check out this great introduction material on how to get it working.

With the network running, we can create an overlay network for our cluster:

docker network create -d overlay cluster1_net

Then we can start containers:

docker run -d -p 3306 --net=cluster1_net -e MYSQL_ROOT_PASSWORD=Theistareyk -e CLUSTER_NAME=cluster1 ... -e XTRABACKUP_PASSWORD=Theistare percona/percona-xtradb-cluster

The cool bit is that we can start Percona XtraDB Cluster on any node in the network, and they will communicate over the virtual network.

If you want to stay within a single Docker host (for example during testing), you still can create a bridge network and use it in one host environment.

The script above will run . . . almost. The problem is that every additional node needs to know the address of the running cluster.

To address this (if you prefer a manual process) we introduced the CLUSTER_JOIN variable, which should point to the IP address of one running nodes (or be empty to start the new cluster).

In this case, getting the script above to work might look like below:

docker run -d -p 3306 --net=cluster1_net -e MYSQL_ROOT_PASSWORD=Theistareyk -e CLUSTER_NAME=cluster1 -e CLUSTER_JOIN= -e XTRABACKUP_PASSWORD=Theistare percona/percona-xtradb-cluster

I think manually tracking IP addresses requires unnecessary extra work, especially if we want to start and stop nodes on the fly. So we also decided to use service discovery — especially since you need it to run the Docker overlay network overlay. Right now we support the etcd discovery service, but it isn’t a problem to add more (such as Consul).

Starting etcd is also out of the scope of this post, but you can read about the procedure in the manual.

When you run etcd service discovery (on the host, for example) you can start the nodes:

docker run -d -p 3306 --net=cluster1_net -e MYSQL_ROOT_PASSWORD=Theistareyk -e CLUSTER_NAME=cluster1 -e DISCOVERY_SERVICE= -e XTRABACKUP_PASSWORD=Theistare percona/percona-xtradb-cluster

The node will register itself in the service discovery and will join existing $CLUSTER_NAME.

There is convenient way to check all nodes:

curl http://$ETCD_HOST/v2/keys/pxc-cluster/$CLUSTER_NAME/?recursive=true | jq { "action": "get", "node": { "key": "/pxc-cluster/cluster4", "dir": true, "nodes": [ { "key": "/pxc-cluster/cluster4/", "dir": true, "nodes": [ { "key": "/pxc-cluster/cluster4/", "value": "", "modifiedIndex": 19600, "createdIndex": 19600 }, { "key": "/pxc-cluster/cluster4/", "value": "2af0a75ce0cb", "modifiedIndex": 19601, "createdIndex": 19601 } ], "modifiedIndex": 19600, "createdIndex": 19600 }, { "key": "/pxc-cluster/cluster4/", "dir": true, "nodes": [ { "key": "/pxc-cluster/cluster4/", "value": "", "modifiedIndex": 26420, "createdIndex": 26420 }, { "key": "/pxc-cluster/cluster4/", "value": "cfb29833f1d6", "modifiedIndex": 26421, "createdIndex": 26421 } ], "modifiedIndex": 26420, "createdIndex": 26420 } ], "modifiedIndex": 19600, "createdIndex": 19600 } }

With this, you can start as many cluster nodes as you want and on any host in Docker Network. Now it is convenient to use an SQL proxy in front of the cluster. In this case, we will use ProxySQL (I will show that in a follow-up post).

In later posts, we will also review how to run Percona XtraDB Cluster nodes in an orchestration environment (like Docker Swarm and Kubernetes).

Categories: MySQL

Percona Monitoring and Management 1.0.1 Beta

MySQL Performance Blog - Fri, 2016-06-10 20:05

Percona is glad to announce the release of Percona Monitoring and Management 1.0.1 Beta on 10 June, 2016.

Like prior versions, PMM is distributed through Docker Hub and is free to download. Full instructions for download and installation of the server and client are available in the documentation.

Notable changes to the tool include:

  • Grafana 3.0
  • Replaced custom web server with NGINX
  • Eliminated most of the ports for PMM server container (now only two – 9001 and configurable 80)
  • Updated to the latest versions of Prometheus, exporters, QAN agent
  • Added mongodb_exporter
  • Added MongoDB dashboards
  • Replaced prom-config-api with Consul
  • Improvements to pmm-admin and ability to set server address with the port
  • Added “Server Summary” with aggregated query metrics to QAN app
  • MySQL dashboard updates, added “MySQL InnoDB Metrics Advanced” dashboard
The new server summary in PMM Beta 1.0.1



Metric rates in Query Analytics



Available Dashboards in Metrics


Full documentation is available, and includes details on installation and architecture, and a demonstration of the tool has been set up at

We have also implemented forums for the discussion of PMM.

Help us improve our software quality by reporting any bugs you encounter using our bug tracking system. As always, thanks for your continued support of Percona!

Categories: MySQL
Syndicate content