The Best Activity Tracking Watch

Xaprb, home of innotop - Tue, 2017-02-21 20:31

After thinking about smart watches, activity trackers, and similar devices for a while, I bought a Withings Steel HR. My goal was to find a traditional stylish-looking watch with long battery life, heart rate tracking, sleep tracking, and activity tracking. Here’s my experience thus far.

TL;DR: Using the Withings Steel HR has changed the way I use my smartphone. I love how much less distracted I am. I am happy with the health tracking features, and I like the traditional watch styling and long battery life.

In the last few years I’ve kept an interested eye on the explosion of health- and fitness-tracking devices, considering whether it was time for me to take the plunge and get one. Broadly speaking, there are a few categories of devices on the market, each appealing to a different user for various reasons. Here’s my summary:

  • If you want an iPhone on your wrist, get an Apple Watch. It has poor battery life but it’s seamlessly integrated with your Apple lifestyle.
  • If you want the best fitness/activity tracking, look into FitBit and Garmin. The hardware, apps, and tracking features are unmatched.
  • If you want a stylish analog watch with long battery life and full-featured activity tracking, check out Withings.
  • If you want a hybrid digital watch with pseudo-analog styling, take a look at the Garmin 235 or Ticwear.

There are many more choices than these, but those represent some of the leaders in the field. For more detail, read on.

Today’s smartwatches have many features that go far beyond watch functionality. You can make and receive calls and texts, respond to emails, dismiss calendar alarms, get navigation guidance, and much more. In addition to features you’re used to getting from a smartphone, many devices offer a broad range of features to measure your activity and vital signs: heart rate tracking, step counting, GPS tracking for runs and walks, sleep tracking, and so on.

I was looking for some of what I consider to be the most important features of these, without sacrificing the form factor and style of an analog watch. The Withings Steel HR, a new offering on the market, was my choice because as far as I could tell, it was the only analog watch that tracks heart rate, steps, and sleep, and has a good battery life. I pre-ordered it before it was available and have had it for a couple of months now. I’ll review it in detail below.

I also have some experience with friends who’ve had some of these devices. As with everything, these are intensely personal choices and it’s best to go to a store where you can see and try as may of the models as possible. For example, FitBit is easily one of the leaders in the field of wearables, and makes fantastic products. But are they right for me? One of my friends doesn’t like her FitBit because of the flickering lights she sees at night in bed, which I know would bother me. I was told the FitBit Charge doesn’t truly detect your sleep automatically, though I’m not sure of that, and it may not be waterproof. And several friends complained about the difficulty of using all its many features. As for me, I simply don’t care for the aesthetics of the FitBit, nor the way it feels on my wrist; it has a rigid band that I dislike.

Here’s a summary of some wearables and their features, to give a sense of the market. This is not exhaustive and may not be current or accurate. Note that I’m not an Android user and there are far too many options in the Android market to summarize here. I’m also not listing Jawbone; I couldn’t find anyone who was willing to recommend a Jawbone.

Item Category Battery Heart Rate Sleep Other Features Apple Watch Smart Watch Short Yes Activities, Notifications, Apps Ticwear Smart Watch Short Yes Activities, Notifications, GPS Garmin 235 Watch-Like Tracker Medium Yes Yes Activities, Notifications, GPS Garmin VivoMove Augmented Watch Long Yes Withings Steel HR Augmented Watch Long Yes Yes Activities, Notifications Misfit Phase Augmented Watch Long Yes Activities Fitbit Charge 2 Fitness Tracker Medium Yes Yes Activities, Notifications Garmin VivoSmart Fitness Tracker Medium Yes Yes Activities, Notifications, GPS

Each of these makes and models typically has options or related products that have more or less functionality. For example, Withings offers an Activite Pop that is simpler than the Steel HR, and doesn’t track heart rate. But its battery lasts 8 months instead of 25 days. Misfit has a more fully-featured watch, but it’s not analog. And so on.

It’s also worth noting that the space is fast-moving. While I was biding my time, trying to decide what I wanted, at least three watch and wearable companies were acquired or went out of business, and several new options became available.

The Withings Steel HR

My favorite watch was the Withings Steel HR. I’m a traditional, simplistic watch guy; my analog watch is a Timex Weekender. I wanted a minimalistic analog watch with long battery life and the following features if possible:

  • Sleep, heart rate, and step tracking
  • Activity tracking; I could take or leave this feature
  • I wasn’t really interested in text messages, notifications, and the like
  • Other bonuses: waterproof, quick charging

One of the reasons I wanted a minimalistic product, with fewer smartphone features and more activity-tracking features, is because my experience is a small device with a lot of features is hard to use. I’d rather have a few easy-to-use features than a lot. This biased me away from devices like the Garmin Forerunner 235 and Ticwatch.

The Withings Steel HR tracks steps continuously, and heart rate every few minutes instead of continuously, but has an exercise mode (just long-press the button on the side to activate) that tracks continuously. It tracks sleep automatically, detecting when you’re in bed and light/deep sleep. It’s able to vibrate on your wrist when you get a text message, call, or calendar notification, and displays the caller ID or similar. And it can act as an alarm clock, vibrating to wake you.

It also auto-detects activities and the companion app lets you set goals and review your health statistics.

It’s mostly an analog watch in appearance, although it has a notification area where caller ID and the like appear, and a goal tracker to show how much of your daily step tracking goal you’ve achieved.

I got the black 36mm model. I like the styling. I have found it functional, and I appreciate the long battery life. The band is very comfortable and flexible. I wear my Withings 24x7, even in the shower. Here’s a breakdown of how well things work:

  • The watch hands are slightly hard to see depending on the lighting, because they aren’t white; they are polished stainless steel or similar.
  • Sleep tracking is reasonably good, though it usually thinks I’ve gone to bed before I really do. Sometimes I sit on the couch and work in the evenings for a couple of hours, typing on my laptop or writing in my notebook, and it detects that I’m in “light sleep” during this time.
  • Heart rate tracking is only directionally accurate. Sometimes I look at it in the middle of an intense workout and it’s reporting a heart rate of 62 when I’m pretty sure I’m well above 120. I’ve found it to report high heart rates when I’m at rest, too. I’ve also found long gaps in the tracking when I review the statistics in the app, such as at night. It’s reasonably accurate, though, and over the long term it should be a good gauge of my resting heart rate trend, which is what I care about most.
  • Step tracking is quite accurate, to within 1% or so in my tests. I am unsure how the step measurements from my iPhone are reconciled with the step measurements from the watch. Maybe they are independent.
  • The battery life is about 15-20 days for me, depending on how often I activate the workout mode.
  • Waterproof enough that I wear it in the shower. I’ve found it to mist a bit in hot weather in direct sun once.
  • The setup was a bit finicky; syncing it to my phone with Bluetooth took a couple of tries initially. Since then it’s been fine.
  • The iPhone app is probably not as good as Garmin’s or FitBit’s, but it’s pretty good.
  • Text notifications don’t seem to work. (I have an iPhone). I don’t know about calendar notifications, because I don’t use the iPhone calendar app. Update: Withings support told me that a non-obvious iPhone notification setting is necessary to make text notifications work. It’s worked very well for me since then.
  • Call notifications work well, and the caller ID displays quickly and is surprisingly usable for such a small area.
  • At first I thought that the alarm didn’t work, but now I have no trouble with it.

All in all, I’m happy with it. And I have to say, I’ve changed my mind about getting notifications. My phone/text stays out of sight more now, and isn’t something I always have in front of me. And I have notification sounds turned to very low volume now, so calls and texts are unobtrusive. I don’t miss calls or texts as much anymore because of loud ambient noise or whatever. In general, I notice calls and texts more reliably now, and people around me don’t, and I fuss with my phone less. It’s an unexpected win for me.

If I were to use something else instead, might be the Fitbit Charge line of products. What are your thoughts and experiences using any of these devices?

Picture Credit

Categories: MySQL

MongoDB 3.4 Bundle Release: Percona Server for MongoDB 3.4, Percona Monitoring and Management 1.1, Percona Toolkit 3.0 with MongoDB

MySQL Performance Blog - Mon, 2017-02-20 21:51

This blog post is the first in a series on Percona’s MongoDB 3.4 bundle release. This release includes Percona Server for MongoDB, Percona Monitoring and Management, and Percona Toolkit. In this post, we’ll look at the features included in the release.

We have a lot of great MongoDB content coming your way in the next few weeks. However, I wanted first to give you a quick list of the major things to be on the look out for.

This new bundled release ensures a robust, secure database that you can adapt to changing business requirements. It helps demonstrate how organizations can use MongoDB (and Percona Server for MongoDB), PMM and Percona Toolkit together to benefit from the cost savings and agility provided by free and proven open source software.

Percona Server for MongoDB 3.4 delivers all the latest MongoDB 3.4 Community Edition features, additional Enterprise features and a greater choice of storage engines.

Some of these new features include:

  • Shard member types. All nodes now need to know what they do – this helps with reporting and architecture planning more than the underlying code, but it’s an important first step.
  • Sharding balancer moved to config server primary
  • Configuration servers must now be a replica set
  • Faster balancing (shard count/2) – concurrent balancing actions can now happen at the same time!
  • Sharding and replication tags renamed to “zones” – again, an important first step
  • Default write behavior moved to majority – this could majorly impact many workloads, but moving to a default safe write mode is important
  • New decimal data type
  • Graph aggregation functions – we will talk about these more in a later blog, but for now note that graph and faceted searches are added.
  • Collations added to most access patterns for collections and databases
  • . . .and much more

Percona Server for MongoDB includes all the features of MongoDB Community Edition 3.4, providing an open source, fully-compatible, drop-in replacement with many improvements, such as:

  • Integrated, pluggable authentication with LDAP that provides a centralized enterprise authentication service
  • Open-source auditing for visibility into user and process actions in the database, with the ability to redact sensitive information (such as user names and IP addresses) from log files
  • Hot backups for the WiredTiger engine to protect against data loss in the case of a crash or disaster, without impacting performance
  • Two storage engine options not supported by MongoDB Community Edition 3.4 (doubling the total engine count choices):
    • MongoRocks, the RocksDB-powered storage engine, designed for demanding, high-volume data workloads such as in IoT applications, on-premises or in the cloud.
    • Percona Memory Engine is ideal for in-memory computing and other applications demanding very low latency workloads.

Percona Monitoring and Management 1.1

  • Support for MongoDB and Percona Server for MongoDB
  • Graphical dashboard information for WiredTiger, MongoRocks and Percona Memory Engine
  • Cluster and replica set wide views
  • Many more graphable metrics available for both for the OS and the database layer than currently provided by other tools in the ecosystem

Percona Toolkit 3.0

  • Two new tools for MongoDB are now in Percona’s Toolkit:
    • pt-mongodb-summary (the equivalent of pt-mysql-summary) provides a quick, at-a-glance overview of a MongoDB and Percona Server for MongoDB instance
      • This is useful for any DBA who wants a general idea of what’s happening in the system, what the state of their cluster/replica set is, and more.
    • pt-mongodb-query-digest (the equivalent of pt-query-digest for MySQL) offers a query review for troubleshooting
      • Query digest is one of the most used Toolkit features ever. In MongoDB, this is no different. Typically you might only look at your best and worst query times and document scans. However, this will show 90th percentiles, and top 10 queries take seconds versus minutes.

For all of these topics, you will see more blogs in the next few weeks that cover them in detail. Some people have asked what Percona’s MongoDB commitment looks like. Hopefully, this series of blogs help show how improving open source databases is central to the Percona vision. We are here to make the world better for developers, DBAs and other MongoDB users.

Categories: MySQL

Percona Toolkit 3.0.1 is now available

MySQL Performance Blog - Mon, 2017-02-20 21:50

Percona announces the availability of Percona Toolkit 3.0.1 on February 20, 2017. This is the first general availability (GA) release in the 3.0 series with a focus on padding MongoDB tools:

Downloads are available from the Percona Software Repositories.

NOTE: If you are upgrading using Percona’s yum repositories, make sure that the you enable the basearch repo, because Percona Toolkit 3.0 is not available in the noarch repo.

Percona Toolkit is a collection of advanced command-line tools that perform a variety of MySQL and MongoDB server and system tasks too difficult or complex for DBAs to perform manually. Percona Toolkit, like all Percona software, is free and open source.

This release includes changes from the previous 3.0.0 RC and the following additional changes:

  • Added requirement to run pt-mongodb-summary as a user with the clusterAdmin or root built-in roles.

You can find release details in the release notes. Bugs can be reported on Toolkit’s launchpad bug tracker.

Categories: MySQL

Percona Monitoring and Management 1.1.1 is now available

MySQL Performance Blog - Mon, 2017-02-20 21:49

Percona announces the release of Percona Monitoring and Management 1.1.1 on February 20, 2017. This is the first general availability (GA) release in the PMM 1.1 series with a focus on providing alternative deployment options for PMM Server:

NOTE: The AMIs and VirtualBox images above are still experimental. For production, it is recommended to run Docker images.

The instructions for installing Percona Monitoring and Management 1.1.1 are available in the documentation. Detailed release notes are available here.

There are no changes compared to previous 1.1.0 Beta release, except small fixes for MongoDB metrics dashboards.

A live demo of PMM is available at

We welcome your feedback and questions on our PMM forum.

About Percona Monitoring and Management
Percona Monitoring and Management is an open-source platform for managing and monitoring MySQL and MongoDB performance. Percona developed it in collaboration with experts in the field of managed database services, support and consulting.

PMM is a free and open-source solution that you can run in your own environment for maximum security and reliability. It provides thorough time-based analysis for MySQL and MongoDB servers to ensure that your data works as efficiently as possible.

Categories: MySQL

Percona Server for MongoDB 3.4.2-1.2 is now available

MySQL Performance Blog - Mon, 2017-02-20 21:49

Percona announces the release of Percona Server for MongoDB 3.4.2-1.2 on February 20, 2017. It is the first general availability (GA) release in the 3.4 series. Download the latest version from the Percona web site or the Percona Software Repositories.

Percona Server for MongoDB is an enhanced, open source, fully compatible, highly-scalable, zero-maintenance downtime database supporting the MongoDB v3.4 protocol and drivers. It extends MongoDB with Percona Memory Engine and MongoRocks storage engine, as well as several enterprise-grade features:

Percona Server for MongoDB requires no changes to MongoDB applications or code.

This release candidate is based on MongoDB 3.4.2, includes changes from PSMDB 3.4.0 Beta and 3.4.1 RC, and the following additional changes:

  • Fixed the audit log message format to comply with upstream MongoDB:
    • Changed params document to param
    • Added roles document
    • Fixed date and time format
    • Changed host field to ip in the local and remote documents

Percona Server for MongoDB 3.4.2-1.2 release notes are available in the official documentation.

Categories: MySQL

How Venture Capitalists Have Helped Me

Xaprb, home of innotop - Sun, 2017-02-19 20:20

Venture capital is a competitive industry. Investors compete to win the best companies, so they pitch founders on the value they bring to their portfolio companies. When I was a new founder, their pitches didn’t resonate with me. I found it difficult to understand how they could help. A few years later, I get it; they really can add value. This is what I’ve found so far.

When I first began speaking to potential venture capital investors, I felt as though their pitches to me were all variations on “you’ll get access to our network.” This fell on mostly deaf ears. At that point in my career, it was actually an anti-value for me. My experience of “networking” was associated with cliques and special privileges shared between people based on belonging to a club. It sounded like a fraternity’s pitch to a would-be pledge, to be honest.

At this point I’ve had a few years’ experience working with some fairly active and involved investors, and on reflection, I see more of the value than I did at the beginning. If I were to explain this to past-me, here’s what I might try to emphasize.


In the early years, I didn’t understand how important it would be to hire carefully, nor how difficult and time-consuming it would be. I thought I was good at hiring, but I was wrong. This is a book-length topic, but my venture capital investors have helped in several concrete ways between then and now.

  • Selecting a recruiter. Recruiters have turned out to be far more important than I’d expected. My executive recruiter, in particular, has become an extension of my team and a true partner to the business. But recruiting, as an industry, in many ways deserves the reputation it has. Exceptional recruiters are exceptional. My investors helped me find a recruiter who is good for me and for our business. Without my investors’ recruiting services, I’d probably have hired a bad recruiter, or one who was good at something I didn’t need (or good for a different company but not mine), wasted a lot of time and money, and potentially even failed to find good people for vital positions at crucial moments in the company’s growth. This is a life-and-death matter. The in-house recruiting specialists at my investors have made a huge difference here.
  • Introductions. Several of my most important hires have come through the wisdom and judgment of my investors combined with their extensive networks. Timing is so, so important. By knowing the right person at the right time, they’ve helped find the needle in the haystack.
  • Closing. High-performing people are rarely “on the job market” and are careful with their careers. Joining an early-stage startup with a first-time founder/CEO is a pretty risky move. Without investors, many people never would have even started a conversation with me, and the investors have been instrumental in getting to yes. The investors have helped explain what they saw in the company and its opportunity, lending an independent, third-party perspective that I would be unable to. “Why did you invest in this company?” is an answer only an investor can give.
  • Understanding The Market. My investors have a much broader view of what’s normal and expected in the industry, and can quickly give advice and guidance on what’s going to work and won’t in recruiting-related matters. They’re scouts reporting from the front lines. They can help vet for common mistakes in our processes, provide data on compensation norms, give strategic advice on closing a particular candidate, and so on.

Note that my investors’ recruiting services aren’t for doing recruiting, per se. They’re for helping my company succeed in our own recruiting efforts.

Planning And Cross-Checking

Investors, both actual and potential, have helped review and clarify my plans. They have found things I overlooked, pointed out errors in my logic, and made my models much more rigorous. They have helped me understand the common language of things such as operating plans, showing me what types of models will be quick to evaluate and provide good answers, as well as what’s conventional and therefore easy to pattern-match.

My board member at NEA, Arjun Aggarwal, has spent a great deal of time helping build models for many aspects of the business, helping turn thoughts into spreadsheets. This is not typical; board members aren’t usually this active and involved. Arjun adds a lot of value to the team by doing this.

Speaking to investors generally results in at least some type of challenge to my thinking, even if very diplomatic. Every question is an effort to go a bit deeper. When I speak to venture capitalists, I write down the questions they ask me. Common themes always emerge. I am not a venture capitalist and don’t think like one. Being able to review my notes and see where I need to focus, both for their sake and for mine, is invaluable to me.

Pitch Practice And Feedback

I’m not a pitcher by nature. But virtually everything I do involves summarizing the business’s value, current status, and opportunity to someone, whether that’s a potential recruit, an investor, a partner, a customer, the board of directors, the all-staff meetings, and so on. Venture capitalists provide feedback on how well I’m doing that.

My investors have also gone beyond the call of duty to help me understand how to pitch better, build a better deck, and helped me with pitch practice and rehearsal. As I’ve leaned into this process, I’ve found it useful all day, every day.

When I’ve pitched potential investors, I’ve found it very useful to note and decode their feedback. Some will not say no in a direct way, leaning on compliments followed by encouragement to stay in touch. Others will take time to be very specific about why they’ve decided not to invest. Their feedback is clear guidance as to what they think the business should focus on achieving. It has to be taken with a grain of salt, but collating this feedback often results in advice that’s less conflicting than some other sources I’ve gone to for help. It also points out where I’m just doing a bad job communicating our strengths; I’ve gotten feedback that we should do X when, in fact, we already do X and I just wasn’t saying it very well.

Press and Media Relations

Early startups generally can’t and shouldn’t spend money on an expensive PR firm. Both of my major investors have PR staff and services who have helped us with periodic work we otherwise wouldn’t have had resources to do well.

Similar to recruiters, PR firms are probably a trap for founders pretty often—not that they mean badly, but you need to know how to work with them or you’ll steer yourself astray. Working with our investors’ PR experts instead of with agencies has allowed us to get lots of help at particular times, without taking a big risk on a long-term commitment.

Introductions To Advisors

Various introductions to advisors, entrepreneurs-in-residence, and other helpful people have come through my investors. Many of these people have generously spent significant amounts of time with me and others on the team. We’ve dodged many serious mistakes as a result. We’ve also seized on opportunities we didn’t see ourselves, and found alternative ways to do things that produced surprising results at times. This is true both on the business and the technical sides.


If you’d asked me in 2013, I think I would have said that investors were perhaps exaggerating how much they could help us. I’d have said “all they do is say they’ll make introductions, and introductions are just going to use up precious time I need to conserve.” That’s not what I’ve found. I’ve received help I didn’t expect, didn’t know I needed, and has made a big difference to the business.

PS: If I’ve omitted anything you’ve done for me, it’s forgetfulness, not passive aggressiveness.

Pic Credit

Categories: MySQL

MySQL Bug 72804 Workaround: “BINLOG statement can no longer be used to apply query events”

MySQL Performance Blog - Thu, 2017-02-16 23:39

In this blog post, we’ll look at a workaround for MySQL bug 72804.

Recently I worked on a ticket where a customer performed a point-in-time recovery PITR using a large set of binary logs. Normally we handle this by applying the last backup, then re-applying all binary logs created since the last backup. In the middle of the procedure, their new server crashed. We identified the binary log position and tried to restart the PITR from there. However, using the option --start-position, the restore failed with the error “The BINLOG statement of type Table_map was not preceded by a format description BINLOG statement.” This is a known bug and is reported as MySQL Bug #72804: “BINLOG statement can no longer be used to apply Query events.”

I created a small test to demonstrate a workaround that we implemented (and worked).

First, I ran a large import process that created several binary logs. I used a small value in max_binlog_size and tested using the database “employees” (a standard database used for testing).Then I dropped the database.

mysql> set sql_log_bin=0; Query OK, 0 rows affected (0.33 sec) mysql> drop database employees; Query OK, 8 rows affected (1.25 sec)

To demonstrate the recovery process, I joined all the binary log files into one SQL file and started an import.

sveta@Thinkie:~/build/ps-5.7/mysql-test$ ../bin/mysqlbinlog var/mysqld.1/data/master.000001 var/mysqld.1/data/master.000002 var/mysqld.1/data/master.000003 var/mysqld.1/data/master.000004 var/mysqld.1/data/master.000005 > binlogs.sql sveta@Thinkie:~/build/ps-5.7/mysql-test$ binlogs.sql sveta@Thinkie:~/build/ps-5.7/mysql-test$ mysql < binlogs.sql ERROR 1064 (42000) at line 9020: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'inserting error

I intentionally generated a syntax error in the resulting file with the help of the script (which just inserts a bogus SQL statement in a random row). The error message clearly showed where the import stopped: line 9020. I then created a file that cropped out the part that had already been imported (lines 1- 9020), and tried to import this new file.

sveta@Thinkie:~/build/ps-5.7/mysql-test$ tail -n +9021 binlogs.sql >binlogs_rest.sql sveta@Thinkie:~/build/ps-5.7/mysql-test$ mysql < binlogs_rest.sql ERROR 1609 (HY000) at line 134: The BINLOG statement of type `Table_map` was not preceded by a format description BINLOG statement.

Again, the import failed with exactly the same error as the customer. The reason for this error is that the BINLOG statement – which applies changes from the binary log – expects that the format description event gets run in the same session as the binary log import, but before it. The format description existed initially at the start of the import that failed at line 9020. The later import (from line 9021 on) doesn’t contain this format statement.

Fortunately, this format is the same for the same version! We can simply take it from the beginning the SQL log file (or the original binary file) and put into the file created after the crash without lines 1-9020.

With MySQL versions 5.6 and 5.7, this event is located in the first 11 rows:

sveta@Thinkie:~/build/ps-5.7/mysql-test$ head -n 11 binlogs.sql | cat -n 1 /*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/; 2 /*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/; 3 DELIMITER /*!*/; 4 # at 4 5 #170128 17:58:11 server id 1 end_log_pos 123 CRC32 0xccda074a Start: binlog v 4, server v 5.7.16-9-debug-log created 170128 17:58:11 at startup 6 ROLLBACK/*!*/; 7 BINLOG ' 8 g7GMWA8BAAAAdwAAAHsAAAAAAAQANS43LjE2LTktZGVidWctbG9nAAAAAAAAAAAAAAAAAAAAAAAA 9 AAAAAAAAAAAAAAAAAACDsYxYEzgNAAgAEgAEBAQEEgAAXwAEGggAAAAICAgCAAAACgoKKioAEjQA 10 AUoH2sw= 11 '/*!*/;

The first six rows are meta information, and rows 6-11 are the format event itself. The only thing we need to export into our resulting file is these 11 lines:

sveta@Thinkie:~/build/ps-5.7/mysql-test$ head -n 11 binlogs.sql > binlogs_rest_with_format.sql sveta@Thinkie:~/build/ps-5.7/mysql-test$ cat binlogs_rest.sql >> binlogs_rest_with_format.sql sveta@Thinkie:~/build/ps-5.7/mysql-test$ mysql < binlogs_rest_with_format.sql sveta@Thinkie:~/build/ps-5.7/mysql-test$

After this, the import succeeded!

Categories: MySQL

Percona Blog Poll Results: What Programming Languages Are You Using for Backend Development?

MySQL Performance Blog - Thu, 2017-02-16 21:53

In this blog we’ll look at the results from Percona’s blog poll on what programming languages you’re using for backend development.

Late last year we started a poll on what backend programming languages are being used by the open source community. The three components of the backend – server, application, and database – are what makes a website or application work. Below are the results of Percona’s poll on backend programming languages in use by the community:

Note: There is a poll embedded within this post, please visit the site to participate in this post's poll.

One of the best-known and earliest web service stacks is the LAMP stack, which spelled out refers to Linux, Apache, MySQL and PHP/Perl/Python. We can see that this early model is still popular when it comes to the backend.

PHP still remains a very common choice for a backend programming language, with Python moving up the list as well. Perl seems to be fading in popularity, despite being used a lot in the MySQL world.

Java is also showing signs of strength, demonstrating the strides MySQL is making in enterprise applications. We can also see JavaScript is increasingly getting used not only as a front-end programming language, but also as back-end language with the Node.JS framework.

Finally, Go is a language to look out for. Go is an open source programming language created by Google. It first appeared in 2009, and is already more popular than Perl or Ruby according to this poll.

Thanks to the community for participating in our poll. You can take our latest poll on what database engine are you using to store time series data here. 

Categories: MySQL

MariaDB at Percona Live Open Source Database Conference 2017

MySQL Performance Blog - Thu, 2017-02-16 18:08

In this blog, we’ll look at how we plan to represent MariaDB at Percona Live.

The MariaDB Corporation is organizing a conference called M17 on the East Coast in April. Some Perconians (Peter Zaitsev, Vadim Tkachenko, Sveta Smirnova, Alex Rubin, Colin Charles) decided to submit some interesting talks for that conference. Percona also offered to sponsor the conference.

As of this post, the talks haven’t been accepted, and we were politely told that we couldn’t sponsor.

Some of the proposed talks were:

  • MariaDB Backup with Percona XtraBackup (Vadim Tkachenko)
  • Managing MariaDB Server operations with Percona Toolkit (Colin Charles)
  • MariaDB Server Monitoring with Percona Monitoring and Management (Peter Zaitsev)
  • Securing your MariaDB Server/MySQL data (Colin Charles, Ronald Bradford)
  • Data Analytics with MySQL, Apache Spark and Apache Drill (Alexander Rubin)
  • Performance Schema for MySQL and MariaDB Troubleshooting (Sveta Smirnova)

At Percona, we think MariaDB Server is an important part of the MySQL ecosystem. This is why the Percona Live Open Source Database Conference 2017 in Santa Clara has a MariaDB mini-track, consisting of talks from various Percona and MariaDB experts:

If any of these topics look enticing, come to the conference. We have MariaDB at Percona Live.

To make your decision easier, we’ve created a special promo code that gets you $75 off a full conference pass! Just use MariaDB@PL17 at checkout.

In the meantime, we will continue to write and discuss MariaDB, and any other open source database technologies. The power of the open source community is the free exchange of ideas, healthy competition and open dialog within the community.

Here are some more past presentations that are also relevant:

Categories: MySQL

Group Replication: Shipped Too Early

MySQL Performance Blog - Thu, 2017-02-16 00:02

This blog post is my overview of Group Replication technology.

With Oracle clearly entering the “open source high availability solutions” arena with the release of their brand new Group Replication solution, I believe it is time to review the quality of the first GA (production ready) release.

TL;DR: Having examined the technology, it is my conclusion that Oracle seems to have released the GA version of Group Replication too early. While the product is definitely “working prototype” quality, the release seems rushed and unfinished. I found a significant number of issues, and I would personally not recommend it for production use.

It is obvious that Oracle is trying hard to ship technology to compete with Percona XtraDB Cluster, which is probably why they rushed to claim Group Replication GA quality.

If you’re all set to follow along and test Group Replication yourself, simplify the initial setup by using this Docker image. We can review some of the issues you might face together.

For the record, I tested the version based on MySQL 5.7.17 release.

No automatic provisioning

First off, the first thing you’ll find is there is NO way to automatically setup of a new node.

If you need to setup new node or recover an existing node from a fatal failure, you’ll need to manually provision the slave.

Of course, you can clone a slave using Percona XtraBackup or LVM by employing some self-developed scripts. But given the high availability nature of the product, one would expect Group Replication to automatically re-provision any failed node.

Bug: stale reads on nodes

Please see this bug:

One line summary: while any secondary nodes are “catching up” to whatever happened on a first node (it takes time to apply changes on secondary nodes), reads on a secondary node could return stale data (as shown in the bug report).

This behavior brings us back to the traditional asynchronous replication slave behavior (i.e., Group Replication’s predecessor).

It also contradicts the Group Replication documentation, which states: “There is a built-in group membership service that keeps the view of the group consistent and available for all servers at any given point in time.” (See

I might also mention here that Percona XtraDB Cluster prevents stale reads (see

Bug: nodes become unusable after a big transaction, refusing to execute further transactions

There are two related bugs:

One line summary: after running a big transaction, any secondary nodes become unusable and refuse to perform any further transactions.

Obscure error messages

It is not uncommon to see cryptic error messages while testing Group Replication. For example:

mysql> commit; ERROR 3100 (HY000): Error on observer while running replication hook 'before_commit'.

This is fairly useless and provides little help until I check the mysqld error log. The log provides a little bit more information:

2017-02-09T02:05:36.996776Z 18 [ERROR] Plugin group_replication reported: '[GCS] Gcs_packet's payload is too big. Only the packets smaller than 2113929216 bytes can be compressed.'


The items highlighted above might not seem too bad at first, and you could assume that your workload won’t be affected. However, stale reads and node dysfunctions basically prevent me from running a more comprehensive evaluation.

My recommendation:

If you care about your data, then I recommend not using Group Replication in production. Currently, it looks like it might cause plenty of headaches, and it is easy to get inconsistent results.

For the moment, Group Replication appears an advanced – but broken – traditional MySQL asynchronous replication.

I understand Oracle’s dilemma. Usually people are hesitant to test a product that is not GA. So in order to get feedback from users, Oracle needs to push the product to GA. Oracle must absolutely solve the issues above during future QA cycles.

Categories: MySQL

Docker Images for Percona Server for MySQL Group Replication

MySQL Performance Blog - Wed, 2017-02-15 16:27

In this blog post, we’ll point to a new Docker image for Percona Server for MySQL Group Replication.

Our most recent release of Percona Server for MySQL (Percona Server for MySQL 5.7.17) comes with Group Replication plugins. Unfortunately, since this technology is very new, it requires some fairly complicated steps to setup and get running. To help with that process, I’ve prepare Docker images that simplify its setup procedures.

You can find the image here:

To start the first node (bootstrap the group):

docker run -d -p 3306 --net=clusternet -e MYSQL_ROOT_PASSWORD=passw0rd -e CLUSTER_NAME=cluster1 perconalab/pgr-57

To add nodes into the group after:

docker run -d -p 3306 --net=clusternet -e MYSQL_ROOT_PASSWORD=passw0rd -e CLUSTER_NAME=cluster1 -e CLUSTER_JOIN=CONTAINER_ID_FROM_THE_FIRST_STEP perconalab/pgr-57

You can also get a full script that starts “N” number of nodes, here:


Categories: MySQL

Percona Live Open Source Database Conference 2017 Crash Courses: MySQL and MongoDB!

MySQL Performance Blog - Tue, 2017-02-14 20:23

The Percona Live Open Source Database Conference 2017 will once again host crash courses on MySQL and MongoDB. Read below to get an outstanding discount on either the MySQL or MongoDB crash course (or both).

The database community constantly tells us how hard it is to find someone with MySQL and MongoDB DBA skills who can help with the day-to-day management of their databases. This is especially difficult when companies don’t have a full-time requirement for a DBA. Developers, system administrators and IT staff spend too much time trying to solve basic database problems that keep them from doing their other job duties. Eventually, the little problems or performance inefficiencies that start to pile up lead to big problems.

In answer to this growing need, Percona Live is once again hosting crash courses for developers, systems administrators and other technical resources. A crash course is a one-day training session on either MySQL 101 or MongoDB 101.

Don’t let the name fool you: these courses are led by Percona database experts who will show you the fundamentals of MySQL or MongoDB tools and techniques.

And it’s not just for DBAs: developers are encouraged to attend to hone their database skills.

Below is a list of the topics covered in each course this year:

MySQL 101 Topics MongoDB 101 Topics

Attendees will return ready to quickly and correctly take care of the day-to-day and week-to-week management of your MySQL or MongoDB environment.

The schedule and cost for the 101 courses (without a full-conference pass) are:

  • MySQL 101: Tuesday, April 25 ($400)
  • MongoDB 101: Wednesday, April 26 ($400)
  • Both MySQL and MongoDB 101 sessions ($700)

(Tickets to the 101 sessions do not grant access to the main Percona Live breakout sessions. Full Percona Live conferences passes grant admission to the 101 sessions. 101 Crash Course attendees will have full access to Percona Live keynote speakers the exhibit hall and receptions.)

As a special promo, the first 101 people to purchase the single 101 talks receive a $299.00 discount off the ticket price! Each session only costs $101! Get both sessions for a mere $202 and save $498.00! Register now using the following codes for your discount:

  • 101: $299 off of either the MySQL or MongoDB tickets
  • 202: $498 off of the combined MySQL/MongoDB ticket

Click here to register.

Register for Percona Live 2017 now! Advanced Registration lasts until March 5, 2017. Percona Live is a very popular conference: this year’s Percona Live Europe sold out, and we’re looking to do the same for Percona Live 2017. Don’t miss your chance to get your ticket at its most affordable price. Click here to register.

Percona Live 2017 sponsorship opportunities are available now. Click here to find out how to sponsor.

Categories: MySQL

Percona Server for MongoDB 3.4 Product Bundle Release is a Valentine to the Open Source Community

MySQL Performance Blog - Tue, 2017-02-14 15:46

Percona today announced a Percona Server for MongoDB 3.4 solution bundle of updated products. This release enables any organization to create a robust, secure database environment that can be adapted to changing business requirements.

Percona Server for MongoDB 3.4, Percona Monitoring and Management 1.1, and Percona Toolkit 3.0 offer more features and benefits, with enhancements for both MongoDB® and MySQL® database environments. When these Percona products are used together, organizations gain all the cost and agility benefits provided by free, proven open source software that delivers all the latest MongoDB Community Edition 3.4 features, additional Enterprise features, and a greater choice of storage engines. Along with improved insight into the database environment, the solution provides enhanced control options for optimizing a wider range of database workloads with greater reliability and security.

The solution will be generally available the week of Feb. 20.

New Features and Benefits Summary

Percona Server for MongoDB 3.4

  • All the features of MongoDB Community Edition 3.4, which provides an open source, fully compatible, drop-in replacement
  • Integrated, pluggable authentication with LDAP to provide a centralized enterprise authentication service
  • Open-source auditing for visibility into user and process actions in the database, with the ability to redact sensitive information (such as usernames and IP addresses) from log files
  • Hot backups for the WiredTiger engine protect against data loss in the case of a crash or disaster, without impacting performance
  • Two storage engine options not supported by MongoDB Community Edition 3.4:

    • MongoRocks, the RocksDB-powered storage engine, designed for demanding, high-volume data workloads such as in IoT applications, on-premises or in the cloud
    • Percona Memory Engine is ideal for in-memory computing and other applications demanding very low latency workloads

Percona Monitoring and Management 1.1

  • Support for MongoDB and Percona Server for MongoDB
  • Graphical dashboard information for WiredTiger, MongoRocks and Percona Memory Engine

Percona Toolkit 3.0

  • Two new tools for MongoDB:
    • pt-mongodb-summary (the equivalent of pt-mysql-summary) provides a quick, at-a-glance overview of a MongoDB and Percona Server for MongoDB instance.
    • pt-mongodb-query-digest (the equivalent of pt-query-digest for MySQL) offers a query review for troubleshooting.

For more information, see Percona’s press release.

Categories: MySQL

ClickHouse: New Open Source Columnar Database

MySQL Performance Blog - Mon, 2017-02-13 23:24

For this blog post, I’ve decided to try ClickHouse: an open source column-oriented database management system developed by Yandex (it currently powers Yandex.Metrica, the world’s second-largest web analytics platform).

In my previous set of posts, I tested Apache Spark for big data analysis and used Wikipedia page statistics as a data source. I’ve used the same data as in the Apache Spark blog post: Wikipedia Page Counts. This allows me to compare ClickHouse’s performance to Spark’s.

I’ve spent some time testing ClickHouse for relatively large volumes of data (1.2Tb uncompressed). Here is a list of ClickHouse advantages and disadvantages that I saw:

ClickHouse advantages

  • Parallel processing for single query (utilizing multiple cores)
  • Distributed processing on multiple servers
  • Very fast scans (see benchmarks below) that can be used for real-time queries
  • Column storage is great for working with “wide” / “denormalized” tables (many columns)
  • Good compression
  • SQL support (with limitations)
  • Good set of functions, including support for approximated calculations
  • Different storage engines (disk storage format)
  • Great for structural log/event data as well as time series data (engine MergeTree requires date field)
  • Index support (primary key only, not all storage engines)
  • Nice command line interface with user-friendly progress bar and formatting

Here is a full list of ClickHouse features

ClickHouse disadvantages

  • No real delete/update support, and no transactions (same as Spark and most of the big data systems)
  • No secondary keys (same as Spark and most of the big data systems)
  • Own protocol (no MySQL protocol support)
  • Limited SQL support, and the joins implementation is different. If you are migrating from MySQL or Spark, you will probably have to re-write all queries with joins.
  • No window functions

Full list of ClickHouse limitations

Group by: in-memory vs. on-disk

Running out of memory is one of the potential problems you may encounter when working with large datasets in ClickHouse:

SELECT min(toMonth(date)), max(toMonth(date)), path, count(*), sum(hits), sum(hits) / count(*) AS hit_ratio FROM wikistat WHERE (project = 'en') GROUP BY path ORDER BY hit_ratio DESC LIMIT 10 ↖ Progress: 1.83 billion rows, 85.31 GB (68.80 million rows/s., 3.21 GB/s.) ██████████▋ 6%Received exception from server: Code: 241. DB::Exception: Received from localhost:9000, DB::Exception: Memory limit (for query) exceeded: would use 9.31 GiB (attempt to allocate chunk of 1048576 bytes), maximum: 9.31 GiB: (while reading column hits):

By default, ClickHouse limits the amount of memory for group by (it uses a hash table for group by). This is easily fixed – if you have free memory, increase this parameter:

SET max_memory_usage = 128000000000; #128G

If you don’t have that much memory available, ClickHouse can “spill” data to disk by setting this:

set max_bytes_before_external_group_by=20000000000; #20G set max_memory_usage=40000000000; #40G

According to the documentation, if you need to use max_bytes_before_external_group_by it is recommended to set max_memory_usage to be ~2x of the size of max_bytes_before_external_group_by.

(The reason for this is that the aggregation is performed in two phases: (1) reading and building an intermediate data, and (2) merging the intermediate data. The spill to disk can only happen during the first phase. If there won’t be spill, ClickHouse might need the same amount of RAM for stage 1 and 2.)

Benchmarks: ClickHouse vs. Spark

Both ClickHouse and Spark can be distributed. However, for the purpose of this test I’ve run a single node for both ClickHouse and Spark. The results are quite impressive.

Benchmark summary

 Size / compression  Spark v. 2.0.2  ClickHouse  Data storage format  Parquet, compressed: snappy   Internal storage, compressed   Size (uncompressed: 1.2TB)   395G  212G


 Test  Spark v. 2.0.2  ClickHouse   Diff  Query 1: count (warm)  7.37 sec (no disk IO)  6.61 sec   ~same  Query 2: simple group (warm)   792.55 sec (no disk IO)   37.45 sec  21x better  Query 3: complex group by   2522.9 sec  398.55 sec  6.3x better


ClickHouse vs. MySQL

I wanted to see how ClickHouse compared to MySQL. Obviously, we can’t compare some workloads. For example:

  • Storing terabytes of data and querying (“crunching” would be a better word here) data without an index. It would take weeks (or even months) to load data and build the indexes. That is a much more suitable workload for ClickHouse or Spark.
  • Real-time updates / OLTP. ClickHouse does not support real-time updates / deletes.

Usually big data systems provide us with real-time queries. Systems based on map/reduce (i.e., Hive on top of HDFS) are just too slow for real-time queries, as it takes a long time to initialize the map/reduce job and send the code to all nodes.

Potentially, you can use ClickHouse for real-time queries. It does not support secondary indexes, however. This means it will probably scan lots of rows, but it can do it very quickly.

To do this test, I’m using the data from the Percona Monitoring and Management system. The table I’m using has 150 columns, so it is good for column storage. The size in MySQL is ~250G:

mysql> show table status like 'query_class_metrics'G *************************** 1. row *************************** Name: query_class_metrics Engine: InnoDB Version: 10 Row_format: Compact Rows: 364184844 Avg_row_length: 599 Data_length: 218191888384 Max_data_length: 0 Index_length: 18590056448 Data_free: 6291456 Auto_increment: 416994305

Scanning the whole table is significantly faster in ClickHouse. Retrieving just ten rows by key is faster in MySQL (especially from memory).

But what if we only need to scan limited amount of rows and do a group by? In this case, ClickHouse may be faster. Here is the example (real query used to create sparklines):


SELECT (1480888800 - UNIX_TIMESTAMP(start_ts)) / 11520 as point, FROM_UNIXTIME(1480888800 - (SELECT point) * 11520) AS ts, COALESCE(SUM(query_count), 0) / 11520 AS query_count_per_sec, COALESCE(SUM(Query_time_sum), 0) / 11520 AS query_time_sum_per_sec, COALESCE(SUM(Lock_time_sum), 0) / 11520 AS lock_time_sum_per_sec, COALESCE(SUM(Rows_sent_sum), 0) / 11520 AS rows_sent_sum_per_sec, COALESCE(SUM(Rows_examined_sum), 0) / 11520 AS rows_examined_sum_per_sec FROM query_class_metrics WHERE query_class_id = 7 AND instance_id = 1259 AND (start_ts >= '2014-11-27 00:00:00' AND start_ts < '2014-12-05 00:00:00') GROUP BY point; ... 61 rows in set (0.10 sec) # Query_time: 0.101203 Lock_time: 0.000407 Rows_sent: 61 Rows_examined: 11639 Rows_affected: 0 explain SELECT ... *************************** 1. row *************************** id: 1 select_type: PRIMARY table: query_class_metrics partitions: NULL type: range possible_keys: agent_class_ts,agent_ts key: agent_class_ts key_len: 12 ref: NULL rows: 21686 filtered: 100.00 Extra: Using index condition; Using temporary; Using filesort *************************** 2. row *************************** id: 2 select_type: DEPENDENT SUBQUERY table: NULL partitions: NULL type: NULL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: NULL filtered: NULL Extra: No tables used 2 rows in set, 2 warnings (0.00 sec)

It is relatively fast.

ClickHouse (some functions are different, so we will have to rewrite the query):

SELECT intDiv(1480888800 - toRelativeSecondNum(start_ts), 11520) AS point, toDateTime(1480888800 - (point * 11520)) AS ts, SUM(query_count) / 11520 AS query_count_per_sec, SUM(Query_time_sum) / 11520 AS query_time_sum_per_sec, SUM(Lock_time_sum) / 11520 AS lock_time_sum_per_sec, SUM(Rows_sent_sum) / 11520 AS rows_sent_sum_per_sec, SUM(Rows_examined_sum) / 11520 AS rows_examined_sum_per_sec, SUM(Rows_affected_sum) / 11520 AS rows_affected_sum_per_sec FROM query_class_metrics WHERE (query_class_id = 7) AND (instance_id = 1259) AND ((start_ts >= '2014-11-27 00:00:00') AND (start_ts < '2014-12-05 00:00:00')) GROUP BY point; 61 rows in set. Elapsed: 0.017 sec. Processed 270.34 thousand rows, 14.06 MB (15.73 million rows/s., 817.98 MB/s.)

As we can see, even though ClickHouse scans more rows (270K vs. 11K – over 20x more) it is faster to execute the ClickHouse query (0.10 seconds in MySQL compared to 0.01 second in ClickHouse). The column store format helps a lot here, as MySQL has to read all 150 columns (stored inside InnoDB pages) and ClickHouse only needs to read seven columns.

Wikipedia trending article of the month

Inspired by the article about finding trending topics using Google Books n-grams data, I decided to implement the same algorithm on top of the Wikipedia page visit statistics data. My goal here is to find the “article trending this month,” which has significantly more visits this month compared to the previous month. As I was implementing the algorithm, I came across another ClickHouse limitation: join syntax is limited. In ClickHouse, you can only do join with the “using” keyword. This means that the fields you’re joining need to have the same name. If the field name is different, we have to use a subquery.

Below is an example.

First, create a temporary table to aggregate the visits per month per page:

CREATE TABLE wikistat_by_month ENGINE = Memory AS SELECT path, mon, sum(hits) / total_hits AS ratio FROM ( SELECT path, hits, toMonth(date) AS mon FROM wikistat WHERE (project = 'en') AND (lower(path) NOT LIKE '%special%') AND (lower(path) NOT LIKE '%page%') AND (lower(path) NOT LIKE '%test%') AND (lower(path) NOT LIKE '%wiki%') AND (lower(path) NOT LIKE '%index.html%') ) AS a ANY INNER JOIN ( SELECT toMonth(date) AS mon, sum(hits) AS total_hits FROM wikistat WHERE (project = 'en') AND (lower(path) NOT LIKE '%special%') AND (lower(path) NOT LIKE '%page%') AND (lower(path) NOT LIKE '%test%') AND (lower(path) NOT LIKE '%wiki%') AND (lower(path) NOT LIKE '%index.html%') GROUP BY toMonth(date) ) AS b USING (mon) GROUP BY path, mon, total_hits ORDER BY ratio DESC Ok. 0 rows in set. Elapsed: 543.607 sec. Processed 53.77 billion rows, 2.57 TB (98.91 million rows/s., 4.73 GB/s.)

Second, calculate the actual list:

SELECT path, mon + 1, a_ratio AS ratio, a_ratio / b_ratio AS increase FROM ( SELECT path, mon, ratio AS a_ratio FROM wikistat_by_month WHERE ratio > 0.0001 ) AS a ALL INNER JOIN ( SELECT path, CAST((mon - 1) AS UInt8) AS mon, ratio AS b_ratio FROM wikistat_by_month WHERE ratio > 0.0001 ) AS b USING (path, mon) WHERE (mon > 0) AND (increase > 2) ORDER BY mon ASC, increase DESC LIMIT 100 ┌─path───────────────────────────────────────────────┬─plus(mon, 1)─┬──────────────────ratio─┬───────────increase─┐ │ Heath_Ledger │ 2 │ 0.0008467223172121601 │ 6.853825241458039 │ │ Cloverfield │ 2 │ 0.0009372609760313347 │ 3.758937474560766 │ │ The_Dark_Knight_(film) │ 2 │ 0.0003508532447770276 │ 2.8858100355450484 │ │ Scientology │ 2 │ 0.0003300109101992719 │ 2.52497180013816 │ │ Barack_Obama │ 3 │ 0.0005786473399980557 │ 2.323409928527576 │ │ Canine_reproduction │ 3 │ 0.0004836300843539438 │ 2.0058985801174662 │ │ Iron_Man │ 6 │ 0.00036261003907049 │ 3.5301196568303888 │ │ Iron_Man_(film) │ 6 │ 0.00035634745198422497 │ 3.3815325090507193 │ │ Grand_Theft_Auto_IV │ 6 │ 0.0004036713142943461 │ 3.2112732008504885 │ │ Indiana_Jones_and_the_Kingdom_of_the_Crystal_Skull │ 6 │ 0.0002856570195547951 │ 2.683443198030021 │ │ Tha_Carter_III │ 7 │ 0.00033954377342889735 │ 2.820114216429247 │ │ EBay │ 7 │ 0.0006575000133427979 │ 2.5483158977946787 │ │ Bebo │ 7 │ 0.0003958340022793501 │ 2.3260912792668162 │ │ Facebook │ 7 │ 0.001683658379576915 │ 2.16460972864883 │ │ Yahoo!_Mail │ 7 │ 0.0002190640575012259 │ 2.1075879062784737 │ │ MySpace │ 7 │ 0.001395608643577507 │ 2.103263660621813 │ │ Gmail │ 7 │ 0.0005449834079575953 │ 2.0675919337716757 │ │ Hotmail │ 7 │ 0.0009126863121737026 │ 2.052471735190232 │ │ Google │ 7 │ 0.000601645849087389 │ 2.0155448612416644 │ │ Barack_Obama │ 7 │ 0.00027336526076130943 │ 2.0031305241832302 │ │ Facebook │ 8 │ 0.0007778115183044431 │ 2.543477658022576 │ │ MySpace │ 8 │ 0.000663544314346641 │ 2.534512981232934 │ │ Two-Face │ 8 │ 0.00026975137404447024 │ 2.4171743959768803 │ │ YouTube │ 8 │ 0.001482456447101451 │ 2.3884527929836152 │ │ Hotmail │ 8 │ 0.00044467667764940547 │ 2.2265750216262954 │ │ The_Dark_Knight_(film) │ 8 │ 0.0010482536106662156 │ 2.190078096294301 │ │ Google │ 8 │ 0.0002985028319919154 │ 2.0028812075734637 │ │ Joe_Biden │ 9 │ 0.00045067411455437264 │ 2.692262662620829 │ │ The_Dark_Knight_(film) │ 9 │ 0.00047863754833213585 │ 2.420864550676665 │ │ Sarah_Palin │ 10 │ 0.0012459220318907518 │ 2.607063205782761 │ │ Barack_Obama │ 12 │ 0.0034487235202817087 │ 15.615409029600414 │ │ George_W._Bush │ 12 │ 0.00042708730873936023 │ 3.6303098900144937 │ │ Fallout_3 │ 12 │ 0.0003568429236849597 │ 2.6193094036745155 │ └────────────────────────────────────────────────────┴──────────────┴────────────────────────┴────────────────────┘ 34 rows in set. Elapsed: 1.062 sec. Processed 1.22 billion rows, 49.03 GB (1.14 billion rows/s., 46.16 GB/s.)

Their response time is really good, considering the amount of data it needed to scan (the first query scanned 2.57 TB of data).


The ClickHouse column-oriented database looks promising for data analytics, as well as for storing and processing structural event data and time series data. ClickHouse can be ~10x faster than Spark for some workloads.

Appendix: Benchmark details


  • CPU: 24xIntel(R) Xeon(R) CPU L5639 @ 2.13GHz (physical = 2, cores = 12, virtual = 24, hyperthreading = yes)
  • Disk: 2 consumer grade SSD in software RAID 0 (mdraid)

Query 1

select count(*) from wikistat


:) select count(*) from wikistat; SELECT count(*) FROM wikistat ┌─────count()─┐ │ 26935251789 │ └─────────────┘ 1 rows in set. Elapsed: 6.610 sec. Processed 26.88 billion rows, 53.77 GB (4.07 billion rows/s., 8.13 GB/s.)


spark-sql> select count(*) from wikistat; 26935251789 Time taken: 7.369 seconds, Fetched 1 row(s)

Query 2

select count(*), month(dt) as mon from wikistat where year(dt)=2008 and month(dt) between 1 and 10 group by month(dt) order by month(dt);


:) select count(*), toMonth(date) as mon from wikistat where toYear(date)=2008 and toMonth(date) between 1 and 10 group by mon; SELECT count(*), toMonth(date) AS mon FROM wikistat WHERE (toYear(date) = 2008) AND ((toMonth(date) >= 1) AND (toMonth(date) <= 10)) GROUP BY mon ┌────count()─┬─mon─┐ │ 2100162604 │ 1 │ │ 1969757069 │ 2 │ │ 2081371530 │ 3 │ │ 2156878512 │ 4 │ │ 2476890621 │ 5 │ │ 2526662896 │ 6 │ │ 2489723244 │ 7 │ │ 2480356358 │ 8 │ │ 2522746544 │ 9 │ │ 2614372352 │ 10 │ └────────────┴─────┘ 10 rows in set. Elapsed: 37.450 sec. Processed 23.37 billion rows, 46.74 GB (623.97 million rows/s., 1.25 GB/s.)


spark-sql> select count(*), month(dt) as mon from wikistat where year(dt)=2008 and month(dt) between 1 and 10 group by month(dt) order by month(dt); 2100162604 1 1969757069 2 2081371530 3 2156878512 4 2476890621 5 2526662896 6 2489723244 7 2480356358 8 2522746544 9 2614372352 10 Time taken: 792.552 seconds, Fetched 10 row(s)

Query 3

SELECT path, count(*), sum(hits) AS sum_hits, round(sum(hits) / count(*), 2) AS hit_ratio FROM wikistat WHERE project = 'en' GROUP BY path ORDER BY sum_hits DESC LIMIT 100;


:) SELECT :-] path, :-] count(*), :-] sum(hits) AS sum_hits, :-] round(sum(hits) / count(*), 2) AS hit_ratio :-] FROM wikistat :-] WHERE (project = 'en') :-] GROUP BY path :-] ORDER BY sum_hits DESC :-] LIMIT 100; SELECT path, count(*), sum(hits) AS sum_hits, round(sum(hits) / count(*), 2) AS hit_ratio FROM wikistat WHERE project = 'en' GROUP BY path ORDER BY sum_hits DESC LIMIT 100 ┌─path────────────────────────────────────────────────┬─count()─┬───sum_hits─┬─hit_ratio─┐ │ Special:Search │ 44795 │ 4544605711 │ 101453.41 │ │ Main_Page │ 31930 │ 2115896977 │ 66266.74 │ │ Special:Random │ 30159 │ 533830534 │ 17700.54 │ │ Wiki │ 10237 │ 40488416 │ 3955.11 │ │ Special:Watchlist │ 38206 │ 37200069 │ 973.67 │ │ YouTube │ 9960 │ 34349804 │ 3448.78 │ │ Special:Randompage │ 8085 │ 28959624 │ 3581.9 │ │ Special:AutoLogin │ 34413 │ 24436845 │ 710.11 │ │ Facebook │ 7153 │ 18263353 │ 2553.24 │ │ Wikipedia │ 23732 │ 17848385 │ 752.08 │ │ Barack_Obama │ 13832 │ 16965775 │ 1226.56 │ │ index.html │ 6658 │ 16921583 │ 2541.54 │ … 100 rows in set. Elapsed: 398.550 sec. Processed 26.88 billion rows, 1.24 TB (67.45 million rows/s., 3.10 GB/s.)


spark-sql> SELECT > path, > count(*), > sum(hits) AS sum_hits, > round(sum(hits) / count(*), 2) AS hit_ratio > FROM wikistat > WHERE (project = 'en') > GROUP BY path > ORDER BY sum_hits DESC > LIMIT 100; ... Time taken: 2522.903 seconds, Fetched 100 row(s)


Categories: MySQL

Percona Blog Poll: What Database Engine Are You Using to Store Time Series Data?

MySQL Performance Blog - Fri, 2017-02-10 15:19

Take Percona’s blog poll on what database engine you are using to store time series data.

Time series data is some of the most actionable data available when it comes to analyzing trends and making predictions. Simply put, time series data is data that is indexed not just by value, but by time as well – allowing you to view value changes over time as they occur. Obvious uses include the stock market, web traffic, user behavior, etc.

With the increasing number of smart devices in the Internet of Things (IoT), being able to track data over time is more and more important. With time series data, you can measure and make predictions on things like energy consumption, pH values, water consumption, data from environment-aware machines like smart cars, etc. The sensors used in IoT devices and systems generate huge amounts of time-series data.

How is all of this data collected, segmented and stored? We’d like to hear from you: what database engine are you using to store time series data? Please take a few seconds and answer the following poll. Which are you using? Help the community learn what database engines help solve critical database issues. Please select from one to three database engines as they apply to your environment. Feel free to add comments below if your engine isn’t listed.

Note: There is a poll embedded within this post, please visit the site to participate in this post's poll.
Categories: MySQL

Using NVMe Command Line Tools to Check NVMe Flash Health

MySQL Performance Blog - Thu, 2017-02-09 19:50

In this blog post, I’ll look at the types of NVMe flash health information you can get from using the NVMe command line tools.

Checking SATA-based drive health is easy. Whether it’s an SSD or older spinning drive, you can use the smartctl command to get a wealth of information about the device’s performance and health. As an example:

root@blinky:/var/lib/mysql# smartctl -A /dev/sda smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.4.0-62-generic] (local build) Copyright (C) 2002-16, Bruce Allen, Christian Franke, === START OF READ SMART DATA SECTION === SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE  1 Raw_Read_Error_Rate     0x002f   100   100   000    Pre-fail  Always       -       0  5 Reallocated_Sector_Ct   0x0032   100   100   010    Old_age   Always       -       0  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       41 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       2 171 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       0 172 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       0 173 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       1 174 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       0 183 Runtime_Bad_Block       0x0032   100   100   000    Old_age   Always       -       0 184 End-to-End_Error        0x0032   100   100   000    Old_age   Always       -       0 187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0 194 Temperature_Celsius     0x0022   065   059   000    Old_age   Always       -       35 (Min/Max 21/41) 196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       0 197 Current_Pending_Sector  0x0032   100   100   000    Old_age   Always       -       0 198 Offline_Uncorrectable   0x0030   100   100   000    Old_age   Offline      -       0 199 UDMA_CRC_Error_Count    0x0032   100   100   000    Old_age   Always       -       0 202 Unknown_SSD_Attribute   0x0030   100   100   001    Old_age   Offline      -       0 206 Unknown_SSD_Attribute   0x000e   100   100   000    Old_age   Always       -       0 246 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       145599393 247 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       4550280 248 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       582524 180 Unused_Rsvd_Blk_Cnt_Tot 0x0033   000   000   000    Pre-fail  Always       -       1260 210 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -       0

While smartctl might not know all vendor-specific smart values, typically you can Google the drive model along with “smart attributes” and find documents like this to get more details.

If you move to newer generation NVMe-based flash storage, smartctl won’t work anymore – at least it doesn’t work for the packages available for Ubuntu 16.04 (what I’m running). It looks like support for NVMe in Smartmontools is coming, and it would be great to get a single tool that supports both  SATA and NVMe flash storage.

In the meantime, you can use the nvme tool available from the nvme-cli package. It provides some basic information for NVMe devices.

To get information about the NVMe devices installed:

root@alex:~# nvme list Node             SN                   Model                                    Version  Namespace Usage                      Format           FW Rev ---------------- -------------------- ---------------------------------------- -------- --------- -------------------------- ---------------- -------- /dev/nvme0n1     S3EVNCAHB01861F      Samsung SSD 960 PRO 1TB                  1.2      1         689.63  GB /   1.02  TB    512   B +  0 B   1B6QCXP7

To get SMART information:

root@alex:~# nvme smart-log /dev/nvme0 Smart Log for NVME device:nvme0 namespace-id:ffffffff critical_warning                    : 0 temperature                         : 34 C available_spare                     : 100% available_spare_threshold           : 10% percentage_used                     : 0% data_units_read                     : 3,465,389 data_units_written                  : 9,014,689 host_read_commands                  : 89,719,366 host_write_commands                 : 134,671,295 controller_busy_time                : 310 power_cycles                        : 11 power_on_hours                      : 21 unsafe_shutdowns                    : 8 media_errors                        : 0 num_err_log_entries                 : 1 Warning Temperature Time            : 0 Critical Composite Temperature Time : 0 Temperature Sensor 1                : 34 C Temperature Sensor 2                : 47 C Temperature Sensor 3                : 0 C Temperature Sensor 4                : 0 C Temperature Sensor 5                : 0 C Temperature Sensor 6                : 0 C

To get additional SMART information (not all devices support it):

root@ts140i:/home/pz/workloads/1m# nvme smart-log-add /dev/nvme0 Additional Smart Log for NVME device:nvme0 namespace-id:ffffffff key                               normalized raw program_fail_count              : 100%       0 erase_fail_count                : 100%       0 wear_leveling                   :  62%       min: 1114, max: 1161, avg: 1134 end_to_end_error_detection_count: 100%       0 crc_error_count                 : 100%       0 timed_workload_media_wear       : 100%       37.941% timed_workload_host_reads       : 100%       51% timed_workload_timer            : 100%       446008 min thermal_throttle_status         : 100%       0%, cnt: 0 retry_buffer_overflow_count     : 100%       0 pll_lock_loss_count             : 100%       0 nand_bytes_written              : 100%       sectors: 16185227 host_bytes_written              : 100%       sectors: 6405605

Some of this information is self-explanatory, and some of it isn’t. After looking at the NVMe specification document, here is my read on some of the data:

Available Spare. Contains a normalized percentage (0 to 100%) of the remaining spare capacity that is available.

Available Spare Threshold. When the Available Spare capacity falls below the threshold indicated in this field, an asynchronous event completion can occur. The value is indicated as a normalized percentage (0 to 100%).

(Note: I’m not quite sure what the practical meaning of “asynchronous event completion” is, but it looks like something to avoid!)

Percentage Used. Contains a vendor specific estimate of the percentage of the NVM subsystem life used, based on actual usage and the manufacturer’s prediction of NVM life.

(Note: the number can be more than 100% if you’re using storage for longer than its planned life.)

Data Units Read/Data Units Written. This is the number of 512-byte data units that are read/written, but it is measured in an unusual way. The first value corresponds to 1000 of the 512-byte units. So you can multiply this value by 512000 to get value in bytes. It does not include meta-data accesses.

Host Read/Write Commands. The number of commands of the appropriate type issued. Using this value, as well as one below, you can compute the average IO size for “physical” reads and writes.

Controller Busy Time. Time in minutes that the controller was busy servicing commands. This can be used to gauge long-term storage load trends.

Unsafe Shutdowns. The number of times a power loss happened without a shutdown notification being sent. Depending on the NVMe device you’re using, an unsafe shutdown might corrupt user data.

Warning Temperature Time/Critical Temperature Time. The time in minutes a device operated above a warning or critical temperature. It should be zeroes.

Wear_Leveling. This shows how much of the rated cell life was used, as well as the min/max/avg write count for different cells. In this case, it looks like the cells are rated for 1800 writes and about 1100 on average were used

Timed Workload Media Wear. The media wear by the current “workload.” This device allows you to measure some statistics from the time you reset them (called the “workload”) in addition to showing the device lifetime values.

Timed Workload Host Reads. The percentage of IO operations that were reads (since the workload timer was reset).

Thermal Throttle Status. This shows if the device is throttled due to overheating, and when there were throttling events in the past.

Nand Bytes Written. The bytes written to NAND cells. For this device, the measured unit seems to be in 32MB values. It might be different for other devices.

Host Bytes Written. The bytes written to the NVMe storage from the system. This unit also is in 32MB values. The scale of these values is not very important, as they are the most helpful for finding the write amplification of your workload. This ratio is measured in writes to NAND and writes to HOST. For this example, the Write Amplification Factor (WAF) is 16185227 / 6405605 = 2.53  

As you can see, the NVMe command line tools provide a lot of good information for understanding the health and performance of NVMe devices. You don’t need to use vendor specific tools (like isdct).

Categories: MySQL

Percona Server 5.6.35-80.0 is Now Available

MySQL Performance Blog - Wed, 2017-02-08 18:09

Percona announces the release of Percona Server 5.6.35-80.0 on February 8, 2017. Download the latest version from the Percona web site or the Percona Software Repositories.

Based on MySQL 5.6.35, and including all the bug fixes in it, Percona Server 5.6.35-80.0 is the current GA release in the Percona Server 5.6 series. Percona Server is open-source and free – this is the latest release of our enhanced, drop-in replacement for MySQL. Complete details of this release are available in the 5.6.35-80.0 milestone on Launchpad.

New Features:
  • Kill Idle Transactions feature has been re-implemented by setting a connection socket read timeout value instead of periodically scanning the internal InnoDB transaction list. This makes the feature applicable to any transactional storage engine, such as TokuDB, and, in future, MyRocks. This re-implementation is also addressing some existing bugs, including server crashes: #1166744, #1179136, #907719, and #1369373.
Bugs Fixed:
  • Logical row counts for TokuDB tables could get inaccurate over time. Bug fixed #1651844 (#732).
  • Repeated execution of SET STATEMENT ... FOR SELECT FROM view could lead to a server crash. Bug fixed #1392375.
  • CREATE TEMPORARY TABLE would create a transaction in binary log on a read-only server. Bug fixed #1539504 (upstream #83003).
  • If temporary tables from CREATE TABLE ... AS SELECT contained compressed attributes it could lead to a server crash. Bug fixed #1633957.
  • Using the per-query variable statement with subquery temporary tables could cause a memory leak. Bug fixed #1635927.
  • Fixed new compilation warnings with GCC 6. Bugs fixed #1641612 and #1644183.
  • A server could crash if a bitmap write I/O error happens in the background log tracking thread while a FLUSH CHANGED_PAGE_BITMAPS is executing concurrently. Bug fixed #1651656.
  • TokuDB was using the wrong function to calculate free space in data files. Bug fixed #1656022 (#1033).
  • CONCURRENT_CONNECTIONS column in the USER_STATISTICS table was showing incorrect values. Bug fixed #728082.
  • InnoDB index dives did not detect some of the concurrent tree changes, which could return bogus estimates. Bug fixed #1625151 (upstream #84366).
  • INFORMATION_SCHEMA.INNODB_CHANGED_PAGES queries would needlessly read potentially incomplete bitmap data past the needed LSN range. Bug fixed #1625466.
  • Percona Server cmake compiler would always attempt to build RocksDB even if -DWITHOUT_ROCKSDB=1 argument was specified. Bug fixed #1638455.
  • Adding COMPRESSED attributes to InnoDB special tables fields (like mysql.innodb_index_stats and mysql.innodb_table_stats) could lead to server crashes. Bug fixed #1640810.
  • Lack of free pages in the buffer pool is not diagnosed with innodb_empty_free_list_algorithm set to backoff (which is the default). Bug fixed #1657026.
  • mysqld_safe now limits the use of rm and chown to avoid privilege escalation. chown can now be used only for /var/log directory. Bug fixed #1660265. Thanks to Dawid Golunski (
  • Renaming a TokuDB table to a non-existent database with tokudb_dir_per_db enabled would lead to a server crash. Bug fixed #1030.
  • Read Free Replication optimization could not be used for TokuDB partition tables. Bug fixed #1012.

Other bugs fixed: #1486747 (upstream #76872), #1633988, #1638198 (upstream #82823), #1638897, #1646384, #1647530, #1647741, #1651121, #1156772, #1644569, #1644583, #1648389, #1648737, #1650247, #1650256, #1650324, #1650450, #1655587, and #1647723.

Release notes for Percona Server 5.6.35-80.0 are available in the online documentation. Please report any bugs on the launchpad bug tracker.

Categories: MySQL

MySQL super_read_only Bugs

MySQL Performance Blog - Wed, 2017-02-08 15:23

This blog we describe an issue with MySQL 5.7’s super_read_only feature when used alongside with GTID in chained slave instances.


In MySQL 5.7.5 and onward introduced the gtid_executed table in the MySQL database to store every GTID. This allows slave instances to use the GTID feature regardless whether the binlog option is set or not. Here is an example of the rows in the gtid_executed table:

mysql> SELECT * FROM mysql.gtid_executed; +--------------------------------------+----------------+--------------+ | source_uuid | interval_start | interval_end | +--------------------------------------+----------------+--------------+ | 00005730-1111-1111-1111-111111111111 | 1 | 1 | | 00005730-1111-1111-1111-111111111111 | 2 | 2 | | 00005730-1111-1111-1111-111111111111 | 3 | 3 | | 00005730-1111-1111-1111-111111111111 | 4 | 4 | | 00005730-1111-1111-1111-111111111111 | 5 | 5 | | 00005730-1111-1111-1111-111111111111 | 6 | 6 | | 00005730-1111-1111-1111-111111111111 | 7 | 7 | | 00005730-1111-1111-1111-111111111111 | 8 | 8 | | 00005730-1111-1111-1111-111111111111 | 9 | 9 | | 00005730-1111-1111-1111-111111111111 | 10 | 10 | ...

To save space, this table needs to be compressed periodically by replacing GTIDs rows with a single row that represents that interval of identifiers. For example, the above GTIDs can be represented with the following row:

mysql> SELECT * FROM mysql.gtid_executed; +--------------------------------------+----------------+--------------+ | source_uuid | interval_start | interval_end | +--------------------------------------+----------------+--------------+ | 00005730-1111-1111-1111-111111111111 | 1 | 10 | ...

On the other hand, we have the super_read_only feature, if this option is set to ON, MySQL won’t allow any updates – even from users that have SUPER privileges. It was first implemented on WebscaleSQL and later ported to Percona Server 5.6. MySQL mainstream code implemented a similar feature in version 5.7.8.

The Issue [1]

MySQL’s super_read_only feature won’t allow the compression of the mysql.gtid_executed table. If a high number of transactions run on the master instance, it causes the gtid_executed table to grow to a considerable size. Let’s see an example.

I’m going to use the MySQL Sandbox to quickly setup a Master/Slave configuration, and sysbench to simulate a high number of transactions on master instance.

First, set up replication using GTID:

make_replication_sandbox --sandbox_base_port=5730 /opt/mysql/5.7.17 --how_many_nodes=1 --gtid

Next, set up the variables for a chained slave instance:

echo "super_read_only=ON" >> node1/my.sandbox.cnf echo "log_slave_updates=ON" >> node1/my.sandbox.cnf node1/restart

Now, generate a high number of transactions:

sysbench --test=oltp.lua --mysql-socket=/tmp/mysql_sandbox5730.sock --report-interval=1 --oltp-tables-count=100000 --oltp-table-size=100 --max-time=1800 --oltp-read-only=off --max-requests=0 --num-threads=8 --rand-type=uniform --db-driver=mysql --mysql-user=msandbox --mysql-password=msandbox --mysql-db=test prepare

After running sysbench for awhile, we check that the number of rows in the gtid_executed table is increasing faster:

slave1 [localhost] {msandbox} ((none)) &gt; select count(*) from mysql.gtid_executed ; +----------+ | count(*) | +----------+ | 300038 | +----------+ 1 row in set (0.00 sec)

By reviewing SHOW ENGINE INNODB STATUS, we can find a compression thread running and trying to compress the gtid_executed table.

---TRANSACTION 4192571, ACTIVE 0 sec fetching rows mysql tables in use 1, locked 1 9 lock struct(s), heap size 1136, 1533 row lock(s), undo log entries 1525 MySQL thread id 4, OS thread handle 139671027824384, query id 0 Compressing gtid_executed table

This thread runs and takes ages to complete (or may never complete). It has been reported as #84332.

The Issue [2]

What happens if you have to stop MySQL while the thread compressing the gtid_executed table is running? In this special case, if you run the flush-logs command before or at the same time as mysqladmin shutdown, MySQL will actually stop accepting connections (all new connections hang waiting for the server) and will start to wait for the thread compressing the gtid_executed table to complete its work. Below is an example.

First, execute the flush logs command and obtain ERROR 1290:

$ mysql -h -P 5731 -u msandbox -pmsandbox -e "flush logs ;" ERROR 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement

We’ve tried to shutdown the instance, but it hangs:

$ mysqladmin -h -P 5731 -u msandbox -pmsandbox shutdown ^CWarning; Aborted waiting on pid file: '' after 175 seconds

This bug has been reported and verified as #84597.

The Workaround

If you already have an established connection to your database with SUPER privileges, you can disable the super_read_only feature dynamically. Once that is done, the pending thread compressing the gtid_executed table completes its work and the shutdown finishes successfully. Below is an example.

We check rows in the gtid_executed table:

$ mysql -h -P 5731 -u msandbox -pmsandbox -e "select count(*) from mysql.gtid_executed ;" +----------+ | count(*) | +----------+ | 300038 | +----------+

We disable the super_read_only feature on an already established connection:

$ mysql> set global super_read_only=OFF ;

We check the rows in the gtid_executed table again, verifying that the compress thread ran successfully.

$ mysql -h -P 5731 -u msandbox -pmsandbox -e "select count(*) from mysql.gtid_executed ;" +----------+ | count(*) | +----------+ | 1 | +----------+

Now we can shutdown the instance without issues:

$ mysqladmin -h -P 5731 -u msandbox -pmsandbox shutdown

You can disable the super_read_only feature before you shutdown the instance to compress the gtid_executed table. If you ran into bug above, and don’t have any established connections to your database, the only way to shutdown the server is by issuing a kill -9 on the mysqld process.


As shown in this blog post, some of the mechanics of MySQL 5.7’s super_read_only command are not working as expected. This can prevent some administrative operations, like shutdown, from happening.

If you are using the super_read_only feature on MySQL 5.7.17 or older, including Percona Server 5.7.16 or older (which ports the mainstream implementation – unlike Percona Server 5.6, which ported Webscale’s super_read_only implementation) don’t use FLUSH LOGS.

Categories: MySQL

Percona Monitoring and Management 1.1.0 Beta is Now Available

MySQL Performance Blog - Tue, 2017-02-07 22:31

Percona announces the release of Percona Monitoring and Management 1.1.0 Beta on February 7, 2017. This is the first beta in the PMM 1.1 series with a focus on providing alternative deployment options for PMM Server:

The instructions for installing Percona Monitoring and Management 1.1.0 Beta are available in the documentation. Detailed release notes are available here.

New in PMM Server:

  • Grafana 4.1.1
  • Prometheus 1.5.0
  • Consul 0.7.3
  • Updated the MongoDB ReplSet dashboard to show the storage engine used by the instance
  • PMM-551: Fixed QAN changing query format when a time-based filter was applied to the digest

New in PMM Client:

  • PMM-530: Fixed pmm-admin to support special characters in passwords
  • Added displaying of original error message in pmm-admin config output

Known Issues:

  • Several of the MongoDB RocksDB metrics do not display correctly. This issue will be resolved in the production release.

A live demo of PMM is available at

We welcome your feedback and questions on our PMM forum.

About Percona Monitoring and Management
Percona Monitoring and Management is an open-source platform for managing and monitoring MySQL and MongoDB performance. Percona developed it in collaboration with experts in the field of managed database services, support and consulting.

PMM is a free and open-source solution that you can run in your own environment for maximum security and reliability. It provides thorough time-based analysis for MySQL and MongoDB servers to ensure that your data works as efficiently as possible.

Categories: MySQL

Overview of Different MySQL Replication Solutions

MySQL Performance Blog - Tue, 2017-02-07 15:04

In this blog post, I will review some of the MySQL replication concepts that are part of the MySQL environment (and Percona Server for MySQL specifically). I will also try to clarify some of the misconceptions people have about replication.

Since I’ve been working on the Solution Engineering team, I’ve noticed that – although information is plentiful – replication is often misunderstood or incompletely understood.

So What is Replication?

Replication guarantees information gets copied and purposely populated into another environment, instead of only stored in one location (based on the transactions of the source environment).

The idea is to use secondary servers on your infrastructure for either reads or other administrative solutions. The below diagram shows an example of a MySQL replication environment.


Fine, But What Choices Do I Have in MySQL?

You actually have several different choices:

Standard asynchronous replication

Asynchronous replication means that the transaction is completed on the local environment completely, and is not influenced by the replication slaves themselves.

After completion of its changes, the master populates the binary log with the data modification or the actual statement (the difference between row-based replication or statement-based replication – more on this later). This dump thread reads the binary log and sends it to the slave IO thread. The slave places it in its own preprocessing queue (called a relay log) using its IO thread.

The slave executes each change on the slave’s database using the SQL thread.


Semi-synchronous replication

Semi-synchronous replication means that the slave and the master communicate with each other to guarantee the correct transfer of the transaction. The master only populates the binlog and continues its session if one of the slaves provides confirmation that the transaction was properly placed in one of the slave’s relay log.

Semi-synchronous replication guarantees that a transaction is correctly copied, but it does not guarantee that the commit on the slave actually takes place.

Important to note is that semi-sync replication makes sure that the master waits to continue processing transactions in a specific session until at least one of the slaves has ACKed the reception of the transaction (or reaches a timeout). This differs from asynchronous replication, as semi-sync allows for additional data integrity.

Keep in mind that semi-synchronous replication impacts performance because it needs to wait for the round trip of the actual ACK from the slave.

Group Replication

This is a new concept introduced in the MySQL Community Edition 5.7, and was GA’ed in MySQL 5.7.17. It’s a rather new plugin build for virtual synchronous replication.

Whenever a transaction is executed on a node, the plugin tries to get consensus with the other nodes before returning it completed back to the client. Although the solution is a completely different concept compared to standard MySQL replication, it is based on the generation and handling of log events using the binlog.

Below is an example architecture for Group Replication.

If Group Replication interests you, read the following blog posts:

There will be a tutorial at the Percona Live Open Source Database Conference in Santa Clara in April, 2017.

Percona XtraDB Cluster / Galera Cluster

Another solution that allows you to replicate information to other nodes is Percona XtraDB Cluster. This solution focuses on delivering consistency, and also uses a certification process to guarantee that transactions avoid conflicts and are performed correctly.

In this case, we are talking about a clustered solution. Each environment is subject to the same data, and there is communication in-between nodes to guarantee consistency.

Percona XtraDB Cluster has multiple components:

  • Percona Server for MySQL
  • Percona XtraBackup for performing snapshots of the running cluster (if recovering or adding a node).
  • wsrep patches / Galera Library

This solution is virtually synchronous, which is comparable to Group Replication. However, it also has the capability to use multi-master replication. Solutions like Percona XtraDB Cluster are a component to improve the availability of your database infrastructure.

A tutorial on Percona XtraDB Cluster will be given at the Percona Live Open Source Database Conference in Santa Clara in April 2017.

Row-Based Replication Vs. Statement-Based Replication

With statement-based replication, the SQL query itself is written to the binary log. For example, the exact same INSERT/UPDATE/DELETE statements are executed by the slave.

There are many advantages and disadvantages to this system:

  • Auditing the database is much easier as the actual statements are logged in the binary log
  • Less data is transfered over the wire
  • Non-deterministic queries can create actual havoc in the slave environment
  • There might be a performance disadvantage, with some queries using statement-based replication (INSERT based on SELECT)
  • Statement-based replication is slower due to SQL optimizing and execution

Row-based replication is the default choice since MySQL 5.7.7, and it has many advantages. The row changes are logged in the binary log, and it does not require context information. This removes the impact of non-deterministic queries.

Some additional advantages are:

  • Performance improvements with high concurrency queries containing few row changes
  • Significant data-consistency improvement

And, of course, some disadvantages:

  • Network traffic can be significantly larger if you have queries that modify a large number of rows
  • It’s more difficult to audit the changes on the database
  • Row-based replication can be slower than statement-based replication in some cases
Some Misconceptions About Replication Replication is a cluster.

Standard asynchronous replication is not a synchronous cluster. Keep in mind that standard and semi-synchronous replication do not guarantee that the environments are serving the same dataset. This is different when using Percona XtraDB Cluster, where every server actually needs to process each change. If not, the impacted node is removed from the cluster. Asynchronous replication does not have this fail safe. It still accepts reads while in an inconsistent state.

Replication sounds perfect, I can use this as a manual failover solution.

Theoretically, the environments should be comparable. However, there are many parameters influencing the efficiency and consistency of the data transfer. As long as you use asynchronous replication, there is no guarantee that the transaction correctly took place. You can circumvent this by enhancing the durability of the configuration, but this comes at a performance cost. You can verify the consistency of your master and slaves using the pt-table-checksum tool.

I have replication, so I actually don’t need backups.

Replication is a great solution for having an accessible copy of the dataset (e.g., reporting issues, read queries, generating backups). This is not a backup solution, however. Having an offsite backup provides you with the certainty that you can rebuild your environment in the case of any major disasters, user error or other reasons (remember the Bobby Tables comic). Some people use delayed slaves. However, even delayed slaves are not a replacement for proper disaster recovery procedures.

I have replication, so the environment will now load balance the transactions.

Although you’ve potentially improved the availability of your environment by having a secondary instance running with the same dataset, you still might need to point the read queries towards the slaves and the write queries to the master. You can use proxy tools, or define this functionality in your own application.

Replication will slow down my master significantly.

Replication has only minor performance impacts on your master. Peter Zaitsev has an interesting post on this here, which discusses the potential impact of slaves on the master. Keep in mind that writing to the binary log can potentially impact performance, especially if you have a lot of small transactions that are then dumped and received by multiple slaves.

There are, of course, many other parameters that might impact the performance of the actual master and slave setup.

Categories: MySQL
Syndicate content