MySQL

Percona Live MySQL Conference Call for Proposals closes Nov. 16

MySQL Performance Blog - Tue, 2014-11-11 17:40

The Percona Live MySQL Conference in Silicon Valley (April 13-16, 2015) features a full day of tutorial sessions and three full days with keynote talks and breakout sessions from leading MySQL experts from around the world. Birds of a Feather sessions, two networking receptions, and a community dinner all offer great opportunities to network with the MySQL community. To make the Percona Live MySQL Conference 2015 even better than the great 2014 event, we need outstanding content from community members.

Submit your speaking proposals now for tutorials or breakout session on the Percona Live conference website. Accepted speakers receive a full conference pass and the sincere gratitude of attendees who will eagerly await your presentations. The Percona Live MySQL Conference call for proposals closes November 16.

We are currently seeking proposals for both breakout and tutorial sessions on the following topics:

  • High Availability
  • DevOps
  • Programming
  • Performance Optimization
  • Replication & Backup
  • MySQL in the Cloud
  • MySQL and NoSQL
  • MySQL Case Studies
  • Security
  • What’s New in MySQL

All submissions will be reviewed by the outstanding Conference Committee of community members:

  • Shlomi Noach from Outbrain
  • Tamar Bercovici from Box
  • Colin Charles from MariaDB
  • Sean Chighizola from Big Fish Games
  • Jeremy Cole from Google
  • Harrison Fisk from Facebook
  • Patrick Galbraith from HP
  • Jay Janssen from Percona
  • Chris Schneider from Ning.com
  • John Scott from Wellcentive
  • Gwen Shapira from Cloudera
  • Shivinder Singh from Verizon
  • Calvin Sun from Twitter
  • Morgan Tocker from Oracle
  • Peter Zaitsev from Percona

Submit your proposals now by visiting the Percona Live MySQL Conference 2015 website. The submission deadline is November 16.

If you plan to attend the conference but won’t be submitting, register now to get the lowest available rate. Visit the Percona Live MySQL Conference 2015 website for full details.

The post Percona Live MySQL Conference Call for Proposals closes Nov. 16 appeared first on MySQL Performance Blog.

Categories: MySQL

OpenStack Live Call for Proposals closes November 16

MySQL Performance Blog - Tue, 2014-11-11 15:20

The OpenStack Live conference in Silicon Valley (April 13-14, 2015) will emphasize the essential elements of making OpenStack perform better with emphasis on the critical role of MySQL and Trove. If you use OpenStack and have a story to share or a skill to teach, we encourage you to submit a speaking proposal for a breakout or tutorial session. The OpenStack Live call for proposals is your chance to put your ideas, case studies, best practices and technical knowledge in front of an intelligent, engaged audience of OpenStack Users. If you are selected as a speaker, you will receive one complimentary full conference pass. November 16th is the last day to submit.

We are seeking submissions for both breakout and tutorial sessions on the following topics:

  • Performance Optimization of OpenStack
  • OpenStack Operations
  • OpenStack Trove
  • Replication and Backup for OpenStack
  • High Availability for OpenStack
  • OpenStack User Stories
  • Monitoring and Tools for OpenStack

All submissions will be reviewed by our highly qualified Conference Committee:

  • Mark Atwood from HP
  • Rich Bowen from Red Hat
  • Andrew Mitty from Comcast
  • Jason Rouault from Time Warner
  • Peter Boros from Percona

If you don’t plan to submit a speaking proposal, now is a great time to purchase your ticket at the low Super Saver rates. Visit the OpenStack Live 2015 conference website for full details.

The post OpenStack Live Call for Proposals closes November 16 appeared first on MySQL Performance Blog.

Categories: MySQL

Impressions from MongoDB Day London 2014

MySQL Performance Blog - Mon, 2014-11-10 15:36

I visited MongoDB Day in London on November 6. Here are a few observations:

App-Developer Centric. It is interesting to see how much MongoDB is about developers; the ops side is something which is a necessary evil developers have to deal with. The ops topics covered in principle that there are no topics about choices of operating systems or hardware for MongoDB beyond flash and more memory.

Development Stacks. Being application centric there was good coverage of the MongoDB-powered stacks – MEAN and METEOR specifically got attention. Especially the METEOR presentation by Henrik Ingo was cool – real-time view synchronization between the Web browser (or mobile app) and database as well as the same language for server-side and client-side development is a really great concept. Though Henrik did not got into detail on how well it scales besides mentioning that it does not work with Sharded MongoDB at this point.

Sharding Focused. Where in the MySQL world the approach is what which applications can get by with without sharding, MongoDB shards almost everything – often employing multiple instances of MongoDB daemon running on the same OS image (on different ports). It is acknowledged that MongoDB does not scale up very well with database-level locking, though this is about to change.

Cluster aware connector. Where in the MySQL world the traditional API is to connect to a single node, in the MongoDB world you often connect to a “cluster” containing many replica sets with data sharded across them. This is really good as this means you do not have to try to emulate the single server with a cluster (especially have single highly available IP).

Pluggable Storage Engines. This was the big thing at this show with its being a top topic of the keynote as well as in-depth sessions. Unlike in MySQL, the MongoDB storage engine is chosen for the whole instance and not one collection/table. This is a transformational time for MongoDB with both the core storage engine being worked on to get document-level locking and the “Wired Tiger” storage engine being added as a write-optimized storage engine option. Hopefully MongoDB is acting to protect themselves from something akin to Innodb-Oracle fiasco in the MySQL space.

MMS. The MongoDB Monitoring Service, which now includes Backups and Deployments, was showcased a lot as an answer to all problems in the MongoDB space. There was a lot of work put into this product, including a really advanced configurator for Amazon Web Services deployments where you can configure many instance properties directly.

Not much hardcore details. I will end basically where I’ve started – there was not much detail for those longing for details. I’m not an expert in MongoDB (yet) so relatively more basic level detail about how exactly MongoDB operates would suffice, but it was not covered. The most helpful for me were some side conversations where I would hear about things like challenges with adding elements to the large array that are part of the document or whether there are any powerful optimization of Covering Indexes that exists for MongoDB as for MySQL. Perhaps this was because it was a smaller show and maybe the next MongoDB World event will have more of such in-depth content.

The post Impressions from MongoDB Day London 2014 appeared first on MySQL Performance Blog.

Categories: MySQL

Percona at AWS re:Invent and events around the world

MySQL Performance Blog - Mon, 2014-11-10 14:56

November is a busy month of events for Percona. Conferences in Moscow, London, Paris, Las Vegas, Buenos Aires, and Montevideo mean our experts are racking up the frequent flyer miles meeting users of Percona software from around the world, sharing our insights on MySQL performance and operations, and learning how others are growing and scaling their applications.

Percona started the month in Moscow attending and speaking at Highload++. This event from October 31 to November 1 welcomed thousands of experts from the biggest and brightest companies in Europe and around the world with a keen interest in developing large scale web applications.

From Moscow we traveled to London to host the annual Percona Live London event from November 3-4. If you missed the keynotes, tutorials, sessions, and general MySQL camaraderie, we’re meeting again April 13-16, 2015, in Santa Clara, CA, at the Percona Live MySQL Conference & Expo, the premier event for the rich and diverse MySQL ecosystem. Based on growing demand, we’ve expanded the conference with OpenStack Live, a unique event for learning key considerations for cloud strategies, how to improve overall performance, and operational best practices for managing OpenStack and maximizing the MySQL core. The Call for Papers for both Percona Live and OpenStack Live close soon so submit your speaker proposals today!

The next stop was Paris to attend, exhibit, and speak at the OpenStack Summit. This event, which ran from November 3-7, drew over 4,000 developers, users, and administrators of OpenStack cloud software seeking to learn about the technology and plan future development. Percona is pleased to play a part in the OpenStack community by participating in the Trove project for DBaaS as well as supporting those that need help with the MySQL core of the technology. In their keynote presentation, Time Warner Cable even featured Percona as a key partner helping them to radically change their business using OpenStack.

This week we’re in Las Vegas for AWS re:Invent at The Venetian running from November 11-14. AWS re:Invent will welcome thousands of developers and operations experts interested in maximizing the power of cloud infrastructure. We want to meet organizations that are creating applications on the Amazon Web Services platform using MySQL and Percona open source software.

Visit us in the Venetian Expo Hall at booth 553 by the Developer Lounge and tell us about your application. We’ll be located next to Heroku and the team from Atlassian. Stop by and enter for a chance to win a Kindle Fire HDX 8.9!

After Las Vegas, Percona MySQL experts will be speaking at two more events, both in South America.

At the MySQL, NoSQL and Cloud 2014 Latin America Conference, Percona Support Engineer Marcos Albe, Percona Consultants Francisco Bordenave and Daniel Guzman, and Senior Remote DBA Max Bubenick will speak on topics ranging from MySQL performance, TokuDB, and best operational practices for database backup and data recovery. At TechMeetup v2014, Percona Consultant Fernando Ipar will lead workshops focused on Big Data Analytics and using the Go programming language. Percona is sponsoring both of these events so visit us at our booths.

We’re looking forward to meeting many users of Percona software as well as some of our customers. If you’re at these events, keep a look out for our team in the Percona shirts. Remember to visit us at AWS re:Invent in Las Vegas this week at booth 553 and enter our drawing to win a Kindle Fire HDX 8.9!

Expect more posts soon about all of our presentations at these events so stay tuned.

The post Percona at AWS re:Invent and events around the world appeared first on MySQL Performance Blog.

Categories: MySQL

OpenStack MySQL HA Solution = Percona XtraDB Cluster: Webinar

MySQL Performance Blog - Thu, 2014-11-06 15:48

MySQL is the most popular and widely used open source database on the planet. OpenStack is fast evolving as the defacto open source cloud platform. OpenStack services such as Nova (Compute), Cinder (Storage), Neutron (Networking), to name just a few, all use MySQL as default datastore.

OpenStack is a cloud platform and therefore needs to be highly available. Understanding how to accomplish MySQL high availability is a key piece to making OpenStack highly available. This webinar will discuss MySQL high availability options for OpenStack with primary focus on high availability with Percona XtraDB Cluster/Galera. Tradeoffs with respect to other options as well as architectural considerations will be discussed.

This webinar is titled “Percona XtraDB Cluster as a MySQL HA Solution for Openstack” and will be next Wednesday, November 12 at 1 p.m. EST, 10 a.. PST.  Register free right here!

 

The post OpenStack MySQL HA Solution = Percona XtraDB Cluster: Webinar appeared first on MySQL Performance Blog.

Categories: MySQL

Data inconsistencies on MySQL replicas: Beyond pt-table-checksum

MySQL Performance Blog - Tue, 2014-11-04 08:00

Percona Toolkit’s pt-table-checksum is a great tool to find data inconsistencies between a MySQL master and its replicas. However it is sometimes not enough to know that there are inconsistencies and let pt-table-sync fix the issue: you may want to know which exact rows are different to identify the statements that created the inconsistency. This post shows one way to achieve that goal.

The issue

Let’s assume you have 2 servers running MySQL 5.5: db1 the master and db2 the replica. You want to upgrade to MySQL 5.6 using an in-place upgrade and to play safe, you will upgrade db2 (the slave) first. If all goes well you will promote it and upgrade db1.

A good thing to do after upgrading db2 is to check for potential data inconsistencies with pt-table-checksum. Once checksumming is done, you can run the following query on db2 to see if there is any data drift:

mysql> SELECT db, tbl, SUM(this_cnt) AS total_rows, COUNT(*) AS chunks FROM percona.checksums WHERE (master_cnt <> this_cnt OR master_crc <> this_crc OR ISNULL(master_crc) <> ISNULL(this_crc)) GROUP BY db, tbl; +------+-------+------------+--------+ | db | tbl | total_rows | chunks | +------+-------+------------+--------+ | mydb | items | 3745563 | 17 | +------+-------+------------+--------+

This indicates that inconsistencies can be found in mydb.items in 17 chunks. Now the question is: which rows are different on db1 and db2?

The solution

The previous query shows that we will find inconsistencies in 17 of the chunks pt-table-checksum used. But what is a chunk?

mysql> SELECT * FROM percona.checksums WHERE this_crc != master_crc AND tbl='items'G ***************** 1. row ***************** db: mydb tbl: items chunk: 28 chunk_time: 0.123122 chunk_index: PRIMARY lower_boundary: 7487511 upper_boundary: 7563474 this_crc: 2c11da8d this_cnt: 75964 master_crc: 66a1c22c master_cnt: 75964 ts: 2014-10-22 01:21:26 [...]

So the first chunk with inconsistencies is chunk #28, which is the set of rows where the primary key is >= 7487511 and <= 7563474.

Let’s export all these rows on db1 and db2 instance ::

# db1 mysql> SELECT * INTO outfile '/tmp/items_db1.txt' FROM mydb.items WHERE id BETWEEN 7487511 AND 7563474; # db2 mysql> SELECT * INTO outfile '/tmp/items_db2.txt' FROM mydb.items WHERE id BETWEEN 7487511 AND 7563474;

Then let’s use diff to isolate non-matching rows

# Using diff to compare rows # diff items_db1.txt items_db2.txt 75872,75874c75872,75874 < 7563382 2127002 3 0 2014-10-22 02:51:33 < 7563383 2127002 4 0 2014-10-22 02:51:33 < 7563384 2127002 5 0 2014-10-22 02:51:33 --- > 7563382 2127002 3 0 2014-10-22 02:51:34 > 7563383 2127002 4 0 2014-10-22 02:51:34 > 7563384 2127002 5 0 2014-10-22 02:51:34 [...]

We can see that some datetime fields are off by 1 second on the 5.6 instance.

In this case, the binlogs showed queries like:

INSERT INTO items ([...],posted_at) VALUES ([...],'2014-10-22 02:51:33.835249');

MySQL 5.5 rounds '2014-10-22 02:51:33.835249' to '2014-10-22 02:51:33' (ignoring the fractional part), while MySQL 5.6 rounds it to '2014-10-22 02:51:34'.

Now it’s easy to fix the application so that it works both with MySQL 5.5 and 5.6 and then continue testing MySQL 5.6.

Conclusion

The method shown above is an easy way to find the exact records that are inconsistent between the MySQL master and a replica. It is not useful if you only want to resync the slave (in this case, just run pt-table-sync) but it can be a first step in understanding how inconsistencies are created.

The post Data inconsistencies on MySQL replicas: Beyond pt-table-checksum appeared first on MySQL Performance Blog.

Categories: MySQL

Percona XtraBackup 2.2.6 is now available

MySQL Performance Blog - Mon, 2014-11-03 15:16

Percona is glad to announce the release of Percona XtraBackup 2.2.6 on November 3, 2014. Downloads are available from our download site here and Percona Software Repositories. We’re also happy to announce that Ubuntu 14.10 and CentOS 7 users can download, install, and upgrade Percona XtraBackup 2.2.6 from Percona’s software repositories.

Percona XtraBackup enables MySQL backups without blocking user queries, making it ideal for companies with large data sets and mission-critical applications that cannot tolerate long periods of downtime. Offered free as an open source solution, Percona XtraBackup drives down backup costs while providing unique features for MySQL backups.

New Features:

  • Percona XtraBackup now reads server options from SHOW VARIABLES rather than my.cnf configuration file.
  • Percona XtraBackup now has more verbose output during initial table scan (it will now print a diagnostic message before performing a tablespace scan, which may take a long time on systems with large numbers of tablespaces) and before starting a backup/apply-log/copy-back operation in innobackupex (it will now print a diagnostic message with a timestamp to make it easier for users to get duration of the operation.)

Bugs Fixed:

  • innobackupex didn’t take the default datadir, which caused backups to fail if the datadir wasn’t specified in the my.cnf configuration file. Bug fixed #936934.
  • innobackupex will now fail with an error when innobackupex –slave-info is used on a multi-threaded non-GTID slave, because Exec_Master_Log_Pos cannot be trusted for a multi-threaded slave. Bug fixed #1372679.
  • InnoDB log scanning failure (bug #60788) would cause backups to fail. Fixed by porting the fix from MySQL 5.7. Bug fixed #1375383.
  • Options --apply-log and --decompress weren’t marked as mutually exclusive, ie. if they were both specified, only --decompress would work, which could lead to MySQL instance being started with an unprepared backup. Fixed by making mutually exclusive categories of options: 1. --decompress, --decrypt; 2. --copy-back; 3. --move-back; 4. --apply-log. Bug fixed #1376874.
  • innobackupex wasn’t creating directories specified in innodb_data_home_dir and innodb_log_group_home_dir when --copy-back option was used. Bug fixed #1382347.
  • Percona XtraBackup now supports all option modifiers supported by upstream MySQL: skip, disable, enable, maximum, loose. Bug fixed #664128.
  • Percona XtraBackup would fail to perform a full backup on Percona Server 5.5 if innodb_log_file_size variable wasn’t set in the [mysqld] section of my.cnf. Bug fixed #1334062.

Other bugs fixed: #1379905, #1386013, #1072695, #1375241, #1182841, and #1343722.

Release notes with all the bugfixes for Percona XtraBackup 2.2.6 are available in our online documentation. Bugs can be reported on the launchpad bug tracker. Percona XtraBackup is an open source, free MySQL hot backup software that performs non-blocking backups for InnoDB and XtraDB databases.

The post Percona XtraBackup 2.2.6 is now available appeared first on MySQL Performance Blog.

Categories: MySQL

Percona Server 5.6.21-70.0 is now available

MySQL Performance Blog - Mon, 2014-11-03 15:14

Percona is glad to announce the release of Percona Server 5.6.21-70.0 on October 30, 2014. Download the latest version from the Percona web site or from the Percona Software Repositories. We’re also happy to announce that Ubuntu 14.10 and CentOS 7 users can download, install, and upgrade Percona Server 5.6 from Percona’s software repositories.

Based on MySQL 5.6.21, including all the bug fixes in it, Percona Server 5.6.21-70.0 is the current GA release in the Percona Server 5.6 series. Percona Server is open-source and free. Complete details of this release can be found in the 5.6.21-70.0 milestone on Launchpad.

New Features:

Bugs Fixed:

  • Values of IP and DB fields in the Audit Log Plugin were incorrect. Bug fixed #1379023.
  • Specifying the --malloc-lib during the server start would produce two LD_PRELOAD entries, if a system-wide jemalloc library was installed. Bug fixed #1382069.
  • In multi-threaded slave replication setup, an incomplete log event group (the one which doesn’t end with COMMIT/ROLLBACK/XID) in a relay log could have caused a replication stall. An incomplete log event group might occur as a result of one of the following events: 1) slave crash; 2) STOP SLAVE or FLUSH LOGS command issued at a specific moment; 3) server shutdown at a specific moment. Bug fixed #1331586 (upstream #73066).
  • Purging bitmaps exactly up to the last tracked LSN would abort XtraDB changed page tracking. Bug fixed #1382336.
  • mysql_install_db script would silently ignore any mysqld startup failures. Bug fixed #1382782 (upstream #74440).

Other bugs fixed: #1369950, #1335590, #1067103, and #1282599.

Release notes for Percona Server 5.6.21-70.0 are available in the online documentation. Please report any bugs on the launchpad bug tracker

The post Percona Server 5.6.21-70.0 is now available appeared first on MySQL Performance Blog.

Categories: MySQL

Backup and restore of MySQL to OpenStack Swift

MySQL Performance Blog - Sat, 2014-11-01 13:00

MySQL database usage is popular in OpenStack. Core OpenStack services for Compute (Nova), Storage (Cinder), Neutron (Networking), Image (Glance) and Identity (Keystone) all use MySQL database.

MySQL – as the world’s most popular database, runs inside OpenStack Virtual Machines and serves as database backend to OpenStack cloud based applications. The MySQL instances can be configured to run in virtual machines manually (by simply installing MySQL inside a VM and running it) or can be created in an on-demand fashion by OpenStack Database-as-a-Service (Trove).

In either case, the MySQL data is mission-critical. OpenStack cloud administrators and cloud guests/tenants need the ability to backup and restore their MySQL databases. mysqldump is traditional way of doing MySQL backups and restores. However, based on previous experiences of the MySQL community, it is widely known that mysqldump has limitations especially when it comes to speed of backup and restore as databases grow larger.

Percona XtraBackup overcomes these limitations and enables faster, non-blocking MySQL backups and restores on larger datasets. It also offers other important abilities such as streaming and encryption. It works with MySQL as well as all the variants such as Percona Server and MariaDB. And all this is open source and free. Percona XtraBackup is popular and widely used in the MySQL community. More recently it has seen adoption within OpenStack context. It is at the heart of backup/restore for Rackspace Cloud DbaaS. OpenStack Database-as-a-Service (Trove) implementation uses Percona XtraBackup as a pluggable backend for database backup and restore.

Until now, however, Percona XtraBackup did not have the ability to do streaming backup and restores of MySQL databases to cloud storage targets such as OpenStack Swift or Amazon Web Services S3. xbcloud tool is a new contribution from Percona that changes this. At a basic level, the tool enables MySQL backups to OpenStack Swift.

OpenStack Swift is the Object Service of OpenStack. It offers a high availability data storage platform that scales horizontally and offers disaster recovery of data across multiple datacenters and geographies. It is the default and popular choice for storing backups of infrastructure and tenant data in OpenStack clouds.

How does xbcloud tool work? xbcloud uploads and downloads full or part of xbstream archive to/from OpenStack Swift. So what is xbstream? xbstream is a streaming format available in Percona XtraBackup that overcomes some limitations of traditional archive formats such as tar, cpio and others which did not allow streaming dynamically generated files, for example dynamically compressed files.

Archive uploading will employ multipart upload for Large Objects on Swift. Along with this, the xbstream archive index file will be uploaded which contains list of files and their parts and offsets of those parts in xbstream archive. This index is needed for downloading only part of archive (one or several tables from backups) on demand.

Details of usage and examples follow.

Backup:

innobackupex --stream=xbstream /tmp | xbcloud [options] put <name>

Example:

innobackupex --stream=xbstream --extra-lsndir=/tmp /tmp | xbcloud put --storage=Swift --swift-container=test --swift-user=test:tester --swift-url=http://192.168.8.80:8080/ --swift-key=testing --parallel=10 full_backup

Restore:

xbcloud [options] get <name> [<list-of-files>] | xbstream -x

Example:

xbcloud get --storage=Swift --swift-container=test --swift-user=test:tester --swift-url=http://192.168.8.80:8080/ --swift-key=testing full_backup | xbstream -xv -C /tmp/downloaded_full innobackupex --apply-log /tmp/downloaded_full innobackupex --copy-back /tmp/downloaded_full

Interested? Try it out. Percona XtraBackup and xbcloud are open source and free. You can find instructions to download, usage and release notes for PXB 2.3 Alpha (which includes xbcloud) here.

So what is next? We at Percona are excited about the possibilities with this new xbcloud tool and Percona Xtrabackup in general. Our most immediate next steps would be to act on community feedback, address bugs and issues and move towards GA soon.

Beyond that we are looking to enhance xbcloud to support other cloud platforms such as AWS S3 and Google Cloud Engine.

We would love to hear from you on this. You could either leave feedback directly as comments to this post or can file enhancement requests and bugs here.

The post Backup and restore of MySQL to OpenStack Swift appeared first on MySQL Performance Blog.

Categories: MySQL

Get a handle on your HA at Percona Live London 2014

MySQL Performance Blog - Fri, 2014-10-31 05:00

From left: Liz van Dijk, Frédéric Descamps and Kenny Gryp

If you’re following this blog, it’s quite likely you’re already aware of the Percona Live London 2014 conference coming up in just a few days. Just in case, though (you know, if you’re still looking for an excuse to sign up), I wanted to put a spotlight on the tutorial to be delivered by my esteemed colleagues Frédéric Descamps (@lefred) and Kenny Gryp (@gryp), and myself.

The past two years at Percona we’ve been spending a substantial amount of time working with customers taking their first steps into creating Highly Available MySQL environments built on Galera. Percona XtraDB Cluster allows you to get it up and running very fast, but as any weathered “HA” DBA will tell you, building the cluster is only the beginning. (Percona XtraDB Cluster is an open source (free) high-availability and high-scalability solution for MySQL clustering.)

Any cluster technology is likely to introduce a great amount of complexity to your environment, and in our tutorial we want to show you not only how to get started, but also how to avoid many of the operational pitfalls we’ve encountered. Our tutorial, Percona XtraDB Cluster in a nutshell, will be taking place on Monday 3 November and is a full-day (6 hours) session, with an intense hands-on approach.

We’ll be covering a great deal of practical topics, such as:

  • Things to keep in mind when migrating an existing environment over to PXC
  • How to manage and maintain the cluster, keeping it in good shape
  • Load balancing requests across the cluster
  • Considerations for deploying PXC in the cloud

Planning on attending? Be sure to come prepared! Given the hands-on approach of the tutorial, make sure you bring your laptop with enough disk space (~20GB) and processing power to run at least 4 small VirtualBox VM’s.

We look forward to seeing you there!

The post Get a handle on your HA at Percona Live London 2014 appeared first on MySQL Performance Blog.

Categories: MySQL

Facebook MySQL database engineers ready for Percona Live London 2014

MySQL Performance Blog - Thu, 2014-10-30 05:00

With 1.28 billion active users, Facebook MySQL database engineers are active and extremely valuable contributors to the global MySQL community. So naturally they are also active participants of Percona Live MySQL conferences! And next week’s Percona Live London 2014 (Nov. 3-4) is no exception. (Register now and use the promotional code “Facebook” to save £30!)

I spoke with Facebook database engineers Yoshinori “Yoshi” Matsunobu and Shlomo Priymak about their upcoming sessions along with what’s new at Facebook since our last conversation back in April.

Tom: Yoshi, last year Facebook deployed MySQL 5.6 on all production environments – what have you and your team learned since doing that? And do you have a few best practices you could share? I realize you’ll be going into detail during your session in London (MySQL 5.6 and WebScaleSQL at Facebook), but maybe a few words on a couple of the bigger ones?

Yoshi: MySQL 5.6 has excellent replication enhancements to use in large-scale deployments. For example, crash safe slave makes it possible to recover without rebuilding a slave instance on server crash. This can greatly minimize slave downtime, especially if your database size is large. There are many other new features such as GTID, multi-threaded slave, streaming mysqlbinlog and we actively use them in production.

For InnoDB, Online DDL is a good example to ease operations. Many MySQL users are doing schema changes by switching masters. This can minimize downtime but requires operational efforts. Online DDL made things much easier.

Tom: Facebook is an active and extremely valuable part of the overall MySQL community and ecosystem – what are some of the key features and improvements you’ve contributed in the past year since moving to MySQL 5.6?

Yoshi: For InnoDB, I think online defragmentation and faster full table scan are the most valuable contributions from Facebook in 5.6. I have received very positive feedback about faster InnoDB full table scan (Logical ReadAhead). My colleague Rongrong will speak about something interesting regarding online defragmentation at Percona Live London. For Replication, we have done many optimizations to make GTID and MTS work without pain. Semi-Synchronous mysqlbinlog and backported Loss-Less semisync from MySQL 5.7 are very useful when you use Semi-Synchronous replication.

Tom: Shlomo, your sesson, “MySQL Automation at Facebook Scale,” will be of great interest to DBAs at large and growing organizations considering that Facebook has one of the world’s largest MySQL database clusters. What are the two or three most significant things that you’ve learned as a database engineer operating a cluster of this size? And has anything surprised you along the way (so far)?

Shlomo: This is a great question! We like to speak of “10x” at Facebook when thinking of scaling. For example, what would you do differently if the number of servers you had was 10x more than what it is? This type of mental exercise is surprisingly useful when working with systems at scale. If you, or any of the readers, try to extrapolate this about systems you manage, there will be things you’ll be imagining about how a system like this would be – and you won’t be too far from our reality in many aspects.

You’d imagine that we automate much of the single units of work, like master/slave failover, upgrades and schema changes. You’d suspect we have automated fault detection, self managing systems, good alarming and self remediation. You’d presume that if you’re used to running a command on 100 machines, you’ll now be running it on 1000. At least that’s what I thought to myself, so these are not the things that surprised me. There are a few fundamental shifts in one’s thinking when you get to these sizes, which I didn’t foresee.

The first one is that there is absolutely no such thing as “one-off.” If there is a server somewhere that hits a problem every three years, and you have 1000 servers, this will be happening daily! Take it to 10,000 servers, and you can see absolutely nothing is a “one-off”. We can’t write things off as “worst case, I’ll get an SMS.” Whatever it is, we have to chase it down and fix it. Not just that – to deploy a fix at scale can require writing fairly large amounts of code, a fix that could be deployed manually by a DBA in smaller environments.

The second one is adapting to constraints which are very pragmatic and tangible. If you’re on AWS, you’re pretty much isolated from things like worrying where your servers are physically located, when they go over their lifetime, and if the firmware on the switch in the rack needs to be upgraded. If you’re a small shop and have a few racks up in a co-lo, hardware maintenance is just not as frequent, but it becomes more painful as you grow.

At Facebook, we run our own datacenters! We need to work around interesting challenges, such as running datacenters that have highly variable compositions of server hardware. Since we have so many servers, something is always going on. Racks of servers need to be moved. Whole clusters need to be rebuilt or refreshed, to be made better, faster, stronger.  New datacenters are constructed, others decommissioned.

Tom: And this is where automation comes into the picture, right?

Shlomo: We have had to build a lot of automation to make these operations seamless, and we work closely with the Site Ops teams on the ground to coordinate these logistically complicated processes.

Another thing my team does in this space is planning capacity and hardware purchases. Since we build our own servers, the turnaround time between ordering and getting machines is quite long, so proper planning is paramount. Buy too much, and you’ve wasted millions of dollars. Buy too few servers, and there won’t be space for user growth and upcoming projects. The sheer scale makes these decisions more complicated and involved.

These things have actually made my job much more interesting, and I think I’d find it hard to adjust to a smaller environment.

Tom: Last April Facebook announced a move to the newly created WebScaleSQL. Yoshi, do you have an update on where WebScaleSQL is today? And I know it’s early, but has there been any impact on Facebook yet?

Yoshi: WebscaleSQL is a collaboration among engineers from several companies that face similar challenges in running MySQL at scale. Collaboration is nothing new to the MySQL community. The intent is to make this collaboration more efficient.

We are based on the latest upstream (currently MySQL-5.6.21), and added many features. We added patches to improve InnoDB performance around compression LRU flushing, locking, NUMA Support, and doublewrite. We statically link Semi-Sync based on lessons learned at Facebook environments (plugin-lock caused hot mutex contentions). We have many upcoming features such as async clients.

We will continue to track the upstream branch that is the latest, production-ready release (currently MySQL 5.6). We are continuing to push the generally useful changes we have from all of the participants.  If you think you have something to contribute, get in touch!

Tom: I remember being surprised earlier this year when you told me there was usually just one MySQL Operations team member on call at any given time thanks to “robots.” How many robots did your team build and what do they do? Oh, and should rank-and-file DBAs around the world be worried about losing their day jobs to these robots?

Shlomo: Instead of becoming obsolete as some fear, our team is shifting its focus from smaller to larger problems, as we rise higher in the levels of abstraction. Our team has progressed with the requirements of the role. From being a team of DBAs that automate some of their work, we have become more like Production Engineers. We design, write and maintain MySQL/Facebook-specific automation that does our work for us.

While we build these software “robots” to do our work, we also have to maintain them. The job of the oncall is to fix these robots when they malfunction, and that can sometimes be difficult due to the size of our codebase.

In regards to employment concerns, I’d say our work has become more interesting, and the amount has increased. It definitely did not decrease, so if Facebook is indicative of other companies, jobs are not at risk just yet. Speaking of jobs – if what we’re doing sounds interesting, we’re hiring!

Oh, and as for details about these “robots” – that’s the topic of my talk next week in London. Come and hear me speak if you want to know more!

Tom: Yoshi, you also will host a session titled “Fast Master Failover without Data Loss.” I don’t want to give too much away, but how did you get failover to work at scale – across vast datacenters?

Yoshi: Master failure is a norm at Facebook, because of the large amount of servers. Without automation, it is not realistic for a limited number of people to manage. We have a very interesting infrastructure to automate failure handling at Facebook scale. To automate stuff, reliability is important. Unreliable automation makes engineers spend lots of time fixing things manually, and that increases downtime. It is also important to define what to automate and what we shouldn’t automate. Define failure scenarios and write good test cases and continuously integrate. There are multiple failure scenarios like the ones below and you’ll hear about each in detail at my session:

– mysqld crash
– mysqld stalls
– kernel panic and reboot
– error spikes caused by H/W failure
– error spikes caused by bad application logic
– rack switch down
– multiple rack switches down
– datacenter down

Tom: What other sessions, keynotes or events are you looking forward to at Percona Live London 2014? And are you guys planning on attending the MySQL Community Dinner?

Yoshi:MySQL 5.7: Performance and Scalability Benchmark(led by Oracle MySQL performance architect Dimitri Kravtchuk). And yes, we’re looking forward to meeting with people at MySQL Community Dinner!

Tom:  Thanks again Yoshi and Shlomo for taking the time to speak with me and I look forward to seeing you both in London next week!

And readers, I invite you to register now for Percona Live London using the promotional code “Facebook” to save £30. I also hope to see you at the MySQL Community Dinner next Monday (Nov. 3). Space is limited so be sure to reserve your spot now and join us aboard our private double-decker bus to the restaurant.

I’d also like to thank the Percona Live London 2014 Conference Committee for putting together a terrific event this year! The conference committee includes:

  • Dailymotion’s Cédric Peintre, conference chairman
  • Percona’s David Busby
  • MariaDB’s Colin Charles
  • ebay Classifieds Group’s Luis Motta Campos
  • Booking.com’s Nicolai Plum
  • Oracle’s Morgan Tocker
  • Spil Games’ Art van Scheppingen

The post Facebook MySQL database engineers ready for Percona Live London 2014 appeared first on MySQL Performance Blog.

Categories: MySQL

MySQL and Openstack deep dive talk at OpenStack Paris Summit (and more!)

MySQL Performance Blog - Wed, 2014-10-29 12:59

I will present a benchmarking talk next week (Nov. 4) at the OpenStack Paris Summit with Jay Pipes from Mirantis. In order to be able to talk about benchmarking, we had to be able to set up and tear down OpenStack environments really quickly. For the benchmarks, we are using a deployment on AWS (ironically) where the instances aren’t actually started and the tenant network is not reachable but all the backend operations still happen.

The first performance bottleneck we hit wasn’t at the MySQL level. We used Rally to benchmark the environment. We started 1,000 fake instances with it at the first glance.

The first bottleneck that we saw was neutron-server eating up a single CPU core. We took a deeper look, and saw that neutron-server is utilizing a single core completely. By default, neutron does everything in a single process. After configuring the api workers and the rpc workers, performance became significantly better.

api_workers = 64 rpc_workers = 32

Before adding the options:

u'runner': {u'concurrency': 24, u'times': 1000, u'type': u'constant'}} +------------------+-----------+-----------+-----------+---------------+---------------+---------+-------+ | action | min (sec) | avg (sec) | max (sec) | 90 percentile | 95 percentile | success | count | +------------------+-----------+-----------+-----------+---------------+---------------+---------+-------+ | nova.boot_server | 4.125 | 9.336 | 15.547 | 11.795 | 12.362 | 100.0% | 1000 | | total | 4.126 | 9.336 | 15.547 | 11.795 | 12.362 | 100.0% | 1000 | +------------------+-----------+-----------+-----------+---------------+---------------+---------+-------+ Whole scenario time without context preparation: 391.359671831

After adding the options:

u'runner': {u'concurrency': 24, u'times': 1000, u'type': u'constant'}} +------------------+-----------+-----------+-----------+---------------+---------------+---------+-------+ | action | min (sec) | avg (sec) | max (sec) | 90 percentile | 95 percentile | success | count | +------------------+-----------+-----------+-----------+---------------+---------------+---------+-------+ | nova.boot_server | 2.821 | 6.958 | 36.826 | 8.165 | 10.49 | 100.0% | 1000 | | total | 2.821 | 6.958 | 36.826 | 8.165 | 10.49 | 100.0% | 1000 | +------------------+-----------+-----------+-----------+---------------+---------------+---------+-------+ Whole scenario time without context preparation: 292.163493156

Stop by our talk at the OpenStack Paris Summit for more details!

In addition to our talk, Percona has two additional speakers at the OpenStack Paris Summit. George Lorch, Percona software engineer, will speak with Vipul Sabhaya of the HP Cloud Platform Services team on “Percona Server Features for OpenStack and Trove Ops.” Tushar Katarki, Percona director of product management, will present a vBrownBag Tech Talk entitled “MySQL High Availability Options for OpenStack.” Percona is exhibiting at the OpenStack Paris Summit conference, as well – stop by booth E20 and say hello!

At Percona, we’re pleased to see the adoption of our open source software by the OpenStack community and we are working actively to develop more solutions for OpenStack users. We also provide Consulting assistance to organizations that are adopting OpenStack internally or are creating commercial services on top of OpenStack.

We are also pleased to introduce the first annual OpenStack Live, a conference focused on OpenStack and Trove, which is April 13 & 14, 2015 in Santa Clara, California. The call for speaking proposals is now open for submissions which will be reviewed by our OpenStack Live Conference Committee (including me!).

The post MySQL and Openstack deep dive talk at OpenStack Paris Summit (and more!) appeared first on MySQL Performance Blog.

Categories: MySQL

How to deal with MySQL deadlocks

MySQL Performance Blog - Tue, 2014-10-28 07:00

A deadlock in MySQL happens when two or more transactions mutually hold and request for locks, creating a cycle of dependencies. In a transaction system, deadlocks are a fact of life and not completely avoidable. InnoDB automatically detects transaction deadlocks, rollbacks a transaction immediately and returns an error. It uses a metric to pick the easiest transaction to rollback. Though an occasional deadlock is not something to worry about, frequent occurrences call for attention.

Before MySQL 5.6, only the latest deadlock can be reviewed using SHOW ENGINE INNODB STATUS command. But with Percona Toolkit’s pt-deadlock-logger you can have deadlock information retrieved from SHOW ENGINE INNODB STATUS at a given interval and saved to a file or table for late diagnosis. For more information on using pt-deadlock-logger, see this post. With MySQL 5.6, you can enable a new variable innodb_print_all_deadlocks to have all deadlocks in InnoDB recorded in mysqld error log.

Before and above all diagnosis, it is always an important practice to have the applications catch deadlock error (MySQL error no. 1213) and handle it by retrying the transaction.

How to diagnose a MySQL deadlock

A MySQL deadlock could involve more than two transactions, but the LATEST DETECTED DEADLOCK section only shows the last two transactions. Also it only shows the last statement executed in the two transactions, and locks from the two transactions that created the cycle. What are missed are the earlier statements that might have really acquired the locks. I will show some tips on how to collect the missed statements.

Let’s look at two examples to see what information is given. Example 1:

1 141013 6:06:22 2 *** (1) TRANSACTION: 3 TRANSACTION 876726B90, ACTIVE 7 sec setting auto-inc lock 4 mysql tables in use 1, locked 1 5 LOCK WAIT 9 lock struct(s), heap size 1248, 4 row lock(s), undo log entries 4 6 MySQL thread id 155118366, OS thread handle 0x7f59e638a700, query id 87987781416 localhost msandbox update 7 INSERT INTO t1 (col1, col2, col3, col4) values (10, 20, 30, 'hello') 8 *** (1) WAITING FOR THIS LOCK TO BE GRANTED: 9 TABLE LOCK table `mydb`.`t1` trx id 876726B90 lock mode AUTO-INC waiting 10 *** (2) TRANSACTION: 11 TRANSACTION 876725B2D, ACTIVE 9 sec inserting 12 mysql tables in use 1, locked 1 13 876 lock struct(s), heap size 80312, 1022 row lock(s), undo log entries 1002 14 MySQL thread id 155097580, OS thread handle 0x7f585be79700, query id 87987761732 localhost msandbox update 15 INSERT INTO t1 (col1, col2, col3, col4) values (7, 86, 62, "a lot of things"), (7, 76, 62, "many more") 16 *** (2) HOLDS THE LOCK(S): 17 TABLE LOCK table `mydb`.`t1` trx id 876725B2D lock mode AUTO-INC 18 *** (2) WAITING FOR THIS LOCK TO BE GRANTED: 19 RECORD LOCKS space id 44917 page no 529635 n bits 112 index `PRIMARY` of table `mydb`.`t2` trx id 876725B2D lock mode S locks rec but not gap waiting 20 *** WE ROLL BACK TRANSACTION (1)

Line 1 gives the time when the deadlock happened. If your application code catches and logs deadlock errors,which it should, then you can match this timestamp with the timestamps of deadlock errors in application log. You would have the transaction that got rolled back. From there, retrieve all statements from that transaction.

Line 3 & 11, take note of Transaction number and ACTIVE time. If you log SHOW ENGINE INNODB STATUS output periodically(which is a good practice), then you can search previous outputs with Transaction number to hopefully see more statements from the same transaction. The ACTIVE sec gives a hint on whether the transaction is a single statement or multi-statement one.

Line 4 & 12, the tables in use and locked are only with respect to the current statement. So having 1 table in use does not necessarily mean that the transaction involves 1 table only.

Line 5 & 13, this is worth of attention as it tells how many changes the transaction had made, which is the “undo log entries” and how many row locks it held which is “row lock(s)”. These info hints the complexity of the transaction.

Line 6 & 14, take note of thread id, connecting host and connecting user. If you use different MySQL users for different application functions which is another good practice, then you can tell which application area the transaction comes from based on the connecting host and user.

Line 9, for the first transaction, it only shows the lock it was waiting for, in this case the AUTO-INC lock on table t1. Other possible values are S for shared lock and X for exclusive with or without gap locks.

Line 16 & 17, for the second transaction, it shows the lock(s) it held, in this case the AUTO-INC lock which was what TRANSACTION (1) was waiting for.

Line 18 & 19 shows which lock TRANSACTION (2) was waiting for. In this case, it was a shared not gap record lock on another table’s primary key. There are only a few sources for a shared record lock in InnoDB:
1) use of SELECT … LOCK IN SHARE MODE
2) on foreign key referenced record(s)
3) with INSERT INTO… SELECT, shared locks on source table
The current statement of trx(2) is a simple insert to table t1, so 1 and 3 are eliminated. By checking SHOW CREATE TABLE t1, you could confirm that the S lock was due to a foreign key constraint to the parent table t2.

Example 2: With MySQL community version, each record lock has the record content printed:

1 2014-10-11 10:41:12 7f6f912d7700 2 *** (1) TRANSACTION: 3 TRANSACTION 2164000, ACTIVE 27 sec starting index read 4 mysql tables in use 1, locked 1 5 LOCK WAIT 3 lock struct(s), heap size 360, 2 row lock(s), undo log entries 1 6 MySQL thread id 9, OS thread handle 0x7f6f91296700, query id 87 localhost ro ot updating 7 update t1 set name = 'b' where id = 3 8 *** (1) WAITING FOR THIS LOCK TO BE GRANTED: 9 RECORD LOCKS space id 1704 page no 3 n bits 72 index `PRIMARY` of table `tes t`.`t1` trx id 2164000 lock_mode X locks rec but not gap waiting 10 Record lock, heap no 4 PHYSICAL RECORD: n_fields 5; compact format; info bit s 0 11 0: len 4; hex 80000003; asc ;; 12 1: len 6; hex 000000210521; asc ! !;; 13 2: len 7; hex 180000122117cb; asc ! ;; 14 3: len 4; hex 80000008; asc ;; 15 4: len 1; hex 63; asc c;; 16 17 *** (2) TRANSACTION: 18 TRANSACTION 2164001, ACTIVE 18 sec starting index read 19 mysql tables in use 1, locked 1 20 3 lock struct(s), heap size 360, 2 row lock(s), undo log entries 1 21 MySQL thread id 10, OS thread handle 0x7f6f912d7700, query id 88 localhost r oot updating 22 update t1 set name = 'c' where id = 2 23 *** (2) HOLDS THE LOCK(S): 24 RECORD LOCKS space id 1704 page no 3 n bits 72 index `PRIMARY` of table `tes t`.`t1` trx id 2164001 lock_mode X locks rec but not gap 25 Record lock, heap no 4 PHYSICAL RECORD: n_fields 5; compact format; info bit s 0 26 0: len 4; hex 80000003; asc ;; 27 1: len 6; hex 000000210521; asc ! !;; 28 2: len 7; hex 180000122117cb; asc ! ;; 29 3: len 4; hex 80000008; asc ;; 30 4: len 1; hex 63; asc c;; 31 32 *** (2) WAITING FOR THIS LOCK TO BE GRANTED: 33 RECORD LOCKS space id 1704 page no 3 n bits 72 index `PRIMARY` of table `tes t`.`t1` trx id 2164001 lock_mode X locks rec but not gap waiting 34 Record lock, heap no 3 PHYSICAL RECORD: n_fields 5; compact format; info bit s 0 35 0: len 4; hex 80000002; asc ;; 36 1: len 6; hex 000000210520; asc ! ;; 37 2: len 7; hex 17000001c510f5; asc ;; 38 3: len 4; hex 80000009; asc ;; 39 4: len 1; hex 62; asc b;;

Line 9 & 10: The ‘space id’ is tablespace id, ‘page no’ gives which page the record lock is on inside the tablespace. The ‘n bits’ is not the page offset, instead the number of bits in the lock bitmap. The page offset is the ‘heap no’ on line 10,

Line 11~15: It shows the record data in hex numbers. Field 0 is the cluster index(primary key). Ignore the highest bit, the value is 3. Field 1 is the transaction id of the transaction which last modified this record, decimal value is 2164001 which is TRANSACTION (2). Field 2 is the rollback pointer. Starting from field 3 is the rest of the row data. Field 3 is integer column, value 8. Field 4 is string column with character ‘c’. By reading the data, we know exactly which row is locked and what is the current value.

What else can we learn from analysis? Since most MySQL deadlocks happen between two transactions, we could start the analysis based on that assumption. In Example 1, trx (2) was waiting on a shared lock, so trx (1) either held a shared or exclusive lock on that primary key record of table t2. Let’s say col2 is the foreign key column, by checking the current statement of trx(1), we know it did not require the same record lock, so it must be some previous statement in trx(1) that required S or X lock(s) on t2’s PK record(s). Trx (1) only made 4 row changes in 7 seconds. Then you learned a few characteristics of trx(1): it does a lot of processing but a few changes; changes involve table t1 and t2, a single record insertion to t2. These information combined with other data could help developers to locate the transaction.

Where else can we find previous statements of the transactions? Besides application log and previous SHOW ENGINE INNODB STATUS output, you may also leverage binlog, slow log and/or general query log. With binlog, if binlog_format=statement, each binlog event would have the thread_id. Only committed transactions are logged into binlog, so we could only look for Trx(2) in binlog. In the case of Example 1, we know when the deadlock happened, and we know Trx(2) started 9 seconds ago. We can run mysqlbinlog on the right binlog file and look for statements with thread_id = 155097580. It is always good to then cross refer the statements with the application code to confirm.

$ mysqlbinlog -vvv --start-datetime=“2014-10-13 6:06:12” --stop-datatime=“2014-10-13 6:06:22” mysql-bin.000010 > binlog_1013_0606.out

With Percona Server 5.5 and above, you can set log_slow_verbosity to include InnoDB transaction id in slow log. Then if you have long_query_time = 0, you would be able to catch all statements including those rolled back into slow log file. With general query log, the thread id is included and could be used to look for related statements.

How to avoid a MySQL deadlock

There are things we could do to eliminate a deadlock after we understand it.

– Make changes to the application. In some cases, you could greatly reduce the frequency of deadlocks by splitting a long transaction into smaller ones, so locks are released sooner. In other cases, the deadlock rises because two transactions touch the same sets of data, either in one or more tables, with different orders. Then change them to access data in the same order, in another word, serialize the access. That way you would have lock wait instead of deadlock when the transactions happen concurrently.

– Make changes to the table schema, such as removing foreign key constraint to detach two tables, or adding indexes to minimize the rows scanned and locked.

– In case of gap locking, you may change transaction isolation level to read committed for the session or transaction to avoid it. But then the binlog format for the session or transaction would have to be ROW or MIXED.

The post How to deal with MySQL deadlocks appeared first on MySQL Performance Blog.

Categories: MySQL

MySQL & Friends Devroom FOSDEM 2015

MySQL Performance Blog - Mon, 2014-10-27 13:50

You can already feel the cold of February coming slowly… you can also smell waffles, fries and see a large amount of beards walking around with laptops… you are right, FOSDEM is coming! And as every year, the MySQL Community will also be present! For the 4th year in a row, I’ll perpetuate the organization of the MySQL & Friends Devroom.

FOSDEM 2015 edition will be held January 31 and February 1 here in Brussels. The MySQL & Friends Devroom is back on Sunday from 9 a.m. What is FOSDEM? It stands for the “Free and Open Source Software Developers’ European Meeting.” It’s a free event that offers open-source communities a place to meet, share ideas and collaborate.

As every year, the “Call for Papers” has been announced on the MySQL mailing list, and you can still read it here. CfP is open until December 7th!

This year the committee responsible for the talk’s selection is composed by:

* Dimitri Kravtchuk, representing Oracle
* Daniël van Eeden for the Community
* Roland Bouman for the Community
* Cédric Peintre for the Community
* Liz van Dijk, representing Percona
* Serge Frezefond, representing MariaDB
* René Cannaò, representing Blackbird IT

Thanks to all who have accepted playing this role and I wish them to work hard and make the best schedule as possible.

Don’t forget to submit your sessions (submit here, don’t forget to select MySQL track) in time and see you soon in Brussels to discover amazing stuff related to MySQL and have some beers with Friends!

The post MySQL & Friends Devroom FOSDEM 2015 appeared first on MySQL Performance Blog.

Categories: MySQL

MariaDB 10.1: Better query optimization for ORDER BY … LIMIT

Sergey Petrunia's blog - Fri, 2014-10-10 23:47

For some reason, we’ve been getting a lot of issues with ORDER BY optimization recently. The fixes have passed Elena Stepanova’s scrutiny and I’ve pushed them to MariaDB 10.1. Now, MariaDB’s ORDER BY ... LIMIT optimizer:

  • Doesn’t make stupid choices when several multi-part keys and potential range accesses are present (MDEV-6402)
  • Always uses “range” and (not full “index” scan) when it switches to an index to satisfy ORDER BY … LIMIT (MDEV-6657)
  • Tries hard to be smart and use cost/number of records estimates from other parts of the optimizer (MDEV-6384, MDEV-465, MySQL Bug#36817)
  • Takes full advantage of InnoDB’s Extended Keys feature when checking if filesort() can be skipped (MDEV-6796)

so, if you are struggling with poor optimization of ORDER BY ... LIMIT queries, help might be underway.

I have to admit that optimizations for ORDER BY... LIMIT are still far from perfect (both in MySQL and in MariaDB). The code is old and ugly and having spent some time debugging it, I can easily believe that it still does a poor job in some cases. The good part is that now we have some fresh knowledge of the ORDER BY … LIMIT optimization code, as well as RQG scripts for testing ORDER BY .. LIMIT queries. If you have any issues with ORDER BY LIMIT optimization, we will appreciate to see bugs filed for them.

Categories: MySQL

A discovery - Index Condition Pushdown can cause a slowdown after all

Sergey Petrunia's blog - Fri, 2014-10-10 21:56

MariaDB 5.5 and then MySQL 5.6 got Index Condition Pushdown (ICP) optimization (initially coded by yours truly). The idea of ICP is simple: after reading the index record, check the part of WHERE condition that can be computed using index columns, and only then read the table record. That way, we avoid reading table rows that don’t satisfy index condition:

It seems apparent that ICP can never make things slower. The WHERE clause has to be checked anyway, and not reading certain records can only make things faster.

That was what I thought, too, until recently Joffrey Michaie observed the contrary “in the wild”: we’ve got a real-world case where using Index Condition Pushdown was slower than not using it: MDEV-6713. The slowdown was about 20%, both on MariaDB and MySQL.

From what I could investigate so far, the slowdown is caused by these three factors working together:

  • a VARCHAR(255) column in the index. MySQL’s in-memory data representation for VARCHARs is not space efficient. If a column is defined as VARCHAR(255), any value will occupy the entire 255 bytes.
  • InnoDB’s row prefetch cache. When InnoDB detects that one is reading a lot of rows from an index, it pre-fetches index records and stores them in an internal cache. The cache uses the inefficient in-memory data representation.
  • Design of MySQL’s Item classes. One can’t evaluate an expression over a table record that is in InnoDB prefetch cache. Expression evaluation functions expect to find column values in the table’s “primary location”, internally known as table->record[0]. In order for ICP to check the condition, index columns have to be copied to table->record[0], first.

I hope we will be able to investigate this problem and post more about it soon. For now, the news is that ICP can cause a slowdown, when the index has big VARCHAR columns.

Categories: MySQL

MySQL compression: Compressed and Uncompressed data size

MySQL Performance Blog - Fri, 2014-10-10 14:34

MySQL has information_schema.tables that contain information such as “data_length” or “avg_row_length.” Documentation on this table however is quite poor, making an assumption that those fields are self explanatory – they are not when it comes to tables that employ compression. And this is where inconsistency is born. Lets take a look at the same table containing some highly compressible data using different storage engines that support MySQL compression:

TokuDB:

mysql> select * from information_schema.tables where table_schema='test' G *************************** 1. row *************************** TABLE_CATALOG: def TABLE_SCHEMA: test TABLE_NAME: comp TABLE_TYPE: BASE TABLE ENGINE: TokuDB VERSION: 10 ROW_FORMAT: tokudb_zlib TABLE_ROWS: 40960 AVG_ROW_LENGTH: 10003 DATA_LENGTH: 409722880 MAX_DATA_LENGTH: 9223372036854775807 INDEX_LENGTH: 0 DATA_FREE: 421888 AUTO_INCREMENT: NULL CREATE_TIME: 2014-10-10 07:59:05 UPDATE_TIME: 2014-10-10 08:01:20 CHECK_TIME: NULL TABLE_COLLATION: latin1_swedish_ci CHECKSUM: NULL CREATE_OPTIONS: TABLE_COMMENT: 1 row in set (0.00 sec)

Archive:

mysql> select * from information_schema.tables where table_schema='test' G *************************** 1. row *************************** TABLE_CATALOG: def TABLE_SCHEMA: test TABLE_NAME: comp TABLE_TYPE: BASE TABLE ENGINE: ARCHIVE VERSION: 10 ROW_FORMAT: Compressed TABLE_ROWS: 40960 AVG_ROW_LENGTH: 12 DATA_LENGTH: 501651 MAX_DATA_LENGTH: 0 INDEX_LENGTH: 0 DATA_FREE: 0 AUTO_INCREMENT: NULL CREATE_TIME: NULL UPDATE_TIME: 2014-10-10 08:08:24 CHECK_TIME: NULL TABLE_COLLATION: latin1_swedish_ci CHECKSUM: NULL CREATE_OPTIONS: TABLE_COMMENT: 1 row in set (0.01 sec)

InnoDB:

mysql> select * from information_schema.tables where table_schema='test' G *************************** 1. row *************************** TABLE_CATALOG: def TABLE_SCHEMA: test TABLE_NAME: comp TABLE_TYPE: BASE TABLE ENGINE: InnoDB VERSION: 10 ROW_FORMAT: Compressed TABLE_ROWS: 40660 AVG_ROW_LENGTH: 4168 DATA_LENGTH: 169480192 MAX_DATA_LENGTH: 0 INDEX_LENGTH: 0 DATA_FREE: 1572864 AUTO_INCREMENT: NULL CREATE_TIME: 2014-10-10 08:33:22 UPDATE_TIME: NULL CHECK_TIME: NULL TABLE_COLLATION: latin1_swedish_ci CHECKSUM: NULL CREATE_OPTIONS: row_format=COMPRESSED KEY_BLOCK_SIZE=4 TABLE_COMMENT: 1 row in set (0.00 sec)

From this we can see what Archive and Innodb show the COMPRESSED values for DATA_LENGTH and AVG_ROW_LENGTH while TokuDB shows uncompressed one (as of TokuDB 7.5.1) shipped with Percona Server 5.6.21-69.

The problem here is not only a lack of consistency but also what we need to know about BOTH numbers. We often need to know the uncompressed value to understand how much data there is really in the table, but also how much space it takes on the disk. Comparing these also can help us to understand the compression ratio (or on the contrary expansion due to storage overhead and indexes).

Looking at Information_Schema tables available I can’t find any way to find how much uncompressed data is stored in the Innodb (or Archive) table. The simple trick I can use is running the query along those lines: SELECT SUM(LENGTH(col1)+LENGTH(col2)…) FROM T – This would slightly overestimate the length converting numbers and dates to strings but it is good enough for most purposes.

TokuDB though, while providing uncompressed information in Information_schema TABLES table, allows you to get the information of real data storage on disk from its own information schema tables:

mysql> select * from information_schema.TokuDB_fractal_tree_info where table_schema='test' and table_name='comp' limit 10 G *************************** 1. row *************************** dictionary_name: ./test/comp-main internal_file_name: ./_test_sql_147e_292e_main_2c20c08_1_1b_B_0.tokudb bt_num_blocks_allocated: 125 bt_num_blocks_in_use: 125 bt_size_allocated: 1880088 bt_size_in_use: 1502232 table_schema: test table_name: comp table_dictionary_name: main *************************** 2. row *************************** dictionary_name: ./test/comp-status internal_file_name: ./_test_sql_147e_292e_status_2c20bdd_1_1b.tokudb bt_num_blocks_allocated: 4 bt_num_blocks_in_use: 4 bt_size_allocated: 8704 bt_size_in_use: 512 table_schema: test table_name: comp table_dictionary_name: status 2 rows in set (0.01 sec)

This shows us that this table is really consisting of 2 files each “bt_size_allocated” bytes in length. These numbers are close to what you will see on the disk but not exactly. I see these files are taking 1886208 and 16384 files, respectfully. I wish there would be an exact length available to query so we do not have to think how much difference there is and if it can get large enough in some cases to care.

If you’re just looking for information about how much space has been allocated and how much is currently used for given TokuDB table you can use a query like this:

mysql> select sum(bt_size_allocated) total, sum(bt_size_in_use) used, sum(bt_size_allocated)-sum(bt_size_in_use) free from information_schema.TokuDB_fractal_tree_ +---------+---------+--------+ | total | used | free | +---------+---------+--------+ | 1888792 | 1502744 | 386048 | +---------+---------+--------+ 1 row in set (0.01 sec)

To Sum it up – there is some consistency to improve in terms of reporting compressed and uncompressed data length information in MySQL – both in terms of consistency and information available. It is great to see that TokuDB found a way to report both compressed and uncompressed data size information, yet I would really love to see the actual size on the disk that a given table is taking. And it would be great if there was some consistent way to query it from inside MySQL without having to go to the file-system level and dealing with different ways that different storage engines place data on the file system. This becomes especially important with the coming of full tablespace support in MySQL 5.7 which would make it hard to find all matching files for the table on the filesystem.

The post MySQL compression: Compressed and Uncompressed data size appeared first on MySQL Performance Blog.

Categories: MySQL

Priority queue optimization for filesort is now visible in MariaDB 10.0

Sergey Petrunia's blog - Mon, 2014-07-14 17:11

TL;DR: Priority queue optimization for filesort with small LIMIT is now visible in MariaDB: there is a status variable and you can also see it in the slow query log (KB page link).

A longer variant:
One of the new optimizations in MySQL 5.6 is ability to use a priority queue instead of sorting for ORDER BY … LIMIT queries. The optimization was ported into MariaDB 10.0 long ago, but we still get questions if/when it will be ported. I guess, the reason for this is that, besides the query speed, you can’t see this optimization. Neither EXPLAIN, nor EXPLAIN FORMAT=JSON or PERFORMANCE_SCHEMA or status variables give any indication whether filesort used priority queue or the regular quicksort+merge algorithm.

MySQL 5.6 has only one way one can check whether filesort used priority queue. You need to enable optimizer_trace (set optimizer_trace=1), and then run the query (not EXPLAIN, but the query itself). Then, you can look into the optimizer trace and find something like this:

... "filesort_priority_queue_optimization": { "limit": 10, "rows_estimate": 198717, "row_size": 215, "memory_available": 262144, "chosen": true }, ...

MariaDB doesn’t support optimizer_trace at the moment. Even if it did, I think it would be wrong to require one to look into the optimizer trace to find out about the picked query plan.

The natural place to show the optimization would be EXPLAIN output. We could show something like “Using filesort (priority queue)”. This was my initial intent. After looking into the source code, this turned out to be difficult to do. The logic that makes the choice between using quicksort+merge and using priority queue is buried deep inside query execution code. (As if the mess caused by late optimizations of ORDER BY and UNIONs didn’t teach anybody in MySQL team anything).

As for query execution, there are two facilities where one could record execution-time details about the query plan. They are the status variables and the slow query log.

Status variables

We’ve added Sort_priority_queue_sorts status variable. Now, the list of sort-related status variables is:

MariaDB [test]> show status like 'Sort%'; +---------------------------+-------+ | Variable_name | Value | +---------------------------+-------+ | Sort_merge_passes | 0 | | Sort_priority_queue_sorts | 1 | | Sort_range | 0 | | Sort_rows | 11 | | Sort_scan | 1 | +---------------------------+-------+

(Sort_range + Sort_scan) gives total number of sorts. Sort_priority_queue_sorts gives number of sorts that were done using priority queue.

Slow query log

Percona’s Extended statistics in the slow query log shows Filesort/Filesort_on_disk fields. We thought that adding information about priority queue use would be appropriate. Now, slow query log entries look like this:

# Time: 140714 18:30:39 # User@Host: root[root] @ localhost [] # Thread_id: 3 Schema: test QC_hit: No # Query_time: 0.053857 Lock_time: 0.000188 Rows_sent: 11 Rows_examined: 100011 # Full_scan: Yes Full_join: No Tmp_table: No Tmp_table_on_disk: No # Filesort: Yes Filesort_on_disk: No Merge_passes: 0 Priority_queue: Yes SET timestamp=1405348239; select * from t1 where col1 between 10 and 20 order by col2 limit 100;

pt-query-digest is able to parse slow query logs with the new field.

What about PERFORMANCE_SCHEMA

What about PERFORMANCE_SCHEMA? After all, it is the most powerful tool for tracking query execution. It has “absorbed” some status variables into events_statements_history table. For sorting, it has these columns:

| SORT_MERGE_PASSES | bigint(20) unsigned | NO | | NULL | | | SORT_RANGE | bigint(20) unsigned | NO | | NULL | | | SORT_ROWS | bigint(20) unsigned | NO | | NULL | | | SORT_SCAN | bigint(20) unsigned | NO | | NULL | |

Should we add a SORT_PRIORITY_QUEUE_SORTS column there? We didn’t add it into 10.0 right now because of compatibility concerns. Some tools may rely on the structure of PERFORMANCE_SCHEMA tables. Also, PERFORMANCE_SCHEMA table definitions are stored on disk, and one would have to run mysql_fix_privilege_tables after a minor upgrade, which is not good.

Categories: MySQL

New feature in MariaDB 10.1: ANALYZE statement

Sergey Petrunia's blog - Mon, 2014-06-30 16:09

Last week, yours truly has pushed a new feature into MariaDB 10.1 tree: ANALYZE statement.

The idea of this feature is to make it easy to compare query plan with query execution. ANALYZE statement will run the statement, and produce EXPLAIN-like output, where optimizer’s estimates are followed by numbers that were observed when running the query. The output looks like this:

Here,

  • Next to rows there is r_rows column which shows how many records were read from the table.
  • Next to filtered there is r_filtered column which shows which fraction of records was left after the part of the WHERE condition attached to the table was checked.

I think this should explain the feature. If you want more details, please refer to the KB article ANALYZE statement. It also discusses the meaning of the above EXPLAIN output.

Technical details and further plans

ANALYZE currently uses its own counters. Counting is done for all queries, including non-ANALYZE queries. This should be okay (not have visible overhead) as long as counting just increments integer variables in the query plan, without doing any atomic operations or making syscalls.

The upside of this approach is that it’s now trivial to make Explain in the slow query log also print ANALYZE output. When a query runs slowly, you will be able to know where exactly the optimizer was wrong.

The downside is that getting more data will not be as easy. So far, the most requested numbers beyond r_rows and r_filtered were r_time(amount of time spent in reading the table) and r_io(amount of IO that we did on the table). Counting the amount of time that was spent while reading each row will impose CPU overhead, it is a known problem. Counting IO is just incrementing a counter, but it will require interaction between ANALYZE code and storage engine(s) code, which will add complexity.

There is PERFORMANCE_SCHEMA feature, where others have already spent a lot of effort to count wait time and IO. It’s tempting to reuse it. The problem is, P_S collects the wrong data. P_S counters are global, while ANALYZE needs to count IO for each table reference separately. Consider a self-join. From P_S point of view, it is reading from the same table. From ANALYZE point of view, it is reads from two different table references. I’m currently not sure whether ANALYZE should/could rely on PERFORMANCE_SCHEMA.

A totally different angle is that tabular EXPLAIN output doesn’t allow to show much data (for example, how many rows were there before/after GROUP BY?). Here the solution is clear, I think: support EXPLAIN FORMAT=JSON and then add ANALYZE FORMAT=JSON where we can provide lots of detail.

Categories: MySQL

Optimizations for derived tables in MySQL 5.6 and MariaDB 5.5+

Sergey Petrunia's blog - Mon, 2014-06-30 12:29

I had been involved with subquery optimizations fairly closely, but last week I was surprised to find out that MySQL 5.6 does not support derived table merging. This feature was among the subquery features in the abandoned MySQL 6.0. In MariaDB, it was finished and released as part of MariaDB 5.3/5.5. As for MySQL, neither MySQL 5.6, nor MySQL 5.7 has this feature.

So what is this “derived merge”? It’s simple to understand. When one writes complex queries, it is common to use FROM-clause subqueries as a way to structure the query:

select sum(o_totalprice) from (select * from orders where o_orderpriority=’1-URGENT’) as high_prio_orders where o_orderdate between ‘1995-01-01′ and ‘1995-01-07′

MySQL optimizer processes this syntax very poorly. The basic problem is that FROM-subqueries are always materialized exactly as-specified. Conditions from outside the subquery are applied only after the materialization.

In our example, table orders has an index on o_orderdate, and there is a highly selective condition o_orderdate BETWEEN ... which one can use for reading through the index. But the condition is located outside the subquery, so it will not be used when reading the table. Instead, we will get the following plan:

+----+-------------+------------+------+---------------+------+---------+------+---------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+------------+------+---------------+------+---------+------+---------+-------------+ | 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 1505799 | Using where | | 2 | DERIVED | orders | ALL | NULL | NULL | NULL | NULL | 1505799 | Using where | +----+-------------+------------+------+---------------+------+---------+------+---------+-------------+

The meaning of it is:

  1. Do a full table scan is on table `orders`. We expect to read 1.5M rows. Write rows that match o_orderpriority='1-URGENT' into a temporary table
  2. Read the temporary table back. Filter rows that match o_orderdate between ... and compute the query result

MySQL 5.6 has added some improvements to this (link to the manual). They are:

  • The temporary table is materialized as late as possible. This has no effect of the speed of our example query, but it may have an effect for more complex queries.
    • EXPLAIN also will not materialize the temporary table
  • The optimizer has an option to create and use an index on the temporary table.

However, the base problem of materializing FROM subquery before applying any other optimization still remains.

In MariaDB, EXPLAIN will be different:

+------+-------------+--------+-------+---------------+---------------+---------+------+------+------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+--------+-------+---------------+---------------+---------+------+------+------------------------------------+ | 1 | SIMPLE | orders | range | i_o_orderdate | i_o_orderdate | 4 | NULL | 4358 | Using index condition; Using where | +------+-------------+--------+-------+---------------+---------------+---------+------+------+------------------------------------+

Note that we see only one line, and the table orders is accessed through an index on o_orderdate. Running EXPLAIN EXTENDED will show why:

Message: select sum(`dbt3sf1`.`orders`.`o_totalprice`) AS `sum(o_totalprice)` from `dbt3sf1`.`orders` where ((`dbt3sf1`.`orders`.`o_orderpriority` = ‘1-URGENT’) and (`dbt3sf1`.`orders`.`o_orderDATE` between ‘1995-01-01′ and ‘1995-01-07′))

There is no FROM-clause subquery anymore. It has been merged into the upper select. This allowed the optimizer to avoid doing materialization, and also to use the condition and index on o_orderdate to construct a range access.

Query execution time for this particular example went down from 15 sec to 0.25 sec, but generally, the difference can be as big as your table is big.

Categories: MySQL
Syndicate content