MySQL

Webinar Wednesday July 12, 2017: MongoDB Index Types – How, When and Where Should They Be Used?

MySQL Performance Blog - Tue, 2017-07-11 15:32

Join Percona’s Senior Technical Services Engineer, Adamo Tonete as he presents MongoDB Index Types: How, When and Where Should They Be Used? on Wednesday, July 12, 2017 at 11:00 am PDT / 2:00 pm EDT (UTC-7).

Register Now

MongoDB has 12 index types. Do you know how each works, or when you should use each of them? This talk will arm you with this knowledge, and help you understand how indexes impact performance, storage and even the sharding of your data. We will also discuss some solid index operational practices, as well as some settings for things like TTL you might not know exist. The contents of this webinar will make you a Rock Star!

Register for the webinar here.

Adamo Tonete, Senior Technical Services Engineer

Adamo joined Percona in 2015, after working as a MongoDB/MySQL database administrator for three years. As the main database admin of a startup, he was responsible for suggesting the best architecture and data flows for a worldwide company in a 24/7 environment. Before that, he worked as a Microsoft SQL Server DBA for a large e-commerce company, working mainly on performance tuning and automation. Adamo has almost eight years of experience working as a DBA, and in the past three years he has moved to NoSQL technologies (without giving up relational databases).

Categories: MySQL

Hitler Reacts to Removal of MySQL's Query Cache

Xaprb, home of innotop - Mon, 2017-07-10 21:40

The removal of the query cache in MySQL 8.0 improves user experience and has been celebrated by many members of the MySQL community. With this good news, obviously, Hitler isn’t happy. (Parody Video).

Categories: MySQL

Percona Server for MongoDB 3.4.5-1.5 is Now Available

MySQL Performance Blog - Mon, 2017-07-10 18:37

Percona announces the release of Percona Server for MongoDB 3.4.5-1.5 on July 10, 2017. Download the latest version from the Percona web site or the Percona Software Repositories.

Percona Server for MongoDB is an enhanced, open source, fully compatible, highly-scalable, zero-maintenance downtime database supporting the MongoDB v3.4 protocol and drivers. It extends MongoDB with Percona Memory Engine and MongoRocks storage engine, as well as several enterprise-grade features:

Percona Server for MongoDB requires no changes to MongoDB applications or code.

This release is based on MongoDB 3.4.5 and includes the following additional changes:

New Features Bugs Fixed
  • #PSMDB-67: Fixed mongod service status messages.
Categories: MySQL

Webinar Tuesday July 11, 2017: Securing Your MySQL/MariaDB Data

MySQL Performance Blog - Mon, 2017-07-10 16:51

Join Percona’s Chief Evangelist, Colin Charles as he presents Securing Your MySQL/MariaDB Data on Tuesday, July 11, 2017 at 7:00 am PDT / 10:00 am EDT (UTC-7).

Register Now

This webinar will discuss the features of MySQL/MariaDB that when enabled and used improve the default usage of MySQL. Many cloud-based applications fail to:

  • Use appropriate filesystem permissions
  • Employ TLS/SSL for connections
  • Require TLS/SSL with MySQL replication
  • Use external authentication plugins (LDAP, PAM, Kerberos)
  • Encrypt all your data at rest
  • Monitor your database with the audit plugin
  • Review and rejecting SQL injections
  • Design application access using traditional firewall technology
  • Employ other MySQL/MariaDB security features

This webinar will demonstrate and advise on how to correctly implement the features above. We will end the presentation with some simple steps on how to hack a MySQL installation.

You can register for the webinar here.

Colin Charles, Percona Chief Evangelist

Colin Charles is the Chief Evangelist at Percona. He was previously on the founding team of MariaDB Server in 2009, worked at MySQL since 2005 and been a MySQL user since 2000. Before joining MySQL, he worked actively on the Fedora and OpenOffice.org projects. He’s well known within open source communities in APAC, and has spoken at many conferences.

Categories: MySQL

MongoDB Indexing Types: How, When and Where Should They Be Used?

MySQL Performance Blog - Sat, 2017-07-08 22:18

In this blog post, we will talk about MongoDB indexing, and the different types of indexes that are available in MongoDB.

Note: We are hosting a webinar on July 12, 2017, where I will talk about MongoDB indexes and how to choose a good indexing plan.

MongoDB is a NoSQL database that is document-oriented. NoSQL databases share many features with relational databases, and one of them is indexes. The question is how are such documents indexed in the database?

Remember that because MongoDB is a database that writes JSONs, there is no predefined schema in the document. The same field can be a string or an integer – depending on the document.

MongoDB, as well as other databases, use B-trees to index. With some exceptions, the algorithm is the same as a relational database.

The B-tree can use integers and strings together to organize data. The most important thing to know is that an index-less database must read all the documents to filter what you want, while an indexed database can – through indexes – find the documents quickly. Imagine you are looking for a book in a disorganized library. This is how the query optimizer feels when we are looking for something that is not indexed.

There are several different types of indexes available: single field, compound indexes, hashed indexes, geoIndexes, unique, sparse, partial, text… and so on. All of them help the database in some way, although they obviously also get in the way of write performance if used too much.

  • Single fields. Single fields are simple indexes that index only one field in a collection. MongoDB can usually only use one index per query, but in some cases the database can take advantage of more than one index to reply to a query (this is called index intersection). Also, $or actually executes more than one query at a time. Therefore $or and $in can also use more than one index.
  • Compound indexes. Compound indexes are indexes that have more than one indexed field, so ideally the most restrictive field should be to the left of the B-tree. If you want to index by sex and birth, for instance, the index should begin by birth as it is much more restrictive than sex.
  • Hashed indexes. Shards use hashed indexes, and create a hash according to the field value to spread the writes across the sharded instances.
  • GeoIndexes. GeoIndexes are a special index type that allows a search based on location, distance from a point and many other different features.
  • Unique indexes. Unique indexes work as in relational databases. They guarantee that the value doesn’t repeat and raise an error when we try to insert a duplicated value. Unique doesn’t work across shards.
  • Text indexes. Text indexes can work better with indexes than a single indexed field. There are different flags we can use, like giving weights to control the results or using different collections.
  • Sparse/Partial indexes. Sparse and partial indexes seem very similar. However, sparse indexes will index only an existing field and not check its value, while partial indexes will apply a filter (like greater than) to a field to index. This means the partial index doesn’t index all the documents with the existing field, but only documents that match the create index filter.

We will discuss and talk more about indexes and how they work in my webinar MongoDB® Index Types: How, When and Where Should They Be Used? If you have questions, feel free to ask them in the comments below and I will try to answer all of them in the webinar (or in a complementary post).

I hope this blog post was useful, please feel free to reach out me on twitter @AdamoTonete or @percona.

Categories: MySQL

Last Resort: How to Use a Backup to Start a Secondary Instance for MongoDB?

MySQL Performance Blog - Fri, 2017-07-07 18:56

In this blog post, I’ll look at how you can use a backup to start a secondary instance for MongoDB.

Although the documentation says it is not possible to use a backup to start a secondary, sometimes this is the only possible way to start a new instance. In this blog post, we will explain how to bypass this limitation and use a backup to start a secondary instance.

The initial sync/rsync or snapshot works fine when the instances are in the same data center, but it can fail or be really slow. Much slower than moving a compressed backup between data centers.

Not every backup can be used as a source for starting a replica set. The backup must have the oplog file. This means the backup must be done in a previously existent replica set using the --oplog flag point in time backup when dumping the collections. The time spent to move and restore the backup file must be less than the oplog window.

Please follow the next steps to create a secondary from a backup:

  1. Create a backup using the --oplog command.
    mongodump --oplog -o /tmp/backup/
  2. Backup the replica set collection from the local database.

    mongodump -d local -c system.replset
  3. After backup finishes please confirm the oplog.rs file is in the backup folder.
  4. Use bsondump to convert the oplog to JSON. We will use the last oplog entry as a starting point for the replica set.
    bsondump oplog.rs > oplog.rs.txt
  5. Initiate the new instance without the replica set parameter configured. At this point, the instance will act as a single instance.
  6. Restore the database normally using --oplogreplay to apply all the oplogs that have been recorded while the backup was running.
    mongorestore /tmp/backup --oplogReplay
  7. Connect to this server and use the local database to create the oplog.rs collection. Please use the same value as the other members (e.g., 20 GB).
    mongo use local db.runCommand( { create: "oplog.rs", capped: true, size: (20 * 1024 * 1024 * 1024) } )
  8. From the oplog.rs.txt generated in step 4, get the last line and copy the fields ts and h values to a MongoDB document.

    tail -n 1 opslog.rs.txt
  9. Insert the JSON value to the oplog.rs collection that was created before.
    mongo use local db.oplog.rs.insert({ ts: Timestamp(12354454,1), h: NumberLong("-3434387384732843")})
  10. Restore the replset collection to the local database.
    mongorestore -d local -c system.replset ./backup/repliset.bson
  11. Stop the service and edit the parameter replica set name to match the existing replica set.
  12. Connect to the primary and add this new host. The new host must start catching up the oplog and get in sync after a few hours/minutes, depending on the number of operations the replica set handles. It is important to consider adding this new secondary as a hidden secondary, without votes if possible, to avoid triggering an election. When the secondary is added to the replica set drivers, it will start using this host to perform reads. If you don’t add the server with hidden: true, the application will read inconsistent data (old data).

    mongo PRIMARY>rs.add({_id : <your id>, host: "<newhost:27017>", hidden : true, votes : 0, priority :0})
  13. Please check the replication lag, and once the seconds behind master is near to zero, change the host parameters in the replica set to hidden: false and priority or votes.
    mongo PRIMARY> rs.printSlaveReplicationInfo()
  14. We are considering a replica set with three members, where the new secondary has the ID 2 in the member’s array. Use the following command to unhide the secondary and make it available for reads. The priority and votes depend on your environment. Please notice you might need to change the member ID.
    mongo PRIMARY> cfg = rs.config() cfg.members[2].hidden = false cfg.members[2].votes = 1 cfg.members[2].priority = 1 PRIMARY> rs.reconfig(rs)

I hope this tutorial helps in an emergency situation. Please consider using initial sync, disk snapshots and hot backups before using this method.

Feel free to reach out me on twitter @AdamoTonete or @percona.

Categories: MySQL

ClickHouse: One Year!

MySQL Performance Blog - Thu, 2017-07-06 18:17

In this blog, we’ll look at ClickHouse on its one year anniversary.

It’s been a year already since the Yandex team released ClickHouse as open source software. I’ve had an interest in this project from the very start, as I didn’t think there was an open source analytical database that could compete with industry leaders like Vertica (for example).

This was an exciting year for ClickHouse early adopters. Let’s look at what it accomplished so far.

ClickHouse initially generated interest due to the Yandex name – the most popular search engine in Russia. It wasn’t long before jaw-dropping responses popped up: guys, this thing is crazy fast! Many early adopters who tried ClickHouse were really impressed.

Fast doesn’t mean convenient though. That was the main community concern to ClickHouse over the past months. Developed as an internal project for an internal customer (Yandex.Metrica), ClickHouse had a lot of limitations for general community use. It took several months before Yandex could restructure the team and mindset, establish proper communication with the community and start addressing external needs. There are still a lot of things that need to be done. The public roadmap is not easily available, for example, but the wheels are pointed in the right direction. The ClickHouse team has added a lot of the features people were screaming for, and more are in ClickHouse’s future plans.

The Yandex guys are actively attending international conferences, and they were:

They are speaking much more in Russia (no big surprise).

We were very excited by Yandex’s ClickHouse performance claims at Percona, and could not resist making our own benchmarks:

ClickHouse did very well in these benchmarks. There are many other benchmarks by other groups as well, including a benchmark against Amazon RedShift by Altinity.

The first ClickHouse production deployments outside of Yandex started in October-November 2016. Today, Yandex reports that dozens of companies around the world are using ClickHouse in production, with the biggest installations operating with up to several petabytes of data. Hundreds of other enterprises are deploying pilot installations or actively evaluating the software.

There are also interesting reports from CloudFare (How Cloudflare analyzes 1M DNS queries per second) and from Carto (Geospatial processing with ClickHouse).

There are also various community projects around ClickHouse worth mentioning:

Percona is also working to adapt ClickHouse to our projects. We are using ClickHouse to handle Query Analytics and as a long term metrics data for Metrics inside a new version (under development) of Percona Monitoring and Management.

I also will be speaking about ClickHouse at BIG DATA DAY LA 2017 on August 5th, 2017. You are welcome to attend if you are in Los Angeles this day!

ClickHouse has the potential to become one of the leading open source analytical DBMSs – much like MySQL and PostreSQL are leaders for OLTP workloads. We will see in the next couple of years if it happens or not. Congratulations to the Yandex team on their one-year milestone!

Categories: MySQL

Differences in PREPARE Statement Error Handling with Binary and Text Protocol (Percona XtraDB Cluster / Galera)

MySQL Performance Blog - Wed, 2017-07-05 18:22

In this blog, we’ll look at the differences in how a PREPARE statement handles errors in binary and text protocols.

Introduction

Since Percona XtraDB Cluster is a multi-master solution, when an application executes conflicting workloads one of the workloads gets rolled back with a DEADLOCK error. While the same holds true even if you fire the workload through a PREPARE statement, there are differences between using the MySQL connector API (with binary protocol) and the MySQL client (with text protocol). Let’s look at these differences with the help of an example.

Base Workload
  • Say we have a two-node cluster (n1 and n2) with the following base schema and tables:
    use test; create table t (i int, k int, primary key pk(i)) engine=innodb; insert into t values (1, 10), (2, 20), (3, 30); select * from t;
  • Workload (w1) on node-1 (n1):
    prepare st1 from 'update t set k = k + 100 where i > 1'; execute st1;
  • Workload (w2) on node-2 (n2):
    prepare st1 from 'update t set k = k + 100 where i > 1'; execute st1;
  • The workloads are conflicting, and the first node to commit wins. Let’s assume n1 commits first, and the write-set is replicated to n2 while n2 is busy executing the local workload.
Different Scenarios Based on Configuration wsrep_retry_autocommit > 0
  • n2 tries to apply the replicated workload, thereby forcefully aborting (brute-force abort) the local workload (w2).
  • The forcefully aborted workload is retried (based on wsrep_retry_autocommit value (default is 1)).
  • w2 succeeds on retry, and the new state of table t would be:
    (1, 10), (2, 220), (3, 230);
  • The user can increase wsrep_retry_autocommit to a higher value, though we don’t recommend this as a higher value increases pressure on the system to retry failed workloads. A retry doesn’t ensure things always pass if additional conflicting replicated workload are lined up in the certification queue. Also, it is important to note that a retry works only if AUTO_COMMIT=ON. For a begin-commit transaction, retry is not applicable. If we abort this transaction, then the complete transaction is rolled back. The workload executor application should have the logic to expect workload failure due to conflict and handle it accordingly.
wsrep_retry_autocommit = 0
  • Let’s say the user sets wsrep_retry_autocommit=0 on node-2 and executes the same workload.
  • This time it doesn’t retry the local workload (w2). Instead, it sends an error to the end-user:
    mysql> execute st1; ERROR 1213 (40001): WSREP detected deadlock/conflict and aborted the transaction. Try restarting the transaction mysql> select * from t; +---+------+ | i | k | +---+------+ | 1 | 10 | | 2 | 120 | | 3 | 130 | +---+------+ 3 rows in set (0.00 sec)
  • Only the replicated workload (w1) is applied, and local workload w2 fails with a DEADLOCK error.
Use of MySQL Connector API

Now let’s try to execute these workloads through Connector API. Connector API uses a binary protocol, which means it directly invokes the PREPARE statement and executes the statement using dedicated command codes (COM_STMT_PREPARE and COM_STMT_EXECUTE).

While the conflicting transaction internal abort remains same, the COM_QUERY command code (text protocol) handling corrects the “query interrupted error” and resets it to “deadlock error” based on the wsrep_conflict_state value. It also has logic to retry the auto-commit enable statement.

This error handling and retry logic is not present when the COM_STMT_PREPARE and COM_STMT_EXECUTE command codes are used. This means the user sees a different error (non-masked error) when we try the same workload through an application using the MySQL Connector API:

Output of application from node-2: Statement init OK! Statement prepare OK! Statement execution failed: Query execution was interrupted exact error-code to be specific: ER_QUERY_INTERRUPTED - 1317 - "Query execution was interrupted"

As you can see above, the statement/workload execution fails with “Query execution was interrupted” error vs. “deadlock” error (as seen with the normal MySQL client).

Conclusion

While the core execution remains same, error handling is different with text and binary protocol. This issue/limitation was always present, but it’s more prominent since sysbench-1.0 started using the PREPARE statement as a default.

What does this means to end-user/application?

  • Nothing has changed from the Percona XtraDB Cluster perspective, except that if your application is planning to use a binary protocol (like Connector API), then the application should able to handle both the errors and retry the failed transaction accordingly.
Categories: MySQL

Webinar Thursday July 6, 2017: Security and Encryption in the MySQL World

MySQL Performance Blog - Wed, 2017-07-05 16:25

Join Percona’s Solutions Engineer, Dimitri Vanoverbeke as he presents Security and Encryption in the MySQL World on Thursday, July 6, 2017, at 7:00 am PDT / 10:00 am EDT (UTC-7).

Register Now

 

MySQL and MariaDB Server provide many new features that help with security and encryption, both of which are extremely important in today’s world. Learn how to use these features, from roles to at-rest-encryption, to increase security. At the end of the webinar, you should understand how to have a securely configured MySQL instance!

Register for the webinar here.

Dimitri Vanoverbeke, Solutions Engineer

At the age of 7, Dimitri received his first computer. Since then, he’s addicted to anything with a digital pulse. Dimitri has been active in IT professionally since 2003, when he took various roles from internal system engineering to consulting. Before joining Percona, Dimitri worked as an open source consultant for a leading open source software consulting firm in Belgium. During his career, Dimitri became familiar with a broad range of open source solutions and with the devops philosophy. When not glued to his computer screen, he enjoys traveling, cultural activities, basketball and the great outdoors.

Categories: MySQL

Webinar Wednesday July 5, 2017: Indexes – What You Need to Know to Get the Most Out of Them

MySQL Performance Blog - Mon, 2017-07-03 17:00

Join Percona’s Senior Architect, Matthew Boehm, as he presents Indexes – What You Need to Know to Get the Most Out of Them on Wednesday, July 5, 2017 at 8:00 am PDT / 11:00 am EDT (UTC-7).

Register Now

Proper indexing is key to database performance. Find out how MySQL uses indexes for query execution, and then how to come up with an optimal index strategy. In this session, you’ll also learn how to know when you need an index, and also how to get rid of indexes that you don’t need to speed up queries.

Register for the webinar here.

Matthew Boehm, Architect

Matthew joined Percona in the fall of 2012 as a MySQL consultant. He quickly rose to Architect and became one of Percona’s training instructors. His areas of knowledge include the traditional Linux/Apache/MySQL/PHP stack, memcached, MySQL Cluster (NDB), massive sharding topologies, PHP development and a bit of MySQL-C-API development.

Categories: MySQL

Industry Recognition of Percona: Great Products, Great Team

MySQL Performance Blog - Fri, 2017-06-30 22:07

Percona’s focus on customer success pushes us to deliver the very best products, services and support. Three recent industry awards reflect our continued success in achieving these goals.

We recently garnered a Bronze Stevie Award “Company of the Year – Computer Software – Medium.” Stevies are some of the most coveted awards in the tech industry. Each year, more than 10,000 organizations in more than 60 nations compete for Stevies. It’s a huge honor to come away with a win.

Percona was also included in Database Trends and Applications’ (DBTA) annual DBTA 100 list of the companies that matter most in data. The directory was compiled by the editorial staff of DBTA and spotlights the companies that deal with evolving market demands through innovation in software, services, and hardware.

Zippia, a new website dedicated to helping recent grads with their career choices, ranked Percona in the top 10 of the “20 Best Companies To Work For In Raleigh, NC.” To compile the ranking, Zippia chose companies in a 25-mile radius that consistently get high reviews from career resource websites.  

Finally, Percona co-founder and CEO Peter Zaitsev was honored as one of the Top Rated CEOs in Raleigh, North Carolina, by Owler. Owler looked at 167,000 leaders across 50 cities and 25 industries worldwide and identified the top 1,000. This is a huge honor for Peter!

Driven by Peter’s vision, all of us at Percona remain committed to building an excellent organization that is the champion of unbiased open source database solutions. Our continued success, growth and recognition from organizations like the above lead us to believe that the best is yet to come. Stay tuned!

Categories: MySQL

How Does Percona Software Compare: the Value of Percona Software and Services

MySQL Performance Blog - Thu, 2017-06-29 22:22

In this blog post, I’ll discuss my experience as a Solutions Engineer in explaining the value of Percona software and services to customers.

The inspiration for this blog post was a recent conversation with a prospective client. They were exploring open source software solutions and professional services for the first time. This is a fairly common and recurring conversation with individuals and enterprises exploring open source for the first time. They generally find Percona through recommendations from colleagues, or organic Google searches. One of the most common questions Percona gets when engaging at this level is, “How does Percona’s software compare with << Insert Closed-source Enterprise DB Here >>?”

We’re glad you asked. Percona’s position on this question is to explain the value of Percona in the open source space. Percona does not push a particular flavor of open source software. We recommend an appropriate solution given each of our client’s unique business and application requirements. A client coming from this space is likely to have a lengthy list of compliance, performance and scalability requirements. We love these conversations. Our focus is always on providing our customers with the best services, support, and open-source technology solutions possible.

If there were a silver bullet for every database, Percona would not have been as successful over the past 10 (almost 11!) years. We know that MongoDB, MariaDB, MySQL Enterprise, MySQL Community, Percona Server for MySQL, etc., is never the right answer 100% of the time. Our competitors in this space will most likely disagree with this assessment – because they have an incentive to sell their own software. The quality of all of these offerings is extremely high, and our customers use each of them to great success. Using the wrong database software for a particular use-case is like using a butter knife to cut a steak. You can probably make it work, but it will be a frustrating and unrewarding experience.

Usually, there are more efficient alternatives.

There is a better question to ask instead of software vs. software, which basically turns into an apples vs. oranges conversation. It’s: “Why should we partner with Percona as our business adopts more open-source software?” This is where Percona’s experience and expertise in the open-source database space shine.

Percona is invested in ensuring you are making the right decisions based on your needs. Choosing the wrong database solution or architecture can be catastrophic. A recent Ponemon study (https://planetaklimata.com.ua/instr/Liebert_Hiross/Cost_of_Data_Center_Outages_2016_Eng.pdf) estimates the average cost of downtime can be up to $8,800 per minute. Not only is downtime costly, but re-architecting your environment due to an errant choice in database solutions can be costly as well. Uber experienced these pains when using Postgres as their database solution and wrote an interesting blog post about their decision to switch back to MySQL (https://eng.uber.com/mysql-migration/). Postgres has, in turn, responded to Uber recently (http://thebuild.com/presentations/uber-perconalive-2017.pdf). The point of bringing up this high-profile switch is not to take sides, but to highlight the breadth of knowledge necessary to navigate the open source database environment. Given its complexity, the necessity of a trusted business partner with unbiased recommendations should be apparent.

This is where Percona’s experience and expertise in the open source database space shines. Percona’s only investors are its customers, and our only customers are those using our services. Because of this, our core values are intertwined with customer success and satisfaction. This strategy trickles down through support, to professional services and all the way to the engineering team, where efforts are focused on developing software and features that provide the most value to customers as a whole.

Focusing on a business partner in the open source space is a much more important and worthwhile effort than focusing on a software provider. Percona has hands-on, practical experience with a varying array of open-source technologies in the MySQL, MariaDB, and MongoDB ecosystems. We’ve tracked our ability to assist our customer’s efficiencies in their databases and see a 30% efficiency improvement on average with some cases seeing a 10x boost in performance of their database. (https://www.percona.com/blog/2016/07/20/value-database-support/). The open source software itself has proven itself time and again to be the choice of billion dollar industries. Again, there is not one universal choice among companies. There are success stories that run the gamut of open source software solutions.

To summarize, there’s a reason we redirect the software vs. software question. It’s not because we don’t like talking about how great our software is (it is great, and holds it own with other open source offerings). It’s because the question does not highlight Percona’s value. Percona doesn’t profit from its software. Supporting the open-source ecosystem is our goal. We are heavily invested in providing our customers with the right open source software solution for their situation regardless of whether it is ours or not. Our experience and expertise in this space make us the ideal partner for businesses adopting more open source technology in place of enterprise-licensed models.

Categories: MySQL

MySQL Encryption at Rest – Part 2 (InnoDB)

MySQL Performance Blog - Wed, 2017-06-28 20:51

Welcome to Part 2 in a series of blog posts on MySQL encryption at rest. This post covers InnoDB tablespace encryption.

At Percona, we work with a number of clients that require strong security measures for PCI, HIPAA and PHI compliance, where data managed by MySQL needs to be encrypted “at rest.” As with all things open source, there several options for meeting the MySQL encryption at rest requirement. In this three-part series, we cover several popular options of encrypting data and present the various pros and cons to each solution. You may want to evaluate which parts of these tutorials work best for your situation before using them in production.

Part one of this series covered implementing disk-level encryption using crypt+LUKS.

In this post, we discuss InnoDB tablespace encryption (TE). You can choose to use just LUKS, just InnoDB TE, or you can use both. It depends on your compliance needs and your configuration. For example, say you put your binary logs and InnoDB redo logs on partition X, and your InnoDB tablespaces (i.e., your .ibd files) on partition Y. InnoDB TE only handles encrypting the tablespaces. You would need to use LUKS on partition X to encrypt those files.

InnoDB Tablespace Encryption Tutorial

The manual describes InnoDB encryption as follows:

InnoDB tablespace encryption uses a two-tier encryption key architecture, consisting of a master encryption key and tablespace keys. When an InnoDB table is encrypted, a tablespace key is encrypted and stored in the tablespace header. When an application or authenticated user wants to access encrypted tablespace data, InnoDB uses a master encryption key to decrypt the tablespace key. The decrypted version of a tablespace key never changes, but you can change the master encryption key as required. This action is referred to as master key rotation.

As a real-world example, picture a high-school janitor’s key-chain. Each key is different; each key works for only one door (or in MySQL’s case, table). The keys themselves never change, but the label on each key can change based on how the janitor wants to note which key works on which door. At any time, the janitor can re-label the keys with special names, thwarting any hooligans from opening doors they shouldn’t be opening.

Configuration Changes

As noted above, InnoDB uses a keyring to manage encryption keys on a per-table basis. But you must load the keyring plugin before InnoDB. Add the following lines to your my.cnf under the [mysqld] section. You may substitute a different path for the keyring file, but be sure that mysqld has permissions to this path:

early-plugin-load = keyring_file.so keyring_file_data = /var/lib/mysql-keyring/keyring

You do not need to restart MySQL at this time, but you can if you wish. These settings are for when you restart MySQL in the future.

Install the Keyring Plugin in MySQL

If you don’t wish to restart MySQL at this time, you can still continue. Run the following SQL statements within MySQL to load the plugin and configure the keyring file:

mysql> INSTALL PLUGIN keyring_file SONAME 'keyring_file.so'; mysql> SET GLOBAL keyring_file_data = '/var/lib/mysql-keyring/keyring';

You can now CREATE encrypted tables or ALTER existing tables to encrypt them with a simple command:

mysql> CREATE TABLE t1 (c1 INT) ENCRYPTION='Y'; Query OK, 0 rows affected (0.04 sec) mysql> ALTER TABLE users ENCRYPTION='Y'; Query OK, 1 row affected (0.04 sec)

Be careful with the above ALTER: It will most likely block all other transactions during this conversion.

Encrypting Tables Using pt-online-schema-change

Using pt-online-schema-change is fairly straightforward. While it has many options, the most basic usage to encrypt a single table is the following:

pt-online-schema-change --execute --alter "engine=innodb encryption='Y'" D=mydb,t=mytab

pt-online-schema-change creates an encrypted, shadow copy of the table mytab. Once the table is populated with the original data, it drop-swaps the table and replaces the original table with the new one.

It is always recommended to first run pt-online-schema-change with the --dry-run flag first, in order to validate the command and the change.

Encrypting All Tables in a Database

To encrypt all the tables in a database, we can simply take the above command and wrap it in a shell loop. The loop is fed a list of tables in the database from MySQL. You just need to change the dbname variable:

# dbname=mydb # mysql -B -N -e "SELECT TABLE_NAME FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA='$dbname' and TABLE_TYPE='BASE TABLE'" | while read line; do pt-online-schema-change --execute --alter "engine=innodb encryption='Y'" D=$dbname,t=$line; done

Be careful with the above command, as it will take off and blast through all the changes!

Rotating InnoDB Encryption Keys

Your compliance policy dictates how often you need to rotate the encryption keys within InnoDB:

mysql> ALTER INSTANCE ROTATE INNODB MASTER KEY;

This command creates a new key, decrypts all encrypted tablespace headers (where you store the per-table key) using the previous key, and then re-encrypts the headers with the new key. Notice that the table itself is not re-encrypted. This would be an extremely intrusive and potentially dangerous operation. Recall above that each tables’ encryption key never changes (unless the entire table gets decrypted). What can change is the key used to encrypt that key within each tables’ header.

The keyring UDFs in MySQL

The keyring plugin can also be used to store sensitive data within MySQL for later retrieval. This functionality is beyond the scope of this blog as it doesn’t deal directly with InnoDB TE. You can read more about it in the MySQL manual.

Conclusion

InnoDB Tablespace Encryption makes it easy to use MySQL encryption at rest. Keep in mind the downsides of using only TE: InnoDB tablespace encryption doesn’t cover undo logs, redo logs or the main ibdata1 tablespace. Additionally, InnoDB encryption doesn’t cover binary-logs and slow-query-logs.

Categories: MySQL

SSL Connections in MySQL 5.7

MySQL Performance Blog - Tue, 2017-06-27 21:16

This blog post looks at SSL connections and how they work in MySQL 5.7.

Recently I was working on an SSL implementation with MySQL 5.7, and I made some interesting discoveries. I realized I could connect to the MySQL server without specifying the SSL keys on the client side, and the connection is still secured by SSL. I was confused and I did not understand what was happening.

In this blog post, I am going to show you why SSL works in MySQL 5.7, and it worked previously in MySQL 5.6.

Let’s start with an introduction of how SSL worked in 5.6.

SSL in MySQL 5.6

The documentation for SSL in MySQL 5.6 is quite detailed, and it explains how SSL works. But first let’s make one thing clear: MySQL supports secure (encrypted) connections between clients and the server using the TLS (Transport Layer Security) protocol. TLS is sometimes referred to as SSL (Secure Sockets Layer), but MySQL doesn’t actually use the SSL protocol for secure connections because it provides weak encryption.

So when we/someone says MySQL is using SSL, it really means that it is using TLS. You can check which protocol you use:

show status like 'Ssl_version'; +---------------+---------+ | Variable_name | Value | +---------------+---------+ | Ssl_version | TLSv1.2 | +---------------+---------+

So how does it work? Let me quote a few lines from the MySQL documentation:

TLS uses encryption algorithms to ensure that data received over a public network can be trusted. It has mechanisms to detect data change, loss, or replay. TLS also incorporates algorithms that provide identity verification using the X509 standard. X509 makes it possible to identify someone on the Internet. In basic terms, there should be some entity called a “Certificate Authority” (or CA) that assigns electronic certificates to anyone who needs them. Certificates rely on asymmetric encryption algorithms that have two encryption keys (a public key and a secret key). A certificate owner can present the certificate to another party as proof of identity. A certificate consists of its owner’s public key. Any data encrypted using this public key can be decrypted only using the corresponding secret key, which is held by the owner of the certificate.

It works with key pairs (private and public): the server has the private keys and the client has the public keys. Here is a link showing how we can generate these keys.

MySQL 5.6 Client

When the server is configured with SSL, the client has to have the client certificates. Once it gets those, it can connect to the server using SSL:

mysql -utestuser -p -h192.168.0.1 -P3306 --ssl-key=/root/mysql-ssl/client-key.pem --ssl-cert=/root/mysql-ssl/client-cert.pem

We have to specify the key and the certificate. Otherwise, we cannot connect to the server with SSL. So far so good, everything works as the documentation says. But how does it work in MySQL 5.7?

SSL in MySQL 5.7

The documentation was very confusing to me, and did not help much. It described how to create the SSL keys the same way (and even the server and client configuration) as in MySQL 5.6. So if everything is the same, why I can connect without specifying the client keys? I did not have an answer for that. One of my colleagues (Alexey Poritskiy) found the answer after digging through the manuals for a while, and it finally clearly described this behavior:

As of MySQL 5.7.3, a client need specify only the --ssl option to obtain an encrypted connection. The connection attempt fails if an encrypted connection cannot be established. Before MySQL 5.7.3, the client must specify either the --ssl-ca option, or all three of the --ssl-ca, --ssl-key, and --ssl-cert options.

These lines are in the “CREATE USER” syntax manual.

After this, I re-read the SSL documentation and found the following:

5.7.3: On the client side, an explicit --ssl option is no longer advisory but prescriptive. Given a server enabled to support secure connections, a client program can require a secure conection by specifying only the --ssl option. The connection attempt fails if a secure connection cannot be established. Other --ssl-xxx options on the client side mean that a secure connection is advisory (the connection attempt falls back to an unencrypted connection if a secure connection cannot be established).

I still think the SSL manual could be more expressive and detailed.

Before MySQL 5.7.3, it used key pairs. After that, it works a bit similar to websites (HTTPS): the client does not need the keys and the connection still can be secure. (I would still like to see more detailed documentation how it really works.)

Is This Good for Us?

In my opinion, this is a really good feature, but it didn’t get a lot of publicity. Prior to 5.7.3, if we wanted to use SSL we had to create the keys and use the client keys in every client application. It’s doable, but it is just a pain — especially if we are using different keys for every server with many users.

With this feature we can have a different key for every server, but on the client side we only have to enable the SSL connection. Most of the clients use it as the default if SSL is available.

I am pretty sure if people knew about this feature they would use it more often.

Limitations

I tested it with many client applications, and all worked without specifying the client keys. I only had issues with the MySQL 5.6 client, even though the server was 5.7. In that case, I had to specify the client keys. Without them, I couldn’t connect to the server.

It is possible some older applications won’t support this feature.

Conclusion

This is a great feature, easy to use it and it makes SSL implementations much easier. But I think the documentation should be clearer and more precise in describing how this new feature actually works. I already submitted a request to update the documentation.

Categories: MySQL

Webinar Thursday June 29, 2017: Choosing a MySQL® High Availability Solution

MySQL Performance Blog - Tue, 2017-06-27 18:40

Join Percona’s Principal Technical Services Engineer, Marcos Albe as he presents Choosing a MySQL High Availability Solution on Thursday, June 29, 2017, at 11:00 am PDT / 2:00 pm EDT (UTC-7).

Register Now

The MySQL world is full of tradeoffs, and choosing a high availability (HA) solution is no exception. Learn to think about high availability “the Percona way,” and how to use the solutions that we deploy on a regular basis.In this webinar, we will cover:

  • Percona XtraDB Cluster
  • DRBD
  • MHA
  • MySQL Orchestrator

Register for the webinar here.

Marcos Albe, Principal Technical Services Engineer

After 12 years of working as a PHP/JS developer for local and remote firms, Marcos decided to pursue his true love and become a full-time DBA. Since then, he has provided MySQL Support for Percona for the past six. He helps leading web properties with advice on anything-MySQL, as well as in-depth system performance diagnostic help.

 

Categories: MySQL

Webinar Tuesday June 27, 2017: MariaDB® Server 10.2 – The Complete Guide

MySQL Performance Blog - Mon, 2017-06-26 21:37

Join Percona’s Chief Evangelist, Colin Charles as he presents MariaDB Server 10.2: The Complete Guide on Tuesday, June 27, 2017, at 7:00 am PDT / 10:00 am EDT (UTC-7).

Register Now

The new MariaDB Server 10.2 release is out. It has some interesting new features, but beyond just a list of features we need to understand how to use them. This talk will go over everything new that MariaDB 10.2 has to offer.

In this webinar, we’ll learn about Window functions, common table expressions, finer-grained CREATE USER statements, and more – including getting mysqlbinlog up to parity with MySQL.

There are also unique MariaDB features that don’t exist in MySQL like encryption at rest, integrated Galera Cluster, threadpool, InnoDB defragmentation, roles, extended REGEXP, etc.

The webinar describes all the new features, both MySQL compatible and MariaDB-only ones, and show usage examples and practical use cases. We’ll also review the feature roadmap for MariaDB Server 10.3.

Register for the webinar here.

Colin Charles, Chief Evangelist

Colin Charles is the Chief Evangelist at Percona. He was previously part of the founding team of MariaDB Server (2009), worked at MySQL since 2005, and has been a MySQL user since 2000. Before joining MySQL, he actively worked on the Fedora and OpenOffice.org projects. He’s well known within open source communities in APAC and has spoken at many conferences.

Categories: MySQL

Percona XtraDB Cluster, Galera Cluster, MySQL Group Replication High Availability Webinar: Q & A

MySQL Performance Blog - Fri, 2017-06-23 18:45

Thank you for attending the Wednesday, June 21, 2017 high availability webinar titled Percona XtraDB Cluster, Galera Cluster, MySQL Group Replication. In this blog, I will provide answers to the Q & A for that webinar.

You can find the slides and a recording of the webinar here.

Is there a minimum MySQL server version for Group Replication?

MySQL Group Replication is GA since MySQL Community 5.7.17. This is the lowest version that you should use for the Group Replication feature. Otherwise, you are using a beta version.

Since 5.7.17 was the GA release, it’s strongly recommended you use the latest 5.7 minor release. Bugs get fixed and features added in each of the minor releases (as can be seen in the Limitations section in the slide deck).

In MySQL 5.6 and earlier versions, Group Replication is not supported. Note that Percona Server for MySQL 5.7.17 and beyond also ships with Group Replication.

Can I use Percona XtraDB Cluster with MariaDB v10.2? or must I use Percona Server for MySQL?

Percona XtraDB Cluster is Percona Server for MySQL and Percona XtraBackup with the modified Galera library. You cannot run Percona XtraDB Cluster on MariaDB.

However, as Percona XtraDB Cluster is open source, it is possible that MariaDB/Codership implements our modifications into their codebase.

If Percona XtraDB Cluster does not allow InnoDB tables, how do we typically deal with applications that need to use MyISAM tables?

You cannot use MyISAM with Percona XtraDB Cluster, Galera or Group Replication. However, there is experimental MyISAM support in Galera/Percona XtraDB Cluster. But we strongly recommend that you don’t use this in production. It effectively executes all statements in Total Order Isolation, which results in bad performance.

What is a typical business use case for the Group Replication? I specifically like the writes order feature.

Typical use cases are:

  • Environments with strict **durability** requirements
  • Write to multiple nodes simultaneously while keeping data **consistent**
  • Reducing failover time
  • Using other nodes for read-scaling, where reading stale data is more difficult for the application (as opposed to standard asynchronous replication)

The use cases for Galera and Percona XtraDB Cluster are similar.

Where do you run ProxySQL, on a separate server? We are using HAProxy.

You can deploy ProxySQL in many different ways. One common method of installation is to run ProxySQL on a separate layer of servers (ensuring there is failover on this layer). Another commonly used method is to run a ProxySQL daemon on every application server.

Do you support KVM?

Yes, there are no limitations on virtualization solutions.

Can you give some examples of an “arbitrator”?

Some useful links:

What does Percona XtraDB add to make it more performant than InnoDB?

The scalability and performance improvement of Percona XtraDB are listed on the Percona Server for MySQL documentation page: https://www.percona.com/doc/percona-server/LATEST/index.html

How scalable is Percona XtraDB Cluster storage wise? Do we have any limitations?

Storage happens through the storage engine (which is InnoDB). Percona XtraDB Cluster does not have any different limitations than Percona Server for MySQL or MySQL.

However, we need to also consider the practical side of things: the larger the cluster gets, the longer certain operations take. For example, when adding a new node to the cluster another node must be the donor and provide all the data. This will take substantially longer with larger datasets. Certain operational aspects might therefore become more complex.

Is there any development to add multiple nodes simultaneously?

No, at the moment only one node can join the cluster at the same time. Other nodes automatically wait until it is finished before joining.

Why does Galera say we cannot use READ COMMITTED isolation for multimaster mode, even though we can start the cluster with READ-COMMITTED?

You can use READ-COMMITTED as transaction isolation level. The limitation is that you cannot use SERIALIZABLE: http://galeracluster.com/documentation-webpages/isolationlevels.html.

Galera Cluster and MariaDB currently do not prevent a user from using this transaction isolation level. Percona XtraDB Cluster implemented the strict mode to prevent these operations: https://www.percona.com/doc/percona-xtradb-cluster/LATEST/features/pxc-strict-mode.html#explicit-table-locking

MariaDB 10.2 fixed the check constraints issue, When will Percona fix this issue?

There are currently no plans to support CHECK constraints in Percona Server for MySQL (and therefore Percona XtraDB Cluster as well).

As Percona Server is effectively a fully backwards-compatible (but modified) MySQL Community Server, CHECK constraints is a feature that normally would be implemented in MySQL Community first.

Can you share your performance benchmark git repository (if you have one)?

We don’t have a performance benchmark in git repository. You can get detailed information about this benchmark in this blog: Performance improvements in Percona XtraDB Cluster 5.7.17-29.20.

On your slide pointing to scalability charts, how many nodes did you run your test against?

We used a three-node cluster for this performance benchmark.

The product is using Master-Master replication. As such what do you mean when you talk about failover in such configuration?

“Master-Master” replication in the MySQL world means that you configure replication from node1 to node2, and from node2 to node1. However, since replication is asynchronous there are no consistency guarantees when writing to both nodes at the same time. Data can diverge, and nodes can end up with different data.

Master-Master is often used in order to failover from one master to another without having to reconfigure replication. All that is needed is to:

  • Stop the traffic from the current master
  • Let the other master sync up (usually almost immediately)
  • Mark the old master read_only, set the new master as read_write
  • Start traffic to the new master

Where do you maintain the cluster state?

All technologies automatically maintain the cluster state as you add and remove nodes.

What are the network/IP requirements for Proxy SQL?

There are no specific requirements. More documentation about ProxySQL can be found here: https://github.com/sysown/proxysql/wiki.

Categories: MySQL

ClickHouse in a General Analytical Workload (Based on a Star Schema Benchmark)

MySQL Performance Blog - Thu, 2017-06-22 19:20

In this blog post, we’ll look at how ClickHouse performs in a general analytical workload using the star schema benchmark test.

We have mentioned ClickHouse in some recent posts (ClickHouse: New Open Source Columnar Database, Column Store Database Benchmarks: MariaDB ColumnStore vs. Clickhouse vs. Apache Spark), where it showed excellent results. ClickHouse by itself seems to be event-oriented RDBMS, as its name suggests (clicks). Its primary purpose, using Yandex Metrica (the system similar to Google Analytics), also points to an event-based nature. We also can see there is a requirement for date-stamped columns.

It is possible, however, to use ClickHouse in a general analytical workload. This blog post shares my findings. For these tests, I used a Star Schema benchmark — slightly-modified so that able to handle ClickHouse specifics.

First, let’s talk about schemas. We need to adjust to ClickHouse data types. For example, the biggest fact table in SSB is “lineorder”. Below is how it is defined for Amazon RedShift (as taken from https://docs.aws.amazon.com/redshift/latest/dg/tutorial-tuning-tables-create-test-data.html):

CREATE TABLE lineorder ( lo_orderkey INTEGER NOT NULL, lo_linenumber INTEGER NOT NULL, lo_custkey INTEGER NOT NULL, lo_partkey INTEGER NOT NULL, lo_suppkey INTEGER NOT NULL, lo_orderdate INTEGER NOT NULL, lo_orderpriority VARCHAR(15) NOT NULL, lo_shippriority VARCHAR(1) NOT NULL, lo_quantity INTEGER NOT NULL, lo_extendedprice INTEGER NOT NULL, lo_ordertotalprice INTEGER NOT NULL, lo_discount INTEGER NOT NULL, lo_revenue INTEGER NOT NULL, lo_supplycost INTEGER NOT NULL, lo_tax INTEGER NOT NULL, lo_commitdate INTEGER NOT NULL, lo_shipmode VARCHAR(10) NOT NULL );

For ClickHouse, the table definition looks like this:

CREATE TABLE lineorderfull ( LO_ORDERKEY UInt32, LO_LINENUMBER UInt8, LO_CUSTKEY UInt32, LO_PARTKEY UInt32, LO_SUPPKEY UInt32, LO_ORDERDATE Date, LO_ORDERPRIORITY String, LO_SHIPPRIORITY UInt8, LO_QUANTITY UInt8, LO_EXTENDEDPRICE UInt32, LO_ORDTOTALPRICE UInt32, LO_DISCOUNT UInt8, LO_REVENUE UInt32, LO_SUPPLYCOST UInt32, LO_TAX UInt8, LO_COMMITDATE Date, LO_SHIPMODE String )Engine=MergeTree(LO_ORDERDATE,(LO_ORDERKEY,LO_LINENUMBER),8192);

From this we can see we need to use datatypes like UInt8 and UInt32, which are somewhat unusual for database world datatypes.

The second table (RedShift definition):

CREATE TABLE customer ( c_custkey INTEGER NOT NULL, c_name VARCHAR(25) NOT NULL, c_address VARCHAR(25) NOT NULL, c_city VARCHAR(10) NOT NULL, c_nation VARCHAR(15) NOT NULL, c_region VARCHAR(12) NOT NULL, c_phone VARCHAR(15) NOT NULL, c_mktsegment VARCHAR(10) NOT NULL );

For ClickHouse, I defined as:

CREATE TABLE customerfull ( C_CUSTKEY UInt32, C_NAME String, C_ADDRESS String, C_CITY String, C_NATION String, C_REGION String, C_PHONE String, C_MKTSEGMENT String, C_FAKEDATE Date )Engine=MergeTree(C_FAKEDATE,(C_CUSTKEY),8192);

For reference, the full schema for the benchmark is here: https://github.com/vadimtk/ssb-clickhouse/blob/master/create.sql.

For this table, we need to define a rudimentary column C_FAKEDATE Date in order to use ClickHouse’s most advanced engine (MergeTree). I was told by the ClickHouse team that they plan to remove this limitation in the future.

To generate data acceptable by ClickHouse, I made modifications to ssb-dbgen. You can find my version here: https://github.com/vadimtk/ssb-dbgen. The most notable change is that ClickHouse can’t accept dates in CSV files formatted as “19971125”. It has to be “1997-11-25”. This is something to keep in mind when loading data into ClickHouse.

It is possible to do some preformating on the load, but I don’t have experience with that. A common approach is to create the staging table with datatypes that match loaded data, and then convert them using SQL functions when inserting to the main table.

Hardware Setup

One of the goals of this benchmark to see how ClickHouse scales on multiple nodes. I used a setup of one node, and then compared to a setup of three nodes. Each node is 24 cores of “Intel(R) Xeon(R) CPU E5-2643 v2 @ 3.50GHz” CPUs, and the data is located on a very fast PCIe Flash storage.

For the SSB benchmark I use a scale factor of 2500, which provides (in raw data):

Table lineorder – 15 bln rows, raw size 1.7TB, Table customer – 75 mln rows

When loaded into ClickHouse, the table lineorder takes 464GB, which corresponds to a 3.7x compression ratio.

We compare a one-node (table names lineorderfull, customerfull) setup vs. a three-node (table names lineorderd, customerd) setup.

Single Table Operations

Query:

SELECT toYear(LO_ORDERDATE) AS yod, sum(LO_REVENUE) FROM lineorderfull GROUP BY yod

One node:

7 rows in set. Elapsed: 9.741 sec. Processed 15.00 billion rows, 90.00 GB (1.54 billion rows/s., 9.24 GB/s.)

Three nodes:

7 rows in set. Elapsed: 3.258 sec. Processed 15.00 billion rows, 90.00 GB (4.60 billion rows/s., 27.63 GB/s.)

We see a speed up of practically three times. Handling 4.6 billion rows/s is blazingly fast!

One Table with Filtering

SELECT sum(LO_EXTENDEDPRICE * LO_DISCOUNT) AS revenue FROM lineorderfull WHERE (toYear(LO_ORDERDATE) = 1993) AND ((LO_DISCOUNT >= 1) AND (LO_DISCOUNT <= 3)) AND (LO_QUANTITY < 25)

One node:

1 rows in set. Elapsed: 3.175 sec. Processed 2.28 billion rows, 18.20 GB (716.60 million rows/s., 5.73 GB/s.)

Three nodes:

1 rows in set. Elapsed: 1.295 sec. Processed 2.28 billion rows, 18.20 GB (1.76 billion rows/s., 14.06 GB/s.)

It’s worth mentioning that during the execution of this query, ClickHouse was able to use ALL 24 cores on each box. This confirms that ClickHouse is a massively parallel processing system.

Two Tables (Independent Subquery)

In this case, I want to show how Clickhouse handles independent subqueries:

SELECT sum(LO_REVENUE) FROM lineorderfull WHERE LO_CUSTKEY IN ( SELECT C_CUSTKEY AS LO_CUSTKEY FROM customerfull WHERE C_REGION = 'ASIA' )

One node:

1 rows in set. Elapsed: 28.934 sec. Processed 15.00 billion rows, 120.00 GB (518.43 million rows/s., 4.15 GB/s.)

Three nodes:

1 rows in set. Elapsed: 14.189 sec. Processed 15.12 billion rows, 121.67 GB (1.07 billion rows/s., 8.57 GB/s.)

We  do not see, however, the close to 3x speedup on three nodes, because of the required data transfer to perform the match LO_CUSTKEY with C_CUSTKEY

Two Tables JOIN

With a subquery using columns to return results, or for GROUP BY, things get more complicated. In this case we want to GROUP BY the column from the second table.

First, ClickHouse doesn’t support traditional subquery syntax, so we need to use JOIN. For JOINs, ClickHouse also strictly prescribes how it must be written (a limitation that will also get changed in the future). Our JOIN should look like:

SELECT C_REGION, sum(LO_EXTENDEDPRICE * LO_DISCOUNT) FROM lineorderfull ANY INNER JOIN ( SELECT C_REGION, C_CUSTKEY AS LO_CUSTKEY FROM customerfull ) USING (LO_CUSTKEY) WHERE (toYear(LO_ORDERDATE) = 1993) AND ((LO_DISCOUNT >= 1) AND (LO_DISCOUNT <= 3)) AND (LO_QUANTITY < 25) GROUP BY C_REGION

One node:

5 rows in set. Elapsed: 31.443 sec. Processed 2.35 billion rows, 28.79 GB (74.75 million rows/s., 915.65 MB/s.)

Three nodes:

5 rows in set. Elapsed: 25.160 sec. Processed 2.58 billion rows, 33.25 GB (102.36 million rows/s., 1.32 GB/s.)

In this case the speedup is not even two times. This corresponds to the fact of the random data distribution for the tables lineorderd and customerd. Both tables were defines as:

CREATE TABLE lineorderd AS lineorder ENGINE = Distributed(3shards, default, lineorder, rand()); CREATE TABLE customerd AS customer ENGINE = Distributed(3shards, default, customer, rand());

Where  rand() defines that records are distributed randomly across three nodes. When we perform a JOIN by LO_CUSTKEY=C_CUSTKEY, records might be located on different nodes. One way to deal with this is to define data locally. For example:

CREATE TABLE lineorderLD AS lineorderL ENGINE = Distributed(3shards, default, lineorderL, LO_CUSTKEY); CREATE TABLE customerLD AS customerL ENGINE = Distributed(3shards, default, customerL, C_CUSTKEY);

Three Tables JOIN

This is where it becomes very complicated. Let’s consider the query that you would normally write:

SELECT sum(LO_REVENUE),P_MFGR, toYear(LO_ORDERDATE) yod FROM lineorderfull ,customerfull,partfull WHERE C_REGION = 'ASIA' and LO_CUSTKEY=C_CUSTKEY and P_PARTKEY=LO_PARTKEY GROUP BY P_MFGR,yod ORDER BY P_MFGR,yod;

With Clickhouse’s limitations on JOINs syntax, the query becomes:

SELECT sum(LO_REVENUE), P_MFGR, toYear(LO_ORDERDATE) AS yod FROM ( SELECT LO_PARTKEY, LO_ORDERDATE, LO_REVENUE FROM lineorderfull ALL INNER JOIN ( SELECT C_REGION, C_CUSTKEY AS LO_CUSTKEY FROM customerfull ) USING (LO_CUSTKEY) WHERE C_REGION = 'ASIA' ) ALL INNER JOIN ( SELECT P_MFGR, P_PARTKEY AS LO_PARTKEY FROM partfull ) USING (LO_PARTKEY) GROUP BY P_MFGR, yod ORDER BY P_MFGR ASC, yod ASC

By writing queries this way, we force ClickHouse to use the prescribed JOIN order — at this moment there is no optimizer in ClickHouse and it is totally unaware of data distribution.

There is also not much speedup when we compare one node vs. three nodes:

One node execution time:

35 rows in set. Elapsed: 697.806 sec. Processed 15.08 billion rows, 211.53 GB (21.61 million rows/s., 303.14 MB/s.)

Three nodes execution time:

35 rows in set. Elapsed: 622.536 sec. Processed 15.12 billion rows, 211.71 GB (24.29 million rows/s., 340.08 MB/s.)

There is a way to make the query faster for this 3-way JOIN, however. (Thanks to Alexander Zaytsev from https://www.altinity.com/ for help!)

Optimized query:

SELECT sum(revenue), P_MFGR, yod FROM ( SELECT LO_PARTKEY AS P_PARTKEY, toYear(LO_ORDERDATE) AS yod, SUM(LO_REVENUE) AS revenue FROM lineorderfull WHERE LO_CUSTKEY IN ( SELECT C_CUSTKEY FROM customerfull WHERE C_REGION = 'ASIA' ) GROUP BY P_PARTKEY, yod ) ANY INNER JOIN partfull USING (P_PARTKEY) GROUP BY P_MFGR, yod ORDER BY P_MFGR ASC, yod ASC

Optimized query time:

One node:

35 rows in set. Elapsed: 106.732 sec. Processed 15.00 billion rows, 210.05 GB (140.56 million rows/s., 1.97 GB/s.)

Three nodes:

35 rows in set. Elapsed: 75.854 sec. Processed 15.12 billion rows, 211.71 GB (199.36 million rows/s., 2.79 GB/s.

That’s an improvement of about 6.5 times compared to the original query. This shows the importance of understanding data distribution, and writing the optimal query to process the data.

Another option for dealing with JOIN complexity, and to improve performance, is to use ClickHouse’s dictionaries. These dictionaries are described here: https://www.altinity.com/blog/2017/4/12/dictionaries-explained.

I will review dictionary performance in future posts.

Another traditional way to deal with JOIN complexity in an analytics workload is to use denormalization. We can move some columns (for example, P_MFGR from the last query) to the facts table (lineorder).

Observations

  • ClickHouse can handle general analytical queries (it requires special schema design and considerations, however)
  • Linear speedup is possible, but it depends on query design and requires advanced planning — proper speedup depends on data locality
  • ClickHouse is blazingly fast (beyond what I’ve seen before) because it can use all available CPU cores for query, as shown above using 24 cores for single server and 72 cores for three nodes
  • Multi-table JOINs are cumbersome and require manual work to achieve better performance, so consider using dictionaries or denormalization
Categories: MySQL

Percona Monitoring and Management 1.1.5 is Now Available

MySQL Performance Blog - Wed, 2017-06-21 17:58

Percona announces the release of Percona Monitoring and Management 1.1.5 on June 21, 2017.

For installation instructions, see the Deployment Guide.

Changes in PMM Server

  • PMM-667: Fixed the Latency graph in the ProxySQL Overview dashboard to plot microsecond values instead of milliseconds.

  • PMM-800: Fixed the InnoDB Page Splits graph in the MySQL InnoDB Metrics Advanced dashboard to show correct page merge success ratio.

  • PMM-1007: Added links to Query Analytics from MySQL Overview and MongoDB Overview dashboards. The links also pass selected host and time period values.

    NOTE: These links currently open QAN2, which is still considered experimental.

Changes in PMM Client

  • PMM-931: Fixed pmm-admin script when adding MongoDB metrics monitoring for secondary in a replica set.
About Percona Monitoring and Management

Percona Monitoring and Management (PMM) is an open-source platform for managing and monitoring MySQL and MongoDB performance. Percona developed it in collaboration with experts in the field of managed database services, support and consulting.

PMM is a free and open-source solution that you can run in your own environment for maximum security and reliability. It provides thorough time-based analysis for MySQL and MongoDB servers to ensure that your data works as efficiently as possible.

A live demo of PMM is available at pmmdemo.percona.com.

Please provide your feedback and questions on the PMM forum.

If you would like to report a bug or submit a feature request, use the PMM project in JIRA.

Categories: MySQL

Tracing MongoDB Queries to Code with Cursor Comments

MySQL Performance Blog - Wed, 2017-06-21 17:16

In this short blog post, we will discuss a helpful feature for tracing MongoDB queries: Cursor Comments.

Cursor Comments

Much like other database systems, MongoDB supports the ability for application developers to set comment strings on their database queries using the Cursor Comment feature. This feature is very useful for both DBAs and developers for quickly and efficiently tying a MongoDB query found on the database server to a line of code in the application source.

Once Cursor Comments are set in application code, they can be seen in the following areas on the server:

  1. The db.currentOp() shell command. If Auth is enabled, this requires a role that has the ‘inprog’ privilege.
  2. Profiles in the system.profile collection (per-db) if profiling is enabled.
  3. The QUERY log component.

Note: the Cursor Comment string shows as the field “query.comment” in the Database Profiler output, and as the field “originatingCommand.comment“ in the output of the db.currentOp() command.

This is fantastic because this makes comments visible in the areas commonly used to find performance issues!

Often it is very easy to find a slow query on the database server, but it is difficult to target the exact area of a large application that triggers the slow query. This can all be changed with Cursor Comments!

Python Example

Below is a snippet of Python code implementing a cursor comment on a simple query to the collection “test.test”. (Most other languages and MongoDB drivers should work similarly if you do not use Python.)

My goal in this example is to get the MongoDB Profiler to log a custom comment, and then we will read it back manually from the server afterward to confirm it worked.

In this example, I include the following pieces of data in my comment:

  1. The Python class
  2. The Python method that executed the query
  3. The file Python was executing
  4. The line of the file Python was executing

Unfortunately, three of the four useful details above are not built-in variables in Python, so the “inspect” module is required to fetch those details. Using the “inspect” module and setting a cursor comment for every query in an application is a bit clunky, so it is best to create a method to do this. I made a class-method named “find_with_comment” in this example code to do this. This method performs a MongoDB query and sets the cursor comment automagically, finally returning a regular pymongo cursor object.

Below is the simple Python example script. It connects to a Mongod on localhost:27017, and demonstrates everything for us. You can run this script yourself if you have the “pymongo” Python package installed.

Script:

from inspect import currentframe, getframeinfo from pymongo import MongoClient class TestClass: def find_with_comment(self, conn, query, db, coll): frame = getframeinfo(currentframe().f_back) comment = "%s:%s;%s:%i" % (self.__class__.__name__, frame.function, frame.filename, frame.lineno) collection = conn[db][coll] return collection.find(query).comment(comment) def run(self): uri = "localhost:27017" conn = MongoClient(uri) query = {'user.name': 'John Doe'} for doc in self.find_with_comment(conn, query, 'test', 'test'): print doc conn.close() if __name__ == "__main__": t = TestClass() t.run()

There are a few things to explain in this code:

  1. Line #6-10: The “find_with_comment” method runs a pymongo query and handles adding our special cursor comment string. This method takes-in the connection, query and db+collection name as variables.
  2. Line #7: is using the “inspect” module to read the last Python “frame” so we can fetch the file, line number, that called the query.
  3. Line #12-18: The “run” method makes a database connection, runs the “find_with_comment” method with a query, prints the results and closes the connection. This method is just boilerplate to run the example.
  4. Line #20-21: This code initiates the TestClass and calls the “run” method to run our test.
Trying It Out

Before running this script, enable database profiling mode “2” on the “test” database. This is the database the script will query. The profiling mode “2” causes MongoDB to profile all queries:

$ mongo --port=27017 > use test switched to db test > db.setProfilingLevel(2) { "was" : 1, "slowms" : 100, "ratelimit" : 1, "ok" : 1 } > quit()

Now let’s run the script. There should be no output from the script, it is only going to do a find query to generate a Profile.

I saved the script as cursor-comment.py and ran it like this from my Linux terminal:

$ python cursor-comment.py $

Now, let’s see if we can find any Profiles containing the “query.comment” field:

$ mongo --port=27017 > use test > db.system.profile.find({ "query.comment": {$exists: true} }, { query: 1 }).pretty() { "query" : { "find" : "test", "filter" : { "user.name" : "John Doe" }, "comment" : "TestClass:run;cursor-comment.py:16" } }

Now we know the exact class, method, file and line number that ran this profiled query! Great!

From this Profile we can conclude that the class-method “TestClass:run” initiated this MongoDB query from Line #16 of cursor-comment.py. Imagine this was a query that slowed down your production system and you need to know the source quickly. The usefulness of this feature/workflow becomes obvious, fast.

More on Python “inspect”

Instead of constructing a custom comment like the example above, you can also use Python “inspect” to collect the Python source code comment that precedes the code that is running. This might be useful for projects that have code comments that would be more useful than class/method/file/line number. As the comment is a string, the sky is the limit on what you can set!

Read about the .getcomments()  method of “inspect” here: https://docs.python.org/2/library/inspect.html#inspect.getcomments

Aggregation Comments

MongoDB 3.5/3.6 added support for comments in aggregations. This is a great new feature, as aggregations are often heavy operations that would be useful to tie to a line of code as well!

This can be used by adding a “comment” field to your “aggregate” server command, like so:

db.runCommand({ aggregate: "myCollection", pipeline: [ { $match: { _id: "foo" } } ], comment: "fooMatch" })

See more about this new feature in the following MongoDB tickets: SERVER-28128 and DOCS-10020.

Conclusion

Hopefully this blog gives you some ideas on how this feature can be useful in your application. Start adding comments to your application today!

Categories: MySQL
Syndicate content