MySQL

Percona XtraBackup 2.3.6 is now available

MySQL Performance Blog - Tue, 2016-11-29 17:55

Percona announces the release of Percona XtraBackup 2.3.6 on November 29, 2016. Downloads are available from our download site or Percona Software Repositories.

Percona XtraBackup enables MySQL backups without blocking user queries, making it ideal for companies with large data sets and mission-critical applications that cannot tolerate long periods of downtime. Offered free as an open source solution, Percona XtraBackup drives down backup costs while providing unique features for MySQL backups.

This release is the current GA (Generally Available) stable release in the 2.3 series.

New Features
  • Percona XtraBackup now supports SHA256 passwords. Using the SHA256 algorithm requires either SSL encrypted connection, or using public key encryption for password exchange which is only available when both client and server are linked with OpenSSL.
  • Percona XtraBackup now supports Command Options for Secure Connections.
  • NOTE: Due to xbcrypt format changes, backups encrypted with this Percona XtraBackup version will not be recoverable by older versions.
Bugs Fixed:
  • Fixed intermittent assertion failures that were happening when Percona XtraBackup couldn’t correctly identify server version. Bug fixed #1568009.
  • Safe slave backup algorithm performed too short delays between retries which could cause backups to fail on a busy servers. Bug fixed #1624473.
  • Fixed new compilation warnings with GCC 6. Bug fixed #1641612.
  • xbcrypt was not setting the Initialization Vector (IV) correctly (and thus is was not using an IV). This was causing the same ciphertext to be generated across different runs (for the same message/same key). The IV provides the extra randomness to ensure that the same ciphertext is not generated across runs. Bug fixed #1643949.
  • Backup would still succeed even if XtraBackup would fail to write the metadata. Bug fixed #1623210.
  • xbcloud now supports EMC ECS Swift API Authorization requests. Bugs fixed #1638017 and #1638020 (Txomin Barturen).
  • Percona XtraBackup would fail to backup MariaDB 10.2 with the unsupported server version error message. Bug fixed #1602842.

Other bugs fixed: #1639764 and #1639767.

Release notes with all the bugfixes for Percona XtraBackup 2.3.6 are available in our online documentation. Bugs can be reported on the launchpad bug tracker.

Categories: MySQL

Percona Server for MongoDB 3.0.14-1.9 is now available

MySQL Performance Blog - Mon, 2016-11-28 18:18

Percona announces the release of Percona Server for MongoDB 3.0.14-1.9 on November 28, 2016. Download the latest version from the Percona web site or the Percona Software Repositories.

Percona Server for MongoDB 3.0.14-1.9 is an enhanced, open source, fully compatible, highly scalable, zero-maintenance downtime database supporting the MongoDB v3.0 protocol and drivers. Based on MongoDB 3.0.14, it extends MongoDB with MongoRocks and PerconaFT storage engines, as well as features like external authentication and audit logging. Percona Server for MongoDB requires no changes to MongoDB applications or code.

NOTE: PerconaFT has been deprecated and will be removed in the future.

This release includes all changes from MongoDB 3.0.13 and MongoDB 3.0.14. We implemented no additional fixes or features.

You can find the release notes in the official documentation.

Categories: MySQL

Percona Server 5.7.16-10 is now available

MySQL Performance Blog - Mon, 2016-11-28 17:58

Percona announces the GA release of Percona Server 5.7.16-10 on November 28, 2016. Download the latest version from the Percona web site or the Percona Software Repositories.

Based on MySQL 5.7.16, including all the bug fixes in it, Percona Server 5.7.16-10 is the current GA release in the Percona Server 5.7 series. Percona’s provides completely open-source and free software. Find release details in the 5.7.16-10 milestone at Launchpad.

Deprecated Features:
  • Metrics for scalability measurement feature is now deprecated. Users who have installed this plugin but are not using its capability are advised to uninstall the plugin due to known crashing bugs.
Bugs Fixed
  • When a stored routine would call an administrative command such as OPTIMIZE TABLE, ANALYZE TABLE, ALTER TABLE, CREATE/DROP INDEX, etc. the effective value of log_slow_sp_statements was overwritten by the value of log_slow_admin_statements. Bug fixed #719368.
  • Server wouldn’t start after crash with with innodb_force_recovery set to 6 if parallel doublewrite file existed. Bug fixed #1629879.
  • Thread Pool thread limit reached and failed to create thread messages are now printed on the first occurrence as well. Bug fixed #1636500.
  • INFORMATION_SCHEMA.TABLE_STATISTICS and INFORMATION_SCHEMA.INDEX_STATISTICS tables were not correctly updated for TokuDB. Bug fixed #1629448.

Other bugs fixed: #1633061, #1633430, and #1635184.

The release notes for Percona Server 5.7.16-10 are available in the online documentation. Please report any bugs on the launchpad bug tracker .

Categories: MySQL

Percona Server 5.6.34-79.1 is now available

MySQL Performance Blog - Mon, 2016-11-28 17:21

Percona announces the release of Percona Server 5.6.34-79.1 on November 28, 2016. Download the latest version from the Percona web site or the Percona Software Repositories.

Based on MySQL 5.6.34, including all the bug fixes in it, Percona Server 5.6.34-79.1 is the current GA release in the Percona Server 5.6 series. Percona Server is open-source and free – this is the latest release of our enhanced, drop-in replacement for MySQL. Complete details of this release are available in the 5.6.34-79.1 milestone on Launchpad.

Deprecated features:
  • Metrics for scalability measurement feature is now deprecated. Users who have installed this plugin but are not using its capability are advised to uninstall the plugin due to known crashing bugs.
Bugs fixed:
  • When a stored routine would call an administrative command such as OPTIMIZE TABLE, ANALYZE TABLE, ALTER TABLE, CREATE/DROP INDEX, etc. the effective value of log_slow_sp_statements was overwritten by the value of log_slow_admin_statements. Bug fixed #719368.
  • Thread Pool thread limit reached and failed to create thread messages are now printed on the first occurrence as well. Bug fixed #1636500.
  • INFORMATION_SCHEMA.TABLE_STATISTICS and INFORMATION_SCHEMA.INDEX_STATISTICS tables were not correctly updated for TokuDB. Bug fixed #1629448.

Other bugs fixed: #1633061, #1633430, and #1635184.

Release notes for Percona Server 5.6.34-79.1 are available in the online documentation. Please report any bugs on the launchpad bug tracker.

Categories: MySQL

A Great Mid-Priced Stereo System

Xaprb, home of innotop - Thu, 2016-11-24 23:21

You probably know that I like to be pretty minimalistic, and don’t accumulate a lot of “stuff” in my life. Yet for the few material things I value, I try to find the sweet spot: quality above average, price no higher than needed.

Music is one of the things I care a lot about: I have bought thousands of CDs. But I also value when my music sounds as good as possible. I treated myself to an upgraded stereo system so I’d enjoy better audio quality. Here’s my current system.

It’s quite simple: it’s designed to listen to stereo audio with as little processing and as few components as possible, but still be convenient. Here’s the full list of components and widgets:

I bought a couple of the items from Amazon Warehouse Deals and spent less than $700 on the whole setup. I cannibalized an old rubber doormat, cut in half, for padding under the speaker stands.

How does it sound? It’s entry-level audiophile, well above average performance on a small budget. For my purposes, it’s great: it fills a medium-sized room with accurate, uncolored sound.

The speakers are by far the most important part of the setup. Read anywhere online and you’ll see all the audiophiles raving about the ELAC speakers. They outperform speakers many times the cost. There’s a newer generation of ELACs that are supposed to be even better, for a small uptick in price. A reasonable upgrade for this setup would also be an ELAC subwoofer.

The amplifier is a now-discontinued generation from Cambridge Audio. Cambridge Audio is an audiophile brand with a near-legendary reputation. I wanted it to be simple (no fancy processing), slightly above bare-bones, but have plenty of horsepower (45 watts per channel) to drive through dynamic peaks and power good-quality components. There’s only a few controls, including a bypass switch to remove the equalizer and so on from the circuit. Getting a discontinued model saved me a lot of money. If I wanted to step this up and spend a few hundred dollars more, ELAC’s amplifiers are a great combination of price and value.

The speaker stands are serviceable, nothing fancy, but they place the speakers at the right height and they don’t cause issues with sound quality. The rubber doormat keeps them stable and vibration-free.

The rest of the components are things I already had: some headphone-to-coax adapters, 14-gauge speaker wire, and so on. The UPS is important to protect the components from electrical damage.

I mostly listen to my music from an iPhone or iPad over AirPlay via an Apple AirPort Express. Despite the theoretical problems with this (lossy compression, inferior components), the reality is lossy compression artifacts at 256kbps AAC or 320kbps MP3 are minor compared to the quality degradation of nearly any speakers you can buy.

I listen this way because it’s extremely convenient, and convenience enables more listening, which is where I get my enjoyment. (I used to design and build speaker enclosures myself. I learned to appreciate fine audio from a relative who’s made his living designing custom systems for wealthy people. I’ve listened to music on systems costing hundreds of thousands of dollars. I don’t think the high price makes it more enjoyable.)

The result sounds amazing overall. Yes: if I’d spent thousands of dollars more on my stereo, that one passage in that one song could have sounded better. But! I built a system that sounds incredible for a few hundred dollars. If I want to hear my music on a better system, I can visit someone who owns one.

Pro tip: if, like me, you listen via AirPlay for the convenience, you can Option-Click on your speaker icon in your Mac’s menu bar and select an AirPlay device for audio output.

Want to listen to some music that sounds great on a good system? Here’s a Spotify playlist just for you. Make sure you toggle high-quality playback in your Spotify preferences. Enjoy!

Categories: MySQL

6 Steps to Better Security and Privacy

Xaprb, home of innotop - Thu, 2016-11-24 15:52

I wrote previously about securing your digital life. Technology and digital threats are advancing so fast that we’re almost inevitably all going to be attacked in some way. Here are a few more steps I’ve taken recently.

To repeat some of the recommendations from my previous post, you should absolutely use the following:

  • 2-factor authentication everywhere possible (email, social media, etc)
  • a password safe like 1Password so you never try to remember a password
  • long, strong, randomly generated, unique, never-reused passwords
  • full-device/full-disk encryption on hard drives, laptops, tablets, phones etc

Although I don’t think it is realistic to think we can avert a catastrophic attack forever on a personal or global scale (think “digital 9-11”), I think mass attacks against easily identifiable vulnerable populations are much more common and damaging, so here are some additional steps to avoid being the “tallest poppy.”

1. Use a VPN (It’s Easy)

When you access the internet or use an app, you’re opening a series of data connections between your device and another computer somewhere. Many common vulnerabilities are at or near the start of this chain: from your device to the WiFi router, from the router to the DSL device down the street, etc. A virtual private network (VPN) creates an encrypted tunnel between your device and at least part of the chain.

A VPN is a big step up in security and privacy. For example:

  • A VPN can prevent people snooping on what you’re doing if you’re connected to an insecure WiFi point at the airport or coffee shop.
  • A VPN can prevent your internet service provider from logging and inspecting your browsing or other usage—some internet service providers even modify what you browse, injecting ads, tracking, and other stuff into websites!
  • If someone hacks into your home router or cable modem, they won’t be able to intercept VPN-tunneled traffic before it leaves the house.

These are really legitimate things to worry about: millions of cheap, old, unsecured, underscrutinized devices such as routers and modems are sitting exposed to the internet, and tons of them have known security holes.

VPNs sound obscure and hard to set up, but they’re not. You can get a subscription to a VPN service easily and cheaply. I use Private Internet Access and there are many others. Just search and read ratings from a few objective review sites.

A VPN service is also flexible. You can use a standard VPN client to connect; you don’t have to use the one they probably provide for you. I use Tunnelblick to connect to Private Internet Access, for example, because Tunnelblick is open source so I trust it more, and I already use it for other VPNs I connect to.

2. Create a Family Password

I’ve become increasingly convinced that we’re on the brink of widespread sophisticated automated telephone phishing attacks. Consider the following entirely realistic attack against your family’s personal information and finances:

  1. A robot telephones your parents and listens to the sound of their voice, then hangs up.
  2. The robot calls you and imitates the sound of your parent’s voice, engaging you in a simple conversation. “Baron, it’s mom. I am at the hospital. I need your social security number, quick.”

What would you do? Most people would blurt their SSN and ask questions later. You need a previously-agreed way to validate that it’s really your mother on the phone. There are a lot of other scenarios you can imagine where you’d give the robot some really sensitive information.

Does this attack sound far-fetched? It’s not. In the last year I’ve gotten amazingly sophisticated robot calls. They’ve been for relatively innocuous purposes (“can I count on you to donate to the veterans fund?”) but they illustrate how adept computers are at carrying on pretty convincing conversations with humans. I’ve been fascinated at how quickly they’ve gotten good at this. And a sophisticated attacker could easily ask me for some information that seems harmless, call someone else and ask for more, put the pieces together quickly to form the whole puzzle, then a human could use the information to call the bank and convince the agent that they’re me.

One way to avoid this is to agree on a family password or other cue. Many families have passwords for unexpected situations such as sending a coworker to pick up the kids from school when there’s an emergency, for example.

I’m not sure I have the right answer for this yet. Please comment if you have a suggestion. We need a technique that works for humans under stressful situations, but doesn’t fall prey to robo calls trying to create those situations and trick the humans into bypassing the system or revealing the secret.

3. Use Signal or WhatsApp

Encryption provides a host of benefits. We really, really, really need to normalize encryption as a global society. Plaintext communication needs to become weird, and encrypted needs to become easy and expected.

Right now it’s the reverse, and people who use encryption for some types of communication are outliers. We need herd immunity.

WhatsApp and several other messenger apps can use end-to-end encryption for text messages and the like. Sometimes by default, sometimes as a configuration option. Signal is a popular option; I like that it’s open-source so it’s verifiable.

4. Use Keybase

One way of normalizing encryption, and making yourself easily and publicly reachable for private communications, is to join Keybase.io. A simplified way to explain what Keybase does:

  1. It creates trustworthy identification: I am who you think I am, so you can trust that you’re communicating with me, not someone impersonating me.
  2. It provides a trustworthy way to encrypt that communication: if you encrypt something meant for me, only I can decrypt it.
  3. It ensures that you receive exactly what I communicate to you, without modification.

Keybase is very popular among engineers and techies, but we need more. The more people who join Keybase, the closer we are to critical mass and adoption thresholds; the closer we are to practical herd immunity.

5. Use HTTPS On Your Blog

If you have a personal blog or website, please use HTTPS (SSL) for it. There are several ways to do this. I use Netlify to host this blog, so SSL is provided for me. You can also use Let’s Encrypt. Setting up SSL on a personal site used to be hard. It’s now so easy that nobody should use plain-HTTP anymore.

Why? Again, encryption needs to become normalized and expected everywhere. Your website’s users deserve it. Even if they’re just reading a blog, having an HTTPS connection will prevent someone from snooping or modifying the information that is exchanged between their device and the blog server.

Google has a nice article about why every site should use HTTPS.

6. Don’t Use Fingerprint Unlock

Fingerprint readers on computers, phones, and tablets are useful. But it’s risky to unlock the device itself with them. (They are still good for things like unlocking apps that you’d otherwise have to unlock with a long, hard-to-type password.)

The problem is that you can’t change your fingerprint. You should never use something unchangeable for a password, especially on a device that has a lot of sensitive data and access to services. And there are validated cases of fingerprint-locked devices being unlocked by law enforcement, malicious people, etc.

The US government’s attempted power-grab during the 2016 San Bernadino Shooter case, where they tried to use the fear surrounding the case to expand their powers of search and seizure, should give every thoughtful person serious pause.

Use a passcode, and configure your device to reset itself after 10 failed attempts.

Conclusions

Please share this post with friends and family. And please post comments below and let me know what you think.

Photo Credit

Categories: MySQL

Percona Server for MySQL 5.5.53-38.5 is now available

MySQL Performance Blog - Wed, 2016-11-23 18:59

Percona announces the release of Percona Server for MySQL 5.5.53-38.4 on November 23, 2016. Based on MySQL 5.5.53, including all the bug fixes in it, Percona Server for MySQL 5.5.53-38.5 is now the current stable release in the 5.5 series.

Percona Server for MySQL is open-source and free. You can find release details in the 5.5.53-38.5 milestone on Launchpad. Downloads are available here and from the Percona Software Repositories.

Metrics for scalability measurement feature is being built by default but deprecated. Users who have installed this plugin but are not using its capability are advised to uninstall the plugin due to known crashing bugs. This feature was accidentally removed instead of deprecated in the previous release which could cause issues for users that had this feature enabled.

Find the release notes for Percona Server for MySQL 5.5.53-38.5 in our online documentation. Report bugs on the launchpad bug tracker.

Categories: MySQL

Webinar Q/A: MySQL High Availability with Percona XtraDB Cluster 5.7

MySQL Performance Blog - Tue, 2016-11-22 20:34

In this blog I will provide answers to the questions and queries that some of you have raised during the webinar on Nov 17th.

I would like to say thank you to all of the audience who attended the talk on November 17, 2016. You can also check the recording and slides here.

Q. How is storage distribution done across the node?

A. Each node has independent storage and other resources. There is no sharing of resource. Only the write-sets are replicated.

Q. If write-set propagation fails in some manner is there any retry mechanism?

A. write-set are written to group channel and originating node waits for ack from all the nodes of the cluster. If some nodes fails to respond back then it may be loose its cluster membership. Each node needs to consume all write-sets and in given order only.

Q. Normally, we point only to one write node, can we point in Percona XtraDB Cluster 5.7 to two writing nodes balanced ? Or should the solution be ProxySQL ?

A. Percona XtraDB Cluster (PXC) being multi-master you can execute writes on multiple-nodes. (This is possible even with 5.6). ProxySQL will help you load-balance your traffic but facility to write to any node is inherent to PXC.

Q. Which service call does a joining node have to be to get cluster membership? Is there some kind of registry service?

A. There is no special registry service. This is transparent to the end-user and is handled as part of gcomm communication layer.

Q. Would it be possible to get more information about setting up proxy-sql as we are currently using haproxy but would like a more aware balancer.

A. These articles should help:

Q. Is there a recommended setup for Cluster (White Paper)? I did hear a lot of conflict issues between nodes. So I would like to see if there is a recommended setup.

A. There is not a single way to do this but there are a lot of blogs based on your use-case. Simplest one is 3 node cluster in LAN. Conflicts generally happens if user tend to update same data through multiple nodes. Dis-joint workload distribution will help avoid conflict. Said that if conflicts are inherent part of application or workload Percona XtraDB Cluster (PXC) is well armed to handle it.

Q. What is best way to figure out timeouts for geo clusters?

A. Studying latency and ensuring timeout > latency.

Q. Lets say we are running Percona XtraDB Cluster 5.6 version with 2 cluster. Can i join new node with latest version of Percona XtraDB Cluster 5.7?

A. This scenario is possible as part of Percona XtraDB Cluster (PXC) support rolling upgrade a new node demanding SST from 5.6 node will surely not work. Also, this should be a temporary solution with plan for full upgrade not something you want to continue working with.

Q. Currently i am using Percona XtraDB Cluster 5.6. Mostly i am facing a deadlock situation. When insert query is running on big table. Then Percona trys to synch with another node. At that time ant dml query won’t be executed. So at that time i need to shutdown another node. Then query execution will be fine. Then i need to start another node one by node. I even changed may Gelera/percona wrep_xx configuration, but it did not work. So is this kind of issue solved in Percona XtraDB Cluster 5.7?

A. I am not sure I understood the complete setup but let me try to summarize my understanding. You have DML running on node-1 that is replication to node-2 and node-2 workload is trying to touch the same big-table that is getting replicated write-set. Local transaction may face a abort as replicated transaction always take priority over local running transaction. There shouldn’t be a need to shutdown any of the node. If you still face this problem you can file the detailed report on lp or forum. We can discuss what is going wrong.

Q. I need to make DR platform. which replication will be suitable for this. Do i need to upgrade with Percona XtraDB Cluster 5.7 at DR side or Replication manager requires?

A. For DR you can either use extended cluster so that DR site get instant write-set or setup a new cluster and enable cluster-cluster replication using MySQL MASTER-MASTER async replication. (Given DR one way MASTER-SLAVE should also work). You don’t need to upgrade it but it is better to use consistent and updated version for all node especially mix-match of MASTER-SLAVE may have compatibility issue.

Q. What are the major differences/benefits between Percona XtraDB Cluster 5.7 and MariaDB Cluster with Galera ?

A. Percona XtraDB Cluster (PXC) is 5.7 GA. MariaDB 10.2 is proposed to be GA by Dec 2016. Besides this PXC is fully PS compatible that uses XtraDB engine and there are some small functional/usage difference and stability difference.

Q. How much time a node can be out of a cluster and still can rejoin applying writesets ? How is managed writesets retention ?

A. Time node can be offline without need for SST depends on 2 factors: rate of replicating transaction (including size) and size of galera-cache that caches these write-sets. If you think you need longer offline time and then you should set galera cache accordingly.

Q. Can we have a sample config file for geo-clusters?

A. We will try to come up with one in due-course through an upcoming blog. In the meantime, you can look at existing blogs on the Percona Database Performance blog.

Q. Whats is the limit for max_rows and max_tnx_size in Percona XtraDB Cluster (PXC) 5.7..specially for batch datalaods across multi-region cluster nodes

A. wsrep_max_ws_rows (DEFAULT 0: no limit, max: 1048576). wsrep_max_ws_size (DEFAULT: 2G, range: 1024, 2G)

Q: Does Percona XtraDB Cluster (PXC) support MySQL’s GTIDs?

A. Yes. But for Percona XtraDB Cluster (PXC) replication it uses its own GTID. This blog will help clear confusion.

Q. How does Percona XtraDB Cluster (PXC) compare to MySQL’s Group Replication?

A. Both are trying to solve the same problem, except Percona XtraDB Cluster (PXC) is matured and has been in market for quite sometime. GR is being built.

Q. Does Percona XtraDB Cluster (PXC) have a size limitations? I recently tried to setup a 2TB PXC cluster, however, during load tests there were a few instances where one node got out of sync. The server did a full copy of the data, but could not complete because the load tests kept filling up the gcache.

A. There is no such known limitation. Generally if the node received queue fills up then it will emit a FLOW CONTROL signal. Generally you will receive a queue that is small enough not to fill up gcache. If you still have log files you can share them through LP or forum. We will try to look at them.

Q. How do you perform a major version upgrade. Per MySQL’s documentation, you can not replicate from a major version to the last major version. But it is fine to replicate from one major version to the next. So how would you do this in the cluster?

A. As per MySQL you may face issues if you try to replicate from lower version (master in 5.6) to higher version slave (slave in 5.7) but it is not blocked. Some of the semantics may be different. Percona XtraDB Cluster (PXC) write-sets are different though as it shares binlog events and this write-set format has not changed in 5.7.

Q. Does Galera set a max number of nodes that can be part of the cluster?

A. No such realistic limitation.

Q. Are there docker images with this configured? Dockerhub or something?

A. This should help.

Q. What is the maximum latency that would be supported on the LAN before you would say that running a Percona XtraDB Cluster is not a good idea?

A. I guess this is configurable based on timeout. So there is no such recommended latency threshold for LAN. Lesser the better.

Q. When you start a cluster and bootstrap Node 1, then start Node 2 and Node 3. If you restart Node 1, it will rejoin the cluster but not has a bootstrap state, but it does not matter because it will join a live cluster. If my understanding is correct Bootstrap only matter for the first node starting Is that correct ? What would happens if node 1 restart with bootstrap option, will it force the other node to sync against it ? will it join the running cluster?

A. When you start node-1 for the first time it will create a new cluster and node-2 and node-3 will join the existing cluster. Depending on how node-1 is restarted it can join the existing cluster or create one more independent cluster. Recommended way is to use a valid value of wsrep_cluster_address for all nodes and just pass following extra param –wsrep_new_cluster to the bootstrap node. If you happen to restart this node avoid passing this param. The node will try to join the existing cluster.

Q. What is the overhead of running Percona Monitoring and Management (PMM)

A. Percona Monitoring and Management (PMM) installs an agent on the node to collect a lot of other statistics. From Percona XtraDB Cluster (PXC) perspective it will only run to show a status, so pretty lightweight for PXC.

Q. Is it easy (any procedure) to move from codership galera to Percona XtraDB Cluster (PXC)?

A. I don’t think there is blog about it but they are fully compatible so moving should be easy. I will findout if there is set process for this.

Q. Where is the documentation for Cluster Safe Mode and other new features discussed here?

A. pxc_strict_mode. for PFS you can check this out. ProxySQL and Percona Monitoring and Management (PMM) has blog too.

Q. Is there some integrity issues that a client believes a node is up while this one has lost the cluster ?

A. No known issue.

Q. Is there any limit of running a huge number of databases ? Say several millions ?

A. No known issue.

Q. How are the performance of proxy sql compared with ha proxy?

A. You can check this out.

Q. We use Nagios for monitoring, will a plug-in be added for monitoring the cluster, or will it be only Percona Monitoring and Management (PMM)?

A. Check this out.

Q. “Cross data center replication”. We have two data centers that have a ping latency of 2ms (consistent) and I would like to replicate between the two for DR (disaster recovery) purposes.

A. 2 ms latency between 2 DC and consistent network sounds pretty good. Just tune timeout and things will work.

Q. Do you guys have a sample config files for a quick spin off of a 3 node cluster?

A. This should help.

Q. i see that there is added features like pam authentication,thraed pool which is given fro free in percona can you elobrate on it 

A. Percona XtraDB Cluster (PXC) is PS compatible. So any feature that is present in PS will be part of Percona XtraDB Cluster (PXC).

Q. In the example that you showed, where you had a 6 node cluster , where 3 was in Site A and 3 was in Site B. If the WAN link goes down, how does the cluster determine what data set is the master set, once the wan link comes back up after a few hours?

A. In the example I have used 2 DCs. Recommended is to use 3 DCs to avoid split-brain. If you have 6 nodes in 2 DCs and WAN link goes off it will create split-brain and no node will accept workload unless user set weight to form quorum or re-bootstrap primary.

I hope I was able to answer most of the questions/queries. If you happen to have follow-up questions please post them on forum.

Categories: MySQL

Remembering Alan And Harry

Xaprb, home of innotop - Tue, 2016-11-22 00:39

Harry Weller died suddenly this week. Harry was one of the greatest venture capital investors in history. He led NEA’s investment in my company, VividCortex. Harry was an advocate and mentor to me, but he was more: in both life and death, he reminded me of Alan Rimm-Kaufman, a boss and mentor who died a handful of years ago.

Harry Weller

My first contact with Harry was when he reached out to me in early 2013. I was fundraising for VividCortex, and many potential investors were highly skeptical. It was discouraging at times, because some of the more deeply technical investors, those who’d run databases themselves, were more skeptical than those without domain knowledge. There were lots of helpful and supportive venture capitalists, but there was also meeting after meeting after meeting of “I don’t see this solving a problem. This isn’t a business. This is niche at best. Nobody has pain that this addresses.” Some were polite about it, some barely concealed their feelings.

In just a few words, Harry proved that he got it in a way few others did. He knew there was a need and opportunity, even though the timing wasn’t right. Over the following couple of years, we stayed in touch frequently. He never wrote more than a sentence or two, but he always did something helpful: an introduction, an encouragement, something. I sent him occasional updates that would fit onto an iPhone screen.

We brought the company through an existential crisis, then reached cash-flow positive and the business’s fundamental strength became obvious in the numbers. At each milestone, Harry wrote me a few words. When we won a huge deal, for example, Harry just wrote “can u call me” and his phone number. When I got him on the phone he celebrated our success. He cheered me more than I cheered myself.

One day his message was simply “is there a dollar amount of investment that gets me X% of the company with X% postmoney option pool? Hello from a ship in the Atlantic.” And that was how it started. I had a pitch deck, I had financials, a roadmap, a slide full of logos in a quadrant with us at the top right corner, all that jazz. But he barely glanced at it. We found a time between two board meetings, and I drove to his house and waited to meet him in person for the first time. He arrived ravenously hungry, and started pitching me on why we should work together while he made a smoothie, which he shared with me.

He was like a kid who couldn’t stand still. He paced restlessly, interrupting himself midsentence to call other partners at NEA and tell them how big this could be, then hanging up and resuming his thoughts. He told me to stop explaining why this was a good idea—he didn’t need convincing. He pushed me to think even bigger. It wasn’t a question of whether we’d make it. Hah! His biggest fear was that we wouldn’t execute fast enough to outrun the competition. He asked a few questions, did some mental math that was startlingly accurate, summed up the business in a few sentences, and told me if we didn’t fund up and sweep the market, our raw efficiency would become obvious to other investors and they’d fund someone to outgun us. “I’m telling you, I’ve seen this again and again,” he said. “You’ve built a great business, and you’re not going to be able to hide it. In enterprise infrastructure software, my one worry is that you can’t stay ahead of the pack that will come after you. And make no mistake. They will come after you.”

I wasn’t even officially fundraising, but I took his investment and we hit the ground running. Harry frequently called me out of the blue. When I called him I usually got voicemail, then a call back a short while later. I have no idea how he did it. He was on 14 boards and leading the East Coast practice at the world’s largest venture firm, among 99 other things. How can one person have so much bandwidth?

I didn’t have enough time with Harry to get to know him well, not even a year, and only a couple of hours all together. We weren’t close on a personal level. I was still adjusting to some of the ways he worked, still forming opinions about how he operated. He was teaching me to lead more strongly, and the way he did it was brilliantly simple and effective: he told me what the job needed and that if I didn’t do it, a new CEO would. His point was unmistakeable: there is no room in a fast-moving company to leave authority or responsibility vague or let it be owned by the group. It had the effect of pressuring everyone to meet a higher level of performance or leave. I’m still trying to learn how to do that myself.

If I tell the unvarnished truth, I was sometimes nervous of Harry’s motives, because I’d never worked with someone who could instinctively sense an area where I was slightly passive, and feint at it to force me to engage more strongly and produce better results. But in the last couple months, I honestly started to believe he was doing it just to make me be my best.

Then I got the call. Harry was dead. I was stunned.

Alan Rimm-Kaufman

As I said, Harry and I weren’t close, but my reaction was still very emotional, because Harry isn’t the first person to believe in me and create an amazing opportunity before dying unexpectedly. Before Harry, there was Alan Rimm-Kaufman, who shaped my career tremendously.

There were so many parallels and similarities between Harry and Alan, and my personal experience with the two of them.

I joined Alan in 2006 at the Rimm-Kaufman Group, a bootstrapped startup he founded in Charlottesville, Virginia. Alan had an infectious love of life and was unstoppably energetic. He’d enter a room with his face lit up, bubbling with an idea he couldn’t resist sharing. He was unable to stand still. He hopped around like a soccer goalie.

Alan would start a thought and someone else would venture a related idea. Alan would pounce on it. “Yes!” he’d say. With Alan, just as with Harry, you had zero doubt that he believed in you. Alan, also like Harry, was a man of towering genius. I sometimes wondered if the ideas were all his anyway, and he just withheld parts of them, letting us complete the thoughts he’d already had so we’d believe in ourselves too.

One day I said I might want to write a book someday. Alan didn’t even blink. “Oh yeah,” he said, as he bounced up and down on my yoga ball. “Oh, YEAH! You’ll be published in less than 18 months.” It wasn’t a question. It was foretold. When the opportunity to coauthor High Performance MySQL appeared a short time later, Alan switched me to half-time work with full-time benefits and gave me a private office to work in. He told me to shut the door in the morning and write until lunchtime. I don’t remember how long this went on—perhaps a year.

Alan also sent me to conferences, where Google tried to recruit me. I didn’t want to move to Silicon Valley. Alan thought I was crazy and sternly told me to go interview with them. The day before I left, he handed me a letter full of reasons I’d be better off not taking the job. It was classic Alan: ten hilarious statistics like the number of poisonous snake species in California versus Virginia. I didn’t take a job at Google, but I got to know many wonderful people there.

Then I broke Alan’s heart anyway: I left to work at Percona. When I walked into his office he knew. I remember him doing his best to be happy for me.

While I got absorbed into my new life, working all hours of the day and night, Alan’s leukemia returned. I later found out it was a big part of why he was so energetic. After it went into remission the first time, he’d attacked life and all of the crazy dreams he had for himself and his family with renewed vigor, no holds barred. This time, the disease was not to be beaten. It was looking hopeful, and he was feeling good, he wrote me in emails. But then, suddenly he was gone in a matter of days.

When I re-read my emails, I realized Alan was asking me to come see him before he died, but I had been too busy to pay attention.

I was devastated at Alan’s death. It was the first time anyone I cared about had died, and Alan had been like a father to me. But far worse, I was consumed with guilt at being too selfish to go see him, wracked with regret at taking him for granted. It took me a long time to forgive myself. Eventually, I realized that the regret itself was a waste. It was just a form of self-hatred so I could feel better for punishing myself. When I recovered enough to see this, I realized Alan would have wanted nothing more than for me to stop wallowing in guilt and live fully as he had. That was part of the reason I left my job to found VividCortex.

Alan and Harry’s Legacy

When Harry died, a flood of thoughts and feelings came back: all of the things I’d processed after Alan’s death. I was reminded of what came from that, my own growth, how the experience of knowing Alan changed me. There are so many parallels in the brief time I had with Harry.

Alan and Harry were intensely focused on all of life, and accomplished extraordinary things. They live on through family, work, religious community, and many other ways including me. I am part of their legacy, and I try to honor them by doing what they did: greet each day with joy, optimism, and energy. Put the pedal to the metal and don’t even think about the brakes.

I was looking forward to seeing Harry soon, and telling him what’s taking shape right now. But I don’t need to, because I already know what he’d do. He’d cheer me, because it’s risky and gutsy and crazy and of course it will work! He’d tell me damn the torpedoes and full speed ahead, he’d tell me to run, run, run! and do it before anyone else does. I know Alan would say exactly the same thing.

Words can never express my gratitude to Alan and Harry. Words are feeble. You know what’s sufficient? Actions, and actions alone.

PS: To the countless others who’ve also believed, supported, helped, and encouraged me: thank you, it means more than I can say, but watch me, and I’ll put my gratitude into action; and someday I hope I can do as you’ve done, too.

Categories: MySQL

Excel Hacks To Ignore Missing Data

Xaprb, home of innotop - Sun, 2016-11-20 16:11

I’ve done quite a bit of work with Excel over the last few years, and I’ve found a couple of recurring problems when there’s missing or error data in ranges. I’ve had to work around this enough times that I thought it was worth sharing the solutions I’ve used.

Using AGGREGATE() Instead Of MIN()

Aggregate functions such as MIN() and SUM() often produce errors when they’re defined over a range with missing or bogus data. Although this is good, I often want to create a spreadsheet that operates over a fixed range, with computations based on that range. And I want to avoid complicated IF() or error-handling in long formulas, which make them hard to maintain and understand. In some cases, I’ve found that there’s really no way to solve the problems through that approach anyway. For example, if I use IFERROR() and output a 0 when there’s an error, the next set of cells that are derived from that output will divide-by-zero and it’s just turtles all the way down.

With the AGGREGATE() function, I have found a way to solve some of these issues. In return for using a function whose meaning and syntax is a little confusing, I can centralize all the mess into one place instead of smearing it all over the entire spreadsheet.

Here’s an example. I’ve entered some numbers in a range, and then filled the rest of the range with =LOG(0) to create an error. Then I’ve tried to take the MIN() of B6:B15 in cell C6. This results in an error, of course.

You can solve this with the AGGREGATE() function. You have to look up the documentation for the syntax and meaning of each set of options, but in short, you can specify an aggregate operation, extra arguments, and what to do if there are errors. Here’s how to get the MIN() result I want:

Using OFFSET() To Create Ranges

Sometimes I haven’t been able to work around all the challenges using AGGREGATE() and I’ve needed to specify a variable-sized range. This is possible with the OFFSET() function.

For example, suppose I want to do linear regression using the LINEST() function, but part of the range might contain errors or non-values. After typing in the formula in cell D6 and using the “array formula” command (Control-Shift-Enter), I get the following:

OFFSET() can help. It defines a range relative to a specified cell. You specify where the range begins, and how many rows and columns it covers. As long as the range contains a contiguous set of values (all the non-values are at the end), OFFSET() together with COUNT() can define a variable-sized range that adapts to the valid set of cells. COUNT() counts the number of valid values in the range, so you can use it as the argument to OFFSET() to set the size of the resulting range.

Here’s the solution in action on my example spreadsheet. I’m using OFFSET() to specify the first two parameters to LINEST(), defining the ranges C6:C10 and B6:B10.

The caveat with OFFSET() is that, unlike explictly setting a range with cell identifiers, Excel won’t adjust the range when you do things like deleting rows from the middle of it.

Conclusion

Both of these workarounds introduce their own complexities: with AGGREGATE() you’re adding a more obscure syntax, and with OFFSET() you’re circumventing some of Excel’s features to help maintain and adapt your spreadsheet. In both cases, though, I’ve found that in some circumstances the net outcome is a simpler, more comprehensible spreadsheet overall.

Pic Credit (which has nothing to do with anything, but it’s pretty)

Categories: MySQL

Percona Server 5.5.53-38.4 is now available

MySQL Performance Blog - Fri, 2016-11-18 21:33

Percona announces the release of Percona Server 5.5.53-38.4 on November 18, 2016. Based on MySQL 5.5.53, including all the bug fixes in it, Percona Server 5.5.53-38.4 is now the current stable release in the 5.5 series.

Percona Server is open-source and free. You can find release details in the 5.5.53-38.4 milestone on Launchpad. Downloads are available here and from the Percona Software Repositories.

Removed Features: Bugs Fixed:
  • When a stored routine would call an “administrative” command such as OPTIMIZE TABLE, ANALYZE TABLE, ALTER TABLE, CREATE/DROP INDEX, etc. the effective value of log_slow_sp_statements was overwritten by the value of log_slow_admin_statements. Bug fixed #719368.
  • Thread Pool thread limit reached and failed to create thread messages are now printed on the first occurrence as well. Bug fixed #1636500.

Other bugs fixed: #1612076, #1633061, #1633430, and #1635184.

Find the release notes for Percona Server 5.5.53-38.4 in our online documentation. Report bugs on the launchpad bug tracker.

Categories: MySQL

WiredTiger B-Tree versus WiredTiger In-Memory: Q & A

MySQL Performance Blog - Fri, 2016-11-18 18:15

In this blog, I will provide answers to the Q & A for the WiredTiger B-Tree versus WiredTiger In-Memory webinar.

First, I want to thank everybody for attending the October, 13 webinar. The recording and slides for the webinar are available here. Below is the list of questions that I wasn’t able to fully answer during the webinar, with responses:

Q: Does the In-Memory storage engine have an oplog? Do we need more RAM if the oplog is set to be bigger?
Q: So we turn off the oplog?
Q: How is data replicated without oplog? Do you confound it with journaling?

A: Percona Memory Engine for MongoDB can be started with or without oplog, depending on whether it started as part of a replica set or standalone (you cannot explicitly turn oplog on or off). But if created, oplog will be stored in memory as well. You can still control its size with the option --oplogSize .

The recovery log (journal) is disabled for the Percona Memory Engine.

Q: After a crash of the In-Memory storage engine, does it need a complete initial sync? Means, cloning all databases?

A: Yes.

Q: WiredTiger reserves 50% of RAM for de-compression. Is this also true for the In-Memory engine?

A: Where did you find this information? Please point to its location in the docs in the comments section below. I asked Percona developers to confirm or deny this for the Percona Memory Engine, and this was their answer:

WT decompresses data block-wise, and each block is of some reasonable size (usual numbers are couple of Megs, let’s say). Decompressor knows the size of uncompressed data by reading this info from compressed block (this info is stored during compression). It creates an extra buffer of uncompressed block size, decompresses data into this buffer, then uses that decompressed buffer and frees the initial one. So there’s no reserve of memory for either compression or decompression, and no docs stating that.

Please note that this comment applies only to block compression, which is only used during disk I/O when WiredTiger reads and writes blocks, thus not available for Percona Memory Engine.

Q: There is no compression of data in this engine?

A: The Percona Memory Engine uses only prefix compression for indexes. Theoretically, it can use other types of compression: dictionary and Huffman (but they both disabled in MongoDB).

Q: With all the data in memory, is there much benefit to having indexes on the data?

A: Yes, because with index access you will read less data. While reading from memory is much faster than from disk, it is faster to read just few rows from memory instead of scanning millions.

Q: Our db is 70g. Will we need 70g memory to use Percona In-Memory?
Q: How much memory should be allocated for 70g db size?

A: What storage engine do you use? How do you calculate size? If this is WiredTiger and you count the space it allocates, answer is “yes, you need 70G RAM to use Percona Memory Engine.”

Q: What is the difference in size of data between WiredTiger on disks versus WiredTiger In-Memory?

A: There is no difference: the size is same. Please note that WiredTiger (on which the Percona Memory Engine is based) itself can additionally allocate up to 50% of the amount specified in the --inMemorySize option. You can check db.serverStatus().inMemory.cache to find out how much of the specified memory is used for storing your data. "bytes currently in the cache"  shows the total number of bytes occupied by the physical representation of all MongoDB’s databases, and "maximum bytes configured"  shows what is passed in option --inMemorySize. The difference between the two can be used to calculate the amount of memory in bytes available.

Q: What is the way to convert data from disk to In-Memory? Using mongodump and rebuild the indexes?

A: Yes

Q: An enhancement request is to enable regular and In-Memory engines on the same MongoDB instance.

A: This is a MongoDB limitation, but noted and reported for Percona at https://jira.percona.com/browse/PSMDB-88.

Categories: MySQL

Help Us Shape the Future of Percona

MySQL Performance Blog - Thu, 2016-11-17 19:05

Let us know what you think about Percona, and what we should be thinking about for the future.

Over the last ten years, Percona has grown considerably. We’ve moved from being a strictly MySQL company, to a company that supports MongoDB and other open source databases. Percona Live and Percona Live Europe have become key meeting places for the open source community, and now are important hubs for learning about and discussing open source database solutions.

As we look forward to the next ten years of business, we want to get an idea of what you think of us. As we plan for the future, we’d like to hear about your experience with Percona today and get your input on how we can continue to evolve. 

To achieve that end, we’ve put together a survey of questions about us, our services, our products and the open source community’s perception of us. We would appreciate you taking the time to fill it out so we can know your thoughts. Your feedback helps us shape our company and grow the community.

Take the survey here: http://survey.newkind.com/r/rUkjDHPd

It should take 10-15 minutes to complete and will remain open until Friday, Dec. 2. Thanks again for helping us prepare for the future.

Categories: MySQL

All You Need to Know About GCache (Galera-Cache)

MySQL Performance Blog - Thu, 2016-11-17 01:21

This blog discusses some important aspects of GCache.

Why do we need GCache?

Percona XtraDB Cluster is a multi-master topology, where a transaction executed on one node is replicated on another node(s) of the cluster. This transaction is then copied over from the group channel to Galera-Cache followed by apply action.

The cache can be discarded immediately once the transaction is applied, but retaining it can help promote a node as a DONOR node serving write-sets for a newly booted node.

So in short, GCache acts as a temporary storage for replicated transactions.

How is GCache managed?

Naturally, the first choice to cache these write-sets is to use memory allocated pool, which is governed by gcache.mem_store. However, this is deprecated and buggy and shouldn’t be used.

Next on the list is on-disk files. Galera has two types of on-disk files to manage write-sets:

  • RingBuffer File:
    • A circular file (aka RingBuffer file). As the name suggests, this file is re-usable in a circular queue fashion, and is pre-created when the server starts. The size of this file is preconfigured and can’t be changed dynamically, so selecting a proper size for this file is important.
    • The user can set the size of this file using gcache.size. (There are multiple blogs about how to estimate size of the Galera Cache, which is generally linked to downtime. If properly planned, the next booting node will find all the missing write-sets in the cache, thereby avoiding need for SST.)
    • Write-sets are appended to this file and, when needed, the file is re-cycled for use.
  • On-demand page store:
    • If the transaction write-set is large enough not to fit in a RingBuffer File (actually large enough not to fit in half of the RingBuffer file) then an independent page (physical disk file) is allocated to cache the write-sets.
    • Again there are two types of pages:
      • Page with standard size: As defined by gcache.page_size (default=128M).
      • Page with non-standard page size: If the transaction is large enough not to fit into a standard page, then a non-standard page is created for the transaction. Let’s say gcache.page_size=1M and transaction write_set = 1.5M, then a separate page (in turn on-disk file) will be created with a size of 1.5M.

How long are on demand pages retained? This is controlled using following two variables:

  • gcache.keep_pages_size
    • keep_pages_size defines total size of allocated pages to keep. For example, if keep_pages_size = 10M then N pages that add up to 10M can be retained. If N pages add to more than 10M, then pages are removed from the start of the queue until the size falls below set threshold. A size of 0 means don’t retain any page.
  • gcache.keep_pages_count (PXC specific)
    • But before pages are actually removed, a second check is done based on page_count. Let’s say keep_page_count = N+M, then even though N pages adds up to 10M, they will be retained as the page_count threshold is not yet hit. (The exception to this is non-standard pages at the start of the queue.)

So in short, both condition must be satisfied. The recommendation is to use whichever condition is applicable in the user environment.

Where are GCache files located?

The default location is the data directory, but this can be changed by setting gcache.dir. Given the temporary nature of the file, and iterative read/write cycle, it may be wise to place these files in a faster IO disk. Also, the default name of the file is gcache.cache. This is configurable by setting gcache.name.

What if one of the node is DESYNCED and PAUSED?

If a node desyncs, it will continue to received write-sets and apply them, so there is no major change in gcache handling.

If the node is desynced and paused, that means the node can’t apply write-sets and needs to keep caching them. This will, of course, affect the desynced/paused node and the node will continue to create on-demand page store. Since one of the cluster nodes can’t proceed, it will not emit a “last committed” message. In turn, other nodes in the cluster (that can purge the entry) will continue to retain the write-sets, even if these nodes are not desynced and paused.

Categories: MySQL

Is Docker Good for Your Database?

MySQL Performance Blog - Wed, 2016-11-16 16:27

This blog post reviews the appropriateness of Docker and other container solutions for your database environment.

A few weeks back, I wrote a fairly high-level blog post about containers. It covered what you should consider when thinking about using Docker, rkt, LXC, etc. I hope you’ve taken the chance to give it a quick read. It’s a good way to understand some of the disciplines you need to consider before moving to a new technology. However, it sparked a conversation in our Solutions Engineering team. Hopefully, the same one that you’re having in your organization: should customers run their database in containers?

Before we start, I’ll admit that Percona uses containers. Percona Monitoring and Management (PMM for short) presents all of the pretty graphs and query analytics by running in a Docker container. We made that choice because the integration between the components is where we could provide the most value to users. Docker lets us distribute a single ready-to-go unit of awesomeness. In short, it has huge potential on the application side of your environment. 

However, for databases… here are some of our recommendations:

Quick n Dirty

Decision = NOT FOR DBs (as it sits right now)

This is not the case for every environment. It is the default that we think is the best recommendation for the majority of our customers. Please note, that I am only making this recommendation for your database. If you’re using microservices for your application today, then it could make more sense to containerize your database depending on the load characteristics of your database, your scaling needs and the skillset you currently have.

Why?

Lack of Synergy

Before you decide to shoot me, please take some time to understand where we’re coming from. First of all, people designed container solutions to deal with stateless applications that have ephemeral data. Containers spin up a quick microservice and then destroy it. This includes all the components of that container (including its cache and data). The transient nature of containers is because all of the components and services of that container are considered to be part of the container (essentially it’s all or nothing). Serving the container a data volume owned by the underlying OS by punching a hole through the container can be very challenging. Current methods are too unreliable for most databases.

Most of the development efforts put into the various solutions had one goal in mind: statelessness. There are solutions that can help keep your data persistent, but they are very quickly evolving. From what we can tell, they require a high level of complexity, that negate any efficiency gains due to increased operational complexity (and risk). To further my point, this is precisely the conclusion that we’ve come to time and again when we’ve reviewed any “real world” information about the use of containers (especially Docker).

They’re Just Not Stable Yet

These container solutions are meant for quick development and deployment of applications that are broken into tiny components: microservices. Normally, these applications evolve very quickly in organizations that are very software/developer driven. That seems to be how these container solutions (again, especially Docker) are developed as well. New features are pushed out with little testing and design. The main focus seems to be the latest featureset and being first to market. They “beg for forgiveness” instead of “ask for permission.” On top of that, backward compatibility (from what we can tell) is a distant concern (and even that might be an overstatement). This means that you’re going to have to have a mature Continuous Delivery and testing environment as well as a known and tested image repository for your containers.

These are awesome tools to have for the right use cases, but they take time, money, resources and experience. In speaking with many of our customers, this is just not where they’re at as an organization. Their businesses aren’t designed around software development, and they simply don’t have the checkbooks to support the resources needed to keep this hungry machine fed. Rather, they are looking for something stable and performant that can keep their users happy 24×7. I know that we can give them a performant, highly-available environment requires much less management if we strip out containers.

Is There Hope?

Absolutely, in fact, there’s a lot more than hope. There are companies running containers (including databases) at massive scale today! These are the types of companies that have very mature processes. Their software development is a core part of their business plan and value proposition. You probably know who I’m talking about: Uber, Google, Facebook (there are more, these are a just a few). There’s even a good rundown of how you can get persistence in containers from Joyent. But as I said before, the complexity needed to get the basic features necessary to keep your data alive and available (the most basic use of a database) is much too high. When containers have a better and more stable solution for persistent storage volumes, they will be one step closer to being ready, in my opinion. Even then, containerizing databases in most organizations that aren’t dealing with large scale deployments (50+ nodes) with wildly varying workloads is probably unnecessary.

Don’t’ Leave Us Hanging…

I realize that the statement “you’re probably not ready to containerize your database” does not constitute a solution. So here it is: the Solutions Engineering team (SolEng for short) has you covered. Dimitri Vanoverbeke is in the process of a great blog series on configuration management. Configuration management solutions can greatly increase the repeatability of your infrastructure, and make sure that your IT/App Dev processes are repeatable in the physical configuration of your environment. Automating this process can lead to great gains. However, this should make use of a mature development/testing process as part of your application development lifecycle. The marriage of process and technology creates stable applications and happy customers.

Besides configuration management as an enhanced solution, there are some services that can make the life of your operations team much easier. Service discovery and health checking come to mind. My favorite solution is Consul, which we use extensively in PMM for configuration and service metadata. Consul can make sure that your frontend applications and backend infrastructure are working from a real-time snapshot of the state of your services.

Conclusion

There is a lot to think about when it comes to managing an environment, especially when your application develops at a quick pace. With the crafty use of available solutions, you can reduce the overhead that goes into every release. On top of that, you can increase resiliency and availability. If you need our help, please reach out. We’d love to help you!

Categories: MySQL

Webinar Thursday, November 17: MySQL High Availability with Percona XtraDB Cluster 5.7

MySQL Performance Blog - Tue, 2016-11-15 18:37

Join Percona’s Percona XtraDB Cluster Lead Software Engineer Krunal Bauskar for a webinar on Thursday, November 17, 2016, at 7:30 am PST on MySQL High Availability with Percona XtraDB Cluster 5.7.

Percona XtraDB Cluster 5.7 is our brand new MySQL 5.7 compatible Galera-based high availability (HA) solution. Whether you’re new to MySQL clustering technology, or experienced with Galera-based replication, this tutorial provides great insights into working with the software, including:

  • New and unique Features XtraDB Cluster 5.7, including Cluster Safe Mode, instrumentation with Performance Schema and extended support for encrypted tablespace in multi-master topology
  • Seamless integration with ProxySQL for better HA and read/write splitting
  • Improved security with native data at rest encryption and secure networking
  • Native integration with Docker, optimized for Container World
  • Monitoring with Percona Monitoring and Management (PMM)
  • Improved stability with many critical bug fixes and improved error messaging

This tutorial will demonstrate how to set up XtraDB Cluster, complete with High Availability Proxy and Monitoring, as well as perform the most important MySQL high availability management operations.

Register for this webinar here.

Krunal Bauskar, Percona XtraDB Cluster Lead Software Engineer

Krunal joined Percona in September 2015. Before joining Percona, he worked as part of the InnoDB team at MySQL/Oracle. He authored most of the temporary table revamp work, in addition to many other features. In the past, he worked with Yahoo! Labs researching big data issues, as well as working for a database startup that is now part of Teradata. His interests mainly include data management at any scale – which he has been working at for more than decade now.

Categories: MySQL

Percona Monitoring and Management 1.0.6 is now available

MySQL Performance Blog - Tue, 2016-11-15 17:07

Percona announces the release of Percona Monitoring and Management 1.0.6 on November 15, 2016.

The instructions for installing or upgrading Percona Monitoring and Management 1.0.6 are available in the documentation. Detailed release notes are available here.

New in PMM Server:

  • Prometheus 1.2.2
  • External static files are now local for PMM home page
  • Metrics Monitor improvements:
    • Added Amazon RDS OS Metrics dashboard and CloudWatch data source.
    • Added the PMM Server host to metrics monitoring.
    • Refactored MongoDB dashboards.
    • Added File Descriptors graph to System Overview dashboard.
    • Added Mountpoint Usage graph to Disk Space dashboard.
  • Query Analytics improvements:
    • QAN data is now purged correctly.
    • QAN data retention is made configurable with QUERIES_RETENTION option. The default is eight days.
    • Various small fixes to Query Analytics.

New in PMM Client:

  • Fixes for mysql:queries service using Performance Schema as query source:
    • Fixed crash when DIGEST_TEXT is NULL.
    • Removed iteration over all query digests on startup.
    • Added sending of query examples to QAN if available (depends on the workload).
  • Added query source information for mysql:queries service in pmm-admin list output.
  • Added purge command to purge metrics data on the server.
  • Updated mongodb_exporter with RocksDB support and various fixes.
  • Removed --nodetype and --replset flags for mongodb:metrics. The --cluster flag is now optional.
    It is recommended to re-add mongodb:metrics service and purge existing MongoDB metrics using the purgecommand.
  • Enabled monitoring of file descriptors (requires re-adding linux:metrics service).
  • Improved full uninstallation when PMM Server is unreachable.
  • Added time drift check between server and client to pmm-admin check-network output.

Live demo of PMM is available at pmmdemo.percona.com.

We welcome your feedback and questions on our PMM forum.

About Percona Monitoring and Management
Percona Monitoring and Management is an open-source platform for managing and monitoring MySQL and MongoDB performance. It is developed by Percona in collaboration with experts in the field of managed database services, support and consulting.

PMM is a free and open-source solution that you can run in your own environment for maximum security and reliability. It provides thorough time-based analysis for MySQL and MongoDB servers to ensure that your data works as efficiently as possible.

Categories: MySQL

Using Vault with MySQL

MySQL Performance Blog - Tue, 2016-11-15 00:31

Using Vault with MySQL

In my previous post I discussed using GPG to secure your database credentials. This relies on a local copy of your MySQL client config, but what if you want to keep the credentials stored safely along with other super secret information? Sure, GPG could still be used, but there must be an easier way to do this.

This post will look at a way to use Vault to store your credentials in a central location and use them to access your database. For those of you that have not yet come across Vault, it is a great way to manage your secrets – securing, storing and tightly controlling access. It has the added benefits of being able to handle leasing, key revocation, key rolling and auditing.

During this blog post we’ll accomplish the following tasks:

  1. Download the necessary software
  2. Get a free SAN certificate to use for Vault’s API and automate certificate renewal
  3. Configure Vault to run under a restricted user and secure access to its files and the API
  4. Create a policy for Vault to provide access control
  5. Enable TLS authentication for Vault and create a self-signed client certificate using OpenSSL to use with our client
  6. Add a new secret to Vault and gain access from a client using TLS authentication
  7. Enable automated, expiring MySQL grants

jQuery('.toc').attr('target','_self');

Before continuing onwards, I should drop in a quick note to say that the following is a quick example to show you how you can get Vault up and running and use it with MySQL, it is not a guide to production setup and does not cover High Availability (HA) implementations, etc.

Download time

We will be using some tools in addition to Vault, Let’s Encrypt, OpenSSL and json_pp (a command line utility using JSON::PP). For this post we’ll be using Ubuntu 16.04 LTS and we’ll presume that these aren’t yet installed.

$ sudo apt-get install letsencrypt openssl libjson-pp-perl

If you haven’t already heard of Let’s Encrypt then it is a free, automated, and open Certificate Authority (CA) enabling you to secure your website or other services without paying for an SSL certificate; you can even create Subject Alternative Name (SAN) certificates to make your life even easier, allowing one certificate to be used a number of different domains. The Electronic Frontier Foundation (EFF) provide Certbot, the recommended tool to manage your certificates, which is the new name for the letsencrypt software. If you don’t have letsencrypt/certbot in your package manager then you should be able to use the quick install method. We’ll be using json_pp to prettify the JSON output from the Vault API and openssl to create a client certificate.

We also need to download Vault, choosing the binary relevant for your Operating System and architecture. At the time of writing this, the latest version of Vault is 0.6.2, so the following steps may need adjusting if you use a different version.

# Download Vault (Linux x86_64), SHA256SUMS and signature $ wget https://releases.hashicorp.com/vault/0.6.2/vault_0.6.2_linux_amd64.zip https://releases.hashicorp.com/vault/0.6.2/vault_0.6.2_SHA256SUMS.sig https://releases.hashicorp.com/vault/0.6.2/vault_0.6.2_SHA256SUMS # Import the GPG key $ gpg --keyserver pgp.mit.edu --recv-keys 51852D87348FFC4C # Verify the checksums $ gpg --verify vault_0.6.2_SHA256SUMS.sig gpg: assuming signed data in `vault_0.6.2_SHA256SUMS' gpg: Signature made Thu 06 Oct 2016 02:08:16 BST using RSA key ID 348FFC4C gpg: Good signature from "HashiCorp Security <security@hashicorp.com>" gpg: WARNING: This key is not certified with a trusted signature! gpg: There is no indication that the signature belongs to the owner. Primary key fingerprint: 91A6 E7F8 5D05 C656 30BE F189 5185 2D87 348F FC4C # Verify the download $ sha256sum --check <(fgrep vault_0.6.2_linux_amd64.zip vault_0.6.2_SHA256SUMS) vault_0.6.2_linux_amd64.zip: OK # Extract the binary $ sudo unzip -j vault_0.6.2_linux_amd64.zip -d /usr/local/bin Archive: vault_0.6.2_linux_amd64.zip inflating: /usr/local/bin/vault

Let’s Encrypt… why not?

We want to be able to access Vault from wherever we are, we can put additional security in place to prevent unauthorised access, so we need to get ourselves encrypted. The following example shows the setup on a public server, allowing the CA to authenticate your request. More information on different methods can be found in the Certbot documentation.

$ sudo letsencrypt --webroot -w /home/www/vhosts/default/public -d myfirstdomain.com -d myseconddomain.com #IMPORTANT NOTES: # - Congratulations! Your certificate and chain have been saved at # /etc/letsencrypt/live/myfirstdomain.com/fullchain.pem. Your cert will # expire on 2017-01-29. To obtain a new or tweaked version of this # certificate in the future, simply run certbot again. To # non-interactively renew *all* of your certificates, run "certbot # renew" # - If you like Certbot, please consider supporting our work by: # # Donating to ISRG / Let's Encrypt: https://letsencrypt.org/donate # Donating to EFF: https://eff.org/donate-le #

That’s all it takes to get a SAN SSL certificate! The server that this was executed has a public webserver serving the domains that the certificates were requested for. During the request process a file is place in the specified webroot and is used to authenticate the domain(s) for the request. Essentially, the command said:

myfirstdomain.com and myseconddomain.com use /home/www/vhosts/default/public for the document root, so place your files there

Let’s Encrypt CA issues short-lived certificates (90 days), so you need to keep renewing them, but don’t worry as that is as easy as it was to create them in the first place! You can test that renewal works OK as follows (which will renew all certificates that you have without --dry-run):

$ sudo letsencrypt renew --dry-run # #------------------------------------------------------------------------------- #Processing /etc/letsencrypt/renewal/myfirstdomain.com.conf #------------------------------------------------------------------------------- #** DRY RUN: simulating 'letsencrypt renew' close to cert expiry #** (The test certificates below have not been saved.) # #Congratulations, all renewals succeeded. The following certs have been renewed: # /etc/letsencrypt/live/myfirstdomain.com/fullchain.pem (success) #** DRY RUN: simulating 'letsencrypt renew' close to cert expiry #** (The test certificates above have not been saved.) # #IMPORTANT NOTES: # - Your account credentials have been saved in your Certbot # configuration directory at /etc/letsencrypt. You should make a # secure backup of this folder now. This configuration directory will # also contain certificates and private keys obtained by Certbot so # making regular backups of this folder is ideal.

Automating renewal

The test run for renewal worked fine, so we can now go and schedule this to take place automatically. I’m using systemd so the following example uses timers, but cron or similar could be used too. Here’s how to make systemd run the scheduled renew for you, running at 0600 – the rewew process will automatically proceed for any previously-obtained certificates that expire in less than 30 days.

$ sudo cat <<EOF > /etc/systemd/system/cert-renewal.service [Unit] Description=SSL renewal [Service] Type=simple ExecStart=/usr/bin/letsencrypt renew --quiet User=root Group=root EOF $ sudo cat <<EOF > /etc/systemd/system/cert-renewal.timer [Unit] Description=Automatic SSL renewal [Timer] OnCalendar=*-*-* 06:00:00 Persistent=true [Install] WantedBy=timers.target EOF $ sudo systemctl enable cert-renewal.timer Created symlink from /etc/systemd/system/timers.target.wants/cert-renewal.timer to /etc/systemd/system/cert-renewal.timer. $ sudo systemctl start cert-renewal.timer $ sudo systemctl list-timers NEXT LEFT LAST PASSED UNIT ACTIVATES Tue 2016-11-01 06:00:00 UTC 6h left n/a n/a cert-renewal.timer cert-renewal.service

Getting started with Vault

Firstly, a quick reminder that this is not an in-depth review, how-to or necessarily best-practice Vault installation as that is beyond the scope of this post. It is just to get you going to test things out, so please read up on the Vault documentation if you want to use it more seriously.

Whilst there is a development server that you can fire up with the command vault server -dev to get yourself testing a little quicker, we’re going to take a little extra time and configure it ourselves and make the data persistent. Vault supports a number of backends for data storage, including Zookeeper, Amazon S3 and MySQL, however the 3 maintained by HashiCorp are consul, file and inmem. The memory storage backend does not provide persistent data, so whilst there could possibly be uses for this it is really only useful for development and testing – it is the storage backend used with the -dev option to the server command. Rather than tackle the installation and configuration of Consul during this post, we’ll use file storage instead.

Before starting the server we’ll create a config, which can be written in one of 2 formats – HCL (HashiCorp Configuration Language) or JSON (JavaScript Object Notation). We’ll use HCL as it is a little cleaner and saves us a little extra typing!

# Create a system user $ sudo useradd -r -g daemon -d /usr/local/vault -m -s /sbin/nologin -c "Vault user" vault $ id vault uid=998(vault) gid=1(daemon) groups=1(daemon) # Create a config directory remove global access $ sudo mkdir /etc/vault /etc/ssl/vault $ sudo chown vault.root /etc/vault /etc/ssl/vault $ sudo chmod 750 /etc/vault /etc/ssl/vault $ sudo chmod 700 /usr/local/vault # Copy the certficates and key $ sudo cp -v /etc/letsencrypt/live/myfirstdomain.com/*pem /etc/ssl/vault /etc/letsencrypt/live/myfirstdomain.com/cert.pem -> /etc/ssl/vault/cert.pem /etc/letsencrypt/live/myfirstdomain.com/chain.pem -> /etc/ssl/vault/chain.pem /etc/letsencrypt/live/myfirstdomain.com/fullchain.pem -> /etc/ssl/vault/fullchain.pem /etc/letsencrypt/live/myfirstdomain.com/privkey.pem -> /etc/ssl/vault/privkey.pem # Create a combined PEM certificate $ sudo cat /etc/ssl/vault/{cert,fullchain}.pem /etc/ssl/vault/fullcert.pem # Write the config to file $ cat <<EOF | sudo tee /etc/vault/demo.hcl listener "tcp" { address = "10.0.1.10:8200" tls_disable = 0 tls_cert_file = "/etc/ssl/vault/fullcert.pem" tls_key_file = "/etc/ssl/vault/privkey.pem" } backend "file" { path = "/usr/local/vault/data" } disable_mlock = true EOF

So, we’ve now set up a user and some directories to store the config, SSL certificate and key, and also the data, restricting access to the vault user. The config that we wrote specifies that we will use the file backend, storing data in /usr/local/vault/data, and the listener that will be providing TLS encryption using our certificate from Let’s Encrypt. The final setting, disable_mlock is not recommended for production and is being used to avoid some extra configuration during this post. More details about the other options available for configuration can be found in the Server Configuration section of the online documentation.

Please note that the Vault datadir should be kept secured as it contains all of the keys and secrets. In the example, we have done this by placing it in the vault user’s home directory and only allowing the vault user access. You can take this further by restricting local access (via logins) and access control lists

Starting Vault

Time to start the server and see if everything is looking good!

$ sudo -su vault vault server -config=/etc/vault/demo.hcl >/tmp/vault-debug.log 2>&1 & $ jobs [1] + running sudo -su vault vault server -config=/etc/vault/demo.hcl > /tmp/vault-debug.lo $ VAULT_ADDR=https://myfirstdomain.com:8200 vault status Error checking seal status: Error making API request. URL: GET https://myfirstdomain.com:8200/v1/sys/seal-status Code: 400. Errors: * server is not yet initialized

Whilst it looks like something is wrong (we need to initialize the server), it does mean that everything is otherwise working as expected. So, we’ll initialize Vault, which is a pretty simple task, but you do need to make note/store some of the information that you will be given by the server during initialization – the unseal tokens and initial root key. You should distribute these to somewhere safe, but for now we’ll store them with the config.

# Change to vault user $ sudo su -l vault -s /bin/bash (vault)$ export VAULT_ADDR=https://myfirstdomain.com:8200 VAULT_SSL=/etc/ssl/vault # Initialize Vault and save the token and keys (vault)$ vault init 2>&1 | egrep '^Unseal Key|Initial Root Token' >/etc/vault/keys.txt (vault)$ chmod 600 /etc/vault/keys.txt # Unseal Vault (vault)$ egrep -m3 '^Unseal Key' /etc/vault/keys.txt | cut -f2- -d: | tr -d ' ' | while read key do vault unseal -ca-cert=${VAULT_SSL}/fullchain.pem -client-cert=${VAULT_SSL}/client.pem -client-key=${VAULT_SSL}/privkey.pem ${key} done Sealed: true Key Shares: 5 Key Threshold: 3 Unseal Progress: 1 Sealed: true Key Shares: 5 Key Threshold: 3 Unseal Progress: 2 Sealed: false Key Shares: 5 Key Threshold: 3 Unseal Progress: 0 # Check Vault status (vault)$ vault status Sealed: false Key Shares: 5 Key Threshold: 3 Unseal Progress: 0 Version: 0.6.2 Cluster Name: vault-cluster-ebbd5ec7 Cluster ID: 61ae8f54-f420-09c1-90bb-60c9fbfa18a2 High-Availability Enabled: false

There we go, the vault is initialized and the status command now returns details and confirmation that it is up and running. It is worth noting here that each time you start Vault it will be sealed, which means that it cannot be accessed until 3 unseal keys have been used with vault unseal – for additional security here you would ensure that a single person cannot know any 3 keys, so that it always requires more than one person to (re)start the service.

Setting up a policy

Policies allow you to set access control restrictions to determine the data that authenticated users have access to. Once again the documents used to write policies are in either the HCL or JSON format. They are easy to write and apply, the only catch being that the policies associated with a token cannot be changed (added/removed) once the token has been issued; you need to revoke the token and apply the new policies. However, If you want to change the policy rules then this can be done on-the-fly as modifications apply on the next call to Vault.

When we initialized the server we were given the initial root key and we now need to use that in order to start configuring the server.

(vault)$ export VAULT_TOKEN=$(egrep '^Initial Root Token:' /etc/vault/keys.txt | cut -f2- -d: | tr -d ' ')

We will create a simple policy that allows us to read the MySQL secrets, but prevent access to the system information and commands

(vault)$ cat <<EOF > /etc/vault/demo-policy.hcl path "sys/*" { policy = "deny" } path "secret/mysql/*" { policy = "read" capabilities = ["list", "sudo"] } EOF (vault)$ vault policy-write demo /etc/vault/demo-policy.hcl Policy 'demo' written.

We have only added one policy here, but you should really create as many policies as you need to suitably control access amongst the variety of humans and applications that may be using the service. As with any kind of data storage planning how to store your data is important, as it will help you write more compact policies with the level of granularity that you require. Writing everything in /secrets at the top level will most likely bring you headaches, or long policy definitions!

TLS authentication for MySQL secrets

We’re getting close to adding our first secret to Vault, but first of all we need a way to authenticate our access. Vault provides an API for access to your stored secrets, along with wealth of commands with direct use of the vault binary as we are doing at the moment. We will now enable the cert authentication backend, which allows authentication using SSL/TLS client certificates

(vault)$ vault auth-enable cert Successfully enabled 'cert' at 'cert'!

Generate a client certificate using OpenSSL

The TLS authentication backend accepts certificates that are either signed by a CA or self-signed, so let’s quickly create ourselves a self-signed SSL certificate using openssl to use for authentication.

# Create working directory for SSL managment and copy in the config $ mkdir ~/.ssl && cd $_ $ cp /usr/lib/ssl/openssl.cnf . # Create a 4096-bit CA $ openssl genrsa -des3 -out ca.key 4096 Generating RSA private key, 4096 bit long modulus ...........++ ..........................................................................++ e is 65537 (0x10001) Enter pass phrase for ca.key: Verifying - Enter pass phrase for ca.key: $ openssl req -config ./openssl.cnf -new -x509 -days 365 -key ca.key -out ca.crt Enter pass phrase for ca.key: You are about to be asked to enter information that will be incorporated into your certificate request. What you are about to enter is what is called a Distinguished Name or a DN. There are quite a few fields but you can leave some blank For some fields there will be a default value, If you enter '.', the field will be left blank. ----- Country Name (2 letter code) [GB]: State or Province Name (full name) [Some-State]: Locality Name (eg, city) [Some-Place]: Organization Name (eg, company) [Percona]: Organizational Unit Name (eg, section) [Demo]: Comon Name (e.g. server FQDN or YOUR name) [ceri]: Email Address [thisisnotme@myfirstdomain.com]: # Create a 4096-bit Client Key and CSR $ openssl genrsa -des3 -out client.key 4096 Generating RSA private key, 4096 bit long modulus ......................++ ..................................++ e is 65537 (0x10001) Enter pass phrase for client.key: Verifying - Enter pass phrase for client.key: $ openssl req -config ./openssl.cnf -new -key client.key -out client.csr Enter pass phrase for client.key: You are about to be asked to enter information that will be incorporated into your certificate request. What you are about to enter is what is called a Distinguished Name or a DN. There are quite a few fields but you can leave some blank For some fields there will be a default value, If you enter '.', the field will be left blank. ----- Country Name (2 letter code) [GB]: State or Province Name (full name) [Some-State]: Locality Name (eg, city) [Some-Place]: Organization Name (eg, company) [Percona]: Organizational Unit Name (eg, section) [Demo]: Comon Name (e.g. server FQDN or YOUR name) [ceri]: Email Address [thisisnotme@myfirstdomain.com]: Please enter the following 'extra' attributes to be sent with your certificate request A challenge password []: An optional company name []: # Self-sign $ openssl x509 -req -days 365 -in client.csr -CA ca.crt -CAkey ca.key -set_serial 01 -out client.crt Signature ok subject=/C=GB/ST=Some-State/L=Some-Place/O=Percona/OU=Demo/CN=ceri/emailAddress=thisisnotme@myfirstdomain.com Getting CA Private Key Enter pass phrase for ca.key: # Create an unencrypted copy of the client key $ openssl rsa -in client.key -out privkey.pem Enter pass phrase for client.key: writing RSA key # Copy the certificate for Vault access $ sudo cp client.crt /etc/ssl/vault/user.pem

OK, there was quite a lot of information there. You can edit openssl.cnf to set reasonable defaults for yourself and save time. In brief, we have created our own CA, created a self-signed certificate and then created a single PEM certificate with a decrypted key (this avoids specifying the password to use it – you may wish to leave the password in place to add more security, assuming that your client application can request the password.

Adding an authorisation certificate to Vault

Now that we have created a certificate and a policy we now need to allow authentication to occur using the certificate. We will give the token a 1-hour expiration and allow access to the MySQL secrets via the demo policy that we created in the previous step.

(vault)$ vault write auth/cert/certs/demo display_name=demo policies=demo certificate=@${VAULT_SSL}/user.pem ttl=3600 Success! Data written to: auth/cert/certs/demo $ curl --cert user.pem --key privkey.pem ${VAULT_ADDR}/v1/auth/cert/login -X POST {"request_id":"d5715ce1-2c6c-20c8-83ef-ce6259ad9110","lease_id":"","renewable":false,"lease_duration":0,"data":null,"wrap_info":null,"warnings":null,"auth":{"client_token":"e3b98fac-2676-9f44-fdc2-41114360d2fd","accessor":"4c5b4eb5-4faf-0b01-b732-39d309afd216","policies":["default","demo"],"metadata":{"authority_key_id":"","cert_name":"demo","common_name":"thisisnotme@myfirstdomain.com","subject_key_id":""},"lease_duration":600,"renewable":true}}

Awesome! We requested out first client token using an SSL client certificate, we are logged it and we were given our access token (client_token) in the response that provides us with a 1 hour lease (lease_duration) to go ahead and make requests as a client without reauthentication, but there is nothing in the vault right now.

Ssshh!! It’s secret!

“The time has come,” the Vault master said, “to encrypt many things: our keys and passwords and top-secret notes, our MySQL DSNs and strings.”

Perhaps the easiest way to use Vault with your application is to store information there as you would do in a configuration file and read it when the application first requires it. An example of such information is the Data Source Name (DSN) for a MySQL connection, or perhaps the information needed to dynamically generate a .my.cnf. As this is about using Vault with MySQL we will do exactly that and store the user, password and connection method as our first secret, reading it back using the command line tool to check that it looks as expected.

(vault)$ $ vault write secret/mysql/test password="mysupersecretpassword" user="percona" socket="/var/run/mysqld/mysqld.sock" Success! Data written to: secret/mysql/test (vault)$ vault read secret/mysql/test Key Value --- ----- refresh_interval 768h0m0s password mysupersecretpassword socket /var/run/mysqld/mysqld.sock user percona

A little while back (hopefully less than 1 hour ago!) we authenticated using cURL and gained a token, so now that we have something secret to read we can try it out. Fanfares and trumpets at the ready…

$ curl --cert user.pem --key privkey.pem -H 'Content-type: application/json' -H 'X-Vault-Token: 2f1fb630-cbe9-a8c9-5931-515a12d79291' ${VAULT_ADDR}/v1/secret/mysql/test -X GET 2>/dev/null | json_pp { "wrap_info" : null, "lease_id" : "", "request_id" : "c79033b1-f8f7-be89-4208-44d721a55804", "auth" : null, "data" : { "password" : "mysupersecretpassword", "socket" : "/var/run/mysqld/mysqld.sock", "user" : "percona" }, "lease_duration" : 2764800, "renewable" : false, "warnings" : null }

We did it! Now there is no longer the need to store passwords in your code or config files, you can just go and get them from Vault when you need them, such as when your application starts and holding them in memory, or on-demand if your application can tolerate any additional latency, etc. You would need to take further steps to make sure that your application is tolerant of Vault going down, as well as providing an HA setup of Vault to minimise the risk of the secrets being unavailable.

It doesn’t stop here though…

On-demand MySQL grants

Vault acts like a virtual filesystem and uses the generic storage backend by default, mounted as /secret, but due to powerful abstraction it is possible to use many other backends as mountpoints such as an SQL database, AWS IAM, HSMs and much more. We have kept things simple and been using the generic backend so far. You can view the available (mounted) backends using the mounts command:

(vault)$ vault mounts Path Type Default TTL Max TTL Description secret/ generic system system generic secret storage sys/ system n/a n/a system endpoints used for control, policy and debugging

We are now going to enable the MySQL backend, add the management connection (which will use the auth_socket plugin) and then request a new MySQL user that will auto-expire!

# Create a dedicated MySQL user account $ mysql -Bsse "CREATE USER vault@localhost IDENTIFIED WITH auth_socket; GRANT CREATE USER, SELECT, INSERT, UPDATE ON *.* TO vault@localhost WITH GRANT OPTION;" # Enable the MySQL backend and set the connection details (vault)$ vault mount mysql (vault)$ vault write mysql/config/connection connection_url="vault:vault@unix(/var/run/mysqld/mysqld.sock)/" Read access to this endpoint should be controlled via ACLs as it will return the connection URL as it is, including passwords, if any. # Write the template for the readonly role (vault)$ vault write mysql/roles/readonly sql="CREATE USER '{{name}}'@'%' IDENTIFIED WITH mysql_native_password BY '{{password}}' PASSWORD EXPIRE INTERVAL 1 DAY; GRANT SELECT ON *.* TO '{{name}}'@'%';" Success! Data written to: mysql/roles/readonly # Set the lease on MySQL grants (vault)$ vault write mysql/config/lease lease=1h lease_max=12h Success! Data written to: mysql/config/lease

Here you can see that a template is created so that you can customise the grants per role. We created a readonly role, so it just has SELECT access. We have set an expiration on the account so that MySQL will automatically mark the password as expired and prevent access. This is not strictly necessary since Vault will remove the user accounts that it created as it expires the tokens, but by adding an extra level in MySQL it would allow you to set the lease, which seems to be global, in Vault to a little longer than required and vary it by role using MySQL password expiration. You could also use it as a way of tracking which Vault-generated MySQL accounts are going to expire soon. The important part is that you ensure that the application is tolerant of reauthentication, whether it would hand off work whilst doing so, accept added latency, or perhaps the process would terminate and respawn.

Now we will authenticate and request our user to connect to the database with.

$ curl --cert user.pem --key privkey.pem -H 'Content-type: application/json' ${VAULT_ADDR}/v1/auth/cert/login -X POST 2>/dev/null | json_pp { "auth" : { "policies" : [ "default", "demo" ], "accessor" : "2e6d4b95-3bf5-f459-cd27-f9e35b9bed16", "renewable" : true, "lease_duration" : 3600, "metadata" : { "common_name" : "thisisnotme@myfirstdomain.com", "cert_name" : "demo", "authority_key_id" : "", "subject_key_id" : "" }, "client_token" : "018e6feb-65c4-49f2-ae30-e4fbba81e687" }, "lease_id" : "", "wrap_info" : null, "renewable" : false, "data" : null, "request_id" : "f00fe669-4382-3f33-23ae-73cec0d02f39", "warnings" : null, "lease_duration" : 0 } $ curl --cert user.pem --key privkey.pem -H 'Content-type: application/json' -H 'X-Vault-Token: 018e6feb-65c4-49f2-ae30-e4fbba81e687' ${VAULT_ADDR}/v1/mysql/creds/readonly -X GET 2>/dev/null | json_pp { "errors" : [ "permission denied" ] }

Oh, what happened? Well, remember the policy that we created earlier? We hadn’t allowed access to the MySQL role generator, so we need to update and apply the policy.

(vault)$ cat <<EOF | vault policy-write demo /dev/stdin path "sys/*" { policy = "deny" } path "secret/mysql/*" { policy = "read" capabilities = ["list", "sudo"] } path "mysql/creds/readonly" { policy = "read" capabilities = ["list", "sudo"] } EOF Policy 'demo' written.

Now that we have updated the policy to allow access to the readonly role (requests go via mysql/creds when requesting access) we can check that the policy has applied and whether we get a user account for MySQL.

# Request a user account $ curl --cert user.pem --key privkey.pem -H 'Content-type: application/json' -H 'X-Vault-Token: 018e6feb-65c4-49f2-ae30-e4fbba81e687' ${VAULT_ADDR}/v1/mysql/creds/readonly -X GET 2>/dev/null | json_pp { "request_id" : "7b45c9a1-bc46-f410-7af2-18c8e91f43de", "lease_id" : "mysql/creds/readonly/c661426c-c739-5bdb-cb7a-f51f74e16634", "warnings" : null, "lease_duration" : 3600, "data" : { "password" : "099c8f2e-588d-80be-1e4c-3c2e20756ab4", "username" : "read-cert-401f2c" }, "wrap_info" : null, "renewable" : true, "auth" : null } # Test MySQL access $ mysql -h localhost -u read-cert-401f2c -p Enter password: Welcome to the MySQL monitor. Commands end with ; or g. Your MySQL connection id is 17 Server version: 5.7.14-8-log Percona Server (GPL), Release '8', Revision '1f84ccd' Copyright (c) 2009-2016 Percona LLC and/or its affiliates Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or 'h' for help. Type 'c' to clear the current input statement. mysql> show grants; +-----------------------------------------------+ | Grants for read-cert-401f2c@% | +-----------------------------------------------+ | GRANT SELECT ON *.* TO 'read-cert-401f2c'@'%' | +-----------------------------------------------+ 1 row in set (0.00 sec) # Display the full account information $ pt-show-grants --only='read-cert-401f2c'@'%' -- Grants dumped by pt-show-grants -- Dumped from server Localhost via UNIX socket, MySQL 5.7.14-8-log at 2016-11-08 23:28:37 -- Grants for 'read-cert-401f2c'@'%' CREATE USER IF NOT EXISTS 'read-cert-401f2c'@'%'; ALTER USER 'read-cert-401f2c'@'%' IDENTIFIED WITH 'mysql_native_password' AS '*FF157E33408E1FBE707B5FF89C87A2D14E8430C2' REQUIRE NONE PASSWORD EXPIRE INTERVAL 1 DAY ACCOUNT UNLOCK; GRANT SELECT ON *.* TO 'read-cert-401f2c'@'%';

Hurrah! Now we don’t even need to go and create a user, the application can get one when it needs one. We’ve made the account auto-expire so that the credentials are only valid for 1 day, regardless of Vault expiration, and also we’ve reduced the amount of time that the token is valid, so we’ve done a pretty good job of limiting the window of opportunity for any rogue activity

We’ve covered quite a lot in this post, some detail for which has been left out to keep us on track. The online documentation for OpenSSL, Let’s Encrypt and Vault are pretty good, so you should be able to take a deeper dive should you wish to. Hopefully, this post has given a good enough introduction to Vault to get you interested and looking to test it out, as well as bringing the great Let’s Encrypt service to your attention so that there’s very little reason to not provide a secure online experience for your readers, customers and services.

Categories: MySQL

MongoDB Through a MySQL Lens

MySQL Performance Blog - Mon, 2016-11-14 19:56

This blog post looks at MongoDB and MySQL, and covers high-level MongoDB strengths, weaknesses, features, and uses from the perspective of an SQL user.

Delving into NoSQL coming from an exclusively SQL background can seem like a daunting task. I have worked with SQL in both small MySQL environments and large Oracle SQL environments. 

When is it a good choice?

MongoDB is an incredibly robust, scalable, and operator-friendly database solution. MongoDB is a good choice when your developers will also be responsible for the database environment. In small shops and startups, this might be the case. MongoDB stores information in BSON (binary JSON). BSON is the native JSON (JavaScript Object Notation) language used by MongoDB to retrieve information stored in BSON on the back end. JSON is easily relatable to other programming languages, and many developers will already have experience with it.

MongoDB is also a good option when you expect a great deal of your traffic to be writes. This is not to say that MySQL does not have good options when dealing with write-heavy environments, but MongoDB handles this with relative ease. Facebook designed the RocksDB storage engine for write-heavy environments, and performs well (with benchmark testing demonstrating this).

MongoDB is a good choice when you need a schemaless, or schema-flexible, data structure. MongoDB handles changes to your data organization with relative ease and grace. This is the selling point of NoSQL solutions. There have been many improvements in the MySQL world that make online schema changes possible, but the ease at which this is done in MongoDB has not yet been matched. The ability to create records without defining structure gives MongoDB added flexibility.

Another reason to choose MongoDB is its functionality with regards to replication setup, built-in sharding, and auto elections. Setting up a replicated environment in MongoDB is easy, and the auto-election process allows a secondary to take over in the event of a primary database failure. Built-in sharding allows for easy horizontal scaling, which can be more complicated to manage, setup and configure in a MySQL environment.

When should you choose something else?

MongoDB is a great choice for some use cases. It is also not a great choice for others. MongoDB might not be the right choice when your data is highly relational and structured. MongoDB does not support  transactions, but on a document level there is atomicity. There are configuration considerations to make for a replicated environment with regards to write concern, but these come at the cost of performance. Write concern verifies that replicas have written the information. By default, MongoDB sets the write concern to request acknowledgment from the primary only, not replicas. This can lead to consistency issues if there is a problem with the replicas.

How is the structure different?

Many concepts in the SQL world are relatable to the document structure of MongoDB. Let’s take a look at the high-level structure of a simple MongoDB environment to better understand how MongoDB is laid out.

The below chart relates MySQL to MongoDB (which is found in MongoDB’s documentation).


Another interesting note is the mongod process. This is a daemon that processes data requests, much the same as the mysqld process for MySQL. This is the process that listens for MongoDB requests, and manages access to the database. As with MySQL, there are a number of start-up options for the mongod process. One of the most important configuration options is --config which specifies a config file to use for your mongod instance. Slightly different from MySQL, this file uses YAML formatting. Below is an example config file for MongoDB. Please note this is to demonstrate formatting. It isn’t optimized for any production database.

By definition, MongoDB is a document store database. This chart gives you some idea of how that relates to the structure of MySQL or any SQL flavor. Instead of building a table and adding data, you can immediately insert documents into a collection without having to define a structure. This is one of the advantages in flexibility that MongoDB offers over MySQL. It is important to note that just because MongoDB offers this flexibility does not mean that organizing a highly functional production MongoDB database is effortless. Similar to choosing any database, thought should be put into the structure and goal of the database to avoid pitfalls down the line.

# mongod.conf, Percona Server for MongoDB # for documentation of all options, see: # http://docs.mongodb.org/manual/reference/configuration-options/ # Where and how to store data. storage: dbPath: /var/lib/mongodb journal: enabled: true engine: rocksdb # where to write logging data. systemLog: destination: file logAppend: true path: /var/log/mongodb/mongod.log processManagement: fork: true pidFilePath: /var/run/mongod.pid # network interfaces net: port: 27017 bindIp: 127.0.0.1

NOTE: YAML formatting does not handle tab. Use spaces to indent.

How is querying different?

Interacting with the database via the shell also offers something slightly different from SQL. JSON queries MongoDB. Again, this should be familiar to web developers which is one of the appeals of using MongoDB. Below is an example of a query translated from SQL to MongoDB. We have a user table with just usernames and an associated ID.

In SQL:

select username from user where id = 2;

In MongoDB:

db.user.find({_id:2},{“username”:1})

In the JSON format, we specify the user collection to query from and then the ID associated with the document we are interested in. Finally, the field is specified from which we want the value. The result of this query would be the username of the user that has an ID of 2.

Final thoughts

MongoDB is not a silver bullet to your MySQL woes. As both databases continue to evolve, their weaknesses and strengths slowly start to blend together. Do not let the flexibility of MongoDB’s structure fool you into thinking that you must not have a plan for your database environment. This will surely lead to headaches down the road. The flexibility should allow for dynamic, fast changes not thoughtless ones. I encourage any MySQL user to get their hands on a MongoDB instance for testing purposes. MongoDB is a popular option in the e-commerce and gaming world because of its flexibility in schema design and its ability to scale horizontally with large amounts of data.

Categories: MySQL

Amazon AWS Service Tiers

MySQL Performance Blog - Fri, 2016-11-11 20:33

This blog post discusses the differences between the Amazon AWS service tiers.

Many people want to move to an Amazon environment but are unsure what AWS service makes the most sense (EC2, RDS, Aurora). For database services, the tiering at Amazon starts with EC2, then moves up to RDS, and on to Aurora. Amazon takes on more of the implementation and management of the database As you move up the tiers. This limits the optimization options. Obviously, moving up the tiers increases basic costs, but there are tradeoffs at each level to consider.

  • EC2 (Elastic Compute Cloud) is a basic cloud platform. It provides the user with complete control of the compute environment, while reducing your need to monitor and manage hardware. From a database perspective, you can do almost anything in EC2 that you could do running a database on your own hardware. You can tweak OS and database settings, plus do all of the normal database optimization work you would do in a bare metal environment. In EC2, you can run a single server, master/slave, or a cluster, and you can use MySQL, MongoDB, or any other product. You can use AWS Snapshot Manager to take backups, or you can use another backup tool. This option is ideal if you want all the flexibility of running your own hardware without the hassles of daily hardware maintenance.
  • RDS (Relational Data Service) makes it easy to set up a relational database in the cloud. It offers similar resizing capabilities to EC2, but also automates a lot of tasks. RDS supports Aurora (more on that later), Postgres, MySQL, MariaDB, Oracle, and MSSQL. RDS simplifies deployment and automates some maintenance tasks. This means that you are limited in terms of the tweaks that you can implement at the OS and database configuration level. This means you will focus on query and schema changes to optimize a database in this environment. RDS also includes automated backups and provides options for read replicas that you can spread across multiple availability zones. You must consider and manage all these are all items in the EC2 world. This choice is great if you are looking to implement a database but don’t want (or know how) to take on a lot of the tasks, such as backups and replication setup, that are needed for a stable and highly available environment.
  • Aurora is one of the database options available through RDS. You might hear people refer to it either as Aurora or RDS Aurora (they’re both the same). With Aurora, Amazon takes on even more of the configuration and management options. This limits your optimization capabilities even more. It also means that there are far fewer things to worry about since Amazon handles so much of the administration. Aurora is MySQL-compatible, and is great if you want the power and convenience of MySQL with a minimum of effort on the hardware side. Aurora is designed to automatically detect database crashes and restart without the need for crash recovery or to rebuild the database cache. If the entire instance fails, Aurora will automatically failover to one of up to 15 read replicas.

With data in the cloud, security becomes a bigger concern. You continue to govern access to your content, platform, applications, systems ,and networks, just like you would with data stored in your own datacenter. Amazon’s cloud offerings also support highly secure environments, like HIPAA and PCI compliance. They have designed the cloud environment to be a secure database environment while maintaining the necessary access for use and administration, even in these more regulated environments.

Storing data in the cloud is becoming more common. Amazon offers multiple platform options and allows for easy scalability, availability, and reliability.

Categories: MySQL
Syndicate content