Sizzling SQL databases

First look: Google Cloud SQL takes on Amazon Aurora

Google’s second-generation SQL database as a service exits beta with dramatic improvements in performance and storage capacity

At a Glance

Google’s first attempt at MySQL as a service, Cloud SQL, launched in 2011, was somewhat of a disappointment from the standpoint of performance and scalability. Cloud SQL Second Generation, which was released on Aug. 16 of this year, has no such problems. 

In essence, Cloud SQL Second Generation is an implementation of MySQL 5.7 on top of Google’s Compute Engine and Persistent Disk. According to Google, Cloud SQL Second Generation runs seven times faster and has 20 times more storage capacity than its predecessor -- with lower costs, higher scalability, automated backups that can restore your database from any point in time, and 99.95 percent availability, anywhere in the world.

Cloud SQL features

Cloud SQL Second Generation is a fully managed database service, taking the knob-twiddling out of MySQL configuration. As you can see in the figure below, when you configure Cloud SQL you set the storage type, the initial storage capacity, and whether you want a fail-over replica, automated backups, binary logging, and automatic storage capacity expansion.

google cloud sql create instance

Creating a Google Cloud SQL instance involves making only the necessary decisions: the storage type, the initial storage capacity, and whether you want a fail-over replica, automated backups, and binary logging. Further down on this form, there is a check box to enable automatic storage capacity expansion.

You don’t see multiple pages of settings: The database has already been optimized for performance. By contrast, Amazon Aurora does show you all of the MySQL 5.6 settings and allows you to change them, but also has well-tuned defaults.

I asked the Google Cloud SQL team how they optimized the database for performance. They answered:

Cloud SQL uses industry-standard best practice tuning parameters to achieve current performance levels. We will continue to optimize MySQL and the operating system to take advantage of our underlying Google Cloud Platform infrastructure. We will also continue to improve and innovate on our infrastructure to improve the product’s performance. Our current configuration is typical of what can be achieved using Compute Engine instances. Some specifics are:

  1. Ext4 filesystem options: noatime, discard
  2. Disk scheduler: set to “noop”
  3. MySQL options:
    •    
    • innodb_buffer_pool_size is set by machine type (amount of available RAM)
    • innodb_flush_method = O_DIRECT
    • innodb_flush_neighbor_pages = 0
    • innodb_log_file_size = 512MB

The management of the database has been automated as much as makes sense. Backups, replication, patches, and updates are all done for you on your schedule, in a way that ensures better than 99.95 percent availability. That can’t be Five 9’s availability because patches and updates have to happen, but at least they occur only every few months during your chosen maintenance window.

If you want to clone a database using the console, it’s a matter of selecting the instance, dropping down the action menu, picking “Create clone,” and clicking through the “Create clone” page. Creating a replica and adding a failover are equally as easy, and like cloning, they do not interfere with performance. You can also perform all of these actions from the gcloud command line or programmatically via the REST API.

google cloud sql proxy

It is common to restrict access to MySQL database servers to specific IP addresses, but that can be an issue for developers’ machines with dynamic IP addresses. Cloud SQL second generation includes a proxy mechanism that bypasses this issue without weakening security.

You can connect to and manage your Cloud SQL databases through the standard MySQL port and protocol, or through a new proxy mechanism, as shown in the figure above. The proxy uses OAuth and its own secure tunnel.

If you have enabled backups and binary logging, you can use point-in-time recovery to restore your Cloud SQL database to the state it had immediately prior to a specific event, such as an undesired table drop. Point-in-time recovery uses Cloud SQL’s clone functionality. Total recovery consists of the time needed to create a clone from a snapshot (about a minute), plus the time MySQL takes to replay its binary log to the desired position, which depends on how much activity there was in the log.

Cloud SQL scalability

The hardware options for Cloud SQL Second Generation range from one virtual CPU and 614MB of RAM to 16 virtual CPUs and 104GB of RAM, as shown in the figure below. Note that the high end is nominally two times less capable than the biggest Amazon Aurora instance, although a Google Compute Engine virtual CPU doesn’t have exactly the same power as an Amazon virtual CPU, and Google Persistent Disk SSDs don’t have exactly the same performance as Amazon Aurora SSDs.

Within each grouping -- standard and high memory machines -- you choose the CPU and memory together, not separately. If you’re trying things out, you can run a db-f1-micro shared-core Cloud SQL machine (one virtual CPU and 614MB of RAM, no SLA) for a penny and a half per hour, plus storage and network egress charges. Even a starving writer can afford that.

google cloud sql machine types

Cloud SQL supports a wide range of virtualized hardware resources, from one virtual CPU and 614MB of RAM to 16 virtual CPUs and 104GB of RAM. The rule of thumb is to start with the smallest machine type that has enough RAM to hold your largest database table.

If your maximum table size is big but your expected maximum load is relatively light, you can probably find an optimum machine size in the high-memory group. If the maximum table size is smaller or the load is relatively heavy, consider the standard machine group. The rule of thumb is to start with a machine with enough memory to keep your biggest table entirely in RAM.

You also control the storage size and storage performance together, within the choice of SSDs or spinning disks. My rule of thumb for economy is to start with enough storage for your current needs and allow the storage to grow automatically; growing the storage does not incur downtime. For better storage performance, definitely choose SSDs and allocate more storage than you think you’ll need.

You can scale your machine type at any time, but you may incur downtime. When a machine type is changed, the database is shut down, the storage volume is moved to another virtual machine, and the migrated instance is restarted. This process incurs three to five minutes of downtime. 

Cloud SQL performance 

I’m about to show you some straightforward transactional benchmarks, conducted by Google, and not yet replicated by InfoWorld, but remember that your applications may not have the same load profiles as these -- or any -- benchmarks. Google used Sysbench, exactly as Amazon did for its tests of Aurora. One big difference is that Amazon used its largest database instance, which has 32 virtual CPUs and 244GB of memory, and ran four client machines to saturate the database with load to get the highest possible transaction rate numbers. Google used its own largest Cloud SQL instance, with 16 virtual CPUs and 104GB of RAM, and chose equivalent size Amazon RDS MySQL and Aurora instances for fair comparisons.

Google chose a table size that was roughly five times the available RAM, so the data would not fit into memory, and the benchmark tests would reflect storage performance. As I’ve repeated several times, if you want your own database to be fast, you will want the maximum table size to fit into memory.

The first chart below shows the transaction rate versus the number of concurrent Sysbench threads for Amazon RDS MySQL, Amazon Aurora, and Google Cloud SQL Second Generation. Each thread count ran for 30 minutes. For the eight-thread run, Google's results show that the Cloud SQL transaction rate peaks at the beginning and settles down, while the Aurora transaction rate is more consistent but lower. By the 32-thread run, the Cloud SQL and Aurora rates start the same, but the Cloud SQL rate settles to a lower point.

You would think that was a bad result for Google, but it’s not as bad as it might seem. To understand why, you need to look at the latency.

google cloud sql latency

Database latency for Google Cloud SQL Second Generation, Amazon RDS MySQL, and Aurora. Google’s results indicate that Cloud SQL has lower latency up to 16 threads.

Many real-world applications, in particular customer-facing websites, have upper limits on database latency so that the application will be responsive. If you look at the latency for the higher thread counts in the chart above, you’ll see that it starts to go up around 16 threads for the RDS MySQL database, and for the Aurora and Cloud SQL databases at 128 threads. You don’t want to operate your database in that region if you’re running an application that is sensitive to latency.

Cloud SQL vs. alternatives

The other number you need to know when choosing a database service is cost. To first order, the costs of Google Cloud SQL Second Generation and Amazon Aurora are about the same. The higher transaction rates reported by Google for Google Cloud SQL Second Generation would point at Cloud SQL providing better value than Aurora for applications that require low latency, but that isn’t the whole story for all use cases.

First, let’s recall the two other high-performance MySQL-compatible databases I’ve reviewed since looking at Aurora: DeepSQL and ClustrixDB. Both are competitive among the high-performance options we’re discussing. Both have specific strengths, can be run on either cloud, and deserve your consideration.

Second, let’s remember that Aurora can scale up beyond Google Cloud SQL Second Generation by roughly a factor of two in memory and number of CPUs, and it can handle many more threads, as well as more storage capacity (64TB versus 10TB) and more fail-over targets (15 versus one). If you need a MySQL database that exceeds the limits of Cloud SQL, then Aurora might be a better choice. 

Third, let’s remember that databases work best when they are “near” the applications using them. If your apps are already in the Google or Amazon clouds, then it makes sense to keep your databases not only in the same cloud as the apps, but also in the same availability zone.

Finally, let me repeat the most important fact about benchmarks: They aren’t your application, and they may or may not bear any resemblance to your loads. Do your own tests before committing yourself to one database as a service or another -- moving databases from cloud to cloud is almost never as easy as people imagine.

At a Glance
  • Google Cloud SQL is a high-performance managed cloud MySQL 5.7 database service.

    Pros

    • Google benchmarks show high transaction rates at moderate loads
    • Wide range of virtualized hardware resources available
    • As close to maintenance-free as could be expected
    • Very good value

    Cons

    • Write performance cannot be expanded beyond 240MB/s without sharding
    • Memory cannot be expanded beyond 104GB
    • Storage cannot be expanded beyond 10TB

Copyright © 2016 IDG Communications, Inc.