How TiDB combines OLTP and OLAP in a distributed database

The architecture and use cases of a NewSQL hybrid transactional-analytical, MySQL-compatible, horizontally scalable database

How TiDB combines OLTP and OLAP in a distributed database
Thinkstock

TiDB is an open-source, cloud-native, MySQL-compatible distributed database that handles hybrid transactional and analytical processing (HTAP) workloads. It is a member of the “NewSQL” class of relational databases that are designed to be deployed at massive scale. For those of you wondering, the “Ti” stands for Titanium.

PingCAP started building TiDB just three and a half years ago, but already the product has gathered upwards of 15,000 GitHub stars, 200 contributors, 7200 commits, 2000 forks, and 300 production users. Recently TiDB also collected InfoWorld’s 2018 Bossie Award as one of the best open source software projects in the data storage and analytics space.

In this article, I’ll walk through the core features and architectural design of TiDB, cover the three main use cases for the database, and offer a preview of the forthcoming multicloud TiDB-as-a-service offering and TiDB Academy from PingCAP.

TiDB features

The core features of TiDB include elastic horizontal scalability, distributed transactions with ACID guarantees, high availability, and real-time analysis on live transaction data. Let’s take a look at the platform architecture behind these features. The TiDB platform has the following components:

  • TiDB: A stateless SQL layer that is MySQL compatible, built in Go.
  • TiKV: A distributed transactional key-value store, built in Rust. (TiKV recently became a Cloud Native Computing Foundation project.)
  • TiSpark: An Apache Spark plug-in that connects to TiKV or a specialized, columnar storage engine (something we are working on... stay tuned).
  • Placement Driver (PD): A metadata cluster powered by Etcd that manages and schedules TiKV.

TiKV is the foundational layer. This is where all of the data is persisted, automatically partitioned into smaller chunks (we call them “regions”), and automatically replicated and made strongly consistent by executing the Raft consensus protocol. Working with PD, the Placement Driver, TiKV can replicate data across nodes, data centers, and geographical locations. It can also dynamically remove hotspots as they form, and split or merge regions to improve performance and storage usage. We implement range-based sharding inside TiKV, instead of hash-based sharding, because our goal from the beginning was to support a full-featured relational database. Thus TiKV supports various types of scan operations—table scans, index scans, etc.

The stateless SQL layer in TiDB is built to handle 100 percent of your online transaction processing (OLTP) workloads and 80 percent of the common, ad-hoc, online analytical processing (OLAP) workloads. Demonstrating regular performance improvements (see our latest TPC-H benchmarks), this stateless SQL layer leverages the distributed nature of TiKV to execute parallel processing via a coprocessor layer, by pushing down partial queries to different TiKV nodes simultaneously.

For more complex OLAP workloads, such as iterative analysis for training machine learning models or real-time business intelligence gathering, a second stateless SQL layer—TiSpark—reports for duty, also drawing data directly from TiKV. Whereas TiDB speaks MySQL, TiSpark exposes Spark SQL.

tidb platform architecture PingCAP

The TiDB platform architecture. 

TiDB architecture

As you might have noticed, the entire TiDB platform is modularized—all components are separate code bases and loosely coupled. You can deploy the entire TiDB platform as a complete package (most users do) or just parts of it depending on your needs. This modular architecture gives users maximum flexibility and fits the standard of a cloud-native architecture. As per the CNCF’s official definition, cloud-native techniques are ones that “enable loosely coupled systems that are resilient, manageable, and observable.”

As a TiDB user, you can scale your stateless SQL server or TiSpark layer (i.e. your compute resources) out or in independently of scaling TiKV (i.e. your storage capacity), allowing you to make the most of the resources you are consuming to fit your workloads. You can almost think of the TiDB stateless SQL servers as a microservice that sits on top of TiKV, which is a stateful application where your data is persisted. This design makes isolating bugs easier, and rolling upgrades and maintenance quicker and less disruptive.

The trade-off to these TiDB advantages is some additional complexity to deployment and monitoring—there are just more pieces to keep track of. However, with the rise of Kubernetes and the Operator pattern pioneered by CoreOS, deploying and managing TiDB is simple, straightforward, and increasingly automated. The open-source TiDB Operator for Kubernetes allows you to deploy, scale, upgrade, and maintain TiDB in any cloud environment—public, private, or hybrid. TiDB installs Prometheus and Grafana by default, so monitoring comes “out of the box.” (See our tutorial on TiDB Operator.)

Ultimately, flexible scalability of your technical assets is crucial for business success. It’s the difference between becoming the next Facebook and the next Friendster. TiDB modularity and Kubernetes orchestration allow you to bring flexible scalability to your database services.

Finally, let’s look at the three main use cases for TiDB: MySQL scalability, HTAP real-time analytics, and unified data storage.

tidb grafana dashboard PingCAP

A sample Grafana dashboard monitoring a TiDB deployment. 

TiDB use case: MySQL scalability

Because TiDB speaks MySQL – it is compatible with both the MySQL wire protocol and MySQL ecosystem tools like MyDumper and MyLoader—it’s a natural fit for MySQL users who have trouble scaling. To be clear, TiDB is not a replacement for MySQL; rather, it complements MySQL. MySQL is still a great option as a single-instance database, so if your data size or workload is small, stick with MySQL. But if you are scratching your head over issues like these:

  • Considering how to replicate, migrate, or scale your database for extra capacity
  • Looking for ways to optimize your existing storage capacity
  • Getting concerned about slow query performance
  • Researching middleware scaling solutions or implementing a manual sharding policy

Then it’s the right time to start thinking about using a distributed SQL database like TiDB, which takes care of all these concerns out of the box for you. The shortcomings of MySQL sharding solutions is why Mobike, one of the world’s largest dock-less bike-sharing platforms, adopted TiDB (read the Mobike case study). Mobike operates nine million smart bikes in 200 cities serving 200 million users, so it’s not hard to imagine the scaling bottlenecks its team experienced with MySQL. Mobike addressed its need for elastic scalability by deploying TiDB alongside MySQL, along with PingCAP’s suite of enterprise tools including Syncer, which automatically syncs MySQL masters with a TiDB cluster.

A key differentiator between TiDB and other MySQL-compatible databases is TiDB’s distributed architecture. MySQL is a 23-year-old technology that was never meant to work as a distributed system. For example, unlike TiDB, MySQL cannot generate query plans that push down partial queries into multiple machines simultaneously to do parallel processing. TiDB’s SQL parser, cost-based optimizer, and coprocessor layer were built from the ground up to leverage the computing resources and parallelism of a distributed database, so MySQL users can have more power at their disposal.

TiDB use case: HTAP real-time analytics

HTAP (hybrid transaction and analytical processing) is a term, coined by Gartner in 2014, that describes a database architecture that breaks down the wall between transactional and analytical data workloads. The goal is to give businesses real-time analytics and thus enable real-time decision-making. Other industry analyst firms have come up with their own terms for this architecture—451 Research’s “HOAP” (hybrid operational-analytic processing), Forrester’s “Translytical,” and IDC’s “ATP” (analytical transaction processing).

As we’ve discussed, TiDB breaks down the wall between OLTP and OLAP by decoupling its compute layer and storage layer, and using different stateless SQL engines (TiDB and TiSpark) for different analytics tasks. Both engines connect to the same persistent data store (TiKV), making real-time analytics and decision-making a natural product of the system. Cumbersome ETL processes are minimized, “t+1” delays are no longer just part of life, and the data that lives inside TiDB can be used more creatively than before.

Yiguo.com, a large fresh-produce delivery platform that serves 5 million users, is running Apache Spark on top of TiDB (read the Yiguo.com case study) to accelerate complex queries. By upgrading its infrastructure from SQL Server and strengthening its existing MySQL deployment with TiDB, Yiguo.com can run complex joins with high performance throughout the day on Singles’ Day, China’s largest online shopping day, to gain insights and make decisions in real-time.

TiDB use case: Unified data storage

A distributed, modular, HTAP database, TiDB is designed to scale compute and storage capacity horizontally and flexibly to adapt to different workloads, while also serving as a “single source of truth.” By providing scalable SQL services on top of a key-value store, TiDB aims to dramatically reduce the human and technology costs associated with maintaining the data management layer in your infrastructure stack.

For Ele.me, one of the largest food delivery platforms in the world, unifying data storage was one of the key reasons for adopting TiDB and TiKV (read the Ele.me case study). Previously, Ele.me’s data was scattered across among a number of different databases—MongoDB, MySQL, Cassandra, Redis. Eventually, this makeshift stack became untenable, due to mounting operational and maintenance costs. Now 80 percent of the live traffic to Ele.me, from some 260 million users, is served by a single TiKV deployment. This TiKV cluster spans four data centers, each with 100-plus nodes storing dozens of TBs of data—always on, always available.  

Multicloud TiDB-as-a-service and TiDB Academy

Since PingCAP began building TiDB more than three years ago, the database has been battle-tested in many scenarios. Today, more than 300 companies are relying on TiDB in production to meet their OLTP/OLAP, database scalability, real-time analytics, and unified storage needs. However, there are still many to-do’s on the TiDB roadmap.

One of those items is a fully-managed TiDB-as-a-service offering that can be used in any cloud setting—public, private, or hybrid. PingCAP has been working on an enterprise-grade, fully-managed TiDB offering based on Kubernetes and will release the first version by the end of this year. If you are interested in early access to this offering, please sign up here.

Another project in the works at PingCAP is TiDB Academy—a set of self-paced, hands-on courses designed to help database administrators, devops pros, and system architects understand the architecture, design choices, strengths, and trade-offs of TiDB. The first course, “Distributed Database with TiDB for MySQL DBAs,” is ready for enrollment. You can sign up here

And if you just want to give TiDB a spin, check out our TiDB Quick Start Guide.

Kevin Xu (twitter: @kevinsxu) is general manager of global strategy and operations at PingCAP, with a special focus on cloud product management and strategy. 

New Tech Forum provides a venue to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all inquiries to newtechforum@infoworld.com.

Copyright © 2018 IDG Communications, Inc.