When it comes to databases, why ‘I can’t quit you, baby’

Leaving legacy RDMSs is hard, but eventually enterprises will break free of Oracle’s and others’ last grip on their data infrastructure

When it comes to databases, why ‘I can’t quit you’ (yet)
Thinkstock

If any company had a reason to dump Oracle, it’s Amazon. And yet, 14 years after Amazon lamented its “straining database infrastructure on Oracle” and started to “evaluate if we could develop a purpose-built database that would support our business needs for the long term,” the commerce and cloud provider won’t be free of Oracle until the first quarter of 2020, as reported by CNBC’s Jordan Novet.

That “I can’t leave you, baby” reality, to use Led Zeppelin’s lyrics, is not so much a testament to Oracle’s database prowess but to the friction inherent in moving data. Or, as Gartner analyst Merv Adrian once put it, “The greatest force in legacy databases is inertia.”

Why even mighty Amazon is stuck on its legacy Oracle databases

Amazon may have pushed Oracle’s database beyond its ability to scale as early as 2004, as Amazon CTO Werner Vogels has called out, but only a decade later was Amazon seriously considering replacing the venerable technology. As Novet’s interviews reveal:

Amazon began moving off Oracle about four or five years ago, said one of the people, who asked not to be named because the project is confidential. Some parts of Amazon's core shopping business still rely on Oracle, the person said, and the full migration should wrap up in about 14 to 20 months. Another person said that Amazon had been considering a departure from Oracle for years before the transition began but decided at the time that it would require too much engineering work with perhaps too little payoff.

“Too much engineering work with perhaps too little payoff” perfectly describes why most legacy tech sticks around. Once an application is written to run on the mainframe, there’s often little point in rewriting it to run elsewhere. In Adrian’s words, “When someone has invested in the schema design, physical data placement, network architecture, etc. around a particular tool, that doesn’t get lifted and shifted easily.”

What’s not broken needn’t be fixed. Or won’t be, anyway.

And so Amazon has been stuck on Oracle, even with the legacy database incapable of scaling to meet Amazon’s needs. Rather than rearchitect, Amazon has simply built new applications that run on its own database technologies like DynamoDB and Aurora.

Meanwhile, Oracle Chairman Larry Ellison gloated on a December 2017 earnings call that Amazon was forced to pay up $50 million, and hundreds of millions over the past few years. Such gloating, however, can’t paper over Oracle’s complete failure to compete with Amazon Web Services in the cloud, where its market share is a rounding error. This wouldn’t be such a big deal if it weren’t for the reality that data increasingly is born in the cloud, and so stays there with cloud databases like those from AWS or Microsoft Azure.

You’re likely not using your RDBMS as intended

Enterprises that catalog their current applications may discover that, as Amazon did in 2005, that “they were frequently not used for their relational capabilities.” Digging deeper, enterprises might also see, like Amazon, that “About 70 percent of operations were of the key-value kind, where only a primary key was used and a single row would be returned. About 20 percent would return a set of rows, but still operate on only a single table.”

In other words, enterprises that look closely at their data just might see that Oracle (or whatever relational database they use) is a really poor fit for their needs.

The battleground for data is about enabling transformation

Like Amazon, however, enterprises might also spend the next 14 years largely leaving an RDBMS in place for those old workloads because “there’s too much engineering work with perhaps too little payoff.” For new applications, by contrast, there’s plenty of payoff.

Not only is data increasingly born in the cloud, but the ways you manage it have multiplied considerably. This doesn’t mean relational databases like Oracle are dead. It simply means that a single application will often be comprised of a variety of databases. As Vogels has written:

We are increasingly seeing customers wanting to build Internet-scale applications that require diverse data models. In response to these needs, developers now have the choice of relational, key-value, document, graph, in-memory, and search databases. Each of these databases solve a specific problem or a group of problems.

This, however, is a forward-looking statement. It’s about what companies are building or will build, rather than what they have already built (and must continue to live with).

This is the battleground for data. It’s not about legacy applications and the legacy databases that support them. No, it’s all about the future of the enterprise as companies seek to put a bewildering array of data to work in transforming how they operate.

This is a world that Oracle—the industry leader—thus far has failed to impact. If this continues, not only will Amazon eventually break free of Oracle’s last grip on its data infrastructure, so will most other enterprises, too.

Copyright © 2018 IDG Communications, Inc.