Does data gravity hinder multi-cloud adoption?

  • The competing forces of data gravity and bandwidth will continue to impact the adoption and success of cloud computing for the foreseeable future
  • Hybrid multi-cloud scenarios combine on-prem workloads and multiple cloud services, will likely dominate the IT landscape for the upcoming era of computing

Data gravity is the concept that data remains where it is and applications and services are attracted to and use that data. Managing data dependencies across multi-cloud platforms to maintain acceptable levels of application performance, however, remains a major engineering challenge due to the real-world constraints of latency and bandwidth.

While data expands in volume and data gravity causes it to stay put, bandwidth also grows. Nielsen’s Law of Internet Bandwidth proposes that users’ bandwidth grows by 50% each year. So, bandwidth gets an order of magnitude faster about every five years. This means that the rate of bandwidth growth is faster than the rate of data growth but, unlike bandwidth, there is really no practical physical limit on data growth.

In the coming years, datasets are only going to continue to grow and exponentially at that, especially as organizations depend increasingly on artificial intelligence (AI), machine learning, and other data-intensive applications. According to IDC, worldwide data creation will grow to a mind-popping 163 zettabytes by 2025. That’s ten times the amount of data produced in 2017.

How does data gravity influence a multi-cloud strategy?

The bigger the total amount of data stored in one place, the more applications, services, and other data that are pulled towards it — and that is data gravity. As data grows, it becomes increasingly difficult and costly to move and the network of connections between data, applications, and other services becomes more complex. 

There are two challenges to solving data gravity; latency and scale. The speed of light is a fixed, hard limit on how fast data can be transferred between different internet addresses, so placing data as close to your cloud applications and services as possible will reduce latency. As your data increases in size, it becomes more difficult to move it around. 

One way to reduce latency is by putting all of your data in a single cloud. Like the proverbial warning about putting all of your eggs in one basket, this approach has some drawbacks. Of course, the idea that a single public cloud will solve all of your problems is a pipedream that no organization can really make work although it may sound easier, in theory, to work with only one vendor. But with only one bill to pay and the same underlying infrastructure for all of your applications, it isn’t practical.

In short, between the demand for edge computing, the need to comply with data sovereignty regulations and the general need to be nimble and flexible, one cloud for all of your data is just not practical for a business to compete in today’s market. Hence, the move towards multi-cloud adoption. 

That being said, with a hybrid cloud infrastructure, organizations can spread out apps and services to where their data is, to be closer to where they need it, addressing any latency problems and data sovereignty requirements. The key to making this work is to use a common operating environment across these various clouds and data center locations. If your organization is maintaining applications for many different operating environments, the associated complexity and costs may kill you competitively.

Pundits reckon organizations use a mix of Amazon Web Services (AWS), Microsoft Azure, Google Cloud Platform, VMware, on-premise storage, and more — but there needs to be a method to make the apps and services portable and interchangeable, regardless of hosting service. With a common operating system, you can write applications once, run them where it makes sense, and manage your whole environment from one console.