Supercloud is an emerging trend in enterprise computing that is predicted to bring major changes to how companies build out their cloud architecture.
Over the past six months, SiliconANGLE Media has been following the increase in companies considering supercloud as a way to get rid of multicloud complexity and help their customers monetize data assets.
Building a supercloud isnt a one-size-fits-all project. There are as many flavors of supercloud as there are choices for cloud. Some, like Snowflake Inc., are opting for the proprietary variety. Taking the opposite side of the debate is Databricks Inc., which advocates building on open-source standardization.
Open source can pretty much do anything, said Ali Ghodsi (pictured), co-founder and chief executive officer of Databricks Inc. We think that open source is a force in software thats going to continue for decades, hundreds of years, and its going to slowly replace all proprietary code in its way.
Ghodsi spoke with theCUBE industry analyst John Furrier at Supercloud 22, during an exclusive broadcast on theCUBE, SiliconANGLE Medias livestreaming studio. During The Open Supercloud session, they discussed the advantages and disadvantages of taking an open approach to supercloud.
Can open standards deliver the same experience as de facto, proprietary approaches in terms of control, governance, performance and security when it comes to building an abstraction layer that leverages hyperscaler power to deliver a consistent experience to users and developers?Databricks has bet its fortune on the fact that it can.
The companys data lakehouse platform provides an example of an open-source supercloud in action. Built on a structured and unstructured cloud data lake powered by the hyperscalers, which is made reliable and performant by Delta Lake, the platform provides a common approach to data management, security and governance through its Unity Catalog layer.
Were big believers in this data lakehouse concept, which is an open standard to simplifying the data stack and help people to just get value out of their data in any environment, Ghodsi said.
Around 80% of Databricks customer base is on more than one cloud, and they are struggling with the complexity, according to Ghodsi. Reconfiguring data management models over and over to integrate with the different proprietary technologies of the various cloud providers is a time-consuming and difficult task brought about thanks to the ad-hoc creation of multiclouds by default rather than by design a description given by Dell Technologies Inc.s Co-Chief Operating Officer Chuck Whitten.
Its the operations teams that bear the brunt of integrating new technology and making sure it works, according to Ghodsi. And doing it in multiple environments, each with a different proprietary stack, is a tough challenge.
So, they just want standardization, he said. They want open-source technologies. They believe in the communities around it. They know that source code is open so you can see if there are issues with it, if there are security breaches, those kinds of things.
Databricks didnt set out to build a supercloud. The company has a mission to help organizations move through the data/artificial intelligence maturity mode, bringing them to the point where they can leverage the advantage of prescriptive, automated AI/machine learning in the same way that enabled the tech giants to reach the level they are today, according to Ghodsi.
Google wouldnt be here today if it wasnt for AI, he said. You know, wed be using AltaVista or something.
The continuum starts when a company goes digital and starts to collect data, Ghodsi pointed out. They want to clean it, get insights out of it. Then they move on to using the crystal ball of predictive technology. The end comes when a company can finally automate the process completely and act on the predictions.
So this credit card that got swiped, the AI thinks it is fraud, were going to deny it, he said. Thats when you get real value.
The basis of Databricks data lakehouse, which falls under theCUBEs definition of supercloud, started when the company developed the Delta Lake framework in 2019 as a way to help companies organize their messy data lakes. The same year the project was donated to the Linux Foundation in order to encourage innovation. Then, at the start of Databricks Data + AI Summitthis past June, Databricks removed any differences between its branded Delta Lake and the open-source version by handing the reins of the storage framework to the Linux Foundation.
What were seeing with the data lakehouse is that slowly the open-source community is building a replacement for the proprietary data warehouse, Delta Lake, machine learning, real-time stack in open source, and were excited to be part of it, Ghodsi said.
Potentially, the most important protocol in the data lakehouse is Delta Sharing, according to Ghodsi. The open standard enables organizations to efficiently share large data sets without duplication. And being open source means that any organization can access the functionality to help build a supercloud in the design that works best for their organization.
You dont need to be a Databricks customer. You dont need to even like Databricks, Ghodsi said. You just need to use this open-source project and you can now securely share data sets between organizations across clouds.
Open source has already become the software default, and in the next couple of years, its going to be a requirement that software works across the different cloud environments, according to Ghodsi.
Is it based on open source? Is it using this data lakehouse pattern? And if its not, I think theyre going to demand it, he said.
Heres the complete video interview, part of SiliconANGLEs and theCUBEs coverage of the Supercloud 22 event:
Continue reading here: