Databricks landed the biggest acquisition of the generative AI era to date when it bought MosaicML in June for $1.3 billion. It was a nearly 6x markup on MosaicML’s previous valuation of $222 million — a “bargain,” if you ask Databricks CEO Ali Ghodsi: “In hindsight, I would’ve paid even more.”
On Wednesday, Databricks announced a new product Ghodsi hopes will validate his sentiment. His company has grown into one of the hottest enterprise tech firms in the world, last valued at $43 billion in September, by selling its flagship data “lakehouse” software to companies looking for a way to store and analyze data that is both structured (think: spreadsheets of bank transactions) and unstructured (like images and raw text files). The new offering, called the Data Intelligence Platform, will inject MosaicML’s generative AI expertise into the lakehouse.
“I think of the lakehouse as a small platform. It’s the foundation, and this is the skyscraper on top of it,” Ghodsi told Forbes.
Now, instead of needing to know coding languages like Python to analyze a company’s data, workers will be able to ask questions in plain English to get insights about the data, as if using a primitive version of ChatGPT — ultimately allow many more users to make use of data. This is powered by MosaicML’s large language models — the underlying technology that has powered the current AI boom. Customers can train these custom models using proprietary datasets that they have stored in Databricks.
“I have no doubt in my mind that this is the future of all data platforms — Databricks, Snowflake, [Google’s] BigQuery and so on.”
Among beta testers, Ghodsi said Databricks is seeing a lot of CEOs and executives use the new product to answer questions they’d otherwise have to task technical staff to find out. At Tufts Medicine, some doctors have been using the tool to review patient data. “This enables many more people in the organization to get the insights and do the kind of work that previously only data scientists could do,” Ghodsi said. “I have no doubt in my mind that this is the future of all data platforms — Databricks, Snowflake, [Google’s] BigQuery and so on.”
As Databricks, which ranked No. 2 on Forbes’ Cloud 100 list, gears up for a hotly-anticipated IPO, it’s come upon a two-front war against longtime data platform rivals (Snowflake, its closest parallel, is trading on the public market at a $55 billion market capitalization) and the new crop of AI model providers led by OpenAI (No. 1 on Cloud 100) and Anthropic (No. 73). Databricks announced its revenue run rate had surpassed $1.5 billion in September; OpenAI is fast catching up, at $1.3 billion, according to an October report in The Information.
With ChatGPT, OpenAI lowered the bar for consumers to access AI, and its revenue in turn has so far been reportedly dominated by subscriptions to the chatbot. But more companies are leveraging big models like its GPT-4 or Anthropic’s Claude 2 to build their own AI apps, like productivity tool Notion which earlier this week released a new AI-powered feature. Databricks is helping other customers tackle the problem from a different angle: instead of using a massive model, it helps them create smaller, bespoke models.
“What we are seeing most interest in is people who have very sensitive data who want to build their own AI,” Ghodsi said. “We’re helping them do that.” Already Databricks is at capacity on the 15,000 GPUs it leases for these purposes, Ghodsi said, meaning some clients will have to wait. The new product, which lowers the bar for workers to access the data on top of these custom AI models, represents another step in Databricks’ quest to carve out a niche in the fast-moving AI arena.
Still, there remains a learning curve to using the product. When Ghodsi demoed a version of it to Forbes, two queries returned the message “Oh no! An internal error happened.” Databricks clarified afterwards that a sample dataset was used for the demo, and that a user needs to be well-versed in the subject matter to ask the right questions that will return useful answers. The product, then, likely won’t make data scientists obsolete anytime soon. “I think they will continue to be around and they’re going to become even more important because they’re the experts that can troubleshoot those errors you saw,” Ghodsi says.
Update: Databricks clarified after the story was published that part of the product which returned the error messages is still in beta testing, and was not part of Wednesday’s product launch. Databricks said the current product includes a different conversational assistant that can answer simple questions, and it is planning to roll out the more advanced one soon.
MORE FROM FORBESCrisis At Cruise: Robotaxi CEO Confirms Coming Layoffs Amid Scramble To Rebuild Public TrustBy Cyrus FarivarMORE FROM FORBESNotion Releases 'Q&A' AI Feature In Bid To Become 'Magnet Tool' For WorkersBy Kenrick CaiMORE FROM FORBESInside Former Google CEO Eric Schmidt's $1 Billion Philanthropic MessBy Sarah EmersonMORE FROM FORBESFake Profiles And Anonymous Posts: How Social Media Is Upending College Life During The Israel-Gaza WarBy Alexandra S. LevineMORE FROM FORBESPlanes, Trains And Climate ChangeBy Alan Ohnsman