Internet giants like Google and Amazon have benefited greatly from using machine learning algorithms to enhance their online services. Other, smaller businesses want similar tools but can’t afford the scientific and engineering muscle required to do it right. Both new and established high tech players are now racing to make machine learning technology accessible to more companies, and there doesn’t seem to be a shortage of investor cash for smart startups in the field.
One such startup is Seattle-based Dato, which until today was called GraphLab. The company announced the name change this morning, combining it with the announcement of an $18.5 million Series B funding round. This is a huge round for Dato. The company raised $6.75 million before it.
Its customers include Adobe, PayPal, and Cisco. Online music radio service Pandora uses Dato’s software to power its recommendation engine. Real estate database Zillow uses it to generate price estimates.
Roots in Academia
Dato leadership’s goal is to democratize machine learning, Carlos Guestrin, its founder and CEO, said. About seven years ago, when he was associate professor at Carnegie Mellon University, Guestrin and a group of students started an open source project called GraphLab. The goal was to develop large-scale machine learning algorithms to analyze graphs. They tried doing it on Hadoop, but it proved too slow, so they built a new system that allowed them to write those algorithms faster and more easily.
GraphLab the company was formed only in 2013, after Guestrin moved from Pittsburgh, Pennsylvania, to Seattle to join the faculty of the University of Washington as Amazon Professor of Machine Learning. The word “Amazon” is in his title because Jeff Bezos has donated money to the university to attract academic talent like Guestrin, and his appointment was a result of the grant.
Machine Learning at Scale
The company launched its flagship product, called GraphLab Create, in October 2014. As Guestrin himself describes it, the product is meant to enable software engineers and data scientists to make creative intelligent applications that can transform their businesses.
While it has some elements of the original open source project, GraphLab Create is its own animal, built from scratch, he said. It is also open source. The company’s technology is not limited to graphs, like the original academic project at Carnegie Mellon was. It can use a variety of data types, including text and images, which is why the name was changed.
“Dato” is a Spanish and Portuguese word for “datum,” a single piece of information, the singular form of “data.” Guestrin is Brazilian, and Spanish and Portuguese are both his first languages. He likes the word, because it’s short and simple but has a meaning – a rarity in the startup world. “I find it very beautiful,” he said. “GraphLab didn’t capture where we were today.”
The company has given a lot of thought to infrastructure and scale when engineering GraphLab Create. It supports every stage of the application lifecycle, from development to production. A developer can prototype, build, and debug an application using its machine learning capabilities on a desktop, but deploy it on a single Linux server, or (if it needs to run at scale) on a Hadoop YARN cluster in a public cloud.
Money Pouring Into Machine Learning
Guestrin says Dato’s only competition is customers trying to build something similar themselves. And that may be the case for the specific type of technology the company has built, but the machine learning market in general is buzzing with activity.
Venture capital has been pouring into the space. Vulcan Capital, which led Dato’s recent round, and Madrona Venture Group, which participated in the round, also took part in a $21 million Series B for a Seattle-based machine learning startup called Context Relevant last May. Another example was a CRM startup called Clari, which raised $20 million in June to invest in machine learning capabilities for its product.
In December, a company called Scaled Inference announced a $13.6 million Series A round led by Khosla Ventures. Vinod Khosla, the famous Silicon Valley investor, likes to say in public appearances how much better the world would be had some functions currently performed by humans been taken over by machines.
Established high tech giants have also been investing a lot of cash in machine learning. After years of having used machine learning technology to fuel its online services, Google last year launched a neural network that recommends ways to optimize its global data center fleet for efficiency. Facebook has had a dedicated artificial intelligence lab since 2013.
Microsoft is working on real-time language translation during Skype conversations (the company launched a preview version in December); CERN and Yandex used a machine learning system to recognize certain particle collisions in the Large Hadron Collider in a simulation of the moments that followed the Big Bang; a system created by IBM Research was recently able to distinguish images of malignant skin cancer by scanning 3,000 images with 95 percent accuracy.
Going After the Generalist AI Niche
There is a myriad of applications for machine learning. Some companies are building machine learning capabilities for specific purposes, while others, like Dato, want to enable developers to make machine learning part of their applications, whatever each individual application’s function may be. The goal is along the same lines as that of IBM Watson, which is perhaps the most widely publicized machine learning technology.
Ultimately, Dato wants to enable developers to be creative with their data on any type of machine or cloud, Guestrin said. “You can be super creative, explore, and build a cool intelligent application on a laptop and deploy it as a service.”