Meet the Startup Building Artificial Intelligence for Everyone
A startup that’s new on the scene is taking on Google, Facebook, and other tech giants by developing new artificial intelligence technology for image and natural language processing, and building them into tools that are accessible to companies large and small, as well as anyone who wants to experiment with them via the web. That new startup is MetaMind, founded by Stanford PhD Richard Socher, who wants to prove that his technology can process images and text better than any other deep learning technology out there.
Deep learning, a subset of artificial intelligence, refers to a specific way of building and training neural networks to recognize images and understand natural language, using software that operates somewhat like the networks of neurons in the human brain. As MetaMind’s recently-launched website explains, deep learning is a “rapidly growing branch of artificial intelligence” that “comprises a set of techniques that don’t require domain experts to program knowledge into algorithms. Instead, these techniques can learn by observing data.”
As VentureBeat reported recently, Socher never intended to build a career at the “bleeding edge of artificial intelligence.” Instead, he wanted to combine two subjects — language and math — that he enjoyed already. However, Socher ended up developing a technology called recursive neural networks, and four months ago launched a startaup, called MetaMind, with the backing of Khosla Ventures and Marc Benioff.
Socher worked with natural language processing in college, but thought that there wasn’t enough math involved. In graduate school, he did research in computer vision, which had more math but was too specific. For his PhD, he focused on machine learning at Stanford, and saw Andrew Ng give a talk about deep learning and its applications in computer vision.
He tells VentureBeat, “I felt like those were great ideas, but they didn’t quite fit for natural-language processing yet. I invented a bunch of new models for deep learning applied to natural-language processing.” Socher’s recursive neural network examines connections between two consecutive words, then examines connections between that pair of words and the word to its left, continuing until it has examined all of the linguistic components at play.
Instead of working in academia, or taking one of the many jobs offered by major companies over the years, Socher wanted to bring machine learning technologies to a wide variety of people and businesses. He gathered funding to do so, building a team of 10 employees and already attracting paying customers, ranging from small companies to Fortune 500 companies.
Wired’s Cade Metz reports that in recent years, tech giants like Google, Facebook, Microsoft, and Baidu have all focused on deep learning as a promising path to the future of automated computer systems. They’ve hired researcher after researcher from the small pool of academics who specialize in the technology. While Socher, as one of these researchers, received “some very, very attractive offers,” he wanted to begin his own company, one that would build deep learning technologies that anyone can use. Socher tells Wired:
They’re doing some amazing work—Google and Microsoft and Facebook and so on—and their work is impacting a lot of people. But I felt like there’s a lot more potential if you give those tools to the remaining Fortune 500 companies—or to people on the internet, just to let them play with them on their own.
MetaMind’s website already reflects that aspiration. The startup provides “labs with demos” of some of the natural language processing, computer vision, and database predictions technologies that it plans to offer to enterprises, as well as to anyone on the internet. MetaMind will deliver artificial intelligence tools in three categories: language, vision, and database.
The platform can solve natural language processing tasks, classify images, and correct entries in a database. The language “smart module” includes capabilities for easy text classification, sentiment analysis, semantic similarity, summarization, question answering, and more. The vision smart module is capable of general object classification, food classification, localization, segmentation, and others, and users can train their own deep classifier. The database module is capable of autocompletion of missing entries and prediction of columns in a database.
When experimenting with MetaMind’s tools for language processing and image classification, visitors to the website can train MetaMind’s machine learning algorithms for their own tasks, and share the resulting classifier with others. Existing classifiers that are featured on the site include a Twitter sentiment classifier, a media bias classifier, a Kickstarter classifier, and an online peer grading classifier. The site explains how users can train their own text classifier and teach it to complete a task:
If there is a classification problem for which we do not have a classifier, you can upload labeled training data and, with a just a few clicks, you can train a classifier to predict on new datasets. We will give you an estimate of how well it works on new data.
The deep learning algorithms can also analyze daily Twitter trends, and MetaMind’s algorithms can also score the semantic similarity between two sentences, such as “Two men are taking a break from a trip on a snowy road” and “Two men are taking a break from a trip on a road covered by snow.” (The relatedness score of those two sentences comes out to 4.05 out of 5. 1 would mean that they aren’t related at all, while 5 would mean that one is an almost-perfect paraphrase of the other.) This kind of technology could help a company to answer questions from customers, who can ask what amounts to the same question in a huge variety of different ways.
And MetaMind says that its deep learning algorithms for sentiment analysis can recognize positive, neutral, or negative sentiment more accurately than any other API currently available, classifying “overall sentiment as well as entity-level sentiment.” The sentiment module can be used in finance, marketing, and social media analysis.
MetaMind’s website also enables users to try out its tools related to computer vision. As users can train and share their own text classification model, they can create image classifiers for any set of labels, teaching the algorithm by showing it examples of each label. The image prediction infrastructure that users can experiment with can distinguish between 22,000 different general object categories or identify different types of food, with accuracy as good as Google’s system. Users can create general image classifiers, food classifiers, or other deep classifiers with the IcMe (Image Classification Made Easy) project which enables users to automatically label images with classifiers based on convolutional neural networks.
MetaMind’s explanation of the possibilities with its vision models emphasizes how the technology is accessible to anyone on the internet. “MetaMind’s goal is to make deep learning as easy as drag, drop and learn. Hence, we let you create your own image classifier for any set of labels you find interesting directly. All you need to do is teach the algorithm by giving it examples of each label.”
Wired points out that while getting a deep learning tool to recognize photos of chocolate chip cookies, or asking it to identify photos of bald men on horses, is a fun party trick, these kinds of abilities are effective tools for just about any kind of online business. Google and Facebook use deep learning systems to better understand search queries, or more accurately identify images.
But Socher tells Wired that MetaMind is thinking beyond those applications, and is already working with a wide range of businesses, from those who want to identify food photos to those who want to automatically analyze body scans and X-rays. MetaMind’s technology for medical image understanding, for example, is able to “make the expertise of the world’s best radiologists available across the globe via automated medical image diagnostics,” according to the startup’s website. Re/Code reports that MetaMind wants to help radiologists identify cancer, insurers assess houses, and nutritionists label food.
While MetaMind is just one of an array of startups looking to bring this kind of advanced artificial intelligence to the world outside of Google, Facebook, Yahoo, and Twitter — companies that have all acquired such startups — it’s unique in its goal to build a broad set of tools. As VentureBeat reports, Socher thinks that MetaMind has an advantage over those companies in that it draws on New York University professor Yann LeCun’s convolutional neural networks for mining images, as well as Socher’s recursive neural networks, which have achieved breakthoughs in text processing.
Another advantage that may define MetaMind’s contribution to the area is its wide range of capabilities in language processing. A notable startup that remains independent from the powerful tech giants is Clarifai, which focuses on image search. Socher’s focus at Stanford is natural language processing, a field in which researchers look to build systems that can understand more than just words, and are able to process sentences or even entire paragraphs.
LeCun, a deep learning “founding father” who now helms Facebook’s artificial intelligence lab, has identified systems that can truly understand language as the “next frontier.” Though tools like Google Now or Siri can understand the words that a user says, they can’t truly understand the meaning of those words. One of the academic community’s hopes for deep learning is that the area will lead to computers that can understand language. LeCun says that it’s because deep learning systems can train themselves on tasks as they work that many researchers think these machines can help with natural language processing.
Wired reports that MetaMind will both act as a deep learning consultant and offer its deep learning services and software to businesses. Its online service, on hundreds of machines with thousands of graphics processors, will enable businesses to set up dedicated hardware to run deep learning processes without setting up their own dedicated hardware to do so. (And if a company does want to set up its own hardware dedicated to deep learning processes, MetaMind will provide the necessary software and expertise.)
The fact that the tools MetaMind has made available on its site are simple for just about anyone to learn to use demonstrates that its technology really can be as simple as dragging and dropping, even when the tasks at hand call for multiple types of deep learning at one time. VentureBeat notes that while Google and Microsoft have each recently announced that they have made advances in processing images and text together, but Socher was working on the task last year, and has a paper on the topic slated for publication in February.
“We call it ‘drag, drop and learn,'” MetaMind chief executive Sven Strohband, who joined the company from Khosla Ventures, told Re/Code. “All you need is a Web browser, and you can use deep learning technology.” But deep learning isn’t the solution to every problem, and even Strohband acknowledges that. “We are trying to not be very abstract. There are specific tasks, and we are trying to be the best in the world.”
As Technology Review reported in an interview with Demis Hassabis — founder of the startup DeepMind, which Google acquired earlier this year — the goal with artificial intelligence is to create computers that can solve any problem. Hassabis explained, “AI has huge potential to be amazing for humanity. It will really accelerate progress in solving disease and all these things we’re making relatively slow progress on at the moment.”
Google acquired DeepMind shortly after it demonstrated that its software is capable of teaching itself to play classic video games. Technology Review notes that while researchers are currently looking for ways that DeepMind’s technology could improve existing Google products, if the technology progresses as Hassabis and many others hope, it will change the role that computers play in many fields.
For technology in every subfield of artificial intelligence, including the areas that startups like MetaMind aim to take on, the number of potential applications is practically limitless. Though MetaMind’s demos point to some use cases for its deep learning technology, the wide range of the tools made available on the website suggest that the startup is open even to applications that it hasn’t yet imagined — applications that could make better, more sophisticated services and tools available to businesses and consumers and accelerate the pace at which advances in artificial intelligence and deep learning take hold in internet companies big and small.
As Strohband told VentureBeat of MetaMind’s technology, “We believe that this should be more available to lots of people, because we think that there’s lots of uses there. People use them for things we couldn’t have anticipated, really, quite frankly.” It’s that wide range of ideas and applications that are likely to lead to the next compelling use case for each breakthough made by artificial intelligence researchers — and to prove to the world that not every breakthrough has to originate with Google or Facebook.