H2O puts machine learning on autopilot

H2O's Driverless AI promises to bring ML analysis to nontechnical users, and to take the drudgery out of model selection for experts

Senior Writer, InfoWorld |

H2O puts machine learning on autopilot — Thinkstock

H2O.ai, creator of applications for making machine learning accessible to business users, has introduced a product intended to allow business users familiar with products like Tableau to extract insights from data without needing expertise in deploying or tuning machine learning models.

Driverless AI, currently in beta, is billed by H2O.ai as an “expert system for AI” — a way to automate the kinds of expertise that data scientists bring to developing machine learning models. The target audience is non-expert users, who can take datasets and run GPU-accelerated ML algorithms against them to extract useful results, without understanding the ins and outs of data science.

In addition to business users eager to leverage ML in their organizations but lack expertise, H2O is also pitching Driverless AI to data scientists. H2O considers Driverless AI to be a way for expert users to automate some of the more tedious processes of analyzing a dataset, such as selecting which of various automatically trained models is the best fit for a given dataset.

The end user sets up their data experiments by way of a web-based UI, with the user typically needing only to choose which target variable from the dataset to solve for. The app handles the selection and deployment of the underlying components, such as AutoML, XGBoost, or TensorFlow, to determine which ones yield the most accurate results. Details that would normally require the attention of a data scientist, such as hyperparameter tuning, can be handled automatically.

The results of the analysis can be browsed via interactive charts presented through the web UI. Another highly touted feature is plain-English explanations of the significance of given variables in an analysis, although some of the data explanations are clunky. Variables that are binary or highly normalized (e.g., male vs. female, or yes/no/don’t-know) are explained using language suited to data with a freer range of values (“If PERSONS_SEX is greater than 1...”).

Many common data analysis use cases are pre-programmed into Driverless AI, and H2O has said it will add more over time. End users cannot yet add their own scenarios.

Pieces of the H2O stack, like the core H2O processing engine and the above-mentioned ML libraries, are available as open source. Driverless AI itself, while built atop those things, is a proprietary product with per-GPU pricing. The company offers Docker containers that can be run on GPU-equipped appliances, and it will also make cloud editions available.

Next read this:

Serdar Yegulalp is a senior writer at InfoWorld, focused on machine learning, containerization, devops, the Python ecosystem, and periodic reviews.