A Medley of Potpourri: AutoAI

Monday, January 15, 2024

AutoAI

From Wikipedia, the free encyclopedia

https://en.wikipedia.org/wiki/AutoAI

Automated Artificial Intelligence (AutoAI) is a variation of the automated machine learning or AutoML technology, which extends the automation of model building towards automation of the full life cycle of a machine learning model. It applies intelligent automation to the task of building predictive machine learning models by preparing data for training and identifying the best type of model for the given data. then choosing the features or columns of data that best support the problem the model is solving. Finally, automation evaluates a variety of tuning options to reach the best result as it generates, then ranks, model-candidate pipelines. The best performing pipelines can be put into production to process new data, and deliver predictions based on the model training. Automated artificial intelligence can also be applied to making sure the model doesn't have inherent bias and automating the tasks for continuous improvement of the model. Managing an AutoAI model requires frequent monitoring and updating, managed by a process known as model operations or ModelOps.

The Automated Machine Learning and Data Science (AMLDS) is a small team within IBM Research, which was formed to apply techniques from artificial intelligence (AI), machine learning (ML) and data management to accelerate and optimize the creation of machine learning and data science workflows. AMLDS gets credit of driving the development of AutoAI.

Use case

A typical use case for AutoAI would be training a model to predict how customers might respond to a sales incentive. The model first gets training with actual data on how customers responded to the promotion. When the trained model presented with new data, can provide a prediction of how a new customer might respond, with a confidence score for the prediction. Prior to AutoML, data scientists had to build these predictive models by hand, testing various combinations of algorithms, then testing to see how predictions compared to actual results, whereas AutoML automated the processes of preparing the data for training, applying algorithms to process the data, and then further optimizing the results. Hence, AutoAI provides greater intelligent automation that allows for testing significantly more combinations of factors to generate model candidate pipelines that reflect and address the problem more accurately. Once built, the model evaluated for bias and updated to improve performance.

The AutoAI process

The user initiates the process by providing a set of training data and identifying the prediction column, which sets up the problem to solve. For example, the prediction column might contain values of yes or no in response to an offered incentive. In the data pre-processing stage, AutoAI applies various algorithms, or estimators, to analyze, clean (for example, remove redundant information or impute missing data), and prepare structured raw data for machine learning (ML).

The next is automated model selection that matches the data with a model type, such as classification or regression. For example, if there are only two types of data in a prediction column, AutoAI prepares to build a binary classification model. If there is an unknowable set of answers, AutoAI prepares a regression model, which employs a distinct set of algorithms, or problem-solving transformations. AutoAI ranks after testing candidate algorithms against small sub-sets of the information, increasing the size of the subset gradually for the algorithms that turns most promising to reach at the best match. This process of iterative and incremental machine learning is what sets AutoAI apart from earlier versions of AutoML.

Feature engineering transforms the raw data into the combination that represents the problem to arrive at the best accurate prediction. Part of this process is to evaluate how data in the training data source can best support an accurate prediction using algorithms, it weights few data more important than others to achieve the desired result. AutoAI automates the consideration of various features construction options in a non-exhaustive, structured manner, meanwhile progressively maximizing the accuracy of model using reinforcement learning. This results from an optimized sequence of information and data transformations that matches the best algorithms of the step involving model selection.

Finally, AutoAI applies the hyperparameter optimization step to refine and advance the best performing model pipelines. Pipelines are model candidates, evaluated and ranked by metrics like accuracy, precision. At the end of the process, the user can review the pipelines and choose the pipeline(s) to put into production to deliver predictions on new data.

History

In August 2017, AMLDS announced that they were researching the use of automated feature engineering to eliminate guesswork in data science. AMDLS members Udayan Khurana, Horst Samulowitz, Gregory Bramble, Deepak Toraga, and Peter Kirchner, along with Fatemeh Nargesian of the University of Toronto and Elias Khalil of Georgia Tech, presented their preliminary research at IJCAI that same year.

Called “Learning-based Feature Engineering,” their method learned the correlations between feature distributions, target distributions, and transformations, built meta-models that used past observations to predict viable transformations, and generalized thousands of data sets spanning different domains. To address feature vectors of varied sizes, it used Quantile Sketch Array to capture the essential character of a feature.

In 2018, IBM Research announced Deep Learning as a Service, which opened popular deep learning libraries such as Caffe, Torch and TensorFlow, to developers in the cloud. Jean-Francois Puget, PhD, a distinguished engineer specializing machine learning (ML) and optimization at IBM, entered the competition. He found out and decided to be ready for IBM AI and data science platforms like IBM Watson. In December 2018, IBM Research announced NeuNetS, a new capability that automated neural network model synthesis as part of automated AI model development and deployment.

In 2020, Liu et al. proposed a method for AutoML that used the alternating direction method of multipliers (ADMM) to configure multiple stages of an ML pipeline, such as transformations, feature engineering and selection, and predictive modeling. This was the first recorded time that IBM Research publicly applied the term “Auto” to machine-learning.

AutoAI: The evolution of AutoML

2019 was the year that AutoML became more widely discussed as a concept. “The Forrester New Wave™: Automation-Focused Machine Learning Solutions, Q2 2019,” evaluated AutoML solutions and found that the more powerful versions offered feature engineering. A Gartner Technical Professional Advice report from August 2019 reported that, based on their research, AutoML could augment data science and machine learning. They described AutoML as the automation of data preparation, feature engineering and model engineering tasks.

AutoAI is the evolution of AutoML. One of AutoAI's principal inventors, Jean-Francois Puget, PhD, describes it as automatically performing data preparation, feature engineering, machine learning algorithm selection, and hyper-parameter optimization to find the best possible machine learning model. The hyper-parameter optimization algorithm used in AutoAI differs from the hyper-parameter tuning of AutoML. The algorithm, optimized for cost function evaluations such as model training and scoring which are typical in machine learning, enabling rapid convergence to a satisfactory solution despite evaluation times of each iteration being of long duration.

Research scientists at IBM Research published a paper "Towards Automating the AI Operations Lifecycle", which describes the advantages and available technologies for automating more of the process, with the goal of limiting the human involvement required to build, test, and maintain a machine learning application. However, some HCI researchers argue that the machine learning application and its recommendations are inevitably taken by human decision makers, thus it is impossible to eliminate human involvement in the process. Rather, a more transparent and interpretable AutoAI design is the key to gain trust from human users, but such design itself is quite a challenge.

Awards for AutoAI

Winner, Best Innovation in Intelligent Automation Award at the AIconics AI Summit (2019), San Francisco.

Winner, iF Design Guide award for Communication in a Software Application (2020)

A Medley of Potpourri

Search This Blog