ML Model Training
- Overview
The process of training an ML model involves providing an ML algorithm (that is, the learning algorithm) with training data to learn from. The term ML model refers to the model artifact that is created by the training process.
The training data must contain the correct answer, which is known as a target or target attribute. The learning algorithm finds patterns in the training data that map the input data attributes to the target (the answer that you want to predict), and it outputs an ML model that captures these patterns. You can use the ML model to get predictions on new data for which you do not know the target.
Before training your model, you can:
- Identify the problem and candidate algorithms.
- Identify data required to train the algorithms.
- Collect initial data.
- Identify its quality and suitability for the task.
- Plan what is needed to make the dataset suitable for the project.
ML is a set of algorithms that learn from data and/or experiences, rather than being explicitly programmed. Each task requires a different set of algorithms, and these algorithms detect patterns to perform certain tasks.
Here are some concepts related to ML:
- Representation: How the model looks and how knowledge is represented
- Evaluation: How good models are differentiated and how programs are evaluated
- Optimization: The process for finding good models and how programs are generated
- The Steps of Training ML Models
Model training is a stage in the data science development lifecycle. It's the process of running a ML algorithm on a dataset, and then optimizing the algorithm to find certain patterns or outputs.
Model training involves learning good values for all the weights and bias from labeled examples. The resulting function with rules and data structures is called the trained ML model.
The process of training ML models can be divided into four steps:
- Data set split for training and evaluation
- Algorithm selection
- Hyperparameter tuning
- Model training
The model's performance during training will eventually determine how well it will work when it is eventually put into an application for the end-users.
- Training ML Models
Here are some steps to train a machine learning (ML) model:
- Data collection: Gather and measure information on targeted variables in an established system.
- Data preparation: Collect, clean, and organize data before using it to train a model. The quality of the data used to train a model significantly impacts the accuracy of its predictions.
- Choose a model: Select the appropriate model architecture and algorithms that can best solve the problem at hand.
- Train the model: Training is the process of the computer looking at all data to figure out the relationship between all the values.
- Analyze and visualize: Visualize the data to have a better understanding of relationships within the dataset.
- Model evaluation: Model evaluation is one of the most important steps in the ML pipeline. The performance of a model can be measured via dozens of metrics.
- Parameter tuning: Tune the parameters.Make predictions: Ask the model to make predictions.
- Foundation Models
Trained on massive datasets, foundation models are large deep learning neural networks that have changed the way data scientists approach ML.
A foundation model is a type of ML model that is pretrained to perform a range of tasks. Until recently, artificial intelligence (AI) systems were specialized tools, meaning that an ML model would be trained for a specific application or single use case.
Rather than develop AI from scratch, data scientists use a foundation model as a starting point to develop ML models that power new applications more quickly and cost-effectively.
Foundation models are ML models that can perform a variety of tasks, such as understanding language, generating text and images, and conversing in natural language. Researchers coined the term to describe ML models that are trained on a wide range of generalized and unlabeled data.
- Generative AI vs. AI ML
Machine learning (ML) and generative AI (GenAI) are both data-driven learning methods, but they have different goals and strategies:
- ML: Focuses on analyzing data to find patterns and make accurate predictions. ML models can be trained to help businesses by processing data, finding patterns, and testing correlations. Deep learning models are a type of ML model that imitate how humans process information.
- GenAI: Focuses on creating new data that resembles training data. Generative AI models are trained to recognize patterns in data and then use these patterns to generate new, similar data. For example, a model trained on English sentences can learn the statistical likelihood of one word following another, allowing it to generate coherent sentences.
Some common models used in GenAI include:
- Variational Autoencoders (VAEs)
- Generative Adversarial Networks (GANs)
- Autoregressive models
GenAI is powered by large machine learning models that are pre-trained on vast amounts of data. A subset of these models are called large language models (LLMs) and are trained on trillions of words across many natural-language tasks.
[More to come ...]