Getty Images
Few-shot learning explained: What you should know
Training data quality and availability aren't always a given in machine learning projects. When data is limited, costly or nonexistent, few-shot learning can help.
Training data is the foundation of machine learning. But extensive training data isn't always readily available, nor are the computational resources needed to train an ML model. To address this issue, developers can use few-shot learning, an approach that relies on a significantly smaller data set.
Few-shot learning takes advantage of ML algorithms' ability to evaluate large sets of training data, then produce accurate decisions and predictions with real-time, real-world production data. By identifying underlying features and structures, few-shot learning enables an existing model to generalize and accurately identify new data with only a few examples.
This technique has many applications, including computer vision, robotics and audio processing. It's useful when data or compute resources are limited, data is too costly, or data isn't properly labeled. But despite all its benefits, few-shot learning can also limit the diversity and complexity of an ML model.
What is few-shot learning?
Few-shot learning is an ML approach that enables a model to classify new data using a limited set of training examples.
Supervised learning typically uses thousands or even hundreds of thousands of labeled data points to train and refine an ML system's classification and decision-making abilities. However, such detailed and extensive training is impractical when extensive volumes of training data can't feasibly be obtained or simply don't exist.
Few-shot learning builds on a pretrained ML model that already performs well in data identification and classification tasks. With few-shot learning, the pretrained ML model receives additional training to add new classifications using just a few data samples. While few-shot learning isn't intended for training an ML model from scratch, it's a valuable strategy for quickly and easily extending an existing model's capabilities.
There are three primary approaches to few-shot learning, each based on the type of prior knowledge the ML model possesses:
- Prior knowledge on similarity. This approach relies on learned patterns in training data that enable the ML model to separate or classify previously unlearned data.
- Prior knowledge on learning. This method involves using prior knowledge to tune the algorithm -- a process known as hyperparameter tuning -- so that it can operate effectively with few examples.
- Prior knowledge on data. In this approach, the ML model has an understanding of the variability and structure of data, which aids in building models from limited examples. An example is using pen stroke data as a foundation for handwriting analysis.
As a prerequisite to few-shot learning, the ML model must already possess some viable data. For example, consider a visual ML model trained to recognize bird species from thousands of diverse and accurately classified bird images.
If a new species of bird is discovered and only a few labeled images of it exist, few-shot learning can incorporate this new species into the model's training. Because the new data fits the underlying structures that the model has already learned, the model can learn to recognize the new species with only a handful of images.
N-shot learning
The practical challenges of training data availability and quality have spawned a general category of ML training called n-shot learning, where n represents some small number of samples.
There are three typical variations of n-shot learning:
- Few-shot learning, which uses a relatively small set of labeled data points to train an ML model.
- One-shot learning, a variant of few-shot learning that uses just one labeled sample for training.
- Zero-shot learning, an extreme approach that attempts to handle new data without any existing data samples.
When is few-shot learning appropriate?
While supervised learning is often the ideal approach for ML models, it isn't always desirable, practical or even possible in real-world scenarios. Few-shot learning can supplement supervised learning in various situations, including the following:
- Data costs too much. Most businesses do not possess enough data to adequately train an ML model, leading them to purchase or license additional data from outside data sources. If the costs of doing so are excessive, few-shot learning might be a more attractive option.
- Data isn't properly labeled. Training data requires accurate labels, but data quality problems and imperfect labeling are common. Often, labeling is performed by people with limited knowledge of the data. Poor labeling and other data quality issues might lead a business to use few-shot learning.
- Data is limited or doesn't exist. Ample data examples do not exist for every possible topic. For example, diagnosing rare diseases, identifying new species or analyzing unique samples might require few-shot learning due to the scarcity of training data.
- Compute resources are limited. Training an ML model using supervised learning can demand significant time and compute resources. A business that cannot allocate these resources might turn to few-shot learning.
Few-shot learning use cases
Numerous ML and AI fields can use few-shot learning, including the following:
- Computer vision. Few-shot learning can support tasks such as character recognition, image classification, object recognition, object tracking and object labeling. Few-shot learning can provide new classifications for similar but different data, as in the above example of adding a new bird species to a library of previously learned species.
- Robotics. Few-shot learning enables robots to learn tasks based on limited human demonstrations, such as how to move from one place to another or how to assemble certain parts.
- Audio processing. Few-shot learning can support tasks such as voice cloning, conversation and translation when few audio samples exist. For example, a speech recognition system could learn to accurately identify and transcribe a new speaker's voice with just a handful of voice samples.
- Natural language processing. Few-shot learning can assist with NLP tasks such as parsing, translation, sentence completion and sentiment analysis. For example, if an ML translation platform incorrectly translates a word, few-shot learning can correct the context, pronunciation or usage, or add new words to an existing translation capability.
- Healthcare. Few-shot learning is well proven in image analysis and can be a powerful addition to image processing and diagnostic capabilities. For example, it can help an image processing platform distinguish between a normal cell and various abnormal expressions of that cell, such as a cancerous version.
- Math and analytics. Few-shot learning can be applied in situations where data is limited or specific queries are not fully supported by the existing training data. It's useful for IoT analytics and mathematical tasks such as curve fitting and reasoning.
Few-shot learning vs. few-shot prompting: What's the difference?
The term few-shot has gained rapid acceptance in ML and AI circles, leading to some confusion due to its application in both few-shot learning and few-shot prompting.
Few-shot learning is an ML technique designed to expand a model's existing training using limited new examples or data points. It's useful for adding new data to a pretrained ML model or when little source data exists.
In contrast, few-shot prompting is a prompt engineering technique used in ML and AI to include several direct examples in the context of the user's prompt. Few-shot prompting is often used to help direct, format or structure an output in a preferred manner.
Advantages of few-shot learning
Few-shot learning offers several key advantages:
- Reduced data collection. Few-shot learning takes the focus off data collection, saving time, money and storage resources. Less data also means less labeling and classification work.
- Reduced computing resources. Few-shot learning requires significantly less computational power and time compared with supervised and other comprehensive ML model training approaches.
- Reduced data dependency. Few-shot learning reduces reliance on large data sets, enabling businesses to develop models and deliver meaningful platforms even when data is scarce or too expensive to acquire.
- Greater model flexibility. Models capable of supporting few-shot learning can adapt quickly to new data or situations using limited data sets, enhancing versatility in busy, rapidly changing environments.
Disadvantages of few-shot learning
Despite its advantages, few-shot learning has several potential drawbacks that business and project leaders should consider:
- Lower diversity. Comprehensive ML training benefits from high data diversity. For example, learning from a thousand images of bluebirds in varied poses, locations and plumages increases a model's probability of accurately identifying that species in the future. With only a few examples, the model's ability to succeed with new data might suffer.
- Limited support for complexity. ML models see relationships between data points, and complex tasks often require significant amounts of data to establish those patterns and relationships. Few-shot learning might not be suitable for complex tasks or determinations due to insufficient data samples.
- Exaggerated impact of bad data. Models trained with few-shot learning are severely affected by erroneous or incomplete data samples, known as noisy data. This makes data selection and usage critical in few-shot learning.
- Memorization instead of understanding. Limited data samples used in few-shot learning might cause the model to memorize the samples rather than analyze and assimilate data into useful patterns, a phenomenon known as overfitting. This causes the model to provide good results with test data, but poor results with live data.
Stephen J. Bigelow, senior technology editor at TechTarget, has more than 20 years of technical writing experience in the PC and technology industry.