Data analytics and AI
This glossary contains definitions related to customer data analytics, predictive analytics, data visualization and operational business intelligence. Some definitions explain the meaning of words used to Hadoop and other software tools used in big data analytics. Other definitions are related to the strategies that business intelligence professionals, data scientists, statisticians and data analysts use to make data-driven decisions.

Algorithms
Terms related to procedures or formulas for solving a problem by conducting a sequence of specified actions. In computing, algorithms in the form of mathematical instructions play an important part in search, artificial intelligence (AI) and machine learning.
-
What is the Twofish encryption algorithm?
Twofish is a symmetric-key block cipher with a block size of 128 bits and variable-length key of size 128, 192 or 256 bits.
-
What is domain generation algorithm (DGA)?
A domain generation algorithm (DGA) is a program that generates a large list of domain names. DGAs provide malware with new domains to evade security countermeasures.
-
What are diffusion models?
Diffusion models are a category of generative AI that excels at creating images, audio, video and other types of data by using a two-step process: forward diffusion and reverse diffusion.
Artificial intelligence
Terms related to artificial intelligence (AI), including definitions about machine learning and words and phrases about training data, algorithms, natural language processing, neural networks and automation.
-
What is Gemma? Google's open sourced AI model explained
Gemma is a collection of lightweight open source generative AI models designed mainly for developers and researchers.
-
What is an automation engineer and how do you become one?
An automation engineer designs and develops autonomous systems to manage repetitive tasks and improve efficiency and productivity.
-
What is lemmatization?
Lemmatization is the process of grouping together different inflected forms of the same word.
Data and data management
Terms related to data, including definitions about data warehousing and words and phrases about data management.
-
What is data curation?
Data curation is the process of creating, organizing and maintaining data sets so people looking for information can access and use them.
-
What is unstructured data?
Unstructured data is information, in many different forms, that doesn't follow conventional data models, making it difficult to store and manage in a mainstream relational database.
-
What is taxonomy in computing?
Taxonomy is the science of classification according to a predetermined system, with the resulting catalog used to provide a conceptual framework for discussion, analysis or information retrieval.
Database management
Terms related to databases, including definitions about relational databases and words and phrases about database management.
-
What is taxonomy in computing?
Taxonomy is the science of classification according to a predetermined system, with the resulting catalog used to provide a conceptual framework for discussion, analysis or information retrieval.
-
What is data preprocessing? Key steps and techniques
Data preprocessing, a component of data preparation, describes any type of processing performed on raw data to prepare it for another data processing procedure.
-
What is data cleansing (data cleaning, data scrubbing)?
Data cleansing, also referred to as data cleaning or data scrubbing, is the process of fixing incorrect, incomplete, duplicate or otherwise erroneous data in a data set.