Data and data management
Terms related to data, including definitions about data warehousing and words and phrases about data management.- file synchronization (file sync) - File synchronization (file sync) is a method of keeping files that are stored in several different physical locations up to date.
- firmographic data - Firmographic data is types of information that can be used to categorize organizations, such as location, name, number of clients, industry and so on.
- FIX protocol (Financial Information Exchange protocol) - The Financial Information Exchange (FIX) protocol is an open specification intended to streamline electronic communications in the financial securities industry.
- foreign key - A foreign key is a column or columns of data in one table that refers to the unique data values -- often the primary key data -- in another table.
- framework - In general, a framework is a real or conceptual structure intended to serve as a support or guide for the building of something that expands the structure into something useful.
- garbage in, garbage out (GIGO) - Garbage in, garbage out, or GIGO, refers to the idea that in any system, the quality of output is determined by the quality of the input.
- General Data Protection Regulation (GDPR) - The General Data Protection Regulation (GDPR) is legislation that updated and unified data privacy laws across the European Union (EU).
- Google BigQuery - Google BigQuery is a cloud-based big data analytics web service for processing very large read-only data sets.
- Google Cloud Storage - Google Cloud Storage is an enterprise public cloud storage platform that can house large unstructured data sets.
- GPS coordinates - GPS coordinates are a unique identifier of a precise geographic location on the earth, usually expressed in alphanumeric characters.
- gradient descent - Gradient descent is an optimization algorithm that refines a machine learning (ML) model's parameters to create a more accurate model.
- Gramm-Leach-Bliley Act (GLBA) - The Gramm-Leach-Bliley Act (GLB Act or GLBA), also known as the Financial Modernization Act of 1999, is a federal law enacted in the United States to control the ways financial institutions deal with the private information of individuals.
- grid computing - Grid computing is a system for connecting a large number of computer nodes into a distributed architecture that delivers the compute resources necessary to solve complex problems.
- gzip (GNU zip) - Gzip (GNU zip) is a free and open source algorithm for file compression.
- Hadoop - Hadoop is an open source distributed processing framework that manages data processing and storage for big data applications in scalable clusters of computer servers.
- Hadoop data lake - A Hadoop data lake is a data management platform comprising one or more Hadoop clusters.
- Hadoop Distributed File System (HDFS) - The Hadoop Distributed File System (HDFS) is the primary data storage system Hadoop applications use.
- hashing - Hashing is the process of transforming any given key or a string of characters into another value.
- health informatics - Health informatics is the practice of acquiring, studying and managing health data and applying medical concepts in conjunction with health information technology systems to help clinicians provide better healthcare.
- Health IT (health information technology) - Health IT (health information technology) is the area of IT involving the design, development, creation, use and maintenance of information systems for the healthcare industry.
- heartbeat (computing) - In computing, a heartbeat is a program that runs specialized scripts automatically whenever a system is initialized or rebooted.
- heat map (heatmap) - A heat map is a two-dimensional representation of data in which various values are represented by colors.
- hierarchy - Generally speaking, hierarchy refers to an organizational structure in which items are ranked in a specific manner, usually according to levels of importance.
- histogram - A histogram is a type of chart that shows the frequency distribution of data points across a continuous range of numerical values.
- historical data - Historical data, in a broad context, is data collected about past events and circumstances pertaining to a particular subject.
- IBM IMS (Information Management System) - IBM IMS (Information Management System) is a database and transaction management system that was first introduced by IBM in 1968.
- ICD-10-CM (Clinical Modification) - The ICD-10-CM (International Classification of Diseases, 10th Revision, Clinical Modification) is a system used by physicians and other healthcare providers to classify and code all diagnoses, symptoms and procedures related to inpatient and outpatient medical care in the United States.
- IDoc (intermediate document) - IDoc (intermediate document) is a standard data structure used in SAP applications to transfer data to and from SAP system applications and external systems.
- in-memory analytics - In-memory analytics is an approach to querying data residing in a computer's random access memory (RAM) as opposed to querying data stored on physical drives.
- in-memory database - An in-memory database is a type of analytic database designed to streamline the work involved in processing queries.
- inductive argument - An inductive argument is an assertion that uses specific premises or observations to make a broader generalization.
- infographic - An infographic (information graphic) is a representation of information in a graphic format designed to make the data easily understandable at a glance.
- information - Information is the output that results from analyzing, contextualizing, structuring, interpreting or in other ways processing data.
- information asset - An information asset is a collection of knowledge or data that is organized, managed and valuable.
- information assurance (IA) - Information assurance (IA) is the practice of protecting physical and digital information and the systems that support the information.
- information governance - Information governance is a holistic approach to managing corporate information by implementing processes, roles, controls and metrics that treat information as a valuable business asset.
- information lifecycle management (ILM) - Information lifecycle management (ILM) is a comprehensive approach to managing an organization's data and associated metadata, starting with its creation and acquisition through when it becomes obsolete and is deleted.
- information rights management (IRM) - Information rights management (IRM) is a discipline that involves managing, controlling and securing content from unwanted access.
- information systems (IS) - An information system (IS) is an interconnected set of components used to collect, store, process and transmit data and digital information.
- inline deduplication - Inline deduplication is the removal of redundancies from data before or as it is being written to a backup device.
- IT incident management - IT incident management is a component of IT service management (ITSM) that aims to rapidly restore services to normal following an incident while minimizing adverse effects on the business.
- Java Database Connectivity (JDBC) - Java Database Connectivity (JDBC) is an API packaged with the Java SE edition that makes it possible to connect from a Java Runtime Environment (JRE) to external, relational database systems.
- job - In certain computer operating systems, a job is the unit of work that a computer operator -- or a program called a job scheduler -- gives to the OS.
- job scheduler - A job scheduler is a computer program that enables an enterprise to schedule and, in some cases, monitor computer 'batch' jobs (units of work).
- job step - In certain computer operating systems, a job step is part of a job, a unit of work that a computer operator (or a program called a job scheduler) gives to the operating system.
- JOLAP (Java Online Analytical Processing) - JOLAP (Java Online Analytical Processing) is a Java application-programming interface (API) for the Java 2 Platform, Enterprise Edition (J2EE) environment that supports the creation, storage, access, and management of data in an online analytical processing (OLAP) application.
- key-value pair (KVP) - A key-value pair (KVP) is a set of two linked data items: a key, which is a unique identifier for some item of data, and the value, which is either the data that is identified or a pointer to the location of that data.
- knowledge base - In general, a knowledge base is a centralized repository of information.
- knowledge management (KM) - Knowledge management is the process an enterprise uses to gather, organize, share and analyze its knowledge in a way that's easily accessible to employees.
- knowledge-based systems (KBSes) - Knowledge-based systems (KBSes) are computer programs that use a centralized repository of data known as a knowledge base to provide a method for problem-solving.
- laboratory information system (LIS) - A laboratory information system (LIS) is computer software that processes, stores and manages data from patient medical processes and tests.
- Lambda architecture - Lambda architecture is an approach to big data management that provides access to batch processing and near real-time processing with a hybrid approach.
- legal health record (LHR) - A legal health record (LHR) refers to documentation about a patient's personal health information that is created by a healthcare organization or provider.
- Lisp (programming language) - Lisp, an acronym for list processing, is a functional programming language that was designed for easy manipulation of data strings.
- LTO-8 (Linear Tape-Open 8) - LTO-8, or Linear Tape-Open 8, is a tape format from the Linear Tape-Open Consortium released in late 2017.
- MariaDB - MariaDB is an open source relational database management system (DBMS) that is a compatible drop-in replacement for the widely used MySQL database technology.
- Massachusetts data protection law - What is the Massachusetts data protection law?The Massachusetts data protection law is legislation that stipulates security requirements for organizations that handle the private data of residents.
- master data - Master data is the core data that is essential to operations in a specific business or business unit.
- master data management (MDM) - Master data management (MDM) is a process that creates a uniform set of data on customers, products, suppliers and other business entities from different IT systems.
- medical scribe - A medical scribe is a professional who specializes in documenting patient encounters in real time under the direction of a physician.
- metadata - Often referred to as data that describes other data, metadata is structured reference data that helps to sort and identify attributes of the information it describes.
- Microsoft Azure - Microsoft Azure, formerly known as Windows Azure, is Microsoft's public cloud computing platform.
- Microsoft Azure Data Lake - Microsoft Azure Data Lake is a highly scalable public cloud service that allows developers, scientists, business professionals and other Microsoft customers to gain insight from large, complex data sets.
- Microsoft MyAnalytics - Microsoft MyAnalytics is a personal analytics application in Office 365 that enables employees to gain insights into how they spend their time at work and how they can work smarter.
- Microsoft Office SharePoint Server (MOSS) - Microsoft Office SharePoint Server (MOSS) is the full version of a portal-based platform for collaboratively creating, managing and sharing documents and Web services.
- Microsoft Power BI - Microsoft Power BI is a business intelligence (BI) platform that provides nontechnical business users with tools for aggregating, analyzing, visualizing and sharing data.
- Microsoft System Center - Microsoft System Center is a suite of software products designed to simplify the deployment, configuration and management of IT infrastructure and virtualized software-defined data centers.
- Microsoft Visual FoxPro (Microsoft VFP) - Microsoft Visual FoxPro (VFP) is an object-oriented programming environment with a built-in relational database engine.
- middleware - Middleware is software that bridges the gap between applications and operating systems by providing a method for communication and data management.
- Monte Carlo simulation - A Monte Carlo simulation is a mathematical technique that simulates the range of possible outcomes for an uncertain event.
- MPP database (massively parallel processing database) - An MPP database is a database that is optimized to be processed in parallel for many operations to be performed by many processing units at a time.
- multidimensional database (MDB) - A multidimensional database (MDB) is a type of database that is optimized for data warehouse and online analytical processing (OLAP) applications.
- national identity card - A national identity card is a portable document, typically a plasticized card with digitally embedded information, that is used to verify aspects of a person's identity.
- noisy data - Noisy data is a data set that contains extra meaningless data.
- normal distribution - A normal distribution is a type of continuous probability distribution in which most data points cluster toward the middle of the range, while the rest taper off symmetrically toward either extreme.
- NoSQL (Not Only SQL database) - NoSQL is an approach to database management that can accommodate a wide variety of data models, including key-value, document, columnar and graph formats.
- NVDIMM (Non-Volatile Dual In-line Memory Module) - An NVDIMM (non-volatile dual in-line memory module) is hybrid computer memory that retains data during a service outage.
- object-oriented database management system (OODBMS) - An object-oriented database management system (OODBMS), sometimes shortened to ODBMS for object database management system, is a database management system (DBMS) that supports the modelling and creation of data as objects.
- OLAP (online analytical processing) - OLAP (online analytical processing) is a computing method that enables users to easily and selectively extract and query data in order to analyze it from different points of view.
- Open Database Connectivity (ODBC) - Open Database Connectivity (ODBC) is an open standard application programming interface (API) that allows application programmers to easily access data stored in a database.
- operational data store (ODS) - An operational data store (ODS) is a type of database that's often used as an interim logical area for a data warehouse.
- operational efficiency - Operational efficiency refers to an organization's ability to reduce waste of time, effort and material while still producing a high-quality service or product.
- operational intelligence (OI) - Operational intelligence (OI) is an approach to data analysis that enables decisions and actions in business operations to be based on real-time data as it's generated or collected by companies.
- Oracle - Oracle is one of the largest vendors in the enterprise IT market and the shorthand name of its flagship product, a relational database management system (RDBMS) that's formally called Oracle Database.
- pandemic plan - A pandemic plan is a documented strategy for business continuity in the event of a widespread outbreak of a dangerous infectious disease.
- parallel file system - A parallel file system is a software component designed to store data across multiple networked servers.
- pebibyte (PiB) - A pebibyte (PiB) is a unit of measure that describes data capacity.
- performance and accountability reporting (PAR) - Performance and accountability reporting (PAR) is the process of compiling and documenting factors that quantify an organization's achievements, efficiency and adherence to budget, comparing actual results against previously articulated goals.
- personal health record (PHR) - A personal health record (PHR) is an electronic summary of health information that a patient maintains control of themselves, as opposed to their healthcare provider.
- picture archiving and communication system (PACS) - Picture archiving and communication system (PACS) is a medical imaging technology used primarily in healthcare organizations to securely store and digitally transmit electronic images and clinically relevant reports.
- pivot table - A pivot table is a statistics tool that summarizes and reorganizes selected columns and rows of data in a spreadsheet or database table to obtain a desired report.
- PL/SQL (procedural language extension to Structured Query Language) - In Oracle database management, PL/SQL is a procedural language extension to Structured Query Language (SQL).
- precision agriculture - Precision agriculture (PA) is a farming management concept based on observing, measuring and responding to inter- and intra-field variability in crops.
- predictive modeling - Predictive modeling is a mathematical process used to predict future events or outcomes by analyzing patterns in a given set of input data.
- primary key (primary keyword) - A primary key, also called a primary keyword, is a column in a relational database table that's distinctive for each record.
- product data management (PDM) - Product data management (PDM) is the process of capturing and managing the electronic information related to a product so it can be reused in business processes such as design, production, distribution and marketing.
- public data - Public data is information that can be shared, used, reused and redistributed without restriction.
- qualitative data - Qualitative data is information that cannot be counted, measured or easily expressed using numbers.
- radiology information system (RIS) - A radiology information system (RIS) is a networked software system for managing medical imagery and associated data.
- raw data (source data or atomic data) - Raw data is the data originally generated by a system, device or operation, and has not been processed or changed in any way.