What is an analytics database (analytical database)?
An analytics database, also called an analytical database, is a read-only system that stores historical data on business metrics such as sales performance and inventory levels. Business analysts, corporate executives and other workers run queries and reports against an analytics database to derive useful insights that can then inform and enhance their decisions. The information in an analytics database is regularly updated to include recent transactional data from an organization's operational systems.
Data in analytics databases
Analytical databases store and process massive amounts of data. They then generate insights that enable organizations to better understand their business, customers, competitors and markets. The information enables them to make better decisions to benefit the business. Some analytical databases are part of a data warehouse or data mart, which is designed to be flexible for users who need to create specific reports and queries.
The data stored in analytics databases can be of many types, such as pricing data, data about trading volumes in financial markets, transactional data, process data, machine data, etc. Some analytics databases may also be used to store and process semistructured or unstructured data, including natural language data. The data may come from many different sources, including enterprise resource planning (ERP) platforms, customer relationship management (CRM) tools and other business applications or proprietary data. Regardless of where the data comes from, analytics databases are optimized to deal with large volumes of data (big data) and also to perform analytics and querying at high speeds.
![Business intelligence vs. advanced analytics chart.](https://www.techtarget.com/rms/onlineImages/business_analytics-business_intelligence_vs_advanced_analytics_mobile.png)
Key characteristics of an effective analytics database
An analytics database is designed to store, manage, process and query massive amounts of data for the purposes of data analytics and reporting. The ability to deal with big data and to make the data easy to query, accessible and usable are critical features of analytics databases. Since these databases are capable of aggregating and analyzing large quantities of data, they are highly suitable for big data analytics and applications like market trend analysis and financial reporting.
Analytics databases are read-only systems that are optimized for read-heavy operations and complex queries. The way the databases are stored is also optimized. The use of columnar storage means that only the specific column associated with a particular query needs to be accessed and its related data processed. That makes it possible for the database to quickly find and retrieve individual records, regardless of the size of the data set.
Data retrieval in analytics databases is further accelerated by the use of data indexing. Various advanced indexing strategies are used to support different types of queries and to increase the speed at which they are executed.
Parallel processing is an important feature of efficient analytics databases. Query workloads are distributed over numerous servers (nodes) which enables faster query response times and data retrieval.
Analytics databases are built to be scalable, which is vital to accommodate the ever-increasing size of data sets used by modern organizations and to ensure that performance does not degrade with expanding data volumes. Analytics databases also compress data, which speeds up query performance, thus enhancing user experiences. Data compression also reduces the physical storage requirements for data and allows the database to continue to efficiently manage large data volumes.
Benefits of analytics databases
The ability to hold large volumes of data while being optimized for fast querying makes it possible for analytics databases to generate responses in real time. That capability makes them ideal for processing data from sources that continuously generate large quantities of data, such as IoT sensors, biometrics devices, social media and streaming video.
Most analytics databases include advanced statistical functionalities (aggregating, filtering, etc.). Those features allow them to generate better insights and create more useful reports for end users. Users can further improve data analysis and reporting by integrating the database with a data visualization tool or business intelligence (BI) platform.
Finally, most analytics databases support Structured Query Language. Considering that SQL is one of the most popular query languages in use today, SQL compatibility makes it easier for users to deal with and query the data in an analytics database.
Components and types of analytics databases
Two major components of analytical databases are the data model used and the query language supported.
Data model. The data model is the logical structure through which data is stored and retrieved. It visually represents how the database organizes and manages data and how data flows through the database. Two common types of data models are relational and object-oriented. In a relational data model, the data is stored in separate tables and joined when necessary, while in an object-oriented data model, the data is stored in fields of objects and represented as objects in software. Database data models can also be hierarchical (data is organized into a tree-like structure), network (linked records have many-to-many relationships) or object-oriented (the database is defined as a collection of reusable objects).
Query language. A query language is a standardized and interpreted programming language for the retrieval of information from any database management system, including analytics databases, without needing to know the internal structure of the database. The information is usually, but not always, in the form of a table or a set of tables.
SQL is ubiquitous and allows for efficient data manipulation and querying. Using SQL, analysts can perform complex queries, quickly retrieve and clean data, and generate useful reports. SQL is a declarative programming language -- as opposed to a procedural one -- meaning that its syntax is defined by the data it operates on rather than the steps to manipulate the data.
Other query languages used for analytics include the following:
The language chosen would depend on many factors, including the purpose of the database and the type of data to be stored in it. For example, SQL, MySQL and PostgreSQL are ideal for joining large data sets and dealing with complex queries. However, when working with large volumes of diverse data types -- which may include unstructured data -- NoSQL may be a better choice for a query language.
Types of analytics databases
There are five main types of analytic databases.
Columnar databases. A columnar database stores data in large contiguous blocks of memory called data columns. This is different from a row-oriented database, where data is stored in tables split across columns and rows. The use of columns makes columnar databases well suited to analytical processing and for meeting the requirements of data warehouses.
Columnar databases also provide a solution to the growing problem of data sprawl in organizations. Since they are designed to store, analyze and retrieve data by column, they can accelerate data loading and improve data analytics performance. Their ability to scale makes them valuable in environments characterized by large data volumes. The columnar architecture also allows for efficient data compression and fast aggregate queries. Examples of columnar databases include MariaDB and Apache Kudu.
Massively parallel processing databases. In an MPP database, data is stored on multiple servers rather than on a single server. Each server has its own set of data and its own computing power. MPP databases typically provide high availability and redundancy by using clusters of computers to serve multiple users. They also efficiently handle large amounts of data while providing fast access to that data. Examples include Apache Redshift, Vertica and Teradata.
In-memory databases. In-memory databases are optimized to store data in random-access memory. This contrasts with traditional databases that rely on solid-state or hard drives to store both data and program instructions. Because data is stored in memory, in-memory databases provide faster query processing and a near-real-time response. Storing data in memory offers several potential benefits over traditional databases, including less I/O overhead and the ability to handle larger databases. Examples of in-memory databases include SAP S/4HANA, Oracle TimesTen and Aerospike.
Online analytical processing databases. An OLAP database is an online database that stores multidimensional cubes of aggregated data. This structure is designed to facilitate data analysis based on multiple data attributes. Since the data is stored in a multidimensional format, OLAP databases can handle higher data volumes than traditional databases. In addition to running fast analytics, these databases can also create reports faster and more efficiently than traditional databases.
Examples of database structures include star schemas and snowflake schemas. OLAP databases are used in many industries, including financial services and retail. Examples include Snowflake and IBM Cognos.
![OLAP process diagram.](https://www.techtarget.com/rms/onlineimages/crm-olap_mobile.png)
Data warehouse appliances. A data warehouse is an information repository used to centralize, store and index vast amounts of data from disparate sources. Data warehouse appliances are integrated hardware and software tools designed to simplify and expedite data warehouse implementation and management, and to optimize the data warehouse performance. DWAs are commonly used in BI to track and analyze data from many sources, including ERP systems, CRM software and other databases. Numerous vendors offer data warehouse appliances, including IBM, Microsoft, Oracle and Teradata.
Analytics database vs. transactional database
An analytics database is specifically designed to support BI and analytics applications, typically as part of a data warehouse or data mart. This differentiates it from a transactional database, including online transaction processing databases that are used for processing sales transactions like order entries. Transactional databases are optimized for high-volume CRUD (Create, Read, Update, Delete) operations and for ensuring data integrity in real-time applications (e.g., sales). In contrast, analytics databases are optimized for dealing with large data volumes, complex queries and read-heavy environments, rather than for high transaction volumes or real-time applications.
Some transactional databases can also support data warehouses and BI applications. However, analytics databases generally offer performance and scalability advantages over transactional databases and also over conventional relational databases.
Analytics databases are also different from operational databases. The latter are not designed for efficient analysis, but are best-suited for looking up single rows of information related to operations.
Thinking of migrating to an open source database? Learn how in this article, which details the steps and considerations for making the move.