Get started
Bring yourself up to speed with our introductory content.
Get started
Bring yourself up to speed with our introductory content.
DataOps
DataOps is an Agile approach to designing, implementing and maintaining a distributed data architecture that will support a wide range of open source tools and frameworks in production. Continue Reading
Comparing DBMS vs. RDBMS: Key differences
A relational database management system is the most popular type of DBMS for business uses. Find out how RDBMS software differs from other DBMS technologies. Continue Reading
data observability
Data observability is a process and set of practices that aim to help data teams understand the overall health of the data in their organization's IT systems. Continue Reading
-
What key roles should a data management team include?
These 10 roles, with different responsibilities, are commonly a part of the data management teams that organizations rely on to make sure their data is ready to use. Continue Reading
Data tenancy maturity model boosts performance and security
A data tenancy maturity model can boost an organization's data operations and help improve the protection of customer data. Improvement is tracked through tiers of data tenancy. Continue Reading
-
Definitions to Get Started
- What is data transformation? Definition, types and benefits
- What is data egress? How it works and how to manage costs
- What is data?
- What is Microsoft Visual FoxPro (VFP)?
- What is corporate performance management (CPM)?
- What is NoSQL (Not Only SQL database)?
- What is a data fabric?
- What is Structured Query Language (SQL)?
What is a data warehouse analyst?
Data warehouse analysts help organizations manage the repositories of analytics data and use them effectively. Here's a look at the role and its responsibilities.Continue Reading
Data observability benefits entire data pipeline performance
Data observability benefits include improving data quality and identifying issues in the pipeline process, but also has challenges organizations must solve for success.Continue Reading
OPAC (Online Public Access Catalog)
An OPAC (Online Public Access Catalog) is an online bibliography of a library collection that is available to the publicContinue Reading
primary key (primary keyword)
A primary key, also called a primary keyword, is a column in a relational database table that's distinctive for each record.Continue Reading
How to reap the benefits of data integration, step by step
A new book lays out a strong case for data integration and guides readers in how to carry out this essential process.Continue Reading
-
How to build an effective DataOps team
More organizations are turning to DataOps to bolster their data management operations. Learn how to build a team with the right people to ensure DataOps success.Continue Reading
data catalog
A data catalog is a software application that creates an inventory of an organization's data assets to help data professionals and business users find relevant data for analytics uses.Continue Reading
Key roles and responsibilities of the modern chief data officer
Chief data officer roles and responsibilities are expanding beyond data strategy, as they are increasingly tasked with cultivating a data-driven culture.Continue Reading
What is data lineage? Techniques, best practices and tools
Organizations can bolster data governance efforts by tracking the lineage of data in their systems. Get advice on how to do so and how data lineage tools can help.Continue Reading
The evolution of the chief data officer role
Chief data officers are taking on additional responsibilities beyond data management as they strive to transform organizations' data culture and focus on value creation.Continue Reading
10 trends shaping the chief data officer role
As data use increases and organizations turn to business intelligence to optimize information, these 10 chief data officer trends are shaping the role.Continue Reading
DBMS keys: 8 types of keys defined
Here's a guide to primary, super, foreign and candidate keys, what they're used for in relational database management systems and the differences among them.Continue Reading
How to build a data catalog: 10 key steps
A data catalog helps business and analytics users explore data assets, find relevant data and understand what it means. Here are 10 important steps for building one.Continue Reading
How to evaluate and optimize data warehouse performance
Organizations build data warehouses to satisfy their information management needs. Data warehouse optimization can help ensure that these warehouses achieve their full potential.Continue Reading
6 key steps to develop a data governance strategy
Data governance shouldn't be built around technology, but the other way around. Existing infrastructure, executive support, data literacy, metrics and proper tools are essential.Continue Reading
7 best practices for successful data governance programs
A comprehensive, companywide data governance program strengthens data infrastructure, improves compliance initiatives, supports strategic intelligence and boosts customer loyalty.Continue Reading
3 considerations for a data compliance management strategy
A data compliance management strategy is key for organizations to protect data the right way. Different positions have responsibility to ensure industry regulations are met.Continue Reading
Data Dredging (data fishing)
Data dredging -- sometimes referred to as data fishing -- is a data mining practice in which large data volumes are analyzed to find any possible relationships between them.Continue Reading
5 key elements of data tenancy
Data tenancy is a key piece of any data protection scheme and can be crafted around five building blocks to provide safe, secure data access to users.Continue Reading
data stewardship
Data stewardship is the management and oversight of an organization's data assets to help provide business users with high-quality data that is easily accessible in a consistent manner.Continue Reading
What a big data strategy includes and how to build one
Companies analyze stores of big data to improve how they operate. But those efforts will bring diminishing returns without a big data strategy. Here's how to build one.Continue Reading
How big data collection works: Process, challenges, techniques
Taming large amounts of data from multiple sources and deriving the greatest value to ensure trusted business decisions hinge on a foolproof system for collecting big data.Continue Reading
Self-service data preparation: What it is and how it helps users
Using self-service tools to properly prepare data simplifies analytics and visualization tasks for business users and speeds complex modeling processes for data scientists.Continue Reading
data profiling
Data profiling refers to the process of examining, analyzing, reviewing and summarizing data sets to gain insight into the quality of data.Continue Reading
data preprocessing
Data preprocessing, a component of data preparation, describes any type of processing performed on raw data to prepare it for another data processing procedure.Continue Reading
data cleansing (data cleaning, data scrubbing)
Data cleansing, also referred to as data cleaning or data scrubbing, is the process of fixing incorrect, incomplete, duplicate or otherwise erroneous data in a data set.Continue Reading
tree structure
A tree data structure is an algorithm for placing and locating files (called records or keys) in a database.Continue Reading
What is a data mart (datamart)?
A data mart is a repository of data that is designed to serve a particular community of knowledge workers.Continue Reading
data lake
A data lake is a storage repository that holds a vast amount of raw data in its native format until it is needed for analytics applications.Continue Reading
compliance
Compliance is the state of being in accordance with established guidelines or specifications, or the process of becoming so.Continue Reading
What is dark data?
Dark data is digital information an organization collects, processes and stores that is not currently being used for business purposes.Continue Reading
What is semantic technology?
Semantic technology is a set of methods and tools that provide advanced means for categorizing and processing data, as well as for discovering relationships within varied data sets.Continue Reading
What is Apache Flink?
Apache Flink is a distributed data processing platform for use in big data applications, primarily involving analysis of data stored in Hadoop clusters.Continue Reading
What is Apache Spark?
Apache Spark is an open source parallel processing framework for running large-scale data analytics applications across clustered computers.Continue Reading
What is a multimodel database?
A multimodel database is a data processing platform that supports multiple data models, which define the parameters for how the information in a database is organized and arranged.Continue Reading
How to choose exactly the right data story for your audience
A data practitioner has two jobs: tell the right data story and in the right way to win over project stakeholders, data expert Larry Burns says in his latest book.Continue Reading
Quiz: Test your understanding of the Hadoop ecosystem
This quiz will test your knowledge of Hadoops basics including framework, capabilities and related technologies.Continue Reading
stream processing
Stream processing is a data management technique that involves ingesting a continuous data stream to quickly analyze, filter, transform or enhance the data in real time.Continue Reading
9 steps to a dynamic data architecture plan
Learn the nine steps to a comprehensive data architecture plan, including C-suite support, data personas, user needs, governance, catalogs, SWOT, lifecycles, blueprints and maps.Continue Reading
How to build a successful cloud data architecture
As enterprises vacate the premises and migrate their operations skyward, a cloud data architecture can provide the long-term flexibility to improve workflows, costs and security.Continue Reading
Db2
Db2 is a family of database management system (DBMS) products from IBM that serve a number of different operating system (OS) platforms.Continue Reading
Building a big data architecture: Core components, best practices
To process the infinite volume and variety of data collected from multiple sources, most enterprises need to get with the program and build a multilayered big data architecture.Continue Reading
Establish big data integration techniques and best practices
A big data integration strategy departs from traditional techniques, embraces several data processes working together and accounts for the volume, variety and velocity of data.Continue Reading
Who belongs on a high-performance data governance team?
Putting together a high-quality data governance team can be a challenge. Explore the necessary team members and best practices for a high-performing team.Continue Reading
How parallelization works in streaming systems
Dive into this book excerpt from 'Grokking Streaming Systems' and learn the crucial role the parallelization process plays in the design of a streaming system.Continue Reading
How a DataOps pipeline can support your data
DataOps has created a lot of hype as a data management pipeline because of its focus on collaboration and flexibility. Read on to find out how these priorities support your data.Continue Reading
Open source database migration guide: How to transition
Open source database transitions have been on the rise as they prove to be worthy competitors to commercial database options, but that transition requires strategy and user buy-in.Continue Reading
Why your data story matters and how to tell it
Data storytelling isn't just for business analysts. Find out how to build a data management story and why you need to have one in the first place.Continue Reading
Creating a data advantage by building a data ecosystem
Developing a data ecosystem will improve personalization and customer retention. Find out how data mining across channels can build a data advantage for your organization.Continue Reading
Enterprise data lakes hold the key to actionable insights
Technological pillars of sound business decisions, AI, machine learning and advanced analytics depend on the quantity, quality and integrity of information in data lakes.Continue Reading
What is feature engineering?
Feature engineering is the process that takes raw data and transforms it into features that can be used to create a predictive model using machine learning or statistical modeling, such as deep learning.Continue Reading
Top 5 U.S. open data use cases from federal data sets
The U.S. government has made data sets from many federal agencies available for public access to use and analyze. Check out some of the ways that data is being used.Continue Reading
Quiz on MongoDB 4 new features and database updates
Check out this excerpt from the new book Learn MongoDB 4.x from Packt Publishing, then quiz yourself on new updates and features to the database.Continue Reading
Google BigQuery
Google BigQuery is a cloud-based big data analytics web service for processing very large read-only data sets.Continue Reading
Why understanding data structures is so important to coders
Jay Wengrow talks about how his new book on data structures and algorithms and considerations for making your choices as efficient as possible.Continue Reading
Key steps in the feature engineering process
Feature engineering is key to machine learning algorithms. Read on to learn how those features are created and chosen to increase the accuracy of those models.Continue Reading
Apache Hadoop YARN
Apache Hadoop YARN is the resource management and job scheduling technology in the open source Hadoop distributed processing framework.Continue Reading
What is data aggregation?
Data aggregation is any process whereby data is gathered and expressed in a summary form.Continue Reading
How to ensure your data lake security
Your data lake is full of sensitive information and securing that data is a top priority. These are the best practices to keep that information safe from hackers.Continue Reading
When a DIY database management system design is the best fit
Learn how a combination of homegrown, off-the-shelf and open source tools, plus proper motivation, can yield a DIY DBMS that meets corporate expectations, needs and ROI.Continue Reading
Building a database application the DIY way
Business users experience the trials, tribulations and exultations of building a DIY DBMS, especially when IT expertise is not readily available or costs are too high.Continue Reading
Developing an enterprise data strategy: 10 steps to take
Consultants detail 10 to-do items for data management teams looking to create a data strategy to help their organization use data more effectively in business operations.Continue Reading
What is Extract, Load, Transform (ELT)?
Extract, Load, Transform (ELT) is a data integration process for transferring raw data from a source server to a data system (such as a data warehouse or data lake) on a target server and then preparing the information for downstream uses.Continue Reading
Data warehousing design and value change with the times
Big data, the cloud and analytics profoundly shape data warehouse purpose and design. Learn how companies derive value from a repository that at times needs definition.Continue Reading
Third-party database tools boast attractive alternatives
For companies considering third-party database tools, this handbook provides expert advice on evaluating and deploying on-premises and cloud options from third parties.Continue Reading
T-SQL (Transact-SQL)
T-SQL (Transact-SQL) is a set of programming extensions from Sybase and Microsoft that add several features to the Structured Query Language (SQL), including transaction control, exception and error handling, row processing and declared variables.Continue Reading
What is database normalization?
Database normalization is intrinsic to most relational database schemes. It is a process that organizes data into tables so that results are always unambiguous.Continue Reading
What is a pivot table?
A pivot table is a statistics tool that summarizes and reorganizes selected columns and rows of data in a spreadsheet or database table to obtain a desired report.Continue Reading
SQL Server database design best practices and tips for DBAs
Good database design is a must to meet processing needs in SQL Server systems. In a webinar, consultant Koen Verbeeck offered advice on how to make that happen.Continue Reading
Big data containers gain wider appeal in system deployments
This handbook examines the use of Docker containers in Kubernetes clusters to run big data systems and offers insight on container deployment and management issues.Continue Reading
Data virtualization tools promote anywhere, anytime data access
This online handbook examines data virtualization software and how organizations are deploying and using the technology as part of their data integration processes.Continue Reading
Check SQL Server Query Store performance impact before using
Many IT teams hesitate to use SQL Server Query Store due to performance concerns. Consultant Andy Warren offers tips on how to test and get started with Query Store.Continue Reading
What is Data as a Service (DaaS)?
Data as a Service (DaaS) is an information provision and distribution model in which data files (including text, images, sounds, and videos) are made available to customers over a network, typically the Internet.Continue Reading
Advice on enterprise data cleansing from an SAP VP
SAP's Kristin McMahon details data cleansing best practices and explains why a good data cleanse needs continual communication, collaboration and oversight.Continue Reading
Data model design tips to help standardize business data
Data models should be understandable to business users and kept to a reasonable scope, say the leaders of a data modeling initiative at England's Environment Agency.Continue Reading
USAA adds data engineering skills to speed data science work
When the United Services Automobile Association's data science team wasn't getting data in the right format, the team lead realized the USAA needed more data engineers.Continue Reading
5 things to know about deploying big data systems in data containers
Planning for security and container APIs, and watching out for infrastructure sprawls are some issues to be aware of before deploying big data in containers.Continue Reading
DataOps is more than DevOps for data, Delphix CTO says
Data operations is young compared to DevOps, but it is increasingly used as part of projects that put data at the center of development. Here, Delphix CTO Eric Schrock makes observations about the trend.Continue Reading
HR makes major strides toward improving employee engagement
5 FAQs on SQL Server containers and how to manage them
Running SQL Server in containers creates new challenges for database administrators. The answers to these questions can guide you through some of them.Continue Reading
The Power BI-PowerShell cmdlet cheat sheet
DBAs can manage Power BI data sets, workspaces and reports with PowerShell. Using the two tools together makes for a more efficient and effective workflow.Continue Reading
Azure Data Studio (formerly SQL Operations Studio)
Azure Data Studio is a Microsoft tool, originally named SQL Operations Studio, for managing SQL Server databases and cloud-based Azure SQL Database and Azure SQL Data Warehouse systems.Continue Reading
SQL vs. NoSQL: What do you know about the database designs?
The decision to use a SQL database or a NoSQL database can be made wisely only if the ins and outs of both are understood. See how well you know the database architectures.Continue Reading
11 features to look for in data quality management tools
As the need for quality data has increased, so have the capabilities of data quality tools. Learn how collaboration, data lineage and other features enable data quality.Continue Reading
AI for analytics augments and bolsters business intelligence
What is an enterprise data strategy?
Defining a data strategy can help focus an organization's data management initiatives -- but it isn't the same as data governance. Expert Anne Marie Smith explains why.Continue Reading
customer data integration (CDI)
Customer data integration (CDI) is the process of defining, consolidating and managing customer information across an organization's business units and systems to achieve a "single version of the truth" for customer data.Continue Reading
5 to-dos for your GDPR compliance checklist
It's never too late to fine-tune your GDPR strategy. Expert Anne Marie Smith suggests a current state analysis of your PII protections, drafting a data privacy policy and more.Continue Reading
2 ways to attach SQL Server database files to Linux containers
SQL Server files can be stored outside of Docker containers in host directories or volumes. Here's how to set up SQL Server on Linux databases and attach them to containers.Continue Reading
Cloud vs. legacy ERP systems: Tug of war intensifies for SMBs
Aging legacy ERP systems at SMBs seem to be getting plenty of scrutiny these days. Heightened consumer demands, shifting technology landscapes and relentless market disruptions, not to mention maintenance costs, technical support and obsolescence, ...Continue Reading
How to attach databases to custom SQL Server containers
Deploying SQL Server in Docker containers for production applications typically requires custom containers. Here are guidelines on how to attach databases to them.Continue Reading
Good data quality for machine learning is an analytics must
As companies add machine learning applications, they need to really understand -- and be able to improve -- their data. That's where data quality initiatives come in.Continue Reading
Six sample databases for SQL Server and how to find them
SQL Server sample databases are useful for test and dev, but they can be difficult to parse. Use this SQL database sample overview to decide which to use and how to access them.Continue Reading
The benefits of columnar storage and the Parquet file format
What's behind Apache Parquet's growing popularity? It may be the file format's columnar storage orientation, which leads to benefits including improved query performance.Continue Reading
Four first steps for customer data management
Forrester's Mike Gualtieri details how to develop a unified plan to manage customer data that gives business users what they need to manage CRM programs.Continue Reading