Data management strategies
Data is a corporate asset that helps make more-informed business decisions, improve marketing campaigns, optimize business operations and reduce costs. Learn data management strategies to ensure IT systems run business applications and provide analytical information that drives decision-making and strategic planning by corporate executives, business managers and end users.
Top Stories
-
Tip
19 Mar 2025
Key considerations for data-intensive architectures
A shift toward data-driven decision-making amplifies the need for scalable systems that deliver real-time business insights supported by quality data. Continue Reading
By- Priyank Gupta, Sahaj Software
-
News
19 Mar 2025
Coalesce acquires data catalog upstart to aid transformation
After raising $50 million in venture capital funding in 2024, the startup acquired CastorDoc to add AI-powered data catalog capabilities to its data preparation platform. Continue Reading
By- Eric Avidon, Senior News Writer
-
News
15 Jan 2021
Informatica takes Customer 360 master data management to cloud
Updated MDM service benefits from integrations with the broader cloud-native Informatica platform that is built on top of a microservices Kubernetes-based architecture. Continue Reading
-
Feature
07 Jan 2021
Open source database comparison to choose the right tool
These are four of the most popular open source relational databases available to enterprises with a comparison chart to help you find the best option to fit your data. Continue Reading
-
Feature
30 Dec 2020
What FAIR data management means for your enterprise
The FAIR principles were made to promote the sharing of data in the research field, but their guidance can help organizations in other industries improve their own data practices. Continue Reading
-
Feature
28 Dec 2020
ChaosSearch looks to bring order to data lakes
Data lakes are like junk drawers in the sky, but new tech from ChaosSearch organizes the mess and makes it searchable. Here, CEO Ed Walsh shares the details and what's next in 2021. Continue Reading
-
Feature
23 Dec 2020
New data warehouse schema design benefits business users
The Unified Star Schema is a revolution in data warehouse schema design. Learn the benefits of this new architecture and read an excerpt from a new book about it. Continue Reading
By- Kara E. Joyce
- Technics Publications, Technics Publications
-
News
24 Nov 2020
IBM to deliver refurbished Db2 for the AI and cloud era
IBM has a tuned-up version of Db2 planned, featuring a handful of AI and machine learning capabilities to make it easier for users to send and manage Db2 data across clouds. Continue Reading
By- Ed Scannell, Freelancer
-
Feature
20 Nov 2020
Maintaining data integrity key for data quality
Maintaining data integrity through improved communication and data literacy is paramount for organizations in the enterprise seeking to ensure data quality and trust. Continue Reading
-
Feature
19 Nov 2020
Why understanding data structures is so important to coders
Jay Wengrow talks about how his new book on data structures and algorithms and considerations for making your choices as efficient as possible. Continue Reading
-
News
12 Nov 2020
AWS Glue DataBrew a new no-code data preparation tool
AWS Glue DataBrew is a new feature that will enable users to extract, transmit and load data to get it ready for analysis without having to write code. Continue Reading
By- Eric Avidon, Senior News Writer
-
News
12 Nov 2020
Databricks builds out SQL Analytics for data lakehouse
Databricks is building out its lakehouse platform with a new SQL Analytics service that will make it easier to run SQL queries with better visualization in the cloud. Continue Reading
-
News
03 Nov 2020
Informatica update improves cloud data management
Informatica is out with its fall 2020 update, integrating capabilities from its July acquisition of Compact Solutions that advance its enterprise data catalog. Continue Reading
-
News
30 Oct 2020
Dremio speeds up cloud data lakes for business intelligence
The Dremio fall 2020 update brings new performance to the vendor's cloud data lake engine technology, including Apache Arrow-based caching and runtime filtering. Continue Reading
-
Feature
16 Oct 2020
How AI data privacy can help your enterprise
Enterprises benefit in many ways from AI data privacy tools that reduce the need for manual efforts from data professionals. Read on for top use cases for the growing technology. Continue Reading
-
News
13 Oct 2020
Upsolver advances open cloud data lake, data pipeline efforts
Upsolver enhanced its data preparation platform to transform data lake content into a data lakehouse structure that enables data queries and analysis. Continue Reading
-
Feature
28 Sep 2020
3 growing applications of AI in data management
There are plenty of ways AI can augment data professionals throughout the data pipeline, from sifting through large data sets for duplicates to easing the preparation process. Continue Reading
By- Andy Hayler, Information Difference
-
Feature
25 Sep 2020
Key steps in the feature engineering process
Feature engineering is key to machine learning algorithms. Read on to learn how those features are created and chosen to increase the accuracy of those models. Continue Reading
-
News
22 Sep 2020
Ahana releases managed Cloud for Presto service
New Ahana cloud system brings a managed service to market that integrates data management and visualization capabilities to aid in business intelligence and data analytics. Continue Reading
-
Feature
15 Sep 2020
The evolving role of the chief data officer
The job of the chief data officer is expanding to be more strategic as the need for organizations to connect and make sense of vast sums of data continues to grow. Continue Reading
-
Feature
03 Sep 2020
Reltio sets new course for master data management strategy
The founder of data management firm Reltio provides insight into the evolving nature of master data management, as organizations look to get a more complete view of their data. Continue Reading
-
Definition
31 Aug 2020
Apache Hadoop YARN
Apache Hadoop YARN is the resource management and job scheduling technology in the open source Hadoop distributed processing framework. Continue Reading
By- Craig Stedman, Industry Editor
- Jack Vaughan
-
News
20 Aug 2020
Elastic 7.9 platform improves data observability
Elastic updated its platform with a new unified agent for collecting data from different sources and a newly enhanced view for data observability across different types of data. Continue Reading
-
Feature
14 Aug 2020
How to streamline your data cleansing process
Data cleansing is an important part of maintaining data quality, and the process is easier if you keep ahead of it by upholding governance and quality standards. Continue Reading
By- Andy Hayler, Information Difference
-
Feature
07 Aug 2020
Top 7 data catalog use cases for enterprises
From data lake modernization to data democratization, there are many benefits to a data catalog. Experts talk about the top ways data catalog adoption can benefit enterprises. Continue Reading
-
News
04 Aug 2020
Isima emerges with bi(OS) converged data platform
Startup launches with the general availability of a converged platform that offers the promise of integrating multiple data management and analysis steps into a single offering. Continue Reading
-
News
30 Jul 2020
Hasura Cloud launches GraphQL-as-a-service platform
Hasura made its GraphQL platform available as a managed service, enabling organizations to connect and query different data sources using the open source GraphQL language. Continue Reading
-
Feature
17 Jul 2020
Top 5 feature engineering tips for better models
From understanding a model's expected goal to factoring in subject matter expertise, experts talk about the best ways to improve your feature engineering. Continue Reading
-
Feature
13 Jul 2020
Key points for a monitoring center pandemic action plan
On-site monitoring centers come under stress when it's necessary for most workers to telecommute. Here are key points to include in a crisis plan to continue service availability. Continue Reading
By -
Feature
06 Jul 2020
Key factors for successful data lake implementation
There are many important parts to a data lake implementation, from technology to governance. Read on for the top factors to evaluate in your implementation strategy. Continue Reading
By -
News
02 Jul 2020
Ahana brings PrestoDB SQL query engine to the cloud
Soon after announcing its seed round of funding, Ahana launched commercial support services for the open source federated SQL data query technology. Continue Reading
-
Feature
25 Jun 2020
10 chief data officer trends that are reshaping the role
The chief data officer role is appearing in more industries and changing in responsibilities. Experts talk about the ways the position is evolving across enterprises. Continue Reading
-
Feature
19 Jun 2020
Yellowbrick CEO outlines hybrid path to cloud data warehouse
Neil Carson, co-founder and CEO of Yellowbrick, details the state of data warehouse market and outlines use cases for hybrid on-premises and cloud deployments. Continue Reading
-
Feature
10 Jun 2020
Choosing a modern data warehouse to fit your data needs
Choosing the right modern data warehouse for your enterprise data is a large task. Here are the key evaluation points to consider when choosing your platform. Continue Reading
By -
Feature
04 Jun 2020
Organization and automation ease data preparation process
By laying down proper groundwork and investing in automated checks, companies can ease the data preparation process and ensure they are getting the most out of their data. Continue Reading
-
Feature
26 May 2020
Benefits of a data catalog and why you need one
Data catalogs help across enterprises, from breaking down data silos to ensuring data privacy. A new book from Technics Publications explains the benefits of data catalogs. Continue Reading
By- Kara E. Joyce
- Technics Publications, Technics Publications
-
Feature
08 May 2020
Top -- and bottom -- 5 Apache Kafka use cases
Apache Kafka has many applications in big data, but what enterprise use cases are the best fit for the tool? Experts discuss where Kafka works best for your data. Continue Reading
-
Feature
01 May 2020
How to ensure your data lake security
Your data lake is full of sensitive information and securing that data is a top priority. These are the best practices to keep that information safe from hackers. Continue Reading
By -
News
27 Apr 2020
Elastic expands data management efforts to workplace search
Elastic previewed a new search service that brings enterprise and software-as-a-service data sources together to help users find relevant data, where it resides. Continue Reading
-
Feature
21 Apr 2020
Master data management best practices for insurers
For insurers, master data management holds the potential to make how they operate more efficient and effective but requires company commitment and investment. Continue Reading
-
Feature
17 Apr 2020
Common data lake challenges and how to overcome them
Managing the data contained in your enterprise data lake presents many challenges. From the amount of data to data inconsistencies, here are some solutions to common issues. Continue Reading
By- Andy Hayler, Information Difference
-
Feature
25 Mar 2020
The business benefits of enterprise data governance and MDM
Data leaders from prominent large organizations provide insights into data governance best practices and benefits, at Informatica's MDM 360 and Data Governance virtual conference. Continue Reading
-
News
20 Mar 2020
Databricks bolsters security for data analytics tool
Databricks looks to bridge the gap between on-premises controls and the cloud with data analytics security policies in the company's latest platform update. Continue Reading
-
Feature
16 Mar 2020
How to build an effective streaming data architecture
Data architecture can be tricky when it comes to real-time analytics. Clear objectives and scalability are important factors when determining the streaming data architecture you need. Continue Reading
-
Tip
28 Jan 2020
Should you host your data lake in the cloud?
On premises or in the cloud: What's the better place for your data lake? Here are some things to consider before deciding where to deploy a big data environment. Continue Reading
By- Andy Hayler, Information Difference
-
News
22 Jan 2020
New Confluent Platform release boosts event streaming quality
Based on the open-source Kafka event streaming platform, the Confluent Platform 5.4 update adds new capabilities to help meet enterprise data management requirements. Continue Reading
-
News
20 Dec 2019
Apache Kafka version 2.4 improves streaming data performance
The latest release of the Apache Kafka open source event streaming platform adds improved replication and availability capabilities to help boost overall performance. Continue Reading
-
Feature
11 Nov 2019
Managing unstructured data is crucial to enterprises' AI goals
Unstructured data makes up a huge portion of most businesses' data volume. But, with data-hungry AI systems coming online, making sense of these stores has never been more important. Continue Reading
By- Bill Marcus
-
News
18 Oct 2019
Databricks contributes Delta Lake to the Linux Foundation
Databricks has found a new home at the Linux Foundation for its open source Delta Lake data lake project, in a bid to help grow a broader community and accelerate adoption. Continue Reading
-
Tip
08 Oct 2019
7 steps to a successful data lake implementation
Flooding a Hadoop cluster with data that isn't well organized and managed can stymie analytics efforts. Take these steps to help make your data lake accessible and usable. Continue Reading
By- David Loshin, Knowledge Integrity Inc.
-
News
25 Sep 2019
Cloudera Data Platform gives big data users multi-cloud path
Cloudera released a big data platform combining its technologies and ones from Hortonworks, initially in the AWS cloud but with multi-cloud support to come. Continue Reading
By- Craig Stedman, Industry Editor
-
News
11 Sep 2019
Stibo Systems advances multidomain MDM system
The new Stibo Systems 9.2 update expands the MDM platform's features with Sisense business intelligence integration and machine learning capabilities. Continue Reading
-
News
05 Sep 2019
Cloudera Data Platform to debut, as big data fortunes waver
The interim CEO of Cloudera is cautiously optimistic about growth prospects as the big data vendor acquired Arcadia Data and planned to bolster its own cloud platform. Continue Reading
-
Feature
27 Aug 2019
10 Apache Kafka best practices for data management pros
How can data management teams most effectively deploy and use Apache Kafka in data pipelines and streaming applications? Here are some key guidelines to follow. Continue Reading
-
Definition
26 Aug 2019
What is deterministic/probabilistic data?
Deterministic and probabilistic are opposing terms that can be used to describe customer data and how it is collected. Deterministic data is also referred to as first party data. Probabilistic data is information that is based on relational patterns and the likelihood of a certain outcome. Continue Reading
By -
News
08 Aug 2019
Elastic Stack 7.3 adds new features to expand data analysis
Elastic Stack, in a new update, adds data frames capabilities to enable new forms of analysis, as well as integrating more data sources that can be searched and analyzed. Continue Reading
-
News
06 Aug 2019
HPE buys MapR assets to fuel AI applications
Longtime independent big data vendor MapR goes out of business, selling technology and intellectual property to HPE. The move marks the continuing decline of the Hadoop market. Continue Reading
By- Shaun Sutner, Senior News Director
-
Tip
20 Jun 2019
Building leaner, meaner BI data sources
As business intelligence analysis and reporting platforms become increasingly important in the enterprise, so does the data that feeds them. Are your BI data sources up to par? Continue Reading
By- Scott Robinson, New Era Technology
-
Feature
13 Jun 2019
Microservices and big data start to get closer
Microservices are riding a wave of user interest, leading to changes in IT operations. ThoughtWorks expert Zhamak Dehghani discusses what that means for big data. Continue Reading
By -
News
31 May 2019
MapR's future in jeopardy, layoffs loom
It's right there in a MapR letter to California's labor department: A leader in the Hadoop market is desperately seeking funding after poor sales of its promising data platform. Continue Reading
By -
Tip
28 May 2019
The evolution of the data preparation process and market
Organizations have long struggled with inconsistent data and other issues. Expert Andy Hayler explores how that has led to the rise of the data preparation tools market. Continue Reading
By- Andy Hayler, Information Difference
-
Feature
22 May 2019
The main picks for Hadoop distributions on the market
Check out the current top Hadoop distribution vendors in the market to help you determine which product is best for your company. Continue Reading
-
Feature
21 May 2019
Inside view of Tibco integration architecture planning
Tibco's acquisitions of well-regarded, small software specialists such as SnappyData are part of a drive toward what it calls 'connected intelligence.' CTO Nelson Petracek provides background. Continue Reading
By -
News
20 May 2019
Lumina launches Radiance, a data risk management platform
In an effort to prevent data loss, Lumina launched Radiance, a SaaS data risk management platform. It collects and analyzes data to help prevent risks and threats. Continue Reading
-
News
30 Apr 2019
Snowflake CEO Bob Muglia talks cloud data warehouse evolution
In this Q&A, now-former Snowflake CEO Bob Muglia discusses the vendor's decision to embrace cloud data warehousing and how the industry is changing as more data moves to the cloud. Continue Reading
By- Brian Holak and Jack Vaughan
-
News
17 Apr 2019
Google takes a run at enterprise cloud data management
New Google Cloud boss Thomas Kurian is putting databases and data management at the forefront at Google. The vendor has forged key data deals, showing a more mature Google Cloud. Continue Reading
By -
Feature
17 Apr 2019
4 factors to consider in a Hadoop distributions comparison
Examine the key characteristics necessary to evaluate in a Hadoop distribution comparison, focusing on enterprise features, subscription options and deployment models. Continue Reading
By- David Loshin, Knowledge Integrity Inc.
-
News
15 Apr 2019
Kafka at center of new event processing infrastructure
Events are as important as data in emerging applications underlying many e-commerce efforts. Streams of events tell a company what motivates customers to use online products. Continue Reading
By -
Feature
05 Apr 2019
USAA adds data engineering skills to speed data science work
When the United Services Automobile Association's data science team wasn't getting data in the right format, the team lead realized the USAA needed more data engineers. Continue Reading
-
News
04 Apr 2019
Tools manage performance for big data cloud applications
Tools such as Unravel and Pepperdata offer a way to measure performance of big data cloud applications, which may aid companies with on-premises configuration issues. Continue Reading
By -
Tip
29 Mar 2019
5 things to know about deploying big data systems in data containers
Planning for security and container APIs, and watching out for infrastructure sprawls are some issues to be aware of before deploying big data in containers. Continue Reading
-
News
25 Mar 2019
Facebook alumni forge own paths to big data analytics tools
Startups Interana and Rockset differ in their approaches to providing new query capabilities on fast-arriving big data. Both are led by technologists who started at Facebook. Continue Reading
By -
Feature
27 Feb 2019
8 tips to improve the data curation process
A data curation and modeling strategy can ensure accuracy and enhance governance. Experts offer eight best practices for curating data. First, start at the source. Continue Reading
-
Reference
27 Feb 2019
states of digital data
A state of digital data is a way to describe the current functionality of a data file. There are three major states of data: data at rest, data in motion and data in use. Continue Reading
-
Feature
25 Feb 2019
Explore Hadoop distributions to manage big data
Discover the uses of Hadoop distributions and the first steps in evaluating these products, as well as how the merger of rivals Cloudera and Hortonworks affects the market. Continue Reading
By- David Loshin, Knowledge Integrity Inc.
-
News
14 Feb 2019
Originators form group to boost Presto SQL query engine
The Presto engine arose as an alternative to Hive for big data queries. Now, the Presto Software Foundation has formed to promote the SQL query software's virtues. Continue Reading
By -
News
01 Feb 2019
Cloud data management, security top of mind for government
Federal government data officers grapple with cloud data management, weighing lower cost and efficiencies against security threats and vendor lock-in. Continue Reading
By -
News
15 Jan 2019
Cloudera and Hortonworks combo to push CDP, machine learning
Two wunderkinds of Hadoop have formalized their merger. Cloudera and Hortonworks say they will place special focus on AI as they chart the stand-alone vendor's future. Continue Reading
By -
Podcast
19 Dec 2018
Open source support was central to 2018 data deals
Mergers and acquisitions unsettled the big data status quo in 2018. Open source support made these couplings a bit different than those of the past, Talking Data podcasters said. Continue Reading
By -
Tip
20 Nov 2018
Trifacta data prep tool helps blend disparate data sources
Handling diverse data sources usually consumes precious developer time. That led healthcare CRM company SymphonyRM to hand the data prep task to business analysts. Continue Reading
By -
Tip
31 Oct 2018
How to build a master data index: Static vs. dynamic indexing
Expert David Loshin explores the differences between static and dynamic indexing in master data management systems, and which queries each approach can support. Continue Reading
By- David Loshin, Knowledge Integrity Inc.
-
Tip
31 Oct 2018
How deterministic and probabilistic matching work
Expert David Loshin explores the benefits and challenges of the two classes of record matching in master data management systems: deterministic matching vs. probabilistic matching. Continue Reading
By- David Loshin, Knowledge Integrity Inc.
-
Feature
11 Oct 2018
Cloud buoys data microservices -- for on-premises systems, too
Data in a microservices architecture is percolating anew. This news analysis looks at IBM Cloud Private for Data and other means to harmonize data in public and private locations. Continue Reading
By -
News
04 Oct 2018
Cloudera-Hortonworks merger narrows Hadoop users' options
Hadoop users will have fewer choices as big data rivals Cloudera and Hortonworks unite. But the new company may be more competitive with AWS and Google. Continue Reading
By -
Tip
27 Sep 2018
The gradual evolution of master data management software
Master data management began with a bang, then hit roadblocks due to complexity. Now, MDM is shifting toward more pragmatic projects tied to data governance. Continue Reading
By- Andy Hayler, Information Difference
-
Opinion
21 Sep 2018
5 trends driving the big data evolution
The speedy evolution of big data technologies is connected to five trends, including practical applications of machine learning and cheap, abundantly available compute resources. Continue Reading
By- Mike Matchett, Small World Big Data
-
News
13 Sep 2018
Containers key for Hortonworks alliance on big data hybrid
Hortonworks is joining with Red Hat and IBM to work together on a hybrid big data architecture format that will run using containers both in the cloud and on premises. Continue Reading
By -
News
07 Sep 2018
Big data tooling rolls with the changing seas of analytics
Hadoop data tooling is expanding. A view holds that Hadoop is moving from alternate data warehousing to a full-fledged big data analytics offering. Continue Reading
By -
Definition
06 Sep 2018
customer data integration (CDI)
Customer data integration (CDI) is the process of defining, consolidating and managing customer information across an organization's business units and systems to achieve a "single version of the truth" for customer data. Continue Reading
By- Jacqueline Biscobing, Senior Managing Editor, News
-
News
08 Aug 2018
Confluent Platform 5.0 aims to mainstream Kafka streaming
Confluent Platform updates seek to bring data streaming with Apache Kafka to a wider audience. A new GUI and user-defined functions are part of the 5.0 release. Continue Reading
By -
Podcast
02 Jul 2018
Hadoop data governance services surface in wake of GDPR
GDPR influence is touching a Hadoop big data world that was immune to many privacy considerations until now. This podcast features the rise of Hadoop data governance for data lakes. Continue Reading
By -
Tip
28 Jun 2018
Four first steps for customer data management
Forrester's Mike Gualtieri details how to develop a unified plan to manage customer data that gives business users what they need to manage CRM programs. Continue Reading
By- Mike Gualtieri, Forrester Research
-
News
25 Jun 2018
Hadoop data lake architecture tests IT on data integration
Hortonworks users talk about building Hadoop data lakes to support new applications -- and the challenges their teams face on ingesting and refining data for end users. Continue Reading
By -
News
18 Jun 2018
Hortonworks cloud options grow via Google, Microsoft, IBM
Hortonworks now supports Google Cloud Storage and has also broadened cloud deals with Microsoft and IBM, aiming to increase cloud uses of its big data platform. Continue Reading
By -
Feature
07 Jun 2018
Google Cloud data lake fuels cloud payment processing flow
To create a cloud payment processing system, Global Payments first had to deploy a data lake in the Google Cloud. Getting quick user feedback was another early step. Continue Reading
By -
Tip
05 Jun 2018
Why Spark DataFrame, lazy evaluation models outpace MapReduce
Learn how the Spark DataFrame execution plan works and why its lazy evaluation model helps the processing engine to avoid the performance issues inherent in Hadoop MapReduce. Continue Reading
By- David Loshin, Knowledge Integrity Inc.
-
Podcast
01 Jun 2018
Starburst finds new worlds to conquer with SQL query engine
Relational databases may have hit a wall of late, but the SQL query engine seems poised for wider growth. Starburst, a retro startup of sorts, is among those looking to take it wider still. Continue Reading
By - 09 Apr 2018
- 09 Apr 2018
- 09 Apr 2018
-
Feature
09 Apr 2018
IT teams take big data security issues into their own hands
Data security needs to be addressed upfront in deployments of big data systems -- and users are likely to find they have to build some security capabilities themselves. Continue Reading
By- Craig Stedman, Industry Editor
-
Podcast
29 Mar 2018
Kubernetes container orchestration gets big data star turn
The new thing in big data is Kubernetes container orchestration. While it's still early, there are signs of activity, which are cited in this edition of the Talking Data podcast. Continue Reading
By