Data management strategies
Data is a corporate asset that helps make more-informed business decisions, improve marketing campaigns, optimize business operations and reduce costs. Learn data management strategies to ensure IT systems run business applications and provide analytical information that drives decision-making and strategic planning by corporate executives, business managers and end users.
Top Stories
-
News
20 Nov 2024
Snowflake partners with Anthropic to improve AI development
The alliance aims to make it easier and faster for the data cloud vendor's customers to use the Claude line of large language models when developing advanced applications. Continue Reading
By- Eric Avidon, Senior News Writer
-
Video
18 Nov 2024
Data engineers, data scientists and data analysts explained
Understanding the differences in the roles of data scientists, analysts and engineers can boost productivity and efficiency in a business. Continue Reading
By- Sabrina Polin, Managing Editor
-
Definition
01 Feb 2022
data transformation
Data transformation is the process of converting data from one format, such as a database file, XML document or Excel spreadsheet, into another. Continue Reading
-
Definition
31 Jan 2022
data preprocessing
Data preprocessing, a component of data preparation, describes any type of processing performed on raw data to prepare it for another data processing procedure. Continue Reading
-
News
27 Jan 2022
Confluent Kafka connects apps on new Dish 5G network
Streaming data is at the foundation of a new smart 5G service from Dish Network, where enterprises build applications that directly benefit and customize the network platform. Continue Reading
-
Feature
25 Jan 2022
Data access key to Regeneron's innovation efforts
After developing a COVID-19 treatment in mere months, Regeneron adopted a data catalog and is developing a data governance framework to speed up its drug development pipeline. Continue Reading
By- Eric Avidon, Senior News Writer
-
News
21 Jan 2022
Apache Hop data orchestration hits open source milestone
The open source technology moves beyond its roots to enable a full data platform as data moves from one source to another for operations, business intelligence and analytics. Continue Reading
-
News
20 Jan 2022
Coalesce launches with data transformation platform
The CEO and co-founder of Coalesce gives insight into the startup's mission and the challenges organizations face with transforming data so that it's useful for data analysts. Continue Reading
-
Feature
11 Jan 2022
'Building the Data Lakehouse' explores next-gen architecture
This book excerpt by 'father of the data warehouse' Bill Inmon and experts Mary Levins and Ranjeet Srivastava explores the latest methods for wrangling data into usable intel. Continue Reading
By- Technics Publications, Technics Publications
-
News
11 Nov 2021
Informatica Cloud Data Marketplace brings data to business users
Fresh off its IPO, the data management vendor continues to expand its Intelligent Data Management Cloud services to enable organizations to effectively use data. Continue Reading
-
News
05 Nov 2021
Reltio advances master data management with $120M fund raise
Reltio now has a valuation of $1.7 billion as it raises new funding to help grow its cloud connected data platform to enable business processes and operations. Continue Reading
-
Definition
03 Nov 2021
What is data management as a service (DMaaS)?
Data management as a service (DMaaS) is a type of cloud service that provides enterprises with centralized storage for disparate data sources. Continue Reading
-
Definition
13 Oct 2021
data lake
A data lake is a storage repository that holds a vast amount of raw data in its native format until it is needed for analytics applications. Continue Reading
By- Craig Stedman, Industry Editor
- Ben Lutkevich, Site Editor
-
Definition
12 Oct 2021
What is dark data?
Dark data is digital information an organization collects, processes and stores that is not currently being used for business purposes. Continue Reading
By -
Definition
06 Oct 2021
What is Apache Flink?
Apache Flink is a distributed data processing platform for use in big data applications, primarily involving analysis of data stored in Hadoop clusters. Continue Reading
By -
Definition
06 Oct 2021
What is Apache Spark?
Apache Spark is an open source parallel processing framework for running large-scale data analytics applications across clustered computers. Continue Reading
By -
Definition
05 Oct 2021
What is a multimodel database?
A multimodel database is a data processing platform that supports multiple data models, which define the parameters for how the information in a database is organized and arranged. Continue Reading
By -
News
15 Sep 2021
Kafka streaming data gets governance from Confluent
Apache Kafka event streaming from Confluent helps online grocery provider Instacart deliver milk to customers when they need it, by helping to power real-time inventory updates. Continue Reading
-
Feature
09 Sep 2021
How VMware's CDO views data management
The chief data officer of the virtualization vendor details her views on the increasing challenges of dealing with growing volumes of data and the critical importance of data governance. Continue Reading
-
Quiz
31 Aug 2021
Quiz: Test your understanding of the Hadoop ecosystem
This quiz will test your knowledge of Hadoops basics including framework, capabilities and related technologies. Continue Reading
By- Melanie Luna, TechTarget
-
Feature
02 Aug 2021
The pros and cons of big data outsourcing
More companies are seeking outside help to capitalize on data's value. Examine the benefits and drawbacks that come with outsourcing big data processing projects. Continue Reading
-
News
22 Jul 2021
Observable expands enterprise data visualization platform
Observable expanded its platform for larger organizations to collaborate on data projects that make information more intuitive to understand and use. Continue Reading
-
Feature
16 Jul 2021
The value of PDF data extraction: Sifting for hidden data
During the process of data cleaning, there's a way to extract valuable hidden data. Learn how in this excerpt from 'Cleaning Data for Effective Data Science.' Continue Reading
By- David Mertz, Packt Publishing
-
Feature
16 Jun 2021
Healthcare device maker boosts production with data quality
One of the world's biggest ventilator manufacturers ramped up production during the pandemic by improving its own data health to better understand and optimize operations. Continue Reading
-
News
01 Jun 2021
After sluggish revenues, Cloudera goes private in $5.3B deal
Big data vendor Cloudera is looking to expand its SaaS capabilities by exiting the public markets and acquiring startups Cazena and Datacoral to bring new self-service features. Continue Reading
-
Feature
01 Jun 2021
How automated metadata management improves business insights
Automating metadata management can cut down time spent on tasks such as data tagging and cataloging. Explore how automated metadata management is improving data quality. Continue Reading
-
News
19 May 2021
Syncari raises $17.3M for automated cloud data sync
Data synchronization vendor is looking to solve the challenge of keeping data updated and automated across multiple cloud systems in a unified data model. Continue Reading
-
News
12 May 2021
Kafka users detail real-time data benefits
Confluent introduces a series of new real-time data efforts as users of the open source data-streaming technology outline real-world deployment applications. Continue Reading
-
Feature
11 May 2021
How to build an all-purpose big data pipeline architecture
Like a superhighway system and its many on- and off-ramps, an enterprise's big data pipeline transports infinite amounts of collected data from its sources to its destinations. Continue Reading
By -
Feature
22 Apr 2021
Enterprise augmented data management benefits and growth
Gartner predicts plenty of growth in the booming augmented data management market, which helps data professionals focus on insights over administrative tasks. Continue Reading
By -
Feature
09 Apr 2021
Why consider an augmented data catalog?
Automated and augmented data catalogs have been around for a few years, but adoption is still lagging. Find out why an enterprise may consider investing in the technology. Continue Reading
-
News
07 Apr 2021
Trifacta unveils new integrations to enable data wrangling
Trifacta on Wednesday unveiled an updated tool to enable customers to work with their directly in Google BigQuery along with two new integrations designed to improve data preparation. Continue Reading
By- Eric Avidon, Senior News Writer
-
Feature
31 Mar 2021
Why consider an open source data catalog
Enterprise data catalogs offer organizations plenty of benefits with metadata management and data organization. Find out why some enterprises choose open source data catalogs. Continue Reading
-
Guest Post
31 Mar 2021
How parallelization works in streaming systems
Dive into this book excerpt from 'Grokking Streaming Systems' and learn the crucial role the parallelization process plays in the design of a streaming system. Continue Reading
By- Josh Fischer and Ning Wang, Manning Publications
-
Feature
30 Mar 2021
How a DataOps pipeline can support your data
DataOps has created a lot of hype as a data management pipeline because of its focus on collaboration and flexibility. Read on to find out how these priorities support your data. Continue Reading
-
Feature
11 Mar 2021
Bias in big data: How to find it and mitigate influence
It's no secret that bias exists in large data sets, ; the key is addressing it. With transparency, diversity and accountability, limiting that bias can be possible. Continue Reading
-
Feature
25 Feb 2021
AWS Data Exchange and the third-party cloud data marketplace
The general manager of the AWS Data Exchange data feed service details what the cloud data marketplace is all about and where it's headed in the future. Continue Reading
-
Feature
22 Feb 2021
The top 5 graph database advantages for enterprises
Graph databases offer plenty of advantages to organizations in the way they connect data points to each other. Read on to see what experts say the top advantages are. Continue Reading
-
News
04 Feb 2021
Vendia raises $15.5M for serverless blockchain data sharing
Vendia is building out its data platform that uses distributed ledger blockchain technology to help organizations and developers more easily share data. Continue Reading
-
Feature
04 Feb 2021
Data catalog comparison to help you choose your best fit
Data catalog options vary across vendors, but, as with most decisions in the data realm, it takes self-knowledge to make the right choice and understand each option's capabilities. Continue Reading
-
Guest Post
29 Jan 2021
Creating a data advantage by building a data ecosystem
Developing a data ecosystem will improve personalization and customer retention. Find out how data mining across channels can build a data advantage for your organization. Continue Reading
By- Sankul Seth
-
News
29 Jan 2021
Apache Iceberg rising for new cloud data lake platforms
The open source Apache Iceberg project is helping to define a new data tier for cloud data lakes that can help to improve performance and access for large data sets. Continue Reading
-
Feature
28 Jan 2021
Pandemic exposes difficulty of data management in education
Limited resources and a shift to remote learning have shown the inequalities across school districts when it comes to data management and the negative impact this can have. Continue Reading
-
Feature
21 Jan 2021
Augmented data preparation the next step for self-service BI
Augmented data tools play a key role in expanding data use across organizations. Read on to find out how augmented data preparation tools democratize data in self-service BI. Continue Reading
-
News
15 Jan 2021
Informatica takes Customer 360 master data management to cloud
Updated MDM service benefits from integrations with the broader cloud-native Informatica platform that is built on top of a microservices Kubernetes-based architecture. Continue Reading
-
Feature
07 Jan 2021
Open source database comparison to choose the right tool
These are four of the most popular open source relational databases available to enterprises with a comparison chart to help you find the best option to fit your data. Continue Reading
-
Feature
30 Dec 2020
What FAIR data management means for your enterprise
The FAIR principles were made to promote the sharing of data in the research field, but their guidance can help organizations in other industries improve their own data practices. Continue Reading
-
Feature
28 Dec 2020
ChaosSearch looks to bring order to data lakes
Data lakes are like junk drawers in the sky, but new tech from ChaosSearch organizes the mess and makes it searchable. Here, CEO Ed Walsh shares the details and what's next in 2021. Continue Reading
-
Feature
23 Dec 2020
New data warehouse schema design benefits business users
The Unified Star Schema is a revolution in data warehouse schema design. Learn the benefits of this new architecture and read an excerpt from a new book about it. Continue Reading
By- Kara E. Joyce
- Technics Publications, Technics Publications
-
Feature
21 Dec 2020
Data warehouse vs. data lake: Key differences
Data warehouses and data lakes are both data repositories common in the enterprise, but what are the main differences between the two and which is best for your data? Continue Reading
By- Bridget Botelho, Editorial Director, News
-
News
24 Nov 2020
IBM to deliver refurbished Db2 for the AI and cloud era
IBM has a tuned-up version of Db2 planned, featuring a handful of AI and machine learning capabilities to make it easier for users to send and manage Db2 data across clouds. Continue Reading
By- Ed Scannell, Freelancer
-
Feature
20 Nov 2020
Maintaining data integrity key for data quality
Maintaining data integrity through improved communication and data literacy is paramount for organizations in the enterprise seeking to ensure data quality and trust. Continue Reading
-
Feature
19 Nov 2020
Why understanding data structures is so important to coders
Jay Wengrow talks about how his new book on data structures and algorithms and considerations for making your choices as efficient as possible. Continue Reading
-
News
12 Nov 2020
AWS Glue DataBrew a new no-code data preparation tool
AWS Glue DataBrew is a new feature that will enable users to extract, transmit and load data to get it ready for analysis without having to write code. Continue Reading
By- Eric Avidon, Senior News Writer
-
News
12 Nov 2020
Databricks builds out SQL Analytics for data lakehouse
Databricks is building out its lakehouse platform with a new SQL Analytics service that will make it easier to run SQL queries with better visualization in the cloud. Continue Reading
-
News
03 Nov 2020
Informatica update improves cloud data management
Informatica is out with its fall 2020 update, integrating capabilities from its July acquisition of Compact Solutions that advance its enterprise data catalog. Continue Reading
-
News
30 Oct 2020
Dremio speeds up cloud data lakes for business intelligence
The Dremio fall 2020 update brings new performance to the vendor's cloud data lake engine technology, including Apache Arrow-based caching and runtime filtering. Continue Reading
-
Feature
16 Oct 2020
How AI data privacy can help your enterprise
Enterprises benefit in many ways from AI data privacy tools that reduce the need for manual efforts from data professionals. Read on for top use cases for the growing technology. Continue Reading
-
News
13 Oct 2020
Upsolver advances open cloud data lake, data pipeline efforts
Upsolver enhanced its data preparation platform to transform data lake content into a data lakehouse structure that enables data queries and analysis. Continue Reading
-
Feature
28 Sep 2020
3 growing applications of AI in data management
There are plenty of ways AI can augment data professionals throughout the data pipeline, from sifting through large data sets for duplicates to easing the preparation process. Continue Reading
By- Andy Hayler, Information Difference
-
Feature
25 Sep 2020
Key steps in the feature engineering process
Feature engineering is key to machine learning algorithms. Read on to learn how those features are created and chosen to increase the accuracy of those models. Continue Reading
-
News
22 Sep 2020
Ahana releases managed Cloud for Presto service
New Ahana cloud system brings a managed service to market that integrates data management and visualization capabilities to aid in business intelligence and data analytics. Continue Reading
-
Feature
15 Sep 2020
The evolving role of the chief data officer
The job of the chief data officer is expanding to be more strategic as the need for organizations to connect and make sense of vast sums of data continues to grow. Continue Reading
-
Feature
03 Sep 2020
Reltio sets new course for master data management strategy
The founder of data management firm Reltio provides insight into the evolving nature of master data management, as organizations look to get a more complete view of their data. Continue Reading
-
Definition
31 Aug 2020
Apache Hadoop YARN
Apache Hadoop YARN is the resource management and job scheduling technology in the open source Hadoop distributed processing framework. Continue Reading
By- Craig Stedman, Industry Editor
- Jack Vaughan
-
News
20 Aug 2020
Elastic 7.9 platform improves data observability
Elastic updated its platform with a new unified agent for collecting data from different sources and a newly enhanced view for data observability across different types of data. Continue Reading
-
Feature
14 Aug 2020
How to streamline your data cleansing process
Data cleansing is an important part of maintaining data quality, and the process is easier if you keep ahead of it by upholding governance and quality standards. Continue Reading
By- Andy Hayler, Information Difference
-
Feature
07 Aug 2020
Top 7 data catalog use cases for enterprises
From data lake modernization to data democratization, there are many benefits to a data catalog. Experts talk about the top ways data catalog adoption can benefit enterprises. Continue Reading
-
News
04 Aug 2020
Isima emerges with bi(OS) converged data platform
Startup launches with the general availability of a converged platform that offers the promise of integrating multiple data management and analysis steps into a single offering. Continue Reading
-
News
30 Jul 2020
Hasura Cloud launches GraphQL-as-a-service platform
Hasura made its GraphQL platform available as a managed service, enabling organizations to connect and query different data sources using the open source GraphQL language. Continue Reading
-
Feature
17 Jul 2020
Top 5 feature engineering tips for better models
From understanding a model's expected goal to factoring in subject matter expertise, experts talk about the best ways to improve your feature engineering. Continue Reading
-
Feature
13 Jul 2020
Key points for a monitoring center pandemic action plan
On-site monitoring centers come under stress when it's necessary for most workers to telecommute. Here are key points to include in a crisis plan to continue service availability. Continue Reading
By -
Feature
06 Jul 2020
Key factors for successful data lake implementation
There are many important parts to a data lake implementation, from technology to governance. Read on for the top factors to evaluate in your implementation strategy. Continue Reading
By -
News
02 Jul 2020
Ahana brings PrestoDB SQL query engine to the cloud
Soon after announcing its seed round of funding, Ahana launched commercial support services for the open source federated SQL data query technology. Continue Reading
-
Feature
25 Jun 2020
10 chief data officer trends that are reshaping the role
The chief data officer role is appearing in more industries and changing in responsibilities. Experts talk about the ways the position is evolving across enterprises. Continue Reading
-
Feature
19 Jun 2020
Yellowbrick CEO outlines hybrid path to cloud data warehouse
Neil Carson, co-founder and CEO of Yellowbrick, details the state of data warehouse market and outlines use cases for hybrid on-premises and cloud deployments. Continue Reading
-
Feature
10 Jun 2020
Choosing a modern data warehouse to fit your data needs
Choosing the right modern data warehouse for your enterprise data is a large task. Here are the key evaluation points to consider when choosing your platform. Continue Reading
By -
Feature
04 Jun 2020
Organization and automation ease data preparation process
By laying down proper groundwork and investing in automated checks, companies can ease the data preparation process and ensure they are getting the most out of their data. Continue Reading
-
Feature
26 May 2020
Benefits of a data catalog and why you need one
Data catalogs help across enterprises, from breaking down data silos to ensuring data privacy. A new book from Technics Publications explains the benefits of data catalogs. Continue Reading
By- Kara E. Joyce
- Technics Publications, Technics Publications
-
Feature
08 May 2020
Top -- and bottom -- 5 Apache Kafka use cases
Apache Kafka has many applications in big data, but what enterprise use cases are the best fit for the tool? Experts discuss where Kafka works best for your data. Continue Reading
-
Feature
01 May 2020
How to ensure your data lake security
Your data lake is full of sensitive information and securing that data is a top priority. These are the best practices to keep that information safe from hackers. Continue Reading
By -
News
27 Apr 2020
Elastic expands data management efforts to workplace search
Elastic previewed a new search service that brings enterprise and software-as-a-service data sources together to help users find relevant data, where it resides. Continue Reading
-
Feature
21 Apr 2020
Master data management best practices for insurers
For insurers, master data management holds the potential to make how they operate more efficient and effective but requires company commitment and investment. Continue Reading
-
Feature
17 Apr 2020
Common data lake challenges and how to overcome them
Managing the data contained in your enterprise data lake presents many challenges. From the amount of data to data inconsistencies, here are some solutions to common issues. Continue Reading
By- Andy Hayler, Information Difference
-
Feature
25 Mar 2020
The business benefits of enterprise data governance and MDM
Data leaders from prominent large organizations provide insights into data governance best practices and benefits, at Informatica's MDM 360 and Data Governance virtual conference. Continue Reading
-
News
20 Mar 2020
Databricks bolsters security for data analytics tool
Databricks looks to bridge the gap between on-premises controls and the cloud with data analytics security policies in the company's latest platform update. Continue Reading
-
Feature
16 Mar 2020
How to build an effective streaming data architecture
Data architecture can be tricky when it comes to real-time analytics. Clear objectives and scalability are important factors when determining the streaming data architecture you need. Continue Reading
-
Tip
28 Jan 2020
Should you host your data lake in the cloud?
On premises or in the cloud: What's the better place for your data lake? Here are some things to consider before deciding where to deploy a big data environment. Continue Reading
By- Andy Hayler, Information Difference
-
News
22 Jan 2020
New Confluent Platform release boosts event streaming quality
Based on the open-source Kafka event streaming platform, the Confluent Platform 5.4 update adds new capabilities to help meet enterprise data management requirements. Continue Reading
-
News
20 Dec 2019
Apache Kafka version 2.4 improves streaming data performance
The latest release of the Apache Kafka open source event streaming platform adds improved replication and availability capabilities to help boost overall performance. Continue Reading
-
Feature
11 Nov 2019
Managing unstructured data is crucial to enterprises' AI goals
Unstructured data makes up a huge portion of most businesses' data volume. But, with data-hungry AI systems coming online, making sense of these stores has never been more important. Continue Reading
By- Bill Marcus
-
News
18 Oct 2019
Databricks contributes Delta Lake to the Linux Foundation
Databricks has found a new home at the Linux Foundation for its open source Delta Lake data lake project, in a bid to help grow a broader community and accelerate adoption. Continue Reading
-
Tip
08 Oct 2019
7 steps to a successful data lake implementation
Flooding a Hadoop cluster with data that isn't well organized and managed can stymie analytics efforts. Take these steps to help make your data lake accessible and usable. Continue Reading
By- David Loshin, Knowledge Integrity Inc.
-
News
25 Sep 2019
Cloudera Data Platform gives big data users multi-cloud path
Cloudera released a big data platform combining its technologies and ones from Hortonworks, initially in the AWS cloud but with multi-cloud support to come. Continue Reading
By- Craig Stedman, Industry Editor
-
News
20 Sep 2019
Swim DataFabric platform helps to understand edge streaming data
Swim released its new Swim DataFabric, which integrates with Microsoft Azure to help users organize and gain insights from streaming data sources. Continue Reading
-
News
11 Sep 2019
Stibo Systems advances multidomain MDM system
The new Stibo Systems 9.2 update expands the MDM platform's features with Sisense business intelligence integration and machine learning capabilities. Continue Reading
-
News
05 Sep 2019
Cloudera Data Platform to debut, as big data fortunes waver
The interim CEO of Cloudera is cautiously optimistic about growth prospects as the big data vendor acquired Arcadia Data and planned to bolster its own cloud platform. Continue Reading
-
Feature
27 Aug 2019
10 Apache Kafka best practices for data management pros
How can data management teams most effectively deploy and use Apache Kafka in data pipelines and streaming applications? Here are some key guidelines to follow. Continue Reading
-
Definition
26 Aug 2019
What is deterministic/probabilistic data?
Deterministic and probabilistic are opposing terms that can be used to describe customer data and how it is collected. Deterministic data is also referred to as first party data. Probabilistic data is information that is based on relational patterns and the likelihood of a certain outcome. Continue Reading
By -
Feature
13 Aug 2019
Data management roles: Data architect vs. data engineer, others
Veteran data pro Michael Bowers differentiates between key data management positions, including their salaries and which ones can add the most business value. Continue Reading
By -
News
08 Aug 2019
Elastic Stack 7.3 adds new features to expand data analysis
Elastic Stack, in a new update, adds data frames capabilities to enable new forms of analysis, as well as integrating more data sources that can be searched and analyzed. Continue Reading
-
News
06 Aug 2019
HPE buys MapR assets to fuel AI applications
Longtime independent big data vendor MapR goes out of business, selling technology and intellectual property to HPE. The move marks the continuing decline of the Hadoop market. Continue Reading
By- Shaun Sutner, News Director