Essential Guide

Browse Sections

Editor's note

The rise of public cloud has legitimized the data analytics market -- making big data a bigger deal than ever before. Companies that have been collecting terabytes of data for years can now use public cloud as a cost-effective approach to mine and analyze that data. And successful big data analytics strategies often mean a competitive advantage for companies.

Amazon Web Services (AWS) offers a variety of data analytics tools. AWS customers can do everything from process data in real time to implement machine learning for applications. Currently, there are five primary AWS products for cloud-based analytics: Elastic MapReduce (EMR), Kinesis, Redshift, Data Pipeline and Machine Learning.

Third-party tools also exist to diversify and expand on the AWS analytics portfolio. While each service supports big data in its own way, it's key for administrators to understand each offering to ensure proper data integration.

1Process big data, and then visualize it

Once you're ready to mine and process data from your databases, there is no shortage of tools to help with that task. In some situations, enterprises need instantaneous information -- such as monetary transactions, social media response and clickstreams. Amazon Kinesis allows users to build a dashboard or application to monitor information as soon as it comes in from the data stream. Kinesis dashboards are one method for visualizing big data, but it might not suit the needs of every business. Third-party options like Tableau offer connectivity to EMR and other AWS products. Being able to see past data and using it to generate predictive algorithms is another challenge. And creating mathematical algorithms to interpret future data can be a tough and time-consuming task. Amazon Machine Learning provides visualization tools and helps create models to react to real-time data.