lassedesignen - Fotolia

Big data application development: An introduction to Hadoop

Getting into the big data market? Hadoop offers a platform for dealing with very large data sets and the technology's vendors offer training and support for channel partners.

The numbers look promising: IDC forecasts the worldwide revenues for big data and business analytics will reach $150.8 billion in 2017, an increase of 12.4% compared with the 2016 sales.

However, big data application development is challenging. With the growth of mobile, social media, and the internet of things, the volume of data that enterprises collect has been increasing.

"Traditional database management systems (DBMSs) do not easily scale to support very large data sets," noted Geneva Lake, vice president of worldwide alliances at MapR Technologies Inc. Also, old-school systems do not work well with unstructured information such as video.

A new generation of DBMS technology emerged to fill the gaps. Hadoop began as the Google File System, an idea first discussed in the fall of 2003. By early 2006, the work had evolved into an open source project, and development was turned over to the Apache Software Foundation.

Hadoop is an open source database management system for processing large data sets using the MapReduce programming model. The software runs on clusters of commodity hardware. Leading Hadoop distributions come from vendors such as Cloudera Inc., Hortonworks Inc. and MapR Technologies, all of which run partner programs for channel companies.

Big data application development

The complexity of big data application development and deployment is opening doors for channel partners that can help make their customers' projects successful. The task requires more than finding powerful hardware and software. Companies want to mine their information -- use it competitively after they collect it. This area presents opportunities and challenges to resellers and their customers.

Traditional database management systems (DBMSs) do not easily scale to support very large data sets.
Geneva Lakevice president of worldwide alliances, MapR Technologies Inc.

With big data, application development and deployment demand a lot of custom integration work. An organization needs to pull information from a variety of sources. With the number of sources and formats rising, such work has become complex and time-consuming.

Then, the organization must present the data to employees in a manner that lets them slice and dice it. Unlike traditional systems that focus on automating routine tasks and reducing headcount, big data provides more nebulous business insights, such as how to improve sales.

"Businesses collect a lot of information but have trouble tapping into it in a way that provides them with ways to improve the organization," said John Bender, senior vice president and general manager at RCG Global Services, an IT consulting firm with more than 1,000 employees and a partnership with MapR Technologies.

Sometimes, a business goes down a few unproductive paths. Rather than a grandiose initial application, "organizations should start small with a system where they already have data and an understanding of a business problem," stated Paul Bachteal, senior director of global sales support at SAS Institute Inc., which has been working with the Hortonworks' Hadoop distribution since 2013.

A sampling of Hadoop distribution vendor channel programs

Cloudera's partner ecosystem program offers a Certified Technology Program and a Cloudera Accelerator program.

Hortonworks Partnerworks channel initiative includes programs for independent software and hardware vendors, resellers, consultants and managed services providers.

MapR Technologies' Converge Partners Program spans consulting, software, platform OEM and distribution partners, offering go-to-market efforts and development support. 

Reseller programs take shape

As the market has matured, the leading suppliers have built up their reseller programs to aid companies exploiting big data technology. The MapR Converge Partners program was launched in June 2016 and includes three levels of partners. An Affiliate designation enables resellers to begin working with MapR. In the Preferred level, resellers have regional practices and mature value-added offerings. The Elite grouping features resellers that have a significant presence in multiple geographies; these channel partners are assigned an account manager.

In July 2017, MapR added the Elite Premier partner category to its top tier. This program level includes executive sponsorships for joint business alignments; designated alliance managers; direct engagement with the MapR professional services team; access to customized partner training, and investments in joint marketing and sales programs.

The Hortonworks Partnerworks program, meanwhile, focuses on two reseller relationships. The Managed Service Provider certification provides a support-focused approach for companies delivering managed services that include Hortonworks products. The Modern Data Solutions level is designed for partners that have certified integrations with Hortonworks solutions.

At Cloudera, the company's channel program features a half-dozen partner types: technology providers (independent software vendors and independent hardware vendors), service providers, authorized resellers, OEMs, cloud partners, and training providers. The vendor has focused on providing training for IT professionals and developed nine courses based on various Hadoop skills. Since 2009, 40,000 individuals have completed Cloudera's certifications in areas such as Designing and Building Big Data Applications and Cloudera Search Training.

Next Steps

Learn more about the challenges of big data application development

Read about big data's impact on the data preparation process

Find out about the role of containers in the Hadoop ecosystem

Dig Deeper on Emerging technologies for MSPs