Select Page

Apache Hadoop: What’s the Big Deal?

Eric Russo | | October 11, 2013

Apache Hadoop

There’s a lot of buzz surrounding Apache Hadoop, an open-source software project for the distributed processing of Big Data. Hadoop and Big Data are one and the same since the global market for Hadoop-MapReduce software now has a compound annual growth rate of 60.2%, with sales expected to increase from $77 million in 2011 to $812.8 million by 2016, according to IDC analysts. Vendors including IBM and Oracle offer support for tools and services in the Hadoop ecosystem.

Hadoop’s best features and potential applications

Those extolling Hadoop’s use point to Hadoop’s attributes, including its scalability, cost efficacy, flexibility, and fault tolerance. Hadoop allows users to add nodes without needing to make significant changes, such as altering the data format or the applications running atop it.

Hadoop can be run on commodity servers, making it an affordable solution for smaller enterprises. Because Hadoop does not rely on any specific data type, it can work with both structured and unstructured data from multiple sources. Users can join and aggregate the data in various ways.

Hadoop is also fault-tolerant and automatically redirects work so processing can continue if a node is lost. As IBM explains:

“It is designed to scale up from a single server to thousands of machines, with a very high degree of fault tolerance. Rather than relying on high-end hardware, the resiliency of these clusters comes from the software’s ability to detect and handle failures at the application layer.”

Although Hadoop is an analytics tool, most enterprise users are reportedly deploying it for storage and Extract, Transform, Load (ETL) tasks rather than for analytics.

As Cade Metz at Wired observes:

“Hadoop reinvented data analysis not only at Facebook and Yahoo but so many other web services. And then an army of commercial software vendors started selling the thing to the rest of the world. Soon, even the likes of Oracle and Greenplum were hawking Hadoop. These companies still treated Hadoop as an adjunct to the traditional database — as a tool suited only to certain types of data analysis. But now, that’s changing too.”

It is changing, but slowly, according to Matt Asay, vice president of corporate strategy at 10gen, which created MongoDB. He says:

“We’re still early in Hadoop’s technological and market evolution, in part due to the complexity of the technology, with 26% of even the most sophisticated Hadoop users citing how long it takes to get into production as a gating factor to its widespread use. Gartner reveals even lower rates of adoption of Big Data projects, often involving Hadoop, at a mere 6%, as enterprises try to grapple with both appropriate use cases and understanding the relevant technology. […] We’re still getting comfortable with Hadoop.”

Image by Intel Free Press.

12c Upgrade Bug with SQL Tuning Advisor

This blog post outlines steps to take on Oracle upgrade 11.2 to 12.1 if you’re having performance problems. Oracle offers a patch and work around to BUG 20540751.

Megan Elphingstone | March 22, 2017

Oracle EPM Cloud Vs. On-Premises: What’s the Difference?

EPM applications help measure the business performance. This post will help you choose the best EPM solutions for your organization’s needs and objectives.

Bobby Ellis | April 10, 2018

Scripting Out the Logins, Server Role Assignments, and Server Permissions

Imagine there are over one hundred logins in the source server and you need to migrate them all over to the destination server. Wouldn’t it be awesome if we could automate the process by generating the scripts for the required tasks?

JP Chen | October 1, 2015

Work with Us

Let’s have a conversation about what you need to succeed and how we can help get you there.

CONTACT US

Work for Us

Where do you want to take your career? Explore exciting opportunities to join our team.

EXPLORE JOBS