Untangling YARN – What Is It?
Author: Eric Russo | | June 2, 2014
Apache Hadoop released its version 2.2.0, which now includes Apache YARN. It is acknowledged as one of the greatest changes within this latest update, but what is YARN?
YARN, or MapReduce 2.0, opens up Hadoop beyond MapReduce. Because it now separates resource management from the processing components of Hadoop, YARN enables users to interact in more varied and useful ways with their data.
YARN provides cluster resource management and allows applications and services to run natively in Hadoop. In the application stack, for example, YARN sits atop the Hadoop distributed file system, as do Tez — the execution engine for interactive SQL queries — Storm, Giraph, and HBase.
MapReduce previously sent jobs one-by-one to the Hadoop distributed file system (HDFS). Then, it extracted useful information from the data. Now, multiple search tools can be used simultaneously to search data within the HDFS storage system. Multiple applications can be run in Hadoop with YARN.
It also, for example, separates the two primary responsibilities that were in the MapReduce JobTracker component — resource management and job scheduling/monitoring — into separate applications. This allows users to better manage the cluster resources within Hadoop than they could previously.
Another way to think of it is that YARN packages the resource management capabilities that were in MapReduce such that new engines can use them.
Rohit Bakhshi, product manager at Hortonworks, told InfoQ:
YARN is but a larger part of the Hadoop ecosystem. InfoWorld explains:
Several organizations are now building applications on YARN, according to Hortonworks.
Bakhshi added:
This latest iteration of Hadoop was in development for about four years. Among the organizations reportedly using Hadoop include Amazon Web Services, AOL, Apple, eBay, Facebook, Netflix, and Hewlett-Packard.
Related Posts
How to Solve the Oracle Error ORA-12154: TNS:could not resolve the connect identifier specified
The “ORA-12154: TNS Oracle error message is very common for database administrators. Learn how to diagnose & resolve this common issue here today.
Data Types: The Importance of Choosing the Correct Data Type
Most DBAs have struggled with the pros and cons of choosing one data type over another. This blog post discusses different situations.
How to Recover a Table from an Oracle 12c RMAN Backup
Our database experts explain how to recover and restore a table from an Oracle 12c RMAN Backup with this step-by-step blog. Read more.