Why do you need a data landing area when working with Big Data? These data staging areas contain unstructured, semi-structured, and unmodeled data that can be useful for data management and analytics. In Big Data projects, having a segregated landing area can help with production and development and fill several critical roles in the enterprise.
So what exactly is a landing area? Good question. There’s no consensus as of yet.
Itamar Ankorion, vice president of business development and corporate strategy, Attunity, writes:
“There is a lot of talk in the private sector about the big data ‘landing area,’ though definitions offered by technology evangelists and corporate executives are often divergent or vague. In some cases, these areas are where information is aggregated, cleaned and analyzed. In other situations, these regions consist of where the assets were originally conceived. Decision-makers need to define this concept, and many are choosing to consider the data warehousing environment to be where all the magic happens.”
IBM defines it as an area where data is segregated from other sources in the network. In the landing area, data can be initially integrated. It may also be aggregated. Users may also be able to explore the data, apply visualization tools, and conduct other types of initial analysis before it moves into the system where more rigorous analytics can be applied to it.
“The role of a big data landing area is deliberately vague. It’s clearly not the production front-end access and sandboxing layer where you run your fast queries, do your interactive exploration, and build and score your predictive models. It’s clearly not the production hub layer, where you store your core system-of-reference data, manage metadata, and enforce data governance standards.”
However you wish to define it, a Big Data landing area allows database professionals to accomplish a variety of essential tasks prior to applying analytics or other tools designed to extract meaningful business intelligence.
A landing area can be used to acquire and collect data from any number of sources before it is sent downstream. The landing area can also be used for numerous pre-analysis tasks. This might include data aggregation, cleaning, or merging.
Then the data can be sent out to the enterprise’s front end. The landing area can also be a way point for data destined for archiving.
What is clear is that establishing a landing area is essential. It provides Big Data projects with a foundation for success.
EPM applications help measure the business performance. This post will help you choose the best EPM solutions for your organization’s needs and objectives.
Which RAID should you use with SQL Server? Learn the differences between RAID 0, RAID 1, RAID 5, and RAID 10, along with best practices.