| |
|
|
This page lists documentation resources specific to using Hive on Amazon Elastic MapReduce.
|
| |
|
|
This video walks you through using the AWS Console to start an interactive job flow for developing a simple log parsing application using Apache Pig, then uploading the finished application to S3 ready to be run through the Console on a regular basis.
|
| |
|
|
The Amazon Elastic MapReduce service allows users to create massively distributed data processing tasks built on Map and Reduce functions. Amazon Elastic Compute Cloud allows users to run any software on a scale out compute platform. EC2 can, for example be used for large scale data analysis by running an analytic database management
system. Often data analysis tasks start with a processing phase where
unstructured or semi-structured data needs to be processed or transformed before loading into a relational database. In this example we show how to use EMR to process and load a data set from S3 into the Vertica Analytic Database running on EC2.
|