Resources

Articles & Tutorials
Documents in Category  
Sort by:
Type: Articles & Tutorials Video Tutorial: Getting started with Hive on Amazon Elastic MapReduce   
This video provides an introduction to the use of Apache Hive to operate a data warehouse with Amazon Elastic MapReduce. It takes you through the development of Hive script using an interactive job flow and shows you how to deploy this script in Amazon S3 and how to run job flows to execute the script in batch mode.
Last Modified: Oct 1, 2009 8:23 PM
Type: Articles & Tutorials Running Hive on Amazon ElasticMap Reduce   
This page lists documentation resources specific to using Hive on Amazon Elastic MapReduce.
Last Modified: Oct 1, 2009 5:40 PM
Type: Articles & Tutorials Contextual Advertising using Hive and Amazon Elastic MapReduce   
This article is a tutorial showing how to get started with Apache Hive and Amazon Elastic MapReduce. This tutorial follows the story of an imaginary internet advertising company that operates a data warehouse using Hive and Amazon Elastic MapReduce. The company runs machines in Amazon EC2 that serve advertising impressions and redirect clicks to the sites being advertised. The machines running in EC2 store record each impressions and clicks in log files that are pushed to Amazon S3. An Amazon Elastic MapReduce job flow combines the logs into a table that is stored in Amazon S3 and subsequently used by an analyst uses to evaluate an algorithm for contextual advertising.
Last Modified: Oct 1, 2009 5:43 PM
Type: Articles & Tutorials Additional Features of Hive in Amazon Elastic MapReduce   
This article describes the Hive extensions that make Hive work more easily with Amazon Elastic MapReduce.
Last Modified: Oct 1, 2009 5:38 PM
Type: Articles & Tutorials Operating a Data Warehouse with Hive, Amazon Elastic MapReduce and Amazon SimpleDB   
This article shows how to use Amazon Elastic MapReduce and Hive to process logs uploaded to Amazon S3 from a fleet of boxes which are serving online advertising. The logs are processed and the resulting information is stored in a collection of relational tables persisted in Amazon S3 and queryable using Hive. Summaries of the data are pushed to Amazon SimpleDB where they are accessible to monitoring tools.
Last Modified: Oct 1, 2009 5:37 PM
Type: Articles & Tutorials Video: Getting Started with Apache Pig on Elastic MapReduce   
This video walks you through using the AWS Console to start an interactive job flow for developing a simple log parsing application using Apache Pig, then uploading the finished application to S3 ready to be run through the Console on a regular basis.
Last Modified: Aug 11, 2009 10:26 AM
Type: Articles & Tutorials Parsing Logs with Apache Pig and Elastic MapReduce   
This tutorial shows you how to develop a simple, log parsing application using Pig and Amazon Elastic MapReduce. The tutorial walks you through using Pig interactively (via SSH) on a subset of your data, which enables you to prototype your script quickly. The tutorial then takes you through uploading the script to Amazon S3 and running on a larger set of input data.
Last Modified: Aug 10, 2009 5:22 PM
Type: Articles & Tutorials Introduction to Amazon Elastic MapReduce   
A step-by-step walk through Amazon Elastic MapReduce.
Last Modified: Apr 7, 2009 4:30 PM
Type: Articles & Tutorials Finding Similar Items with Amazon Elastic MapReduce, Python, and Hadoop Streaming   
Data Wrangling blogger and AWS developer Peter Skomoroch gives us an introduction to Amazon Elastic MapReduce. Peter Skomoroch is a consultant at Data Wrangling in Arlington, VA where he mines large datasets to solve problems in search, finance, and recommendation systems.
Last Modified: Apr 7, 2009 6:05 PM

  Point your RSS reader here for a feed of the latest documents in this category

Welcome, Guest Help
Login Login