Resources

Sample Data Processing Applications
Documents in Category  
Sort by:
Type: Job Flows Apache LogAnalysis using Pig   
Analyze your Apache logs using Pig and Amazon Elastic MapReduce.
Last Modified: Aug 10, 2009 5:09 PM
Type: Job Flows Processing and Loading Data from Amazon S3 to the Vertica Analytic Database   
The Amazon Elastic MapReduce service allows users to create massively distributed data processing tasks built on Map and Reduce functions. Amazon Elastic Compute Cloud allows users to run any software on a scale out compute platform. EC2 can, for example be used for large scale data analysis by running an analytic database management system. Often data analysis tasks start with a processing phase where unstructured or semi-structured data needs to be processed or transformed before loading into a relational database. In this example we show how to use EMR to process and load a data set from S3 into the Vertica Analytic Database running on EC2.
Last Modified: May 30, 2009 7:08 AM
Type: Job Flows LogAnalyzer for Amazon CloudFront   
Analyze your Amazon CloudFront Logs using Amazon Elastic MapReduce.
Last Modified: Jun 1, 2009 11:02 AM
Type: Job Flows Cascading.Multitool   
A command-line tool for processing large data sets.
Last Modified: Apr 6, 2009 2:49 PM
Type: Job Flows FreeBase   
FreebaseDataProcessor is a simple streaming Hadoop application that finds the most popular items in the given freebase data input and loads them into Amazon SimpleDB.
Last Modified: Apr 2, 2009 1:53 PM
Type: Job Flows ItemSimilarity   
ItemSimilarity is a simple Hadoop streaming Python application that attempts to find similar items for each item in the input dataset. This example application finds similar artists using the Audioscrobbler user playlist dataset and Amazon Elastic MapReduce.
Last Modified: Apr 2, 2009 2:49 PM
Type: Job Flows CloudBurst   
CloudBurst provides highly-sensitive short read mapping with MapReduce.
Last Modified: Apr 2, 2009 1:53 PM
Type: Job Flows Word Count Example   
This example shows how to use Hadoop Streaming to count the number of times that words occur within a text collection.
Last Modified: Apr 2, 2009 1:53 PM

  Point your RSS reader here for a feed of the latest documents in this category

Welcome, Guest Help
Login Login