About This Sample
Messaging
A Rails application running on a 'Server' EC2 instance uploads
to
Amazon S3 an image submitted through the Rails application's form by
the user. The Rails application puts a job message with details of the
uploaded object into the 'todo' queue from the Rails controller using
the ActiveMessaging Rails plug-in. One or more EC2 'Worker' instances
poll the 'todo' queue for jobs, read the message and download,
watermark the image then upload the new image to Amazon S3. The details
of this new image are added to the original job message and the new
message put into the 'done' SQS queue then the original message in the
'todo' queue is deleted.
The Rails application retrieves messages at regular intervals
from the 'done' queue with the ActiveMessaging 'poller' daemon. An
ActiveRecord model encapsulates a message and the ActiveMessaging
plug-in saves all the SQS messages sent and received. The
Rails application controller reads messages sent and received from the
database rather than directly from the 'done' queue and sends them to
the view that displays jobs. This prevents a situation where a user or
many users update their view to see their submitted and completed jobs
and each refresh or request makes new calls to Amazon SQS. N number of
users each with N refreshes would increase Amazon SQS usage
exponentially.
Figure 1: Media Processing Pipeline with Amazon Web Services
It's worth noting that the format of Amazon SQS messages
created in
this application is shared with that of the 'boto' library from Mitch
Garnaat. The Ruby YAML library is very helpful in working with these
RFC-822 compliant messages.
See Mitch Garnaat's Monster
Muck Mashup - Mass Video Conversion Using AWS
Muck, Heavy Lifting
In this application the 'muck', or 'heavy-lifting' is
demonstrated
by watermarking images. See the 'watermarker.rb' file in the root of
the .ZIP file containing the sample application. Of course your
application might do any other kind of work. If there were intermediate
steps between non-watermarked and watermarked more Amazon SQS queues
might be used. In this sample application there is no intermediate
state and so the job goes directly from the 'todo' queue to the 'done'
queue when the work is performed.
Scaling
The amount of watermarking that can be done asynchronously
without
affecting the Rails application performance at all is increased simply
by starting many more 'Worker' EC2 instances, Amazon SQS ensures that
if there is a job one of the 'Workers' will pick it up and process it.
This code sample uses an SQLite3 database with the Rails
application
- not scalable nor persistent. Using Amazon SimpleDB the Rails
application could be scaled by starting many instances of the Rails
application and all would use Amazon SimpleDB for persisting the
messages.
'Pulling'
This sample application code is accompanied by a public AMI.
It
employs a 'pull'-like mechanism for deployment instead of 'pushing'
(what is done with capistrano).
See PJ Cabrera's Using
Parameterized Launches to Customize Your AMIs
This application further demonstrates using public key
encryption to
protect the AWS keys that are passed to the Amazon EC2 instances at
launch time. The corresponding private key is bundled into the AMI. On
launching an instance of the AMI, the private key is deleted before
rc.local adds to the SSH
authorized_hosts file the keypair used to launch the
instance .
The ec2-launch-instances reads the user data it associates
with the
instance from the configuration file using the -f switch instead of
from the stdin. Using the configuration file works better from a shell
because of the length and contents of the encrypted, base 64 encoded,
AWS keys cipher text.
Once the AWS keys are decrypted, the keys are put into the
appropriate configuration files (broker.yml, amazon_s3.yml) and either
the Rails 'script/server' and ActiveMessaging 'poller' are launched (if
the 'server' keyword is present in the instance user data), or only the
'watermarker.rb' script is run (if the 'worker' keyword is present in
the instance user data).
See the 'launch.rb' file in the root of the accompanying code
sample .
Prerequisites
-
You are signed up and active for Amazon S3, SQS, EC2
-
You can run the EC2 Command Line Tools
(try running 'ec2-describe-instances') -
You have followed the Amazon EC2 getting started guide
(you will have a key called gsg-keypair, use this or your preferred
keypair where you see <mykeypair>
in the instructions below) -
You have an empty bucket in Amazon S3
(create a new bucket if
you need to)
If
you prefer or wish to later download the code annd run it locally see
the section below titled "Running the Sample Code Locally".
Otherwise this code sample is meant to be run inside the
cloud with it's
accompanying public AMI. You do not need to download the code. Continue
with the section immediately below titled "Running the Sample".
Running the Sample
Create a configuration file for 'Servers'.
-
Create a file called server.cfg.
Edit and save the file with the below two lines in it (replace <mybucket> with
the name of an empty bucket you own):
server <mybucket>
-
Download the application's public key that will be used to
encrypt your AWS access and secret access keys (or copy the URL and
download it from your browser)
curl -O https://s3.amazonaws.com/aws-pipeline/aws-pipeline_public.pem
-
Encrypt a copy of your AWS keys with the aws-pipeline
application's public key, base 64 encode it and append it to your
server.cfg
(Substitute your AWS keys where you see <awsaccesskey>
and <awssecretaccesskey>)
(The aws-pipeline EC2 AMI contains a corresponding private key to
decrypt your AWS keys)
(You are trusting the owner of the AMI)
(Only you can SSH into the instances of this AMI that you will launch)
echo "<awsaccesskey><awssecretaccesskey>" | \ openssl rsautl -encrypt -inkey aws-pipeline_public.pem -pubin | \ openssl base64 >> server.cfg
Create a configuration file for 'Workers'.
-
Create a copy of server.cfg called worker.cfg
cp server.cfg worker.cfg
-
Edit worker.cfg and change the word 'server' on the first
line to 'worker'. Save the file.
Launch and connect to Amazon EC2 instances.
-
Create a security group for 'Servers' and allow traffic to port 80
ec2-add-group aws-pipeline -d "AWS Pipeline Instances" ec2-authorize aws-pipeline -P tcp -p 80
-
Launch two 'Workers'
ec2-run-instances ami-a128cdc8 -g aws-pipeline -k <mykeypair> -f worker.cfg -n 2
-
Launch a single 'Server'
ec2-run-instances ami-a128cdc8 -g aws-pipeline -k <mykeypair> -f server.cfg
Example Output:
INSTANCE i-5edf2f37 ami-a128cdc8 pending mykeypair 0
m1.small 2008-01-11T19:28:45+0000
-
Wait 30 seconds and get the details of the folly booted-up
instance using the InstancID from the below output (i.e.: 'i-5edf2f37').
ec2-describe-instances i-5edf2f37
Example Output:
RESERVATION r-d40ce7bd 319268305561 defaolt
INSTANCE i-5edf2f37 ami-a128cdc8
ec2-72-44-56-6.z-1.compute-1.amazonaws.com
domU-12-31-38-00-39-F2.compute-1.internal running mykeypair 0 m1.small
2008-01-11T19:28:45+0000
-
Copy the public DNS name (ends with '.amazonaws.com') for
the
instance (e.g. 'ec2-72-44-56-6.z-1.compute-1.amazonaws.com'). Open it
in
your preferred browser.
|
Upload a JPEG to watermark
 |
Refresh for Completed Job
|
Running the Sample Code Locally
You will need the below Ruby gems. (note that the RMagick gem requires
ImageMagick be installed in your development environment.)
- rails (2.0.2)
- right_aws (1.7.1)
- aws-s3 (edge)
- daemons (edge)
- RMagick (edge)
-
Download the code sample to a directory of your choice.
-
Enter your AWS Access Key and Secret Access Key into the
development section of both config/broker.yml
and config/amazon_s3.yml.
-
Run the following in the directory to which you downloaded
the .ZIP file.
cd aws-pipeline script/server & script/poller run & ./watermarker.rb &
-
Open this link in your browser: http://localhost:3000/
Related Articles
Introduction
to AWS for Ruby Developers
Monster
Muck Mashup - Mass Video Conversion Using AWS
Using
Parameterized Launches to Customize Your AMIs
Introduction
to ActiveMessaging for Rails
Changelog
+ April 30 2008
- updated for SQS 2.0.
- upgraded to Rails 2.0.2.
- watermarker.rb uses right_aws instead of sqs gem for SQS 2.0 support.
- updated to latest versions of attachment_fu and activemessaging plugins.
Discussion
Please use this forum thread for submitting reviews, bugs, or discussion of this sample app:
[ANN] AWS Processing Pipeline with Ruby, Rails and ActiveMessaging
|