|
Resources
That's a huge difference! By packaging and compressing our files, we've reduced our total cost by $0.82, or 30%. Well, let's get started making this happen. We list much of the code in the following section; however, you can download the entire application and follow along if you prefer. Pack and StoreIf you read our first article on using Amazon S3 with Ruby on Rails then you will be familiar with some of the code that we will use here. As in the previous article, we are going to need to the following:
Let's start the coding! Note: Although I use a Mac for development, the following commands are the same regardless of your operating system of choice. You just need to have the tar utility installed on your computer and be able to run the "tar" command in your command-line tool. The following code was created using Ruby 1.8.6 and Ruby on Rails 1.2.3. First, let's install all the gems we will need. Open a command-line tool and type the following: $ sudo gem install aws-s3 archive-tarsimple Now that we have all of the gems we need, we create a new Rails application. Go to the directory where you keep all of your Rails applications and type the following: $ rails economical_s3 To be able to use the archive-tarsimple Ruby gem, we have to configure our application controller to provide access to all of its resources. Open app/controllers/application.rb and make it look like the following: require "archive/tarsimple" class ApplicationController < ActionController::Base include Archive session :session_key => '_economical_s3_session_id' end That was easy. Now, let's go into the economical_s3 directory and install the attachment_fu plug-in. We will use attachment_fu for the main product file upload, but for Amazon S3, we are going to use the Amazon S3 gem directly. $ cd economical_s3 To use Amazon S3, we need to have an Amazon Web Services account, and then set up a bucket to hold our files. If you haven't already created your free Amazon Web Services account, do that now. When we all have an AWS account, let's set up a bucket. Go back to your command-line tool and type the following: $ s3sh The s3sh tool is like irb for the Amazon S3 gem. We will use this tool to connect to Amazon Web Services and create our bucket. First, we need to connect to Amazon S3 using our AWS credentials. >> AWS::S3::Base.establish_connection!( Now that we are connected, let's create our development bucket. Keep in mind that every bucket name must be unique, so name yours something fun. >> Bucket.create('economicals3_development')
Now we have our bucket. Type exit and press ENTER. Open your favorite text editor, and then open config/amazon_s3.yml. Enter your bucket_name, access_key_id, and your secret_access_key where required. You can find the access key values in your Amazon Web Services portal. When you have the keys entered and a bucket created, we are ready to code our application. Let's take stock of where we are now. We installed our RubyGems, created a Rails application, installed our plug-ins, configured our application for Amazon S3, connected to Amazon S3, and created our bucket. In the previous tutorial we used the attachment_fu plug-in to upload our attachments directly to Amazon S3. Today we will use it to temporarily store our product files in the file system. Every time a new file is uploaded, we check to see if we have 10 we can package together. When we hit 10, we create a package, ship it to Amazon S3, and remove the files from the file system. Believe it or not, this is a lot simpler (thanks to Ruby and Rails) than we might think. Let's get to it! As with any Rails application, we need to create our databases and configure our connection. Let's keep it simple and name the database economical_s3_development. Then, open config/database.yml and type your server credentials. After we've done this, we create our scaffold files for our file management. Run this little command and observe all the files that Rails happily generates for us: $ script/generate scaffold_resource ProductFile description:text size:integer content_type:string filename:string package_id:integer uploaded_on:datetime created_at:datetime updated_at:datetime Next we need a place to keep track of our packages. These packages will contain our product files. After we generate the scaffold code for the packages, we will take a look at the relationships between product files and packages. $ script/generate scaffold_resource Package filename:string created_at:datetime updated_at:datetime Note that we are using the scaffold_resource generator instead of the scaffold generator. This will create our migrations and all of our forms with the database columns we want, all ready to go. One command does it all for us. Let's make sure that is true with a quick check of our migration files. Looks good. Let's run them: $ rake db:migrate So far, so good. Now that we have code and database tables, let's configure our ProductFile and Package models accordingly. First, we need to set up our relationships. class ProductFile < ActiveRecord::Base<br /> belongs_to :package Next we configure the ProductFile model to use attachment_fu. We don't need to do anything with the Package model. Note: The following code limits the size of the files to between 1 KB and 500 MB. You can easily change this. Please see the README file that comes with the attachment_fu plug-in for all the details. class ProductFile < ActiveRecord::Base<br /> belongs_to :package The last step before delving into the controllers and making the magic happen is to update the "new" view of our product files. Open views/product_files/new.rhtml, remove all of the form fields accept for description, add one more field for uploaded_data, and change the form_for tag.
<h1>New product_file</h1>
<%= error_messages_for :product_file %>
<% form_for(:product_file, :url => product_files_path, :html => { :multipart => true }) do |f| %>
<p>
<b>Description</b><br />
<%= f.text_area :description %>
</p>
<p>
<b>File</b><br />
<%= f.file_field :uploaded_data %>
</p>
<p>
<%= submit_tag "Create" %> or <%= link_to 'Back', product_files_path %>
</p>
<% end %>
Just for fun, let's run script/server at the root of our Rails application and see if we can upload a file. Done? Great. Our next step is to add the functionality to keep track of the number of product files we have, and when that number hits 10, we create a package and send it to Amazon S3. The first thing we need is a place to store our number. We will keep it simple and store that number in our database. Create the migration, create the model, and then add the counter to the table. $ script/generate migration CreateProductFileTrackingTable Open the migration we just created and edit as follows:
class CreateProductFileTrackingTable < ActiveRecord::Migration
def self.up
create_table :product_file_counts do |t|
t.column :file_count, :integer, :default => 0, :null => false
t.column :last_uploaded_file, :integer, :default => 0, :null => false
t.column :updated_at, :datetime
end
ProductFileCount.create(:file_count => 0)
end
def self.down
drop_table :product_file_counts
end
end
This will generate our table and, because we already created our ProductFileCount model, it will create a blank record to work with in our product_file_counts table. Delete the migration that was generated when we created our ProductFileCount model (004), and run the new migration. $ rake db:migrate What we need to do now is add a method to our product_files controller to increment our counter every time we upload a file. We will use this same method to check the value in the database, and if we are at 10, it will make a call to the packages controller and create a package. Let's add a method to our product_files controller to handle the incrementing. This code is the heart of our savings plan, and is heavily commented, and it generates log statements all over the place so we can see what is going on. After we define our method we create a counter variable and use it to increment our file_count column in the product_file_counts table. Note: All of the following code is a single method in product_files_controller.rb.
def increment_and_package
@incrementer = ProductFileCount.find(:first)
@incrementer.file_count += 1
@incrementer.save
When that is done, we run our check to see if we have hit 10 files. If we have, we go to the database to get the last 10 files that have been uploaded. It is these 10 files that we will pack and ship.
if @incrementer.file_count == 10
# Check the last_uploaded_file value, if it isn't 0, we can pack and ship
if @incrementer.last_uploaded_file != 0
# Grab the last 10 files
logger.info "*** We are grabbing the last 10 files ***"
@files_to_pack = ProductFile.find(:all, :conditions => ['id > ?', @incrementer.last_uploaded_file], :limit => 10)
else
logger.info "*** We are grabbing the last 10 files ***"
@files_to_pack = ProductFile.find(:all, :conditions => ["id <= ?", 10])
end
When we have the 10 files, we create a tar file locally and add the 10 files to it. Because every file in a single bucket needs a unique name, we will use SHA1 (Secure Hash Algorithm version 1.0) encryption on the current time and use that for the file name.
# Create a new tar file - use the time to make it unique
enc = Digest::SHA1.hexdigest(Time.now().to_s)
tar_file = Archive::Tar.new(enc + ".tar")
# Add the files to it
logger.info "*** Time to pack it up ***"
for prod_file in @files_to_pack
tar_file.add_to_archive("public" + prod_file.public_filename)
end
Now, to save space, not only are we going to pack up the files, we also need to compress them in the tar file. We are going to use bz2 (Bzip2) compression. Note: Depending on the file type, the amount of compression will vary.
# Compress the file for packaging
logger.info "*** Time to compress it ***"
tar_file.compress_archive("bzip2")
filename = tar_file.archive_name + ".bz2"
my_file = File.open(filename)
After we have our compressed file, we need to read in the Amazon S3 config file we created.
# Read the config file and get the info to connect to S3
logger.info "*** Reading the S3 config file ***"
conf_file = RAILS_ROOT + '/config/amazon_s3.yml'
s3_conf = YAML.load_file(conf_file)[ENV['RAILS_ENV']].symbolize_keys
Now we connect to Amazon S3 and find the bucket we will be using.
# Connect to S3
logger.info "*** Connecting to S3 ***"
conn = AWS::S3::Base.establish_connection!(:access_key_id => s3_conf[:access_key_id], :secret_access_key => s3_conf[:secret_access_key])
# find our bucket
logger.info "*** Finding out bucket ***"
AWS::S3::Bucket.find(s3_conf[:bucket_name])
After we are connected and we have our bucket, upload the file.
# Upload our file
logger.info "*** Uploading our file ***"
logger.info "*** The file name is: " + filename
if AWS::S3::S3Object.store(filename, open(filename), s3_conf[:bucket_name])
logger.info "*** File upload successful ***"
If the upload was successful, we create a new package using the name of our file, and we update our counter.
# Create a new package record from the filename
new_package = Package.create(:filename => filename)
# Update the product_file_counts.last_uploaded_file with the id of the last product_file in the hash
logger.info "*** Incrementer time ***"
@incrementer.last_uploaded_file = @files_to_pack.last.id
@incrementer.file_count = 0
@incrementer.save
Next we update the record of each product file we uploaded with the upload time, and remove it from the file system. Lastly we remove our compressed file.
# Update the product_files in @files_to_pack with the new package id and remove the file
logger.info "*** Updating and deleting the product files ***"
for prod_file in @files_to_pack
prod_file.package = new_package
prod_file.uploaded = Time.now()
prod_file.save
# Remove the product_files from the file system
File.delete(RAILS_ROOT + "/public" + prod_file.public_filename)
end
# Finally remove the local bz2 file
File.delete(filename)
end # end file upload
end # end @incrementer.file_count
end # end increment_and_package
When we have our method in place, we update the create method to make it all happen.
@product_file = ProductFile.new(params[:product_file])
respond_to do |format|
if @product_file.save
self.increment_and_package
If we have done our job, when we hit 10 uploads, the last 10 uploaded files will be packaged, sent to Amazon S3, and the locally stored files will be removed after updating the package_id field. The Next LevelI am sure that you noticed a few things missing, such as the ability to retrieve the packages. Well, if we showed you how to do everything, what fun would there be for you all? One more way to take this application to the next level is to separate out the Amazon S3 upload, and upload the files using backgrounDRb, fully automating the upload and keeping the UI responsive. What else do you think you could add? ConclusionsThere is no doubt that with scalable, on-demand, pay-as-you-go storage, the Amazon Web Services tool Amazon S3 is shaking up traditional hosting models. Freely available RubyGems and Rails plug-ins makes Ruby on Rails the ideal platform for creating web-scale applications that easily take advantage of these services. Using the built-in capabilities of Ruby we lower our total costs of storage even further and receive a faster return on investment. Learning More About Amazon Web ServicesThis article highlights a few aspects of working with Amazon Web Services. Here are a few more resources available to Ruby and Rails developers to help you learn more: Common AWS Resources
Great Resources for Ruby and Rails Developers
Ruby and AWS Real-World ExamplesHere are some web sites that use Amazon Web Services and Ruby on Rails: References
About the AuthorAfter eight years as an MCSE and project manager, Robert Dempsey jumped from IT management and PHP/Visual Basic.NET development to Ruby on Rails. He is the project director of Atlantic Dominion Solutions, a Ruby on Rails development firm, and has recently launched Rails For All, a not-for-profit organization dedicated to promoting the use of Ruby on Rails. In addition, Robert presents on a regular basis at the Orlando Ruby Users Group, and has begun giving talks to Java user groups on topics including JRuby and Ruby on Rails. |
||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||||||||||
|
|||
|---|---|---|---|
| Login | |||