|
Discussion Forums
|
Thread: [ANNOUNCE] s3sync 1.0.0 using Ruby!
|
|
|
Replies:
243
-
Pages:
17
[
1
2
3
4
5
| Next
]
-
Last Post:
Jan 13, 2009 10:18 AM
by: m3networks
|
|
|
Posts:
177
Registered:
3/28/06
|
|
|
|
[ANNOUNCE] s3sync 1.0.0 using Ruby!
Posted:
Sep 23, 2006 8:48 PM PDT
|
|
|
This is a ruby program that easily transfers directories between a local
directory and an S3 bucket:prefix. It behaves somewhat, but not precisely, like
the rsync program.
One benefit over some other comparable tools is that s3sync goes out of its way
to mirror the directory structure on S3. Meaning you don't *need* to use s3sync
later in order to view your files on S3. You can just as easily use an S3
shell, a web browser (if you used the --public-read option), etc. Note that
s3sync is NOT necessarily going to be able to read files you uploaded via some
other tool. This includes things uploaded with the old perl version! For best
results, start fresh!
s3sync runs happily on linux, probably other *ix, and also Windows (except that
symlinks and permissions management features don't do anything on Windows).
For more information, check out:
http://s3.amazonaws.com/ServEdge_pub/s3sync/README.txt and to download s3sync along with its assorted ruby libraries:
http://s3.amazonaws.com/ServEdge_pub/s3sync/s3sync.tar.gz
Let me know what you think, how it works for you, what features are missing, etc.
I hope one day to support incremental backups to S3, but I haven't figured out a satisfactory way to do that yet (it really needs key renaming to work well).
G
|
|
Posts:
69
Registered:
3/16/06
|
|
|
|
Re: [ANNOUNCE] s3sync 1.0.0 using Ruby!
Posted:
Sep 25, 2006 12:30 AM PDT
in response to: greg13070
|
|
|
This looks pretty good, but I haven't gotten it to work yet. I keep getting "S3 command failed: .... With result 400 Bad Request". Any idea where to look for what's going wrong? The key/secret key both appear to be right.
Also - not sure if this will show up with the ruby implementation, but I got stuck doing a backup with the perl implementation beacuse it was following symlinked directories in recursion. Somewhere in my system, there was a stupid directory that contained a symlink to itself...... I haven't fixed that symblink yet (I hesitate to delete symlinks in system directories that I didn't create), so, once I get the ruby version going, I'll be looking to see if that still goes wrong.
Thanks, though! This looks really great!
|
|
Posts:
177
Registered:
3/28/06
|
|
|
|
Re: [ANNOUNCE] s3sync 1.0.0 using Ruby!
Posted:
Sep 25, 2006 1:01 AM PDT
in response to: Bryan Pendleton
|
|
|
I didn't get that error at all during my testing. A 400 usually means a badly formatted HTTP request. Do you have the capability to trace the tcp connection with tcpdump or similar and send it to me? (Make sure you don't use --ssl, or else it will all be encrypted and impossible to analyze).
If you do this, also make sure and send the command line you're using and any env variables you have set (except your secret key, don't ever tell anyone that!)
As for symlinks.. No recent version of s3sync (ruby or perl > 0.3) should have ever "followed" symbolic links. Only the original implementation did that, before I got to it. If something is chasing symlinks, then it's definitely a bug.
|
|
Posts:
69
Registered:
3/16/06
|
|
|
|
Re: [ANNOUNCE] s3sync 1.0.0 using Ruby!
Posted:
Sep 25, 2006 3:41 PM PDT
in response to: greg13070
|
|
|
Seems to have something to do with symlinks. I've had it happen no several times on different source files, all of which are symlinks. The example I PMmed you stopped on the first file because the first file was, in fact, a symlink. Other attempts I've made make it further, but eventually get stuck on some symlink along the way.
|
|
Posts:
177
Registered:
3/28/06
|
|
|
|
Re: [ANNOUNCE] s3sync 1.0.0 using Ruby!
Posted:
Sep 25, 2006 9:51 PM PDT
in response to: Bryan Pendleton
|
|
|
I can't tell anything from the tcpdump because the packet body was only cap'ing a few bytes. Please use -s 0 so the packet body is not limited.
|
|
Posts:
177
Registered:
3/28/06
|
|
|
|
Re: [ANNOUNCE] s3sync 1.0.1
Posted:
Sep 28, 2006 11:10 PM PDT
in response to: greg13070
|
|
|
2006-09-29:
Added support for --expires and --cache-control. Eg:
--expires="Thu, 01 Dec 2007 16:00:00 GMT"
--cache-control="no-cache"
Thanks to Charles for pointing out the need for this, and supplying a patch
proving that it would be trivial to add =) Apologies for not including the short
form (-e) for the expires. I have a rule that options taking arguments should
use the long form.
http://s3.amazonaws.com/ServEdge_pub/s3sync/s3sync.tar.gz
|
|
Posts:
10
Registered:
3/22/06
|
|
|
|
Re: [ANNOUNCE] s3sync 1.0.1
Posted:
Oct 2, 2006 1:53 PM PDT
in response to: greg13070
|
|
|
Thanks for this awesome tool; it, honestly, is the only Linux command-line S3 backup tool I could get to work in anything resembling a reliable fashion, and for that I thank you doubly.
One thing: in synchronizing a directory of files, s3sync.rb is consistently failing on one file that's 33 Mb in size; I get the following error:
put jlevine-backup provolone/mysqlbackup/20061002/spamassassin.sql.bz2 #<S3::S3Object:0xb7e81fe4> Content-Type application/x-bzip2 Content-Length 33731296
With result 400 Bad Request
Is there a known problem with large files, or am I experiencing another kind of error and incorrectly ascribing it to the file size?
|
|
Posts:
177
Registered:
3/28/06
|
|
|
|
Re: [ANNOUNCE] s3sync 1.0.1
Posted:
Oct 2, 2006 3:16 PM PDT
in response to: J. Levine
|
|
|
I think there is a "known but not understood yet" bug that can cause 400 errors. Can you help me test by backing up just the directory containing that file? Make sure you do not use SSL (or else the connections will be impossible for me to inspect), and run this during the test:
tcpdump -p -s 0 -w tcpdump.cap -i eth0
(assming your net card interface is called eth0; edit appropriately)
If you can post, PM, or email me a zip of the log file produced by tcpdump, I can hopefully find out what I'm doing wrong that creates a malformed request (which is the 400 error)
Please be advised that any confidentail information in the files you transfer will be plainly visible to anyone who views the tcpdump capture file.
Your help is appreciated :)
|
|
Posts:
10
Registered:
3/22/06
|
|
|
|
Re: [ANNOUNCE] s3sync 1.0.1
Posted:
Oct 2, 2006 4:29 PM PDT
in response to: greg13070
|
|
|
I just replied to you privately, with a link to the tcpdump file. It appears that Amazon is timing out the connection, and that s3sync isn't successful in re-establishing the connection...
|
|
Posts:
177
Registered:
3/28/06
|
|
|
|
[ANNOUNCE] s3sync 1.0.2
Posted:
Oct 2, 2006 8:29 PM PDT
in response to: J. Levine
|
|
|
New version is out, it contains a fix for fail/retry situations. I also turned off the debug messages about HTTP streaming which I'd forgotten before.
I recommend all users to update, and make sure your new version reads 1.0.2 or greater (it's in the upper right corner of the 'usage' and also near the beginning of the s3sync.rb file)
The archive is still hosted at:
http://s3.amazonaws.com/ServEdge_pub/s3sync/s3sync.tar.gz
|
|
Posts:
33
Registered:
2/8/06
|
|
|
|
Re: [ANNOUNCE] s3sync 1.0.2
Posted:
Oct 4, 2006 11:17 PM PDT
in response to: greg13070
|
|
|
Thanks, your work is much appreciated. :)
|
|
Posts:
177
Registered:
3/28/06
|
|
|
|
[ANNOUNCE] s3sync 1.0.4
Posted:
Oct 5, 2006 8:36 AM PDT
in response to: greg13070
|
|
|
Few bugs fixed; everyone should update.
http://s3.amazonaws.com/ServEdge_pub/s3sync/s3sync.tar.gz
By the way now that AWS has addressed their keepalive issue, I'm proud to say that s3sync supports persistent connections. This will measurably decrease latency,
especially
with SSL sessions.
It's possible that it will uncover more bugs as well of course :)
There are two known issues right now that I haven't addressed yet. They seem to be more nuisance than show-stopper, so I probably won't get to them immediately. This is what they look like:
Create node servers/mail.servedge.com /roundcubemail-svn/skins/default/images/buttons /.svn/props/ldap_pas.png.svn-work
/usr/lib/ruby/1.8/net/protocol.rb:133:in `sysread': Connection reset by peer (Errno::ECONNRESET)
from /usr/lib/ruby/1.8/net/protocol.rb:133:in `rbuf_fill'
<snip>
from ./s3sync.rb:520
Create node servers/mail.servedge.com/phpmyadmin
S3 command failed:
put ServEdge mail_Wed/home/servers/mail.servedge.com/phpmyadmin #<S3::S3Object:0x4065a21c> Content-Length 38
With result 400 Bad Request
The first one seems self-explanatory. I'm not catching connection resets. The second one only occurs when trying to create a node describing a directory, and only for certain directories. The contents (recursively) of that directory still get stored fine.
|
|
Posts:
69
Registered:
3/16/06
|
|
|
|
Re: [ANNOUNCE] s3sync 1.0.4
Posted:
Oct 6, 2006 11:42 AM PDT
in response to: greg13070
|
|
|
1.0.4 is going really great... A couple of things:
Bugs/issues:
1) What's the nature of the issue on the directories not getting stored? Will I (I assume) lose directory permissions on restores if they never get stored?
2) There's still a problem with following symlinks. One of my directories has a symlink to itself (subdirname -> .), which is getting followed recursively. This means that the backup will both never terminate, and that I end up storing a lot of copies of that directory in S3. Hrm.
Features/wants:
1) While I'm at it - multiple simultaneous transfers? It seems like S3 often has an incoming I/O limit, which is presumably bypassable with parallel sends. Maybe allow a settable number of simultaneous transfers to be started?
2) gzip compression on send? If you set the metadata right, it sounds like most browsers will transparently decompress gzip-compressed files, but it, of course, will often take less space to store files in s3 this way. Faster transfers/less storage cost sounds like a win, to me.
Thanks!
|
|
Posts:
2
Registered:
10/7/06
|
|
|
|
Re: [ANNOUNCE] s3sync 1.0.0 using Ruby!
Posted:
Oct 7, 2006 4:44 PM PDT
in response to: greg13070
|
|
|
Hi,
It wasn't clear from reading the README and the source whether s3sync copies all specified files in full, or whether it only transfers changed files (whether judged on the last time s3sync was successfully run, or by comparing the modification times of the local and s3 files).
|
|
Posts:
10
Registered:
3/22/06
|
|
|
|
Re: [ANNOUNCE] s3sync 1.0.0 using Ruby!
Posted:
Oct 7, 2006 7:29 PM PDT
in response to: watercannon
|
|
|
I certainly won't answer for Greg, but I can't imagine that s3sync is able to only transfer file changes, since that would involve one of two things:
- a sync-like service running at Amazon S3's end that's able to cooperate with your local s3sync to determine what's changed in each file, or
- s3sync would have to download each file, check each for changes, and then upload those changes.
The first doesn't exist. The second wouldn't be of any assistance, since you'd actually *increase* the bandwidth being used -- you'd transfer the file in full back from S3, and then add on the transfer back to S3 of any file changes.
|
|
|
|