Tuesday, April 13, 2010

Optimized Amazon S3 Sync Feature

Note: this post applies to CloudBerry Explorer 2.0 and later.

In this blog post we will demonstrate an optimization that we added to our sync feature. In the previous version of CloudBerry S3 Explorer we do request header for each file on s3 to get original date modified. It used to make the user pay extra money to Amazon because of the number of requests (equals to number of files on s3). If you let’s say have 70 000 files on S3 that you want to sync, every time sync feature runs it would run a head request 70 000 times.

In the version 2.0 we are adding an option "Compare Content", which is ON by default.

image001

The option calculates MD5 checksum for each file on a local computer that has a matching file on s3. Then it compares the MD5 for local file and ETAG (the MD5 check sum calculated by Amazon when the file is uploaded to s3)

If the ETAG is different we are looking for Date Created (the date of file uploading) and compare to local Date Modified. If local date modified is newer than Date Created on s3 we do not do additional requests for this file and upload it to s3 storage.

If local date modified is older than Date Created on s3 we do request for headers for this file and look into Date Modified stored in our header.

This way we can significantly minimize the number of requests to Amazon S3.

+++

CloudBerry S3 Explorer is a Windows freeware product that helps managing Amazon S3 storage and CloudFront . You can download it at http://cloudberrylab.com/

CloudBerry S3 Explorer PRO is a Windows program that helps managing Amazon S3 storage and CloudFront . You can download it at http://cloudberrylab.com/ It is currently in beta and free for all users. You can download it at http://cloudberrylab.com/

Like our products? Please help us spread the word about them. Learn here how to do it.

No comments: