software

s3bucketmap.py

26 June 2008

s3bucketmap.py: accessing an S3 bucket or portion thereof as a Python map

s3bucketmap.py implements Python mapping operations on an Amazon S3 bucket, with an optional prefix.

It runs on Python 2.5, and has been tested (by myself) on Windows Vista.

It is licensed under the MIT License.

sample code

The following Python code, when run as a script (which can be downloaded from S3BucketMapExample.py), expects the usual three arguments to access an S3 bucket (i.e. access key, secret access key and bucket name):

from s3bucketmap import S3BucketMap
import sys

bucketMap = S3BucketMap(sys.argv[1], sys.argv[2], sys.argv[3], "testprefix/")

bucketMap["testKey"] = "some value"
print "bucketMap[testKey] = %r" % bucketMap["testKey"]
print "testKey in bucketMap = %r" % ("testKey" in bucketMap)
print "testKey2 in bucketMap = %r" % ("testKey2" in bucketMap)
bucketMap["anotherTestKey"] = "another value"
for key in bucketMap:
    print "key = %s" % key

When run on an empty bucket, it should produce the following output:

bucketMap[testKey] = 'some value'
testKey in bucketMap = True
testKey2 in bucketMap = False
key = anotherTestKey
key = testKey

Note: the version of S3BucketMapExample.py available from the keevalbak Github git repository includes the additional option of retrieving configuration options from a localenv module.

dependencies

s3bucketmap.py depends on one other Python package:

download

s3bucketmap.py

details

missing features

s3bucketmap.py is the simplest possible implementation of the Python mapping operations. It only stores or retrieves values which are byte strings, and it expects keys to be strings. All key strings are converted to Unicode, and then UTF-8 encoded to S3 keys.

s3bucketmap doesn't do any of the following:

One could add these features to s3bucketmap.py. However it would be better to do this by layering other map implementations over the top of the s3bucketmap.S3BackupMap class, rather than adding to the class itself.

Note: these features are included in Shove, which I was attempting to use as an interface to S3 for the application I was writing (yet to be released). I found that the additional features complicated the interface unnecessarily, so I created s3bucketmap.py as an alternative to Shove, one which provides access to S3 as a Python dictionary in the simplest and most straightforward manner possible.

prefix option

S3 doesn't have a direct notion of hierarchical folders, but it indirectly supports folder-like capabilities with its ability to search bucket keys based on a prefix.

In practice this means that one can treat a set of keys with a specified prefix as a sub-bucket, and efficiently operate on those keys regardless of how many other keys exist in the bucket.

s3bucketmap.py supports this feature of S3 by providing a prefix option to the constructor of the S3BucketMap class (with a default value of "").

The forward slash character ("/", for example in "testprefix/" used in the example above) has no special meaning to S3, however tools such as s3fox interpret slashes as separators for folder names, so you might as well put a slash at the end of the prefix argument used to construct a S3BucketMap, so that you can more easily browse keys created with that prefix.