How do I use S3 buckets to share data on Tier 2 storage with other users?

The answer to this question depends on how you access MSI's Tier-2 storage. If you primarily use it via the globus interface, we have some directions for sharing there. This covers how to share data using the s3 command line tools such as s3cmd.

The best way we have found to share data between users via the S3 interface is using bucket policies.

This involves writing a policy file (using a text editor) and then applying it to your bucket with s3cmd.

A simple policy file would look something like this:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {"AWS": [ "arn:aws:iam:::user/uid=54321", "arn:aws:iam:::user/uid=76543" ]},
      "Action": [ "s3:*" ],
      "Resource": ["arn:aws:s3:::mybucket/*", "arn:aws:s3:::mybucket"]
    }
  ]
}

Some notes about the file contents...

  • The "version" line reflects a protocol revision - it's not a date which you can change.
  • The "principal" line contains a comma-separated list of users who should have access. It is meant to be a single line, though it may appear split on this webpage.
    • For most people, the usernames internal to Tier 2 are not the same as your Internet ID, but you can look them up using the "s3info" command - for example:
      % s3info info -u username
      Sharing address:
      Tier 2 username: uid=54321
    • "s3info" will return both your s3 keys and username for yourself; for other users it will return only their tier-2 username
  • The "Action" contains a list of s3 command that are permitted. There are a lot of options. In this simple example we permit any of them.
  • Finally, the resource defines what path the policy applies to. We give the bucket name itself ("mybucket") as well as its contents separately. You can apply policies to specific sub-paths as well.

Now we hopefully have a policy file ready to go. Let's assume your policy file is called "s3policy-mybucket". You can apply it to your bucket with s3cmd like this:

s3cmd setpolicy s3policy-mybucket s3://mybucket

After this, all of the listed users should be able to list, read and write the bucket.

There are a couple of small caveats here.

  1. Only the original bucket owner will be able to see the bucket in a list of buckets (the output from "s3cmd ls s3:/"). Other users are only able to list it when explicitly requested - for example "s3cmd ls s3://mybucket"
  2. A user who has access to a bucket via policy like this, should in principle also be able to access it via globus. Unfortunately, this doesn't work. We are investigating this issue with globus.

In the longer term MSI hopes to provide some wrapper scripts to make creating some simple policies like the above easier.

 

Category: 
datastorage