This page is currently incomplete and it is being updated following recent developments.

S3

CSCS offers a public cloud object storage service, based on the Ceph Object Gateway. The service can be accessed from S3-compatible clients.

General information 

Endpoint: https://rgw.cscs.ch

URL: path-style in the format https://rgw.cscs.ch/%(bucket)s/key-name

Publicly accessible object links (after setting proper bucket policy): https://rgw.cscs.ch/<tenant>:<bucket-name>/key-name

Usage examples

AWS CLI

Configuration

The first step is to configure the profile:

> aws configure --profile naret-testuser
AWS Access Key ID [None]: [REDACTED]
AWS Secret Access Key [None]: [REDACTED]
Default region name [None]: cscs-zonegroup
Default output format [None]:

Then, settings such as the default endpoint and the path-style URLs can be placed in the configuration file:

[profile naret-testuser]
endpoint_url = https://rgw.cscs.ch
region = cscs-zonegroup
s3 =
    addressing_style = path


Creating a pre-signed URL

> aws --profile=naret-testuser s3 presign s3://test-bucket/file.txt --expires-in 300

https://rgw.cscs.ch/test-bucket/file.txt?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=IA6AOCNMKPDXQ0YNA3DP%2F20241209%2Fcscs-zonegroup%2Fs3%2Faws4_request&X-Amz-Date=20241209T080748Z&X-Amz-Expires=300&X-Amz-SignedHeaders=host&X-Amz-Signature=f2e2adb457f6fd43401124e4ea2650fba528e614ab661f9c05e2fa2e77691b5d

Notice that the tenant part is missing from the URL: this is because S3 doesn't natively deal with multitenancy. The correct object is retrieved based on the access key. A more thorough explanation can be found in the RGW documentation.

Making a bucket's contents anonymously accessible from the Internet

First, a bucket policy needs to be written:

> cat test-public-bucket-anon-from-internet.json
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": "*",
      "Action": "s3:GetObject",
      "Resource": [
		"arn:aws:s3:::test-public-bucket/*",
		"arn:aws:s3:::test-public-bucket"
      ]
    }
  ]
}

Then, it can be applied to the bucket:

> aws --profile=naret-testuser s3api put-bucket-policy --bucket test-public-bucket --policy file://test-public-bucket-anon-from-internet.json

At this point, the objects in test-public-bucket are accessible via direct links:

> curl https://rgw.cscs.ch/test_tenant:test-public-bucket/file.txt
This is a test.


s3cmd

Configuration

The first step is to configure the profile:

> s3cmd --configure

Enter new values or accept defaults in brackets with Enter.
Refer to user manual for detailed description of all options.

Access key and Secret key are your identifiers for Amazon S3. Leave them empty for using the env variables.
Access Key: [REDACTED]
Secret Key: [REDACTED]
Default Region [US]: cscs-zonegroup

Use "s3.amazonaws.com" for S3 Endpoint and not modify it to the target Amazon S3.
S3 Endpoint [s3.amazonaws.com]: rgw.cscs.ch

Use "%(bucket)s.s3.amazonaws.com" to the target Amazon S3. "%(bucket)s" and "%(location)s" vars can be used
if the target S3 system supports dns based buckets.
DNS-style bucket+hostname:port template for accessing a bucket [%(bucket)s.s3.amazonaws.com]: rgw.cscs.ch/%(bucket)s

Encryption password is used to protect your files from reading
by unauthorized persons while in transfer to S3
Encryption password:
Path to GPG program:

When using secure HTTPS protocol all communication with Amazon S3
servers is protected from 3rd party eavesdropping. This method is
slower than plain HTTP, and can only be proxied with Python 2.7 or newer
Use HTTPS protocol [Yes]: Yes

On some networks all internet access must go through a HTTP proxy.
Try setting it here if you can't connect to S3 directly
HTTP Proxy server name:

New settings:
  Access Key: [REDACTED]
  Secret Key: [REDACTED]
  Default Region: cscs-zonegroup
  S3 Endpoint: rgw.cscs.ch
  DNS-style bucket+hostname:port template for accessing a bucket: rgw.cscs.ch/%(bucket)s
  Encryption password:
  Path to GPG program: None
  Use HTTPS protocol: True
  HTTP Proxy server name:
  HTTP Proxy server port: 0

And then confirm.

IMPORTANT: The configuration is not complete yet.

> s3cmd ls s3://test-bucket
ERROR: S3 error: 403 (SignatureDoesNotMatch)

To fix this, it is necessary to edit the .s3cfg file, normally located in the user's home directory, and change the signature_v2 setting to true.

~ > cat .s3cfg | grep signature_v2
signature_v2 = True

> s3cmd ls s3://test-bucket
2024-12-09 08:05           15  s3://test-bucket/file.txt


Cyberduck

Configuration

In order to be able to connect to the S3 endpoint using Cyberduck, a profile supporting path-style requests must be downloaded from here.

Swift - DEPRECATED

CSCS offers a public cloud object storage service, based on OpenStack Swift. The service can be accessed from REST APIs compatible with the Openstack Swift protocol.

Access

There are several ways of accessing the object storage service, from user-friendly graphical tools to software libraries which can be integrated with custom applications.

Web frontend

Users with the SwiftOperator role can use the OpenStack dashboard to access object storage, by selecting the Object Store tab on the top menu bar of the Horizon GUI. Users who don't have the SwiftOperator role cannot access the object store from this web interface and must use other means such as CLI or Cyberduck or the REST API.

Command line interface

A generic guide on how to connect to our OpenStack system Castor using the command line is available at this URL. You can find detailed documentation on how to use the Swift command line client at the official Openstack documentation. Please note that there are two ways of accessing Swift, either the old Swift client command line, which has the most features, but it's no longer actively developed: Object Storage service (swift) command-line client. Alternatively, you can use the newer OpenStack unified CLI, which does not cover all Swift features yet, but it's actively developed: OpenStackClient

Cyberduck

A guide for connecting to our object storage service using the graphical file browsing client Cyberduck can be found here.

Swift REST API

The object storage is accessible from a REST API, defined this official documentation page. You can access the REST API with curl, or alternatively you can use a software library such as the Python swiftclient module, which can be downloaded from GitHub or from PyPi.

Examples

You can find below a list of how-tos for typical use cases, which however don't cover all the possible operations of Openstack Swift. For a complete list of functionalities please refer to the official documentation of CLI and REST API.

Known issues and workarounds

You can find nelow a list of known issues with the relative workaround.

Access Control Lists

Users could be granted two different roles within a project:

  • SwiftOperator: can list, create and modify all containers and objects within a project, can configure read or write Access Control Lists (ACL)
  • member: can access objects in specific containers only if a read or write ACL was granted to them

In Swift, ACLs can be assigned to containers, not to objects. Our authentication system involves Keystone V3 and domains, which means that user names and project names might not be unique. Because of this, when creating an ACL rule users need to specify project IDs and user IDs instead of project names and user names. To facilitiate this, we are regularly populating an object called user_ids inside of a container project_info in each project. The object user_ids contains the IDs of members of the current project.

This is an example of how and operator can add a read ACL to a container:

swift post mycontainer --read-acl {PROJECT1_ID}:{USER1_ID},{PROJECT2_ID}:{USER2_ID

The option --write-acl is used to configure write permissions. Please note that PROJECT_ID and USER_ID are long alphanumerical strings, so the command in reality will look like the following:

swift post testcontainer --read-acl 62f7feebbfb94f3bbb501b0a060nfn2r:3bb7feebbfb94f3bb5mdob0a060b30eb

It is also possible to use the wildcard *, as described in the official ACL documentation.

Operators can use the swift stat command to list existing ACLs on a container.

If a user wants to use the CLI to access an object in a project he's member of he can run a command like the following, after having authenticated: swift list {container_name}.

Only operators can list all the containers in a project. Normal users cannot list which containers they have access to. However, once they are told by operators which containers they have access to, they can list their contents. If a user wants to access an object in a project he's not member of, as long as he was granted ACL access, he should instead use the following command:

swift --os-storage-url https://object.cscs.ch/v1/AUTH_{CHOSEN_PROJECT_ID} list {CONTAINER_NAME

You can find more documentation about Swift ACLs on the official documentation.

Roles and project memberships have to be requested contacting CSCS staff.

Data protection

Container versioning

Our object storage system automatically saves up to 3 versions of objects whenever they are modified or deleted. Object versions are automatically stored into the {your_container_name}_versions container and they are kept for 90 days. The user can recover an object by copying (CLI: swift copy) the desired version from the {your_container_name}_versions container into {your_container_name} or to a different one. The {your_container_name}_versions containers are automatically created by a daily cron job.

Backup

In addition to versioning, all data written in the object store is backed up to tape. This allows the recovery of the entire object store in case of major hardware or file system failures. The backup is taken once a day, and we configured a retention policy of 3 months. Whenever an object changes, a new copy is created in the backend, with a maximum of three copies stored on tape. In case of major outage, the entire storage will be restored within a few days.

Swift S3 API

In addition to the standard Swift API the object storage service exposes also a S3 API, still using the same service endpoint. The compatibility matrix between the Swift S3 API and the Amazon S3 API is described in this table.

These two APIs allow access to the same object storage service, so no matter which one you decide to use, you will always access to the same data. The S3 ACLs are disabled in order to allow both APIs to operate seamlessly. In order to set ACLs on your containers/buckets and objects, please use the Swift API.

To use the Swift S3 API, first you have to create a set of EC2 credentials. For this you need to obtain a standard keystone token and then use the OpensStacl CLI as follow:

openstack ec2 credentials create

You will obtain in output a pair of access and secret keys which can be used with any S3 client. Below a short list of the most common ones: