You may want to make use of Object Storage in your infrastructure. An S3-compatible service can be enabled for your tenant so you can store or retrieve data from buckets stored in the cloud, offered by both ECMWF and EUMETSAT.
At the moment the access to this service is not activated by default for every tenant. If you wish to use it, please raise an issue through the Support Portal requesting access to this service.
You may also use this guide to use any other S3 storage service such as AWS from your instances at the European Weather Cloud. Just adapt the host and credential information accordingly.
Managing your Object Storage with S3cmd
S3cmd is a free command line tool and client for uploading, retrieving and managing data in Amazon S3 and other cloud storage service providers that use the S3 protocol.
Many other advanced tools (e.g. https://rclone.org/) exist, as do APIs for many languages, but this article aims only to demonstrate the basics.
Install the tool
The easiest is to install it through the system package manager
sudo yum install s3cmd
Or for Ubuntu:
sudo apt install s3cmd
Alternatively, you may get the latest version from PyPi
Configure s3cmd
You will need to configure s3cmd
before you can use it. The tool will read the configuration on ~/.s3cfg
Create the configuration file if it does not exist:
touch ~/.s3cfg
Edit the file and set up at least the following parameters.
ECMWF:
- ECMWF CCI1 endpoint:
host_base = object-store.os-api.cci1.ecmwf.int
- ECMWF CCI2 endpoint:
host_base = object-store.os-api.cci2.ecmwf.int
- ECMWF CCI1 endpoint:
EUMETSAT endpoint: host_base =
s3.waw3-1.cloudferro.com
Fill in the
<youraccesskey>
and<yoursecretkey>
that will be given to you by the providerhost_base = <EUM or ECMWF endpoint> host_bucket = access_key = <youraccesskey> secret_key = <yoursecretkey> use_https = True # Needed for EUMETSAT, as the provider currently supplies a "CloudFerro" SSL certificate. Skip if ECMWF. check_ssl_certificate = False
Basic tasks
If you type s3cmd -h
you will see the different options of the command, but here are the basics:
List buckets
s3cmd ls
Create a bucket
s3cmd mb s3://yourbucket
List bucket contents
s3cmd ls s3://yourbucket
Get data from bucket
s3cmd get s3://newbucket/file.txt
Put data into bucket
s3cmd put file.txt s3://newbucket/
Remove data from bucket
s3cmd rm s3://newbucket/file.txt
Remove empty bucket
s3cmd rb s3://yourbucket/
Configure automatic expiry of data
s3cmd expire --expiry-days=14 s3://yourbucket/
Information about a bucket
s3cmd info s3://newbucket
Remove automatic expiry policy
s3cmd dellifecycle s3://yourbucket/
Mounting your bucket with S3FS via FUSE
You may also mount your bucket to expose the files in your S3 bucket as if they were on a local disk. Generally S3 cannot offer the same performance or semantics as a local file system, but it can be useful for legacy applications that mainly need to read data and expect the files to be in a conventional file system. You can find more information here.
S3FS installation
First of all, make sure you have S3FS installed in your VM. On CentOS:
sudo yum install epel-release yum install s3fs-fuse
On Ubuntu:
sudo apt install s3fs
Configure S3FS
You need to store your credentials in a file so S3FS can authenticate with the service. You need to replace <youraccesskey>
and <yoursecretkey>
by your actual credentials.
echo <youraccesskey>:<yoursecretkey> | sudo tee /root/.passwd-s3fs sudo chmod 600 /root/.passwd-s3fs
Setting up an automatic mount
Assuming you want to mount your bucket in /mnt/yourbucket
, here is what you need to do:
sudo mkdir /mnt/yourbucket echo "s3fs#yourbucket /mnt/yourbucket fuse _netdev,allow_other,nodev,nosuid,uid=$(id -u),gid=$(id -g),use_path_request_style,url=<s3_endpoint> 0 0" | sudo tee -a /etc/fstab sudo mount -a
Again, you must replace <s3_endpoint>
by the relevant endpoint at ECMWF or EUMETSAT, and you may customise other mount options if you wish to do so. At this point you should have your bucket mounted and ready to use.
5 Comments
Florian Pinault
If you want to give anonymous read permission to your data :
Note that the following will give read access to the content of the bucket recursively, but not to the bucket itself (your need to run both to give access) :
Soren Kalesse
You can also place the credential file
passwd-s3fs
globally under/etc
. I prefer that over the/root/.passwd-s3f
location, because the contents of that file have global impact, especially when you are working with s3fs automounts.Also it is very possible to put multiple credentials for multiple buckets in the
passwd-s3fs
file. In order to do so, simply add one line per bucket and prefix the<access-key>:<secrect-key>
part by<bucket-name>:
like in this example:Note that still the file should be secured by doing
References: http://manpages.ubuntu.com/manpages/xenial/man1/s3fs.1.html, https://github.com/s3fs-fuse/s3fs-fuse/wiki/Fuse-Over-Amazon
Hella Riede
Hi Mike Grant
Quote from above: "July 2021: EUMETSAT's S3 provision has an incomplete SSL certificate; you can either set
use_https
toFalse
..."Doing that I get redirect errors, such as
Also, s3cmd told me to specify
use_https = Yes
oruse_https = No
instead ofuse_https = True
oruse_https = False
. However, that might be because I use a version installed with pip. Sometimes parameters are different between the pip version and a linux repo version.When I set
use_https = Yes
all seems normalAntonio Vocino
as of today (19 May 2022) for EUMETSAT's S3 provision, to make it working you have to set in the configuration file BOTH the options:
use_https=True
check_ssl_certificate = False
then the mounting is OK.
Hella Riede
Note that uploading with s3cmd may produce regular errors, always first failing and then (hopefully) succeeding:
... and so on
This is described here: https://github.com/s3tools/s3cmd/issues/1114. The latest version of the master branch has a fix for this, but doesn't have a release yet, so repos like pip still have the buggy version.