Page History
...
The S3 mover allows an S3-style object storage system to be used a second tier for Lustre. The mover copies data between Lustre and an S3 bucket. The configuration includes the archive ID and bucket to be used to store data. A UUID is generated for each file, and the object is stored in the bucket with similar path as the POSIX mover. Once the archive is complete, however, a URL for the object is returned to the Agent:
s3://<bucketName>/<prefix>/objects/xx/yy/UUIDThe S3 mover can also be used to import existing data set from S3 into Lustre. To do this, the “lhsm import” command is used to replicate the path names from S3 into the Lustre filesystem, and the import also adds the s3 URL to each file so the mover can retrieve the data.
Storing Large Files in S3
There is a 5TB size limit in AWS S3 (TODO: check limits for other S3 implementations) , so larger files will need to be sharded over multiple S3 objects. The metadata of the shard UUIDs and their respective ranges would be stored in a container object, and this object would be returned to the agent as a key. To distinguish container objects from normal objects, the type in the URL could be changed to s3c, e.g.:
s3c://<bucketName>/<prefix>/objects/xx/yy/UUID