Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The S3 mover can also be used to import existing data set from S3 into Lustre. To do this, the “lhsm import” command is used to replicate the path names from S3 into the Lustre filesystem, and the import also adds the s3 URL to each file so the mover can retrieve the data.

Storing Large Files in S3 (Proposed)

There is a 5TB size limit in AWS S3 (TODO: check limits for other S3 implementations) , so larger files will need to be sharded over multiple S3 objects. The metadata of the shard UUIDs and their respective ranges would be stored in a container metadata object, and the name of this object would be is returned to the agent as a key. To distinguish container "shard metadata" objects from normal objects, the type in the URL could be changed to s3c, e.g.:

...

name includes al ".shm" extension after the UUID. 

s3://<bucketName>/<prefix>/o/UUID.shm