Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The current implementation of the Lustre POSIX copytool (lhsmtool_posix) is intended for demonstration purposes and not production use. Several items appear to be included that are interesting features but not fully realized for a production use:

  • Creating a shadow namespace. The copytool saves the name of each archived files in a “shadow tree,” but there is no mechanism to keep them in sync with changes made in the filesystem. In the case of hard links, only one name is saved.

  • File striping data is copied and stored with the object in the archive, and is used when file is restored. However, there is no mechanism to allow the user to change the striping data before the file is restored.

  • The copytool uses FIDs to identify objects in the archive. FIDs are not global globally unique identifiers and, like inode numbers, are not intended too to be used to identify files outside of the filesystem.

  • The Lustre HSM interface and data movement functionality is tightly coupled in the code, and provides no abstraction for different data movers.

  • The copytool produces overly verbose logging, and does not capture performance metrics.

...

This design divides a "copytool" into two separate components, a low level Agent and backend-specific data movers. In this approach, the data movers do not interact directly with Lustre. Each mover registers to the locally running agent, and the agent delivers actions for the mover to perform. The agent manages the incoming requests from the HSM coordinator, and forwards the state updates back to the coordinator. The agent also stores the keys provided by the mover for each archived file, and the then provides this key in further actions on this file.

The movers are standalone processes, and communicate with the agent using gRPC, an RPC protocol built using Google's Protocol Buffers.  When When the agent starts, it starts the configured movers automatically. 

...

GetActions(Handle) returns (stream ActionItem)

 

Code Block
message ActionItem {
    uint64 id = 1; // Unique indentifier for this action, must be used in status messages
    Command op = 2; 
    string primary_path = 3; // Path to primary file (for metadata or reading)
    string write_path = 4; // Path for writing data (for restore)
    uint64 offset = 5; // Start IO at offset
    uint64 length = 6; // Number of bytes to copy
    bytes file_id = 7; // Archive ID of file (provided with Restore command)
    bytes data = 8; // Arbitrary data passed to action. Data Mover specific.
}

...

The op can be one of these Commands.

ARCHIVE

An Archive command indicates the mover must copy length data starting at offset from the primary_path to the backend. An identifier used to refer to data stored in the backend can be returned as a file_id in the final ActionStatus message.

...

Code Block
xx = UUID[0:2] 
yy = UUID [2:4]

 

 

Once the archive is complete, the mover returns the UUID to the agent. The UUID is used during restore operation to locate the data object in the archive directory.

...

s3://<bucketName>/<prefix>/objects/xxo/yy/UUID

The S3 mover can also be used to import existing data set from S3 into Lustre. To do this, the “lhsm import” command is used to replicate the path names from S3 into the Lustre filesystem, and the import also adds the s3 URL to each file so the mover can retrieve the data.

Storing Large Files in S3 (Proposed)

There is a 5TB size limit in AWS S3 (TODO: check limits for other S3 implementations) , so larger files will need to be sharded over multiple S3 objects. The metadata of the shard UUIDs and their respective ranges would be stored in a container metadata object, and the name of this object would be is returned to the agent as a key. To distinguish container "shard metadata" objects from normal objects, the type in the URL could be changed to s3c, e.g.:

...

name includes al ".shm" extension after the UUID. 

s3://<bucketName>/<prefix>/

...

o/

...

UUID.shm