Page History
...
An administrator can enable/disable Trash Can feature on a specified MDT via:
lctl set_param mdd.*.enable_trash_can_enable
An adminstrator can enable/disable Trash Can feature on a specified directory or a file via the file flag:FS_UNRM_FL
; All sub files under a directory flagged withFS_UNRM_FL
can inherit this flag;
# lfs trash set $file|$dir # lfs trash clear $file|$dir
Move a deleting file into the Trash Can. When delete a regular file marked while Trash Can is enabled will mark it with
FS_UNRM_FL
upon its last unlink, first and move the file into the Trash Can directory "ROOT/RECYCLE
" with FID as its name. And then/.lustre/Trash/MDTxxxx/UID/pFID
". Then set a "trusted.unrm
" XATTR on the "undeleted" deleted file on the Trash Can. The XATTR contains the following information:
struct ll_trash_xattr {
__u32 ltx_flags; /* for future usage */
__u32 ltx_uid; //* original UID of the deleting file, used forto quota accountingrestore on unrm */
__u32 ltx_gid; //* original gidGID of the deleting file, used to restore foron quotaunrm accounting*/
__u32 ltx_projid; //* original projidPROJID of the deleting file, used for quota accountingto restore on unrm */
__u64 ltx_timestamp; //* Timestamp thattime the file moved into the Trash Can, maybeor we could use ctime here? */
};
Where ltx_uid
/ltx_gid
/ltx_projid
are the original UID/GID/PROJID of the deleted file, mainly used for quota accounting for the restore operation; @ltx_timestamp
is the time that the file was moved into the Trash Can. It is used to determine whether the file is expired for the specified retention period and thus should be removed from the Trash Can finially finally (maybe we could also use the inode ctime for this purpose instead of storing a separate timestamp?). During deleting the file, we can get the full path information via the way similar to fid2path()
.
...
Internally, the lfs trash list
command is looking up the FID and MDT of the current directory, or the directory specified by DIR
, and then listing the respective directory under $MOUNT/.lustre/trash/MDTxxxx/DIRFIDpFID/
or the directory file descriptor returned via llapi_recycle_fid_get(MNTPT, mdt)
if the .lustre/trash
directory is not available.where any files deleted from this directory would be moved This is mainly for debugging, since users will generally use the virtual .Trash
directory to interact with the Trash Can and restore files.
Deleting a file or directory in the Trash Can will remove the temporary file under "
ROOT/.lustre/TRASHTrash
" and free the data space on Lustre OSTs permanently:
...
# lfs trash clear DIR ...
Restore a file in the Trash Can on a given MDT. It will restore the file and its content according to the saved full path and then delete the stub on the Trash Can.
...
The MDS is already monitoring the OST fullness every 5s to make object allocation decisions, so it can also make decisions about files to delete. Thus MDT can periodically Therefore, the MDT can periodically monitor the space usage of the trash user (quota) and space usage for the entire file system with the additional consideration of the retention period and deleted timestamp for the files, choose the candidates to be deleted permanently to free up the space.
Also, there needs to be some accounting of files in the trashTrash Can, so that "df" does not show the filesystem as 100% or 90% full all the time, but rather only show only the non-trash space usage (= real usage - trash usage).
...
If the same filename is repeatedly created and deleted within the same parent directory, then the deleted files will have conflicts when moved into the pFID
directory in the Trash Can. To disambiguate the files in Trash, the conflicting filenames should be disambiguated by appending a timestamp to the filename, like filename.2025-04-03-00:11:24
, possibly adding .microseconds
if there is still a conflict. It isn't totally clear whether it would be better to use the timestamp from when the file was deleted, or when the file was created. Both have some value to help users distinguish between the different versions.
...
In LU-13031 the JobID of the process that first creates a file is stored in the user.job
xattr on the MDT inode, for diagnostic purposes and to allow determining provenance of each file later on. For the Trash Canorder to avoid overwhelming the Trash Can with files that are rapidly created and deleted (e.g. short-lived temporary files), it would be useful to also store the JobID of the process that is deleting the file, for diagnostics such as determining rogue processes that are deleting files in the filesystem. Something like user.del
would be a reasonable default xattr name. The actual xattr name can be configured with the mdt.*.job_xattr_del
parameter.
Trash support for a striped directory
It would useful to implement a virtual ".Trash
" subdirectory accessible in each directory in the filesystem that can be used to browse files/directories in the Trash Can and access them for recovery.
...
desirable to impose an upper limit on the number of versions that will be saved in the trash can.
Avoid preserving temporary files
Files that only exist for a very short time (e.g. temporary files) should not necessarily be preserved in the Trash Can, or they can quickly overwhelm the available capacity of the filesystem, and result in important files being purged from trash and/or filling the trash faster than files can be cleaned up. Files marked with the I_LINKABLE
flag on the MDS (from O_TMPFILE
or Lustre Volatile files, see LU-18844) should not be preserved in the Trash Can. It would be useful to have a tunable parameter that sets a minimum age for files to be preserved in the Trash Can (e.g. 65 minutes?) so that files that are frequently created and deleted are not preserved since they could consume a considerable amount of space.
JobID of process deleting a file
In LU-13031 the JobID of the process that first creates a file is stored in the user.job
xattr on the MDT inode, for diagnostic purposes and to allow determining provenance of each file later on. For the Trash Can, LU-17648 describes storing the JobID of the process that is deleting the file, for diagnostics such as determining rogue processes that are deleting files in the filesystem. Something like user.del
would be a reasonable default xattr name. The actual xattr name can be configured with the mdt.*.job_xattr_del
parameter.
.Trash
Virtual Directory Support
.Trash
virtual directory
It would useful to implement a virtual ".Trash
" subdirectory accessible in each directory in the filesystem that can be used to browse files/directories in the Trash Can and access them for recovery.
The FID of the ".Trash
" directory is derived from the FID of the parent directory (pFID
), by looking up the corresponding "stub" directory with the FID-named directory: ".lustre/trash/MDTXXXX/UID/pFID
". Essentially ".Trash
" under each normal directory is just a virtual shortcut to the stub directory (if the parent is not a striped dir) that is accessible in each directory if specified by name ".Trash
". The files/directories under ".Trash
" directory can be access via normal POSIX file system API such as via readdir()/stat()/getxattr()
so that it can be used by normal tools such as "ls -l .Trash/
" or "find .Trash
" to locate files for restoration or permanent removal. If there are no deleted files under a specific directory, then the virtual .Trash
directory will not be accessible, and will return -ENOENT
for any lookup
.Trash
pFID name lookup
The FID-based names of stub directories stored as .lustre/trash/MDTxxxx/UID/pFID
directory are needed for efficient lookup of the parent FID during unlink. However, these directory names are not very user-friendly when browsing the virtual .Trash
directory in the filesystem namespace. Rather than showing the pFID
name to users during readdir()
calls (ls
, find
, etc.) it would be better to look up the actual parent directory name via the FID→trusted.link
xattr on the parent and return this to clients. The FID number of the directory entry would be the FID of the stub directory itself, not the pFID that is used internally for identification. While copying the trusted.link
xattr over to the stub directory at creation would simplify this lookup, there is some risk that the name in the trusted.link
xattr would become stale if the parent directory is renamed. On the other hand, this may also be useful to preserve the original name of the directory in case some automated tool is renaming the original directory to a temporary name before deletion?
.Trash
striped directory
For a striped directory, its ".Trash
" directory is also a vitual striped directory with each stripe on the same location (MDTs) where the shard FID is the FID of the corresponding stub directory on that MDT. If the stub directory on a certain MDT does not exist (or not create yet), the client lookup()
or readdir()
under ".Trash
" directory should skip the stripe. The master FID of the virtual ".Trash
" directory could be same with the FID of the parent directory but with f_ver
setting with 1 (FID_VERSION_TRASH
= 1) to distinguish them.
To avoid the inconsistent problem, each access on the virtual striped ".Trash
" should check and revalidate the virtual stripe LMV EA. For example, It should add the new shards into the stripe EA after a new stub directory on a certain MDT was created.
.Trash
directory migration
It should handle the case that a directory was restriped and the LMV layout was changed. In this case, the files under the directory will be migrated to another MDT. To simplify the implementation, we do not migrate the files according to the new LMV layout in the Trash Can. This may result in the the lookup()
operation will be issued to a wrong MDT and return -ENOENT
wrongly (after files in the trash can are restored). However, the readdir()
operations will still return all the dir entries in the striped trash even if the parent LMV layout was re-striped and changed, since the parent directory FID (pFID) will remain the same as before restriping. Maybe it needs to migrate the files restored from the trash can to the appropriate shard according to their name hash once the LMV layout has been changed.
Orphans in Trash Can
For an orphan file, it means the file is still opened (but not closed) by a certain user. Upon its last unlink, it can directly move into the trash can and mark with LUSTRE_ORPHAN_FL | LUSTRE_TRASH_FL. And the orphans file can not be permanently deleted from the trash can until its last close().
...