Lustre code available from Whamcloud git repository contains tools to test a Lustre installation. Since creation of Lustre in 2001, these tools have matured and multiplied. To date, three different test suits are available.
Test suite overview
This document assumes that you have a Linux kernel compiled with Lustre patches. Typical routes to getting a working Lustre kernel include:
- By downloading a pre-build kernel from a provider.
- By applying the Lustre patches and building your own kernel.
Details on both of these routes is provided on the wiki page: Putting together a Lustre filesystem.
Pre-requisites
The instructions on this page assume that you have the Lustre test suite installed. You can get this from source at http://git.whamcloud.com or as RPM from server builds at build.whamcloud.com.
llmount.sh
One of the simplest test suites consists of llmount.sh
and llmountcleanup.sh
. llmount.sh
uses a collection of bash scripts to create a Lustre file system complete with MDS, MDT, OSS, OST and Client using loop devices on a single machine. llmountcleanup.sh
tears down the work llmount.sh
performed and should return your system to normal.
Once llmount.sh
has completed successfully you should see the following:
[root@client-10 ~]# /build/lustre-release/lustre/tests/llmount.sh Stopping clients: client-10.lab.whamcloud.com /mnt/lustre (opts:) Stopping clients: client-10.lab.whamcloud.com /mnt/lustre2 (opts:) Loading modules from /build/lustre-release/lustre/tests/.. lnet.debug=0x33f1504 lnet.subsystem_debug=0xffb7e3ff lnet options: 'networks=tcp0 accept=all' Formatting mgs, mds, osts Checking servers environments Checking clients client-10.lab.whamcloud.com environments Setup mgs, mdt, osts Starting mds: -o loop /tmp/lustre-mdt /mnt/mds lnet.debug=0x33f1504 lnet.subsystem_debug=0xffb7e3ff lnet.debug_mb=256 Started lustre-MDT0000 Starting ost1: -o loop /tmp/lustre-ost1 /mnt/ost1 lnet.debug=0x33f1504 lnet.subsystem_debug=0xffb7e3ff lnet.debug_mb=256 Started lustre-OST0000 Starting ost2: -o loop /tmp/lustre-ost2 /mnt/ost2 lnet.debug=0x33f1504 lnet.subsystem_debug=0xffb7e3ff lnet.debug_mb=256 Started lustre-OST0001 Starting client: client-10.lab.whamcloud.com: -o user_xattr,acl,flock client-10.lab.whamcloud.com@tcp:/lustre /mnt/lustre lnet.debug=0x33f1504 lnet.subsystem_debug=0xffb7e3ff lnet.debug_mb=256 Using TIMEOUT=20 [root@client-10 ~]#
configuring llmount.sh
llmount.sh
takes configuration from environment variables. If you want to overload these values, you can copy the default values from /usr/lib64/lustre/tests/cfg/local.sh
locally, modify your copy of local.sh
and then ensure the system-wide llmount.sh
first sources your local.sh
.
Troubleshooting llmount.sh
llmount.sh
falls over complaining ...
This error typically indicates a problem with an Infiniband (IB) network. Even though llmount.sh
does not connect to any external machines the IB network must be working correctly. It is possible to switch to tcp for the purposes of running llmount.sh
: Select network using export NETTYPE=tcp
, and check that Lnet is configured to use tcp in /etc/modules.conf
. More details on Lnet are available in the manual.
llmount complains that a value is undefined
Before you run llmount.sh
it is necessary to set the debug size environment variable: export DEBUG_SIZE=256
. Setting the DEBUG_SIZE to this value ensures enough space is allocated for logs for all the cpus in the system. If DEBUG_SIZE is too small, the param setting will complain during llmoun.sh
- You will now have a lustre filesystem available to you in user-space at
/mnt/lustre/
. You can test this by switching striping to all nodes and writing a big file:
[root@client-10 ~]# lfs setstripe -c -1 /mnt/lustre [root@client-10 ~]# lfs getstripe /mnt/lustre/ /mnt/lustre/ stripe_count: -1 stripe_size: 0 stripe_offset: -1 [root@client-10 ~]# dd if=/dev/zero of=/mnt/lustre/file.out bs=1MB count=400 400+0 records in 400+0 records out 400000000 bytes (400 MB) copied, 2.33261 seconds, 171 MB/s
Clean-up the after the tests:
/build/lustre-release/lustre/tests/llmountcleanup.sh
auster
Auster is a large suite of functional tests for Luster. There is very good coverage of all Lustre functionality contained within Auster. Help is available on-line:
$ /usr/lib64/lustre/tests/auster -h Usage auster [options] suite [suite optoins] [suite [suite options]] Run Lustre regression tests suites. -c CONFIG Test environment config file -d LOGDIR Top level directory for logs -D FULLLOGDIR Full directory for logs -f STR Config name (cfg/<name>.sh) -g GROUP Test group file (Overrides tests listed on command line) -S TESTSUITE First test suite to run allows for restarts -i N Repeat tests N times (default 1). A new directory will be created under LOGDIR for each iteration. -k Don't stop when subtests fail -R Remount lustre between tests -r Reformat (during initial configuration if needed) -s SLOW=yes -v Verbose mode -l Send logs to the Maloo database after run (can be done later by running maloo_upload.sh) -h This help. Suite options These are suite specific options that can be specified after each suite on the command line. suite-name [options] --only LIST Run only specific list of subtests --except LIST Skip list of subtests --start-at SUBTEST Start testing from subtest --stop-at SUBTEST Stop testing at subtest --time-limit LIMIT Don't allow this suite to run longer than LIMT seconds. [UNIMPLEMENTED] Example usage: Run all of sanity and all of replay-single except for 70b with SLOW=y using the default "local" configuration. auster -s sanity replay-single --except 70b Run all tests in the regression group 5 times using large config. auster -f large -g test-groups/regression -r 5
Run tests using auster script
- Single node
# cd /usr/lib[64]/lustre/tests # ./auster -rv runtests
Note: This is a very simple setup, not all tests can be run in this configuration
- Multiple nodes
# cd /usr/lib[64]/lustre/tests Edit cfg/local.shMinimum required variables: mds_HOST, ost_HOST, PDSH, MDSDEV (MDSDEV1 if lustre 2.x), OSTCOUNT, OSTDEV#, MDS_MOUNT_OPTS, OST_MOUNT_OPTS See Lustre Test Tools Environment Variable for more infomation Make sure partitions on the disks are setup If using real devices, make sure to set MDS_MOUNT_OPTS, OST_MOUNT_OPTS = "" Edit cfg/ncli.sh if there are more than 1 clientsSet RCLIENTS=<list of remote clients> # ./auster -rvf ncli runtests (or any other test suite)
Test logs will be in /tmp/test_logs/YYYY-MM-DD
Subsequence runs do not need to reformat (-r option) the filesystem