You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

Lustre code available from Whamcloud git repository contains tools to test a Lustre installation. Since creation of Lustre in 2001, these tools have matured and multiplied. To date, three different test suits are available.

Test suite overview

This document assumes that you have a Linux kernel compiled with Lustre patches. Typical routes to getting a working Lustre kernel include:

  • By downloading a pre-build kernel from a provider.
  • By applying the Lustre patches and building your own kernel.
    Details on both of these routes is provided on the wiki page: Putting together a Lustre filesystem.

llmount.sh

One of the simplest test suites consists of llmount.sh and llmountcleanup.sh. llmount.sh uses a collection of bash scripts to create a Lustre file system complete with MDS, MDT, OSS, OST and Client using loop devices on a single machine. llmountcleanup.sh tears down the work llmount.sh performed and should return your system to normal.

Once llmount.sh has completed successfully you should see the following:

[root@client-10 ~]# /build/lustre-release/lustre/tests/llmount.sh 
Stopping clients: client-10.lab.whamcloud.com /mnt/lustre (opts:)
Stopping clients: client-10.lab.whamcloud.com /mnt/lustre2 (opts:)
Loading modules from /build/lustre-release/lustre/tests/..
lnet.debug=0x33f1504
lnet.subsystem_debug=0xffb7e3ff
lnet options: 'networks=tcp0 accept=all'
Formatting mgs, mds, osts
Checking servers environments
Checking clients client-10.lab.whamcloud.com environments
Setup mgs, mdt, osts
Starting mds: -o loop  /tmp/lustre-mdt /mnt/mds
lnet.debug=0x33f1504
lnet.subsystem_debug=0xffb7e3ff
lnet.debug_mb=256
Started lustre-MDT0000
Starting ost1: -o loop  /tmp/lustre-ost1 /mnt/ost1
lnet.debug=0x33f1504
lnet.subsystem_debug=0xffb7e3ff
lnet.debug_mb=256
Started lustre-OST0000
Starting ost2: -o loop  /tmp/lustre-ost2 /mnt/ost2
lnet.debug=0x33f1504
lnet.subsystem_debug=0xffb7e3ff
lnet.debug_mb=256
Started lustre-OST0001
Starting client: client-10.lab.whamcloud.com: -o user_xattr,acl,flock client-10.lab.whamcloud.com@tcp:/lustre /mnt/lustre
lnet.debug=0x33f1504
lnet.subsystem_debug=0xffb7e3ff
lnet.debug_mb=256
Using TIMEOUT=20
[root@client-10 ~]# 

Troubleshooting llmount.sh

llmount.sh falls over complaining ...

This error typically indicates a problem with an Infiniband (IB) network. Even though llmount.sh does not connect to any external machines the IB network must be working correctly. It is possible to switch to tcp for the purposes of running llmount.sh: Select network using export NETTYPE=tcp, and check that Lnet is configured to use tcp in /etc/modules.conf. More details on Lnet are available in the manual LINK TO THE MANUAL.

llmount complains that a value is undefined

Before you run llmount.sh it is necessary to set the debug size environment variable: export DEBUG_SIZE=256. Setting the DEBUG_SIZE to this value ensures enough space is allocated for logs for all the cpus in the system. If DEBUG_SIZE is too small, the param setting will complain during llmoun.sh

  1. You will now have a lustre filesystem available to you in user-space at /mnt/lustre/.
  2. You can test this by switching striping to all nodes and writing a big file:
    [root@client-10 ~]# lfs setstripe -c -1 /mnt/lustre
    [root@client-10 ~]# lfs getstripe /mnt/lustre/
    /mnt/lustre/
    stripe_count:   -1 stripe_size:    0 stripe_offset:  -1
    [root@client-10 ~]# dd if=/dev/zero of=/mnt/lustre/file.out bs=1MB count=400
    400+0 records in
    400+0 records out
    400000000 bytes (400 MB) copied, 2.33261 seconds, 171 MB/s
    
  3. Clean-up the after the tests:
    /build/lustre-release/lustre/tests/llmountcleanup.sh
    

test-acc.sh

auster

A description and usage instructions for Auster are available in a separate wiki page LINK TO AUSTER PAGES

  • No labels