You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 26 Next »

Purpose

Describe the steps you need to build and test a 1.8 Lustre system (MGS, MDT, MDS, OSS, OST, client) on a CentOS 5 machine.

Prerequisite

  • A newly installed CentOS 5 machine with the name: client-10.

Overview

Lustre 1.8 servers require a patched and compiled kernel. Patches are readily available in the Whamcloud git source repository. A test suite is included with the Lustre 1.8 source. This document walks through the steps of patching the kernel, building Lustre and running a basic test of the complete system.

Procedure

The procedure requires that a OS is setup for development - this includes Lustre source and kernel headers. Once setup, a new kernel can be patched, compiled, run and tested. Building a RPM based kernel is described in detail on the Lustre.org wiki.

Provision Machine

Once CentOS5.5 is provisioned on client-10 login as root.

  1. Install development tools: yum groupinstall "Development Tools"
  2. Install a bunch of useful stuff: yum install rpm-build redhat-rpm-config unifdef gnupg quilt git
  3. Create a user build with the home directory /build
    useradd -d /build build
    
  4. Switch to user su build
  5. Change to directory ~build
  6. Get the 1.8 branch from the Whamcloud git account.
    git clone git://git.whamcloud.com/fs/lustre-release.git
    cd lustre-release
    git checkout --track -b b1_8 origin/b1_8
    
  7. Run sh ./autogen.sh
  8. Resolve the outstanding dependencies until autogen.sh completes successfully. Success will look like:
    [root@client-10 lustre-release]# sh ./autogen.sh 
    Checking for a complete tree...
    checking for automake-1.9 >= 1.9... found 1.9.6
    ...
    Running automake-1.9...
    Running autoconf...
    [root@client-10 lustre-release]#
    

Prepare the kernel source

This section of the walk-thru is taken from http://wiki.centos.org/HowTos/Custom_Kernel

  1. Get the kernel source. First create the directory structure, then get the source from the RPM. Create a .rpmmacros file to install the kernel source in our user dir.
    cd
    mkdir -p kernel/rpmbuild/{BUILD,RPMS,SOURCES,SPECS,SRPMS}
    cd kernel
    echo '%_topdir %(echo $HOME)/kernel/rpmbuild' > ~/.rpmmacros
    
  2. Install the kernel source:
    rpm -i http://mirror.centos.org/centos/5/updates/SRPMS/kernel-2.6.18-194.32.1.el5.src.rpm 2>&1 | grep -v mockb
    
  3. Expand the source. Using rpmbuild will also apply CentOS patches.
    cd ~/kernel/rpmbuild/SPECS
    rpmbuild -bp --target=`uname -m` ./kernel-2.6.spec
    

    This should return a bunch of stuff and end:
    ...
    + echo 'Patch #20216 (xen-hvm-correct-accuracy-of-pmtimer.patch):'
    Patch #20216 (xen-hvm-correct-accuracy-of-pmtimer.patch):
    + patch -p1 --fuzz=2 -s
    + exit 0
    

At this point, we now have a kernel souce, with all the CentOS patches applied, residing in the directory /build/kernel/rpmbuild/BUILD/kernel-2.6.18/linux-2.6.18.x86_64

Patch the kernel source with the Lustre code.

  1. Add a unique build id so we can be certain our kernel is booted. Edit ~build/kernel/rpmbuild/BUILD/kernel-2.6.18/linux-2.6.18.x86_64/Makefile and modify lin 4, the EXTRAVERSION to read:
    EXTRAVERSION = -lustre18
  2. enter the directory /build/kernel/rpmbuild/BUILD/kernel-2.6.18/linux-2.6.18.x86_64
  3. overwrite the .config file with /build/lustre-release/lustre/kernel_patches/kernel_configs/kernel-2.6.18-2.6-rhel5-x86_64.config
    cp /build/lustre-release/lustre/kernel_patches/kernel_configs/kernel-2.6.18-2.6-rhel5-x86_64-smp.config ./.config
    
  4. link the Lustre series and patches
    ln -s ~/lustre-release/lustre/kernel_patches/series/2.6-rhel5.series series
    ln -s ~/lustre-release/lustre/kernel_patches/patches patches
    
  5. Apply the patches to the kernel source using quilt
    quilt push -av
    ...
    ...
    Applying patch patches/jbd2_stats_proc_init-wrong-place.patch
    patching file fs/jbd2/journal.c
    Hunk #1 succeeded at 1042 (offset 143 lines).
    
    Now at patch patches/jbd2_stats_proc_init-wrong-place.patch
    

Build the new kernel as an RPM.

  1. Go into the kernel source directory and issue the following commands to build a kernel rpm.
    cd /build/kernel/rpmbuild/BUILD/kernel-2.6.18/linux-2.6.18.x86_64
    make oldconfig || make menuconfig
    make include/asm
    make include/linux/version.h
    make SUBDIRS=scripts
    make include/linux/utsrelease.h
    make rpm
    
  2. make a coffee. NOTE If you receive a request to generate more entropy, you need to trigger some disk I/O or keyboard I/O. I would recommend (in another terminal):
    grep -Ri 'whamcloud' /usr
    
  3. As user build change to directory ~build/lustre-release

At this point, you should have a fresh kernel RPM /build/kernel/rpmbuild/RPMS/x86_64/kernel-2.6.18lustre18-1.x86_64.rpm

Configure and build Lustre

  1. Configure Lustre source
    [build@client-10 lustre-release]$ ./configure --with-linux=/build/kernel/rpmbuild/BUILD/kernel-2.6.18lustre18/
    ...
    ...
    EXTRA_KCFLAGS: -include /build/lustre-release/config.h  -g -I/build/lustre-release/lnet/include -I/build/lustre-release/lnet/include -I/build/lustre-release/lustre/include
    LLCFLAGS:      -g -Wall -fPIC -D_GNU_SOURCE
    
    Type 'make' to build Lustre.
    
  2. make rpms:
    [build@client-10 lustre-release]$ make rpms
    ...
    ...
    Wrote: /build/kernel/rpmbuild/RPMS/x86_64/lustre-debuginfo-1.8.5.54-2.6.18_194.32.1.el5.lustre18_201103071000.x86_64.rpm
    Executing(%clean): /bin/sh -e /var/tmp/rpm-tmp.15638
    + umask 022
    + cd /build/kernel/rpmbuild/BUILD
    + cd lustre-1.8.5.54
    + rm -rf /var/tmp/lustre-1.8.5.54-root
    + exit 0
    make[1]: Leaving directory `/build/lustre-release'
    
  3. You should now have build the following rpms:
    ls ~build/kernel/rpmbuild/RPMS/x86_64/
    lustre-debuginfo-1.8.5.54-2.6.18_lustre18_201103081147.x86_64.rpm
    lustre-tests-1.8.5.54-2.6.18_lustre18_201103081147.x86_64.rpm
    lustre-source-1.8.5.54-2.6.18_lustre18_201103081147.x86_64.rpm
    lustre-modules-1.8.5.54-2.6.18_lustre18_201103081147.x86_64.rpm
    lustre-1.8.5.54-2.6.18_lustre18_201103081147.x86_64.rpm
    lustre-ldiskfs-3.1.5-2.6.18_lustre18_201103081148.x86_64.rpm
    lustre-ldiskfs-debuginfo-3.1.5-2.6.18_lustre18_201103081148.x86_64.rpm
    kernel-2.6.18lustre18-1.x86_64.rpm
    

Installing the Lustre kernel and rebooting.

  1. As root, Install the kernel
    rpm -ivh ~build/kernel/rpmbuild/RPMS/x86_64/kernel-2.6.18prep-1.x86_64.rpm
    
  2. Check that /boot/grub/menu.lst contains the correct default kernel to boot. This is typically 0:
    Default=0
    
  3. reboot
  4. connect with conman, and watch the machine come up
  5. view the login prompt with satisfaction:
    CentOS release 5.5 (Final)
    Kernel 2.6.18-lustre18 on an x86_64
    
    client-10.lab.whamcloud.com login:
    

Installing Lustre.

  1. Change to root and Change directory into /build/kernel/rpmbuild/RPMS/x86_64/
  2. Install modules lustre-modules and user space tools lustre-
    rpm -ivh /build/kernel/rpmbuild/RPMS/x86_64/lustre-modules-1.8.5.54-2.6.18_lustre18_*.x86_64.rpm /build/kernel/rpmbuild/RPMS/x86_64/lustre-1.8.5.54-2.6.18_lustre18_*.x86_64.rpm /build/kernel/rpmbuild/RPMS/x86_64/lustre-ldiskfs-3.1.5-2.6.18_lustre18_*.x86_64.rpm
    

Installing e2fsprogs

e2fsprogs is needed to run the test suite.

  1. Download e2fsprogs from http://build.whamcloud.com/job/e2fsprogs/
  2. Install with rpm -ivh e2fsprogs

Testing Lustre

  1. As root, create a large enough debug buffer to contain the log for the total number of
  2. Run llmount.sh
    export DEBUG_SIZE=256
    /build/lustre-release/lustre/tests/llmount.sh
    
  3. You should see something like:
    [root@client-10 ~]# /build/lustre-release/lustre/tests/llmount.sh 
    Stopping clients: client-10.lab.whamcloud.com /mnt/lustre (opts:)
    Stopping clients: client-10.lab.whamcloud.com /mnt/lustre2 (opts:)
    Loading modules from /build/lustre-release/lustre/tests/..
    lnet.debug=0x33f1504
    lnet.subsystem_debug=0xffb7e3ff
    lnet options: 'networks=tcp0 accept=all'
    Formatting mgs, mds, osts
    Checking servers environments
    Checking clients client-10.lab.whamcloud.com environments
    Setup mgs, mdt, osts
    Starting mds: -o loop  /tmp/lustre-mdt /mnt/mds
    lnet.debug=0x33f1504
    lnet.subsystem_debug=0xffb7e3ff
    lnet.debug_mb=256
    Started lustre-MDT0000
    Starting ost1: -o loop  /tmp/lustre-ost1 /mnt/ost1
    lnet.debug=0x33f1504
    lnet.subsystem_debug=0xffb7e3ff
    lnet.debug_mb=256
    Started lustre-OST0000
    Starting ost2: -o loop  /tmp/lustre-ost2 /mnt/ost2
    lnet.debug=0x33f1504
    lnet.subsystem_debug=0xffb7e3ff
    lnet.debug_mb=256
    Started lustre-OST0001
    Starting client: client-10.lab.whamcloud.com: -o user_xattr,acl,flock client-10.lab.whamcloud.com@tcp:/lustre /mnt/lustre
    lnet.debug=0x33f1504
    lnet.subsystem_debug=0xffb7e3ff
    lnet.debug_mb=256
    Using TIMEOUT=20
    [root@client-10 ~]# 
    
  4. You will now have a lustre filesystem available to you in user-space at /mnt/lustre/.
  5. You can test this by switching striping to all nodes and writing a big file:
    [root@client-10 ~]# lfs setstripe -c -1 /mnt/lustre
    [root@client-10 ~]# lfs getstripe /mnt/lustre/
    /mnt/lustre/
    stripe_count:   -1 stripe_size:    0 stripe_offset:  -1
    [root@client-10 ~]# dd if=/dev/zero of=/mnt/lustre/file.out bs=1MB count=400
    400+0 records in
    400+0 records out
    400000000 bytes (400 MB) copied, 2.33261 seconds, 171 MB/s
    
  6. Clean-up the after the tests:
    /build/lustre-release/lustre/tests/llmountcleanup.sh
    

Congratulations, you mission is complete.

Trouble shooting.

  1. If Infiniband is now working, you can switch to tcp: Select network using export NETTYPE=tcp. Lustre test defaults to 'tcp', the automatically provisioned machines have lnet setup to us o2ib

ENDS~

  • No labels