Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The LUTF is designed with a Master-Agent approach to test LNet. The Master and Agent LUTF instance uses a telnet python module to communicate with each other and more than one Agent can communicate with single Master instance at the same time. The Master instance controls the execution of the python test scripts to test LNet on Agent instances. It collects the results of all the tests run on Agents and write them to a YAML file. It also controls the synchronization mechanism between test-scripts running on different Agents.

 

The below diagram shows how LUTF interacts with LNet

...

Figure 1: System Level Diagram

...

Building the LUTF

To build LUTF, it first requires to set up an environment with all the required packages installed and then building using GNU build system like Lustre tree is built.  
Following sub sections briefs on the steps for the building process.

Environment Set-Up

  1. Python 2.7.5 is required along with some other python related packages like -
    1. netifaces
    2. PyYAML
    3. paramiko (some MR test scripts are written using paramiko, so need to have this installed too)
  2. SWIG (Simplified Wrapper and Interface Generator) is required to generate a glue code to allow the python test scripts call DLC APIs.
  3. Password less SSH - Nodes running LUTF are required to setup password less SSH to each other.

Build along Lustre tree using GNU tools

  1. All the other test suites/scripts for lustre are placed under lustre/tests/ directory. Place LUTF as well under lustre/tests.
  2. Mention LUTF as a subdirectory to be build in lustre/tests/Makefile.am
  3. *  Create an autoMakefile.am under lustre/tests/ and also under lustre/tests/lutf/ .
  4. Create a Makefile.am under lustre/tests/lutf/ to generate the required binary files and swig files. 
    1. It would also require to modify configure.ac under lustre tree parent directory to add python path and other dependencies.
    2. Add the LTLIBRARIES and SOURCES to generate the swig wrapper files.
  5. Run "make distclean" to clean up any residual build artifacts.
  6. cd to lustre tree parent directory and run "sh autogen.sh"
  7. Run "./configure"
  8. Run "make"

LUTF/AT Integration

For LUTF-Autotest integration, the first step in this process is to build LUTF along with lustre just like other test-suites are build. The previous step "Build along Lustre tree using GNU tools" discussed fulfills this purpose.  Once, the LUTF is build along with lustre, we have all the needed binary files and swig generated wrapper files used to run python test-scripts. After this -

  1. The config file (similar to how Auster has) provided for LUTF under lustre/tests/cfg/  is used to identify nodes involved in test-suite and set up environment variables.
  2. AT runs the Master script which reads the config file and set up LUTF on the identifies nodes and triggers the execution of test-suite on the Agent nodes.
  3. AT collects the results of the test-suite in the form of a YAML file (similar to Auster) and then passes the results to Maloo.

 

Infrastructure

Automatic Deployment

With LUTF-Autotest integration, an infrastructure is created that makes AT to deploy LUTF on the test nodes, collect results of the tests run and then pass the test results to Maloo to be displayed there.

Deploy LUTF

  1. A config file (similar to what Auster has) is provided by AT which can define and set the environment variables . This file would also have information about the nodes involved in test-suite and their IP addresses.
  2. A Master script is created which can read the IP addresses of the nodes involved in the test-suite from the config file and run LUTF on the identified Agent and Master nodes.
  3. This Master script also triggers to run a child script that can fetch the information about the Network Interfaces (NIDs) on all the nodes involved in test-suite. 
    1. This information of NIDs can then further be provided to each batch test (scripts to run all the similar tests related to one feature bundled together) to execute.
  4. The Master script then triggers the batch test script to run on the Agent nodes through the Master node identified to be used for the test-suite.

 

...

 

 

...

#!/bin/bash

#Key Exports

export master_HOST=onyx-15vm1

export agent1_HOST=onyx-16vm1

export agent2_HOST=onyx-17vm1

export agent3_HOST=onyx-18vm1

export AGENTCOUNT=3

 

VERBOSE=true

  

# ports for LUTF Telnet connection

export MASTER_PORT=8494

export AGENT_PORT=8094

 

# script and result paths

script_DIR=$LUSTRE/tests/lutf/python/test/dlc/

output_DIR=$LUSTRE/tests/lutf/python/tests/

 

 

Collect Results

...

 

 

...

TestGroup:

    test_group: review-ldiskfs

    testhost: trevis-13vm5

    submission: Mon May  8 15:54:41 UTC 2017

    user_name: root

autotest_result_group_id: 5e11dc5b-7dd7-48a1-b4a3-74a333acd912

test_sequence: 1

test_index: 10

session_group_id: cfeff6b3-60fc-438a-88ef-68e65a08694f

enforcing: true

triggering_build_number: 45090

triggering_job_name: lustre-reviews

total_enforcing_sessions: 5

code_review:

 type: Gerrit

 url: review.whamcloud.com

 project: fs/lustre-release

 branch: multi-rail

 identifiers:

 - id: 3fbd25eb0fe90e4f34e36bad006c73d756ef8499

issue_tracker:

 type: Jira

 url: jira.hpdd.intel.com

 identifiers:

 - id: LU-9119

Tests:

-

        name: dlc

        description: lutf dlc

        submission: Mon May  8 15:54:43 UTC 2017

        report_version: 2

        result_path: lustre-release/lustre/tests/lutf/python/tests/

        SubTests:

        -

            name: test_01

            status: PASS

            duration: 2

            return_code: 0

            error:

        -

            name: test_02

            status: PASS

            duration: 2

            return_code: 0

            error:

        duration: 5

        status: PASS

-

        name: multi-rail

        description: lutf multi-rail

        submission: Mon May  8 15:59:43 UTC 2017

        report_version: 2

        result_path: lustre-release/lustre/tests/lutf/python/tests/

        SubTests:

        -

            name: test_01

            status: PASS

            duration: 2

            return_code: 0

            error:

        -

            name: test_02

            status: PASS

            duration: 2

            return_code: 0

            error:

        duration: 5

        status: PASS

The LUTF shall be integrated with the Lustre tests under lustre/tests/lutf. The LUTF will be built and packaged with the standard

 


Code Block
sh ./autogen.sh
./configure --with-linux=<kernel path>
make
# optionally
make rpms
# optionally
make install


 

The make system will build the following items:

  1. lutf binary
  2. liblutf_agent.so - shared library to communicate with the LUTF backend.
  3. clutf_agen.py and clutf_agent.so: glue code that allows python to call functions in liblutf_agent.so
  4. _lnetconfig.so and lnetconfig.py - glue code to allow python test scripts to utilize the DLC interface.

The build process will check if python 2.7.5 and SWIG 2.0 or higher is installed before building. If these requirements are not met the LUTF is not built

If the LUTF is built it will be packaged in the lustre-tests rpm and installed in /usr/lib64/lustre/tests/lutf.

Test Environment Set-Up

Each node which will run the LUTF will need to have the following installed

  1. ncurses library
    1. yum install ncurses-devel
  2. readline library
    1. yum install readline-devel
  3. python 2.7.5
    1. https://www.python.org/download/releases/2.7.5/
    2. ./configure --prefix=<> --enable-shared # it is recommended to install in standard system path
    3. make; make install
  4. setuptools
    1. https://pypi.python.org/pypi/setuptools
    2. The way it worked for me:
      1. Download package and untar
      2. python2.7 setup.py install
  5. psutils
    1. https://pypi.python.org/pypi?:action=display&name=psutil
      1. untar
      2. cd to untared directory
      3. python2.7 setup.py install
  6. netifaces
    1. https://pypi.python.org/pypi/netifaces
  7. Install PyYAML

The LUTF will also require that passwordless ssh is setup for all the nodes which run the LUTF.

LUTF/AT Integration

LUTF Deployment

The LUTF will provide a deployment script, lutf_deploy.py, which will download and install all the necessary elements defined above. If everything is successful it will start the LUTF given the LUTF YAML configuration file, described later.

AT Integration

A similar script to auster will be provided by the LUTF, lutf_engage.py. The purpose of the script is to manage which nodes the LUTF will be deployed on. Only the AT has knowledge of the nodes available; therefore the script will perform the following steps;

  1. Take as input the following parameters. NOTE: These parameters can be provided as a set of environment variables, or can be placed in a YAML file and then the path of the YAML file can be passed to the lutf_engage.py script. The second option will be assumed in this HLD.
    1. IP address of node to be used for master
    2. IP addresses of nodes to be used as agents
    3. Two YAML configuration files for the Master and Agent nodes.
    4. YAML configuration file describing the tests to run.
  2. Call the lutf_deploy.py script for each of the nodes provided. It will pass the Master YAML LUTF Configuration file to the master node that the agent configuration file to the agent nodes.
    1. Query the LUTF master to ensure the expected number of agents are connected.
    2. If everything is correct, then continue with the tests, otherwise build a YAML block describing the error.
  3. Send the test YAML configuration file to the LUTF master and wait.
  4. Once the tests are completed the LUTF master will return a YAML block describing the test results, described below
    1. the LUTF Master will provide an API based around paramiko. The API is described below.

LUTF Configuration Files

Master YAML Configuration File

This configuration file describes the information the master needs in order to start

Code Block
config:
   type: master
   mport: <master port>
   base_path: <base path to the LUTF directory - optional.
               if not present default to /usr/lib64/lustre/tests>
   extra_py: <extra python paths>

Slave YAML Configuration File

This configuration file describes the information the agent needs in order to start

Code Block
config:
   type: agent
   maddress: <master address - optional>
   mport: <master port>
   dport: <agent daemon port>
   base_path: <base path to the LUTF directory>
   extra_py: <extra python paths>

Test YAML Configuration File

This configuration file describes the list of tests to run

Code Block
config:
   type: tests
   tests:
      - 0: <test set name>
        1: <test set name>
        2: <test set name>
        ....
        N: <test set name>

LUTF Result file

This YAML result file describes the results of the tests that were requested to run

Code Block
TestGroup:
    test_group: review-ldiskfs
    testhost: trevis-13vm5
    submission: Mon May  8 15:54:41 UTC 2017
    user_name: root
autotest_result_group_id: 5e11dc5b-7dd7-48a1-b4a3-74a333acd912
test_sequence: 1
test_index: 10
session_group_id: cfeff6b3-60fc-438a-88ef-68e65a08694f
enforcing: true
triggering_build_number: 45090
triggering_job_name: lustre-reviews
total_enforcing_sessions: 5
code_review:
 type: Gerrit
 url: review.whamcloud.com
 project: fs/lustre-release
 branch: multi-rail
 identifiers:
 - id: 3fbd25eb0fe90e4f34e36bad006c73d756ef8499
issue_tracker:
 type: Jira
 url: jira.hpdd.intel.com
 identifiers:
 - id: LU-9119
Tests:
- name: dlc
        description: lutf dlc
        submission: Mon May  8 15:54:43 UTC 2017
        report_version: 2
        result_path: lustre-release/lustre/tests/lutf/python/tests/
        SubTests:
        - name: test_01
          status: PASS
          duration: 2
          return_code: 0
          error:
        - name: test_02
          status: PASS
          duration: 2
          return_code: 0
          error:
        duration: 5
        status: PASS
-  name: multi-rail
        description: lutf multi-rail
        submission: Mon May  8 15:59:43 UTC 2017
        report_version: 2
        result_path: lustre-release/lustre/tests/lutf/python/tests/
        SubTests:
        - name: test_01
            status: PASS
            duration: 2
            return_code: 0
            error:
        - name: test_02
            status: PASS
            duration: 2
            return_code: 0
            error:
        duration: 5
        status: PASS

 

 

A sample Config file used by Auster

 

 

Sample LUTF Config file

#!/bin/bash

#Key Exports

export master_HOST=onyx-15vm1

export agent1_HOST=onyx-16vm1

export agent2_HOST=onyx-17vm1

export agent3_HOST=onyx-18vm1

export AGENTCOUNT=3

 

VERBOSE=true

  

# ports for LUTF Telnet connection

export MASTER_PORT=8494

export AGENT_PORT=8094

 

# script and result paths

script_DIR=$LUSTRE/tests/lutf/python/test/dlc/

output_DIR=$LUSTRE/tests/lutf/python/tests/

 

 

Collect Results

  1. A YAML format is decided for the results of the entire test-run and a result YAML file is generated per that format.
  2. The YAML file also points to the path where the test result file for each test is stored.
  3. This YAML file is then passed to AT which further passes it to Maloo.

A sample result YAML file from Auster
results.yml

 

 

Sample LUTF result YAML file

TestGroup:

    test_group: review-ldiskfs

    testhost: trevis-13vm5

    submission: Mon May  8 15:54:41 UTC 2017

    user_name: root

autotest_result_group_id: 5e11dc5b-7dd7-48a1-b4a3-74a333acd912

test_sequence: 1

test_index: 10

session_group_id: cfeff6b3-60fc-438a-88ef-68e65a08694f

enforcing: true

triggering_build_number: 45090

triggering_job_name: lustre-reviews

total_enforcing_sessions: 5

code_review:

 type: Gerrit

 url: review.whamcloud.com

 project: fs/lustre-release

 branch: multi-rail

 identifiers:

 - id: 3fbd25eb0fe90e4f34e36bad006c73d756ef8499

issue_tracker:

 type: Jira

 url: jira.hpdd.intel.com

 identifiers:

 - id: LU-9119

Tests:

-

        name: dlc

        description: lutf dlc

        submission: Mon May  8 15:54:43 UTC 2017

        report_version: 2

        result_path: lustre-release/lustre/tests/lutf/python/tests/

        SubTests:

        -

            name: test_01

            status: PASS

            duration: 2

            return_code: 0

            error:

        -

            name: test_02

            status: PASS

            duration: 2

            return_code: 0

            error:

        duration: 5

        status: PASS

-

        name: multi-rail

        description: lutf multi-rail

        submission: Mon May  8 15:59:43 UTC 2017

        report_version: 2

        result_path: lustre-release/lustre/tests/lutf/python/tests/

        SubTests:

        -

            name: test_01

            status: PASS

            duration: 2

            return_code: 0

            error:

        -

            name: test_02

            status: PASS

            duration: 2

            return_code: 0

            error:

        duration: 5

        status: PASS


Network Interface Discovery

The LUTF test scripts will need to be implemented in a generic way. Which means that each test scripts which requires the use of interfaces, will need to discover the interfaces available to it on the node. If there are sufficient number of interfaces of the correct type, then the test can continue otherwise the test will be skipped and reported as such in the final result.

...

 Maloo

  1. A separate section is to be created in Maloo to display LUTF test results.
  2. The results from output YAML file passed from AT are displayed in the LUTF results section.
  3. A Test-parameter specifically for LUTF tests to be defined that will allow to run only LUTF tests. This will help in avoiding unnecessary tests to run for only LNet related changes.

...