...

The LUTF is meant to cover the following test use cases:

Use Case

Requirement

Description
Single node configuration

Excersize the

Exercise the liblnetconfig API directly to configure LNet

Excersise the

Exercise the lnetctl utility to configure LNet

LUTF Design Overview

The LUTF is designed with a Master-Agent approach to test LNet. The Master and Agent LUTF instance uses a telnet python module to communicate with each other and more than one Agent can communicate with single Master instance at the same time. The Master instance controls the execution of the python test scripts to test LNet on Agent instances. It collects the results of all the tests run on Agents and write them to a YAML file. It also controls the synchronization mechanism between test-scripts running on different Agents.

The below diagram shows how LUTF interacts with LNet

Gliffy Diagram

size	700
name	LUTF design
pagePin	8

Figure 1: System Level Diagram

LUTF Data Flow

...

LUTF Deployment

The LUTF will provide a dependency script, lutf_dep.py, which will download and install all the necessary elements defined above.

The LUTF will integrate with auster. LUTF should just run like any other Lustre test. A bash wrapper script will be created to execute the LUTF, lutf.sh .

SIDE NOTE: Since LUTF simply just runs python scripts, it can run any test, including Lustre tests.

Auster

auster configuration scripts set up the environment variables required for the tests to run. These environment variables include:

The nodes involved in the tests
The devices to use for storage
The clients
The PDSH command to use

It also sets a host of specific Lustre environment variables.

It then executes the tests scripts, ex: sanity.sh

sanity.sh can then run scripts utilizing the information provided in the environment variables.

LUTF and Auster

The LUTF will build on the existing test infrastructure.

An lutf.sh script will be created, which will be executed from auster.

auster will continue to setup the environment variables it does as of the time of this writing. The lutf.sh will run the LUTF. Since the LUTF is run within the auster context, the test python scripts will have access to these environment variables and can use them the same way as the bash test scripts do. If LUTF python scripts are executed on the remote node the necessary information from the environment variables are delivered to these scripts.

Test Prerequisites

Before each test the lutf.sh will provide functions to perform the following checks:

If the master hasn't started, start it.
If the agents on the nodes specified haven't started, then start them.
Verify the system is ready to start. IE: master and agents are all started.

Test Post-requisites

Provide test results in YAML format.

It's the responsibility of the test scripts to ensure that the system is in an expected state; ie: file system unmounted, modules unloaded, etc.

LUTF Threading Overview

...

All tests are run on one node.
Multi-node/no File system testing	Configure one or more nodes Run lnet_selftest Ensure traffic conforms to configuration Repeat the above These tests require node synchronization. For example if a script is configuring node A, node B can not start traffic until node A has finished configuration.
Multi-node/File system testing	Start file system traffic Perform some configuration changes which would change LNet behavior Ensure that configuration changes are honored These tests require node synchronization.
Error Injection testing	Either with file system mount or not Inject various types of errors on different nodes on the setup Monitor statistics to determine how LNet is handling faults These tests require node synchronization.

LUTF Design Overview

The LUTF is designed with a Master-Agent approach to test LNet. The Master and Agent LUTF instance uses a telnet python module to communicate with each other and more than one Agent can communicate with single Master instance at the same time. The Master instance controls the execution of the python test scripts to test LNet on Agent instances. It collects the results of all the tests run on Agents and write them to a YAML file. It also controls the synchronization mechanism between test-scripts running on different Agents.

The below diagram shows how LUTF interacts with LNet

Gliffy Diagram

size	700
name	LUTF design
pagePin	8

Figure 1: System Level Diagram

LUTF Data Flow

Gliffy Diagram


name	LUTF Data Flow
pagePin	1

LUTF Deployment

The LUTF will provide a dependency script, lutf_dep.py, which will download and install all the necessary elements defined above.

The LUTF will integrate with auster. LUTF should just run like any other Lustre test. A bash wrapper script will be created to execute the LUTF, lutf.sh .

SIDE NOTE: Since LUTF simply just runs python scripts, it can run any test, including Lustre tests.

Auster

auster configuration scripts set up the environment variables required for the tests to run. These environment variables include:

The nodes involved in the tests
The devices to use for storage
The clients
The PDSH command to use

It also sets a host of specific Lustre environment variables.

It then executes the tests scripts, ex: sanity.sh

sanity.sh can then run scripts utilizing the information provided in the environment variables.

LUTF and Auster

The LUTF will build on the existing test infrastructure.

An lutf.sh script will be created, which will be executed from auster.

auster will continue to setup the environment variables it does as of the time of this writing. The lutf.sh will run the LUTF. Since the LUTF is run within the auster context, the test python scripts will have access to these environment variables and can use them the same way as the bash test scripts do. If LUTF python scripts are executed on the remote node the necessary information from the environment variables are delivered to these scripts.

Test Prerequisites

Before each test the lutf.sh will provide functions to perform the following checks:

If the master hasn't started, start it.
If the agents on the nodes specified haven't started, then start them.
Verify the system is ready to start. IE: master and agents are all started.

Test Post-requisites

Provide test results in YAML format.

It's the responsibility of the test scripts to ensure that the system is in an expected state; ie: file system unmounted, modules unloaded, etc.

LUTF Threading Overview

Gliffy Diagram

name	Threading Overview
pagePin	2

Thread Description

Listener: Listens for connections from LUTF Agents and for Heartbeats to monitor aliveness of the Agents.
HeartBeat: Send a periodic heartbeat to the LUTF Master to inform it that the agent is still alive.
Python Interpreter: Executes python test scripts which can call into one of the C/Python APIs provided

C/Python APIs

C/Python Management API

Parse configuration
provide status on the LUTF Agents
provide status on executing scripts
Store results

C/Python Synchronization APIs

Assign work to LUTF Agents from LUTF Master
1. This will result in a YAML rpc block being sent to the LUTF agent
Wait for work completion events from LUTF Agents
Register for asynchronus events
1. Asynchronous events come in the form of YAML blocks.

C/Python liblnetconfig APIs

These are the configuration APIs in lnet/utils/lnetconfig/liblnetconfig.h

Other APIs can be wrapped in SWIG and exposed for the LUTF python test scripts to call

LUTF Test Scripts Design Overview

The test scripts will be deployed on all nodes under test as well as the test master.
Each test script will need to provide a run function
- This function is intended to be executed by the test master
The LUTF will provide at least one other function to perform the actual testing.
- This function will be called remotely and will execute on the test node.
Each test, which can be composed of arbetrary python code, must return a YAML text block to the test master reporting the results of the operation.
All functions should always take a dictionary as its input parameter and output a dictionary as its return result

LUTF Communication Protocol

Code Block
rpc: target: agent_id type: function_call fname: function_name parameters: param0: value param1: value2 param2: 1 param3: [1, 2, 3] param4: 1.4

Test Environment Set-Up

Each node which will run the LUTF will need to have the following installed

...

Space shortcuts

Page tree

Versions Compared

Old Version 91

New Version 92

Key

LUTF Design Overview

LUTF Data Flow

LUTF Deployment

Auster

LUTF and Auster

Test Prerequisites

Test Post-requisites

LUTF Threading Overview

LUTF Design Overview

LUTF Data Flow

LUTF Deployment

Auster

LUTF and Auster

Test Prerequisites

Test Post-requisites

LUTF Threading Overview

Thread Description

C/Python APIs

C/Python Management API

C/Python Synchronization APIs

C/Python liblnetconfig APIs

LUTF Test Scripts Design Overview

LUTF Communication Protocol

Test Environment Set-Up

Space shortcuts

Page tree

Page History

Versions Compared

Old Version 91

New Version 92

Key

LUTF Design Overview

LUTF Data Flow

LUTF Deployment

Auster

LUTF and Auster

Test Prerequisites

Test Post-requisites

LUTF Threading Overview

LUTF Design Overview

LUTF Data Flow

LUTF Deployment

Auster

LUTF and Auster

Test Prerequisites

Test Post-requisites

LUTF Threading Overview

Thread Description

C/Python APIs

C/Python Management API

C/Python Synchronization APIs

C/Python liblnetconfig APIs

LUTF Test Scripts Design Overview

LUTF Communication Protocol

Test Environment Set-Up