...
The LUTF is meant to cover the following test use cases:
Use Case |
---|
Description |
---|
Single node configuration |
|
|
LUTF Design Overview
The LUTF is designed with a Master-Agent approach to test LNet. The Master and Agent LUTF instance uses a telnet python module to communicate with each other and more than one Agent can communicate with single Master instance at the same time. The Master instance controls the execution of the python test scripts to test LNet on Agent instances. It collects the results of all the tests run on Agents and write them to a YAML file. It also controls the synchronization mechanism between test-scripts running on different Agents.
The below diagram shows how LUTF interacts with LNet
Gliffy Diagram | ||||||
---|---|---|---|---|---|---|
|
Figure 1: System Level Diagram
LUTF Data Flow
...
LUTF Deployment
The LUTF will provide a dependency script, lutf_dep.py,
which will download and install all the necessary elements defined above.
The LUTF will integrate with auster. LUTF should just run like any other Lustre test. A bash wrapper script will be created to execute the LUTF, lutf.sh
.
SIDE NOTE: Since LUTF simply just runs python scripts, it can run any test, including Lustre tests.
Auster
auster
configuration scripts set up the environment variables required for the tests to run. These environment variables include:
- The nodes involved in the tests
- The devices to use for storage
- The clients
- The PDSH command to use
It also sets a host of specific Lustre environment variables.
It then executes the tests scripts, ex: sanity.sh
sanity.sh
can then run scripts utilizing the information provided in the environment variables.
LUTF and Auster
The LUTF will build on the existing test infrastructure.
An lutf.sh
script will be created, which will be executed from auster
.
auster
will continue to setup the environment variables it does as of the time of this writing. The lutf.sh
will run the LUTF. Since the LUTF is run within the auster
context, the test python scripts will have access to these environment variables and can use them the same way as the bash test scripts do. If LUTF python scripts are executed on the remote node the necessary information from the environment variables are delivered to these scripts.
Test Prerequisites
Before each test the lutf.sh
will provide functions to perform the following checks:
- If the master hasn't started, start it.
- If the agents on the nodes specified haven't started, then start them.
- Verify the system is ready to start. IE: master and agents are all started.
Test Post-requisites
- Provide test results in YAML format.
It's the responsibility of the test scripts to ensure that the system is in an expected state; ie: file system unmounted, modules unloaded, etc.
LUTF Threading Overview
...
All tests are run on one node. | |
Multi-node/no File system testing |
These tests require node synchronization. For example if a script is configuring node A, node B can not start traffic until node A has finished configuration. |
Multi-node/File system testing |
These tests require node synchronization. |
Error Injection testing |
These tests require node synchronization. |
LUTF Design Overview
The LUTF is designed with a Master-Agent approach to test LNet. The Master and Agent LUTF instance uses a telnet python module to communicate with each other and more than one Agent can communicate with single Master instance at the same time. The Master instance controls the execution of the python test scripts to test LNet on Agent instances. It collects the results of all the tests run on Agents and write them to a YAML file. It also controls the synchronization mechanism between test-scripts running on different Agents.
The below diagram shows how LUTF interacts with LNet
Gliffy Diagram | ||||||
---|---|---|---|---|---|---|
|
Figure 1: System Level Diagram
LUTF Data Flow
Gliffy Diagram | ||||||
---|---|---|---|---|---|---|
|
LUTF Deployment
The LUTF will provide a dependency script, lutf_dep.py,
which will download and install all the necessary elements defined above.
The LUTF will integrate with auster. LUTF should just run like any other Lustre test. A bash wrapper script will be created to execute the LUTF, lutf.sh
.
SIDE NOTE: Since LUTF simply just runs python scripts, it can run any test, including Lustre tests.
Auster
auster
configuration scripts set up the environment variables required for the tests to run. These environment variables include:
- The nodes involved in the tests
- The devices to use for storage
- The clients
- The PDSH command to use
It also sets a host of specific Lustre environment variables.
It then executes the tests scripts, ex: sanity.sh
sanity.sh
can then run scripts utilizing the information provided in the environment variables.
LUTF and Auster
The LUTF will build on the existing test infrastructure.
An lutf.sh
script will be created, which will be executed from auster
.
auster
will continue to setup the environment variables it does as of the time of this writing. The lutf.sh
will run the LUTF. Since the LUTF is run within the auster
context, the test python scripts will have access to these environment variables and can use them the same way as the bash test scripts do. If LUTF python scripts are executed on the remote node the necessary information from the environment variables are delivered to these scripts.
Test Prerequisites
Before each test the lutf.sh
will provide functions to perform the following checks:
- If the master hasn't started, start it.
- If the agents on the nodes specified haven't started, then start them.
- Verify the system is ready to start. IE: master and agents are all started.
Test Post-requisites
- Provide test results in YAML format.
It's the responsibility of the test scripts to ensure that the system is in an expected state; ie: file system unmounted, modules unloaded, etc.
LUTF Threading Overview
Gliffy Diagram | ||||
---|---|---|---|---|
|
Thread Description
- Listener: Listens for connections from LUTF Agents and for Heartbeats to monitor aliveness of the Agents.
- HeartBeat: Send a periodic heartbeat to the LUTF Master to inform it that the agent is still alive.
- Python Interpreter: Executes python test scripts which can call into one of the C/Python APIs provided
C/Python APIs
C/Python Management API
- Parse configuration
- provide status on the LUTF Agents
- provide status on executing scripts
- Store results
C/Python Synchronization APIs
- Assign work to LUTF Agents from LUTF Master
- This will result in a YAML rpc block being sent to the LUTF agent
- Wait for work completion events from LUTF Agents
- Register for asynchronus events
- Asynchronous events come in the form of YAML blocks.
C/Python liblnetconfig APIs
- These are the configuration APIs in
lnet/utils/lnetconfig/liblnetconfig.h
Other APIs can be wrapped in SWIG and exposed for the LUTF python test scripts to call
LUTF Test Scripts Design Overview
- The test scripts will be deployed on all nodes under test as well as the test master.
- Each test script will need to provide a
run
function- This function is intended to be executed by the test master
- The LUTF will provide at least one other function to perform the actual testing.
- This function will be called remotely and will execute on the test node.
- Each test, which can be composed of arbetrary python code, must return a YAML text block to the test master reporting the results of the operation.
- All functions should always take a dictionary as its input parameter and output a dictionary as its return result
LUTF Communication Protocol
Code Block |
---|
rpc:
target: agent_id
type: function_call
fname: function_name
parameters:
param0: value
param1: value2
param2: 1
param3: [1, 2, 3]
param4: 1.4 |
Test Environment Set-Up
Each node which will run the LUTF will need to have the following installed
...