Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Gliffy Diagram
nameLUTF Data Flow
pagePin12

LUTF Threading Overview

Gliffy Diagram
2
nameThreading Overview
pagePin3

The LUTF is designed to allow master-agent, agent-agent or master-master communication. For the first phase of the implementation we will implement the master-agent communication.

Thread Description

  • Listener: Listens for connections from LUTF Agents and for Heartbeats to monitor aliveness of the Agents.
  • HeartBeat: Send a periodic heartbeat to the LUTF Master to inform it that the agent is still alive.
  • Python Interpreter: Executes python test scripts which can call into one of the C/Python APIs provided

...

auster will continue to setup the environment variables it does as of the time of this writing. The lutf.sh will run the LUTF. Since the LUTF is run within the auster context, the test python scripts will have access to these environment variables and can use them the same way as the bash test scripts do. If LUTF python scripts are executed on the remote node the necessary information from the environment variables are delivered to these scripts.

Test Prerequisites

Before each test the lutf.sh will provide functions to perform the following checks:

  1. If the master hasn't started, start it.
  2. If the agents on the nodes specified haven't started, then start them.
  3. Verify the system is ready to start. IE: master and agents are all started.

Test Post-requisites

  1. Provide test results in YAML format.

It's the responsibility of the test scripts to ensure that the system is in an expected state; ie: file system unmounted, modules unloaded, etc.

Auster will run the LUTF as follows

Code Block
./auster -f lutfcfg -rsv -d /opt/results/ lutf [--suite <test suite name>] [--only <test case name>]
example:
./auster -f lutfcfg -rsv -d /opt/results/ lutf --suite samples --only sample_02


Test Prerequisites

Before each test the lutf.sh will provide functions to perform the following checks:

  1. If the master hasn't started, start it.
  2. If the agents on the nodes specified haven't started, then start them.
  3. Verify the system is ready to start. IE: master and agents are all started.

Test Post-requisites

  1. Provide test results in YAML format.

It's the responsibility of the test scripts to ensure that the system is in an expected state; ie: file system unmounted, modules unloaded, etc.

LUTF LUTF Test Scripts Design Overview

...

To execute a function call on a remote node the following RPC YAML block is sent

Code Block
rpc:
   targetdst: agent_id # the IDname of the agent to execute the function on
   typesrc: source_name # name of the originator of the rpc
   type: function_call # Type of the RPC
   script: script_path # Path to the script which includes the function to execute
   fname: function_name # Name of function to execute
   parameters: # Parameters to pass the function
      param0: value # parameters can be string, integer, float or list
      param1: value2
      paramN: valueN

To return the results of the script execution

Code Block
rpc:
   targetdst: master_idagent_id # name of the agent to execute the function on
   src: source_name # mastername ID.of Therethe shouldoriginator onlyof bethe onerpc
   type: results # Type of the RPC
   results:
      script: script_path # Path to the script which was executed
      return_code: python_object # return code of function which is a python object

...

Code Block
####### Part of the LUTF infrastructure ########
# The BaseTest class is provided by the LUTF infrastructure
# The rpc method of the BaseTest class will take the parameters,
# serialize it into a YAML block and send it to the target specified.
class BaseTest(object, lutfrpc):
   def __init__(target=None):
      if target:
         self.remote = true
         self.target = target

   def __getattribute__(self,name):
        attr = object.__getattribute__(self, name)
        if hasattr(attr, '__call__'):
            def newfunc(*args, **kwargs):
                if self.remote:
                    # execute on the remote defined by:
                    #     self.target
                    #     attr.__name__ = name of function
                    #     type(self).__name__ = name of class  
                    result = lutfrpc.send_rpc(self.target, attr.__name__, type(self).__name__, *args, **kwargs)
                else:
                    result = attr(*args, **kwargs)
                return result
            return newfunc
        else:
            return attr

###### In the test script ######
# Each test case will inherit from the BaseTest class.
class Test_1a(BaseTest):
   def __init__(target):
      # call base constructor
      super(Test_1a, self).__init__(target)
   def methodA(parameters):
	  # do some test logic
   def methodB(parameters):
      # do some more test logic

# The run function will be executed by the LUTF master
# it will instantiate the Test or the step of the test to run
# then call the class' run function providing it with a dictionary
# of parameters
def run(dictionary, results):
   target = lutf.get_target('mds')
   # do some logic
   Test1a = Test_1a(target);
   result = Test1a.methodA(params)
   if (test for result success):
       result2 = Test1a.methodb(more_params)
   # append the results_yaml to the global results

To simplify matters Test parameters take only a dictionary as input. The dictionary can include arbitrary data,  which can be encoded in YAML eventually.

Communication Infrastructure

...

The LUTF provided rpc communciation relies on a simple socket implementation.

  1. The LUTF Python RPC call will package the following into a YAML block:
    1. absolute file path
    2. class name
    3. function name
    4. arguments passed to the function
  2. The LUTF Python RPC call will call into an LUTF provided C API to send the rpc text block to the target specified and block for response
  3. The LUTF slave listener will recieve the rpc YAML text block and pass it up to the python layer
  4. Python layer will parse the rpc YAML text block into a python dictionary and will instantiate the class specified and call the method
  5. It'll take the return values from the executed method pack it up in an RPC YAML block and call the same C API to send back the YAML block to the waiting master.
  6. The master will receive the RPC YAML text block and pass it up to the python RPC layer
  7. Python RPC layer will decode the YAML text block into a python dictionary and return the results

This mechanism will also allow the test class methods to be executed locally, by not providing a target

The LUTF can read all the environment variables provided and encode them into the YAML being sent to the node under test. This way the node under test has all the information it needs to execute.

Test Environment Set-Up

Each node which will run the LUTF will need to have the following installed

  1. ncurses library
    1. yum install ncurses-devel
  2. readline library
    1. yum install readline-devel
  3. python 2.7.5
    1. https://www.python.org/download/releases/2.7.5/
    2. ./configure --prefix=<> --enable-shared # it is recommended to install in standard system path
    3. make; make install
  4. setuptools
    1. https://pypi.python.org/pypi/setuptools
    2. The way it worked for me:
      1. Download package and untar
      2. python2.7 setup.py install
  5. psutils
    1. https://pypi.python.org/pypi?:action=display&name=psutil
      1. untar
      2. cd to untared directory
      3. python2.7 setup.py install
  6. netifaces
    1. https://pypi.python.org/pypi/netifaces
  7. Install PyYAML
    1. pip isntall pyyaml

The LUTF will also require that passwordless ssh is setup for all the nodes which run the LUTF. This task is already done when the AT sets up the test cluster.

Building the LUTF

The LUTF shall be integrated with the Lustre tests under lustre/tests/lutf. The LUTF will be built and packaged with the standard

Code Block
sh ./autogen.sh
./configure --with-linux=<kernel path>
make
# optionally
make rpms
# optionally
make install

The make system will build the following items:

  1. lutf binary
  2. liblutf_agent.so - shared library to communicate with the LUTF backend.
  3. clutf_agen.py and _clutf_agent.so: glue code that allows python to call functions in liblutf_agent.so
  4. lnetconfig.py and _lnetconfig.so - glue code to allow python test scripts to utilize the DLC interface.

The build process will check if python 2.7.5 and SWIG 2.0 or higher is installed before building. If these requirements are not met the LUTF will not be built

If the LUTF is built it will be packaged in the lustre-tests rpm and installed in /usr/lib64/lustre/tests/lutf.

Tasks

...

  • lutf binary
  • listener thread
  • Heart beat
  • python integration
    • Look into having a choice between python 3.x and python 2.7.x
  • IPC

...

  • SWIG infrastructure to call liblnetconfig
(parameters):
	  # do some test logic
   def methodB(parameters):
      # do some more test logic

# The run function will be executed by the LUTF master
# it will instantiate the Test or the step of the test to run
# then call the class' run function providing it with a dictionary
# of parameters
def run(dictionary, results):
   target = lutf.get_target('mds')
   # do some logic
   Test1a = Test_1a(target);
   result = Test1a.methodA(params)
   if (test for result success):
       result2 = Test1a.methodb(more_params)
   # append the results_yaml to the global results

To simplify matters Test parameters take only a dictionary as input. The dictionary can include arbitrary data,  which can be encoded in YAML eventually.

Communication Infrastructure

Gliffy Diagram
nameCallFlow
pagePin1

The LUTF provided rpc communciation relies on a simple socket implementation.

  1. The LUTF Python RPC call will package the following into a YAML block:
    1. absolute file path
    2. class name
    3. function name
    4. arguments passed to the function
  2. The LUTF Python RPC call will call into an LUTF provided C API to send the rpc text block to the target specified and block for response
  3. The LUTF slave listener will recieve the rpc YAML text block and pass it up to the python layer
  4. Python layer will parse the rpc YAML text block into a python dictionary and will instantiate the class specified and call the method
  5. It'll take the return values from the executed method pack it up in an RPC YAML block and call the same C API to send back the YAML block to the waiting master.
  6. The master will receive the RPC YAML text block and pass it up to the python RPC layer
  7. Python RPC layer will decode the YAML text block into a python dictionary and return the results

This mechanism will also allow the test class methods to be executed locally, by not providing a target

The LUTF can read all the environment variables provided and encode them into the YAML being sent to the node under test. This way the node under test has all the information it needs to execute.

Test Environment Set-Up

Each node which will run the LUTF will need to have the following installed

  1. ncurses library
    1. yum install ncurses-devel
  2. readline library
    1. yum install readline-devel
  3. rlwrap: Used when telneting into the LUTF telnet server. Allows using up/down errors and other readline features
    1. yum install rlwrap 
  4. python 3.6+
    1. yum install python3
  5. paramiko
    1. pip3 install paramiko 
  6. netifaces
    1. pip3 install netifaces 
  7. Install PyYAML
    1. pip3 install pyyaml 

The LUTF will also require that passwordless ssh is setup for all the nodes which run the LUTF. This task is already done when the AT sets up the test cluster.

Building the LUTF

The LUTF shall be integrated with the Lustre tests under lustre/tests/lutf. The LUTF will be built and packaged with the standard

Code Block
sh ./autogen.sh
./configure --with-linux=<kernel path>
make
# optionally
make rpms
# optionally
make install

The make system will build the following items:

  1. lutf binary
  2. liblutf_agent.so - shared library to communicate with the LUTF backend.
  3. clutf_agent.py and _clutf_agent.so: glue code that allows python to call functions in liblutf_agent.so
  4. clutf_global.py  and _clutf_global.so : glue code that allows python to call functions in liblutf_global.so
  5. lnetconfig.py and _lnetconfig.so  - glue code to allow python test scripts to utilize the DLC interface.

The build process will check if python 3.6 and SWIG 3.0 or higher is installed before building. If these requirements are not met the LUTF will not be built

If the LUTF is built it will be packaged in the lustre-tests rpm and installed in /usr/lib64/lustre/tests/lutf.

Tasks

TaskDescription
C infrastructure
  • lutf binary
  • listener thread
  • Heart beat
  • python integration
    • Look into having a choice between python 3.x and python 2.7.x
  • IPC
    • Manage connections between the master and the agents
    • Track the agents
    • Provide APIs for Request/Response Pair
      • These APIs will block in the calling thread until a response is received
      • TODO: What happens if we're calling these APIs from separate Python threads?
        • What I'm trying to get at is to see how a script can spawn python threads. These threads can do RPC. While the main test thread can continue doing other test logic.
  • API for managing and querying the state kept by the C infrastructure
    • agent information
SWIG
  • SWIG infrastructure to call C APIs
    • liblnetconfig
    • LUTF Agent Management
    • LUTF RPC
lutf.sh
  • Spawn the master and agents appropriately
  • Pass to the master the suite or specific test to run. If nothing is provided all suites are run.
  • Waits on the master until it exits after running the tests
lutf Python Library
  • Association between Agents and node roles (MGS/MDD/etc)
    • IE build a view of the clustre as identified by the provided environment variables.
  • API for querying the Agents
  • Automatically loaded and initialized
  • API for suites and scripts management and execution
  • Use the lutf Provisioning Library to clean the clustre before running each test.
lutf Provisioning Library
  • API to provision LNet and lnet_selftest
  • API to provision the Lustre File System
    • API should take a dictionary of the different nodes and based on the node types it spawns a simple File system
  • Both APIs can be used together.
    • use the LNet provisioning API to provision and configure LNet
    • use the Lustre FS provisioning API to provision the File system on top of the configured LNet 
  • API to un-provision a clustre described in a python dictionary 
lutf logging infrastructure
  • Set lustre logging levels
  • Collect lustre logs
  • collect syslogs
  • Provide debugging level infrastructure for the test scripts (probably just use the provided Python logging)
  • API for storing YAML results.


OLD INFORMATION

TODO: Below is old information still being cleaned up

Test Environment Set-Up

Each node which will run the LUTF will need to have the following installed

  1. ncurses library
    1. yum install ncurses-devel
  2. readline library
    1. yum install readline-devel
  3. python 2.7.5
    1. https://www.python.org/download/releases/2.7.5/
    2. ./configure --prefix=<> --enable-shared # it is recommended to install in standard system path
    3. make; make install
  4. setuptools
    1. https://pypi.python.org/pypi/setuptools
    2. The way it worked for me:
      1. Download package and untar
      2. python2.7 setup.py install
  5. psutils
    1. https://pypi.python.org/pypi?:action=display&name=psutil
      1. untar
      2. cd to untared directory
      3. python2.7 setup.py install
  6. netifaces
    1. https://pypi.python.org/pypi/netifaces
  7. Install PyYAML
    1. pip isntall pyyaml

The LUTF will also require that passwordless ssh is setup for all the nodes which run the LUTF. This task is already done when the AT sets up the test cluster.

OLD INFORMATION

TODO: Below is old infromation stil being cleaned up

LUTF Configuration Files

Setup YAML Configuration File

...