Reference

LU-18029

Objective

This document is a base for design and implementation of a more flexible and safer configuration of the LNet module.

Static configuration

Currently the system is configured through a static module configuration file. The lnetctl commands are another way to configure the LNet modules.

The network configuration is also done through this interface but restricts a flexible configuration and may also impose a security and stability problem through the parsing that is done in the kernel.

Framework

Static configuration

Basic module parameters (timeouts, retries, max, ..) shall still be configurable. There should be an option to enable and disable certain functions like mrrouting, arp and any future functionality.·

Dynamic configuration

The implementation shall be able to add, remove and modify scripts dynamically. Those changes will be active after the next startup.
The sequence of script execution shall be configurable. The scripts contain mainly, but not exclusive, network and parameter configuration that can be set with lnetctl.

User level configuration

The configuration shall happen on a user level running as root. These scripts will be executed by the next lustre restart. This should make sure that faulty configuration doesn't end up in a kernel crash.

The correct interpretation of enabling and disabling functions ( i.e.: skip_mrrouting) in the static configuration is handled by the scripts.

Variable system parameters

These would be the responsibility of the scripts, but has to be supported be the framework.

EMF integration

Is also in the responsibility of the scripts and should be supported.

Implementation

Creating new lustre rule and script for LNet configuration

A dynamic configuration could be implemented by a rule for the lustre module that is executed after loading the module. An additional rule would be necessary that executes a script.

The first step would be a test for LNet network configuration

Creating new conf directory

The next step would be introducing a directory that contains the scripts. A control script has to be written that processes all configuration files with a specific extension in lexical order. Similar to .conf.d directories in Linux.

The LNet network configuration will be moved into the directory.

Adjusting existing scripts and move to conf directory


Next the existing scripts like mrrouting and arp settings can be transitioned to the new structure and related rules and code in the lnetctl module can be removed.

Transitioning ip2net (This is optional)


After this framework is implemented it is possible to configure also ip2net from user level scripts. This makes it possible to remove ip2net parsing and execution from the kernel.
The ip2net parsing should be implemented in lnetctl and single ioctl/netlink commands to add interfaces.


A further step (down the road)

This is optional after dynamic configuration is implemented and lnetctl modified.
Removing ip2net configuration from the static configuration file and parsing from the kernel module. 

  • No labels

1 Comment

  1. In general, I don't see a clear description of the motivation for this.  Beyond "flexible" and "safety" it would be useful to reference specific examples of what is missing or limiting in the current implementation.  Not that I'm against this improvement, but any change brings bugs along with it, so I'd like to better understand how serious the issues are that are being faced by customers that this feature would resolve vs. other LNet improvements (e.g. LU-17515 or EX-9066) that will also improve usability and robustness of the system?