...
UDSPs are configured from lnetctl via either command line or YAML config files and then passed to the kernel. Policies are applied to all local networks and remote peers then stored in the kernel. During the selection process the policies are examined as part of the selection algorithm. Policies will be the top most priority in the selection process since it is user defined. The rest of the selection criteria will be applied on the subset of interfaces which match the policies.
UDSP Rules Types
There Outlined below are three the UDSP rule types
- Network rules
- NID rules
- NID Pair rules
- NET Pair rules
- Router rules
Network Rules
These rules define the relative priority of the networks against each other. 0 is the highest priority. Networks with higher priorities will be selected during the selection algorithm, unless the network has no healthy interfaces. If there exists an interface on another network which can be used and its healthier than any which are available on the current network, then that one will be used. Health will always trump all other criteria.
NID Rules
These rules define the relative priority of individual NIDs. 0 is the highest priority. Once a network is selected the NID with the highest priority is preferred. Note that NID priority is prioritized below health. For example, if there are two NIDs, NID-A and NID-B. NID-A has higher priority but lower health value, NID-B will still be selected. In that sense the policies are there as a hint to guide the selection algorithm.
NID Pair Rules
Design Principle
Rules shall be defined throug user space and passed to the LNet module. LNet shall store these rules. The net priority and NI priority as separate rules, stored in a separate data structure. Once they are configured they can be applied to the networks. The advantage of that is that rules are not strictly tied to the internal constructs, but can be applied whenever the internal constructs are created and if the internal constructs are deleted then they remain and can be automatically applied at a future time.
...
These rules define the relative priority of paths. 0 is the highest priority. Once a destination NID is selected the source NID with the highest priority is selected to send from.
Net Pair Rules
Net Pair Rules is a generalization of the NID Pair Rules. It attempts to give a priority for all NIDs on two different networks. This can be done by using ip2nets format while defining the NID Pair Rules. For example *@tcp → *@o2ib
Router Rules
Router Rules define which set of routers to use. When defining a network there could be paths which are more optimal than others. To have more control over the path traffic takes, admins configure interfaces on different networks, and split up the router pools among the networks. However, this results in complex configuration, which is hard to maintain and error prone. It is much more desirable to configure all interfaces on the same network, and then define which routers to use when sending to a remote peer. Router Rules alow this functionality
Design Principles
Rule Storage
Rules shall be defined through user space and passed to the LNet module. LNet shall store these rules on a policy list. Each rule type will have its own list. Once policies are added to LNet they will be applied on existing networks, NIDs and routers. The advantage of this approach is that rules are not strictly tied to the internal constructs, IE networks, NIDs or routers, but can be applied whenever the internal constructs are created and if the internal constructs are deleted then they remain and can be automatically applied at a future time.
This makes configuration easy since a set of rules can be defined, like "all IB networks priority 1", "all Gemini networks priority 2", etc, and when a network is added, it automatically inherits these rules.
Rule Application
Performance needs to be taken into account with this feature. It is not feasible to traverse the policy lists on every send operation. This will add unnecessary overhead. When rules are applied they have to be "falttened" to the constructs they impact. For example, a Network Rule is added as follows: o2ib priority 0. This rule gives priority for using o2ib network for sending. A priority field in the network will be added. This will be set to 0 for the o2ib network. As we traverse the networks in the selection algorithm, which is part of the current code, the priority field will be compared. This is a more optimal approach than examining the policies on every network to see if it matches or not.
Rule Structure
Selection policy rules are comprised of two parts:
...
cfg-100, cfg-105, cfg-110, cfg-115, cfg-120, cfg-125, cfg-130, cfg-135, cfg-140, cfg-160, cfg-165
lnetctl Interface
|
YAML Syntax
Each selection rule will translate into a separate IOCLT to the kernel.
|
Flattening rules
Rules will have a serialize and deserialize APIs. The serialize API will flatten the rules into a contiguous buffer that will be sent to the kernel. On the kernel side the rules will be deserialzed to be stored and queried. When the userspace queries the rules, the rules are serialized and sent up to user space, which deserializes it and prints it in a YAML format.
...