Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

UDSPs are configured from lnetctl via either command line or YAML config files and then passed to the kernel. Policies are applied to all local networks and remote peers then stored in the kernel. During the selection process the policies are examined as part of the selection algorithm. Policies will be the top most priority in the selection process since it is user defined. The rest of the selection criteria will be applied on the subset of interfaces which match the policies.

UDSP Rules Types

There Outlined below are three the UDSP rule types

  1. Network rules
  2. NID rules
  3. NID Pair rules
  4. NET Pair rules
  5. Router rules

Network Rules

These rules define the relative priority of the networks against each other. 0 is the highest priority. Networks with higher priorities will be selected during the selection algorithm, unless the network has no healthy interfaces. If there exists an interface on another network which can be used and its healthier than any which are available on the current network, then that one will be used. Health will always trump all other criteria.

NID Rules

These rules define the relative priority of individual NIDs. 0 is the highest priority. Once a network is selected the NID with the highest priority is preferred. Note that NID priority is prioritized below health. For example, if there are two NIDs, NID-A and NID-B. NID-A has higher priority but lower health value, NID-B will still be selected. In that sense the policies are there as a hint to guide the selection algorithm.

NID Pair Rules

Design Principle

Rules shall be defined throug user space and passed to the LNet module. LNet shall store these rules. The net priority and NI priority as separate rules, stored in a separate data structure. Once they are configured they can be applied to the networks. The advantage of that is that rules are not strictly tied to the internal constructs, but can be applied whenever the internal constructs are created and if the internal constructs are deleted then they remain and can be automatically applied at a future time.

...

These rules define the relative priority of paths. 0 is the highest priority. Once a destination NID is selected the source NID with the highest priority is selected to send from.

Net Pair Rules

Net Pair Rules is a generalization of the NID Pair Rules. It attempts to give a priority for all NIDs on two different networks. This can be done by using ip2nets format while defining the NID Pair Rules. For example *@tcp → *@o2ib

Router Rules

Router Rules define which set of routers to use. When defining a network there could be paths which are more optimal than others. To have more control over the path traffic takes, admins configure interfaces on different networks, and split up the router pools among the networks. However, this results in complex configuration, which is hard to maintain and error prone. It is much more desirable to configure all interfaces on the same network, and then define which routers to use when sending to a remote peer. Router Rules alow this functionality

Design Principles

Rule Storage

Rules shall be defined through user space and passed to the LNet module. LNet shall store these rules on a policy list. Each rule type will have its own list. Once policies are added to LNet they will be applied on existing networks, NIDs and routers. The advantage of this approach is that rules are not strictly tied to the internal constructs, IE networks, NIDs or routers, but can be applied whenever the internal constructs are created and if the internal constructs are deleted then they remain and can be automatically applied at a future time.

This makes configuration easy since a set of rules can be defined, like "all IB networks priority 1", "all Gemini networks priority 2", etc, and when a network is added, it automatically inherits these rules.

Rule Application

Performance needs to be taken into account with this feature. It is not feasible to traverse the policy lists on every send operation. This will add unnecessary overhead. When rules are applied they have to be "falttened" to the constructs they impact. For example, a Network Rule is added as follows: o2ib priority 0. This rule gives priority for using o2ib network for sending. A priority field in the network will be added. This will be set to 0 for the o2ib network. As we traverse the networks in the selection algorithm, which is part of the current code, the priority field will be compared. This is a more optimal approach than examining the policies on every network to see if it matches or not.

Rule Structure

Selection policy rules are comprised of two parts:

...

cfg-100, cfg-105, cfg-110, cfg-115, cfg-120, cfg-125, cfg-130, cfg-135, cfg-140, cfg-160, cfg-165

lnetctl Interface

# Adding a network priority rule. If the NI under the network doesn't have
# an explicit priority set, it'll inherit the network priority:
lnetctl > selection net [add | del | show] -h
Usage: selection net add --net <network name> --priority <priority>
  
WHERE:
 
selection net add: add a selection rule based on the network priority
        --net: network string (e.g. o2ib or o2ib* or o2ib[1,2])
        --priority: Rule priority
 
Usage: selection net del --net <network name> [--id <rule id>]
  
WHERE:
 
selection net del: delete a selection rule given the network patter or the id. If both
                   are provided they need to match or an error is returned.
        --net: network string (e.g. o2ib or o2ib* or o2ib[1,2])
        --id: ID assigned to the rule returned by the show command.
  
Usage: selection net show [--net <network name>]
 
WHERE:
 
selection net show: show selection rules and filter on network name if provided.
        --net: network string (e.g. o2ib or o2ib* or o2ib[1,2])
  
# Add a NID priority rule. All NIDs added that match this pattern shall be assigned
# the identified priority. When the selection algorithm runs it shall prefer NIDs with
# higher priority.
lnetctl > selection nid [add | del | show] -h
Usage: selection nid add --nid <NID> --priority <priority>
 
WHERE:
 
selection nid add: add a selection rule based on the nid pattern
        --nid: nid pattern which follows the same syntax as ip2net
        --priority: Rule priority
 
 
Usage: selection nid del --nid <NID> [--id <rule id>]
 
WHERE:
 
selection nid del: delete a selection rule given the nid patter or the id. If both
                   are provided they need to match or an error is returned.
        --nid: nid pattern which follows the same syntax as ip2net
        --id: ID assigned to the rule returned by the show command.
 
 
Usage: selection nid show [--nid <NID>]
 
WHERE:
 
selection nid show: show selection rules and filter on NID pattern if provided.
        --nid: nid pattern which follows the same syntax as ip2net
# Adding point to point rule. This creates an association between a local NI and a remote
# NID, and assigns a priority to this relationship so that it's preferred when selecting a pathway..
lnetctl > selection peer [add | del | show] -h
Usage: selection peer add --local <NID> --remote <NID> --priority <priority>
 
WHERE:
 
selection peer add: add a selection rule based on local to remote pathway
        --local: nid pattern which follows the same syntax as ip2net
        --remote: nid pattern which follows the same syntax as ip2net
        --priority: Rule priority
 
Usage: selection peer del --local <NID> --remote <NID> --id <ID>
 
WHERE:
 
selection peer del: delete a selection rule based on local to remote NID pattern or id
        --local: nid pattern which follows the same syntax as ip2net
        --remote: nid pattern which follows the same syntax as ip2net
        --id: ID of the rule as provided by the show command.
 
Usage: selection peer show [--local <NID>] [--remote <NID>]
 
WHERE:
 
selection peer show: show selection rules and filter on NID patterns if provided.
        --local: nid pattern which follows the same syntax as ip2net
        --remote: nid pattern which follows the same syntax as ip2net
 
# the output will be of the same YAML format as the input described below.

YAML Syntax

Each selection rule will translate into a separate IOCLT to the kernel.

# Configuring Network rules
selection:
    - type: net
      net: <net name or pattern. e.g. o2ib1, o2ib*, o2ib[1,2]>
      priority: <Unsigned integer where 0 is the highest priority>
 
# Configuring NID rules:
selection:
    - type: nid
      nid: <a NID pattern as described in the Lustre Manual ip2net syntax>
      priority: <Unsigned integer where 0 is the highest priority>
 
# Configuring Point-to-Point rules.
selection:
    - type: peer
      local: <a NID pattern as described in the Lustre Manual ip2net syntax>
      remote: <a NID pattern as described in the Lustre Manual ip2net syntax>
      priority: <Unsigned integer where 0 is the highest priority>
 
# to delete the rules, there are two options:
# 1. Whenever a rule is added it will be assigned a unique ID. Show command will display the
#    unique ID. The unique ID must be explicitly identified in the delete command.
# 2. The rule is matched in the kernel based on the matching rule, unique identifier.
#    This means that there can not exist two rules that have the exact matching criteria
# Both options shall be supported.

Flattening rules

Rules  will have a serialize and deserialize APIs. The serialize API will flatten the rules into a contiguous buffer that will be sent to the kernel. On the kernel side the rules will be deserialzed to be stored and queried. When the userspace queries the rules, the rules are serialized and sent up to user space, which deserializes it and prints it in a YAML format.

...