...
It is sometimes desirable to fine tune the selection of local/remote NIs used for communication. For example currently if there are two networks an o2ib OPA and a tcp MLX network, both will be used. Especially if the traffic volume is low the credits criteria will be equivalent between the nodes, and both networks will be used in round robin. However, the user might want to use one network for all traffic and keep the other network free unless the other network goes down.
...
UDSPs are configured from lnetctl via either command line or YAML config files and then passed to the kernel. Policies are applied to all local networks and remote peers then stored in the kernel. During the selection process the policies are examined as part of the selection algorithm. Whenever new peers/peer_nis/local networks/local nis are added they are matched against the rules.The user interface is recorded here.Policies will be the top most priority in the selection process since it is user defined. The rest of the selection criteria will be applied on the subset of interfaces which match the policies.
UDSP Rules Types
There are three UDSP rule types
- Network rules
- NID rules
- Pair rules
Network Rules
These rules define the relative priority of the networks against each other. 0 is the highest priority. Networks with higher priorities will be selected during the selection algorithm.
NID Rules
These rules define the relative priority of individual NIDs. 0 is the highest priority. Once a network is selected the NID with the highest priority is preferred. Note that NID priority is prioritized below health. For example, if there are two NIDs, NID-A and NID-B. NID-A has higher priority but lower health value, NID-B will still be selected. In that sense the policies are there as a hint to guide the selection algorithm.
Pair Rules
Design Principle
Rules shall be defined throug user space and passed to the LNet module. LNet shall store these rules. The net priority and NI priority as separate rules, stored in a separate data structure. Once they are configured they can be applied to the networks. The advantage of that is that rules are not strictly tied to the internal constructs, but can be applied whenever the internal constructs are created and if the internal constructs are deleted then they remain and can be automatically applied at a future time.
This makes configuration easy since a set of rules can be defined, like "all IB networks priority 1", "all Gemini networks priority 2", etc, and when a network is added, it automatically inherits these rules.
Selection policy rules are comprised of two parts:
- The matching rule
- The rule action
The matching rule is what's used to match a NID or a network. The action is what's applied when the rule is matched.
A rule can be uniquely identified using the matching rule or an internal ID which assigned by the LNet module when a rule is added and returned to the user space when they are returned as a result of a show command.
cfg-100, cfg-105, cfg-110, cfg-115, cfg-120, cfg-125, cfg-130, cfg-135, cfg-140, cfg-160, cfg-165
lnetctl Interface
|
YAML Syntax
Each selection rule will translate into a separate IOCLT to the kernel.
|
Flattening rules
Rules will have a serialize and deserialize APIs. The serialize API will flatten the rules into a contiguous buffer that will be sent to the kernel. On the kernel side the rules will be deserialzed to be stored and queried. When the userspace queries the rules, the rules are serialized and sent up to user space, which deserializes it and prints it in a YAML format.
DLC API
| Code Block |
|---|
/* This is a common structure which describes an expression */
struct lnet_match_expr {
__u32 lme_start;
__u32 lme_end;
__u32 lme_incr;
char lme_r_expr[0];
};
struct lnet_selection_descriptor {
enum selection_type lsd_type;
char *lsd_pattern1;
char *lsd_pattern2;
union {
__u32 lsda_priority;
} lsd_action_u;
};
/*
* lustre_lnet_add_selection
* Delete the peer NIDs. If all peer NIDs of a peer are deleted
* then the peer is deleted
*
* selection - describes the selection policy rule
* seq_no - sequence number of the command
* err_rc - YAML structure of the resultant return code
*/
int lustre_lnet_add_selection(struct selection_descriptor *selection, int seq_no, struct cYAML **er_rc); |
Selection Policies
There are four different types of rules that this HLD will address:
...
| Gliffy Diagram | ||||
|---|---|---|---|---|
|
Preferred local/remote NID pairs
...
| Gliffy Diagram | ||||
|---|---|---|---|---|
|
Refer to Olaf's LUG 2016/LAD 2016 PPT for more context.
...
| Gliffy Diagram | ||||
|---|---|---|---|---|
|
The rest of the rules will look very similar as above, except that the list of rules included in the memory pointed to by rule_bulk is going to contain the pertinent structure format.
...
| Code Block |
|---|
/* * lnet_sel_rule_serialize() * Serialize the rules pointed to by rules into the memory block that is provided. In order for this * API to work in both Kernel and User space the bulk pointer needs to be passed in. When this API * is called in the kernel, it is expected that the bulk memory is allocated in userspace. This API * is intended to be called from the kernel to serialize the rules before sending it to user space * rules [IN] - rules to be serialized * rule_type [IN] - rule type to be serialized * bulk_size [IN] - size of memory allocated. * bulk [OUT] - allocated block of memory where the serialized rules are stored. */ int lnet_sel_rule_serialize(struct list_head *rules, enum lnet_sel_rule_type rule_type, __u32 *bulk_size, void __user *bulk); /* * lnet_sel_rule_deserialize() * Given a bulk of rule_type rules, deserialize and append rules to the linked * list passed in. Each rule is assigned an ID > 0 if an ID is not already assigned * bulk [IN] - memory block containing serialized rules * bulk_size [IN] - size of bulk memory block * rule_type [IN] - type of rule to deserialize * rules [OUT] - linked list to append the deserialized rules to */ int lnet_sel_rule_deserialize(void __user *bulk, __u32_bulk_size, enum lnet_sel_rule_type rule_type, struct list_head *rules); |
...
Policy IOCTL Handling
Three new IOCTLs will need to be added: IOC_LIBCFS_ADD_RULES, IOC_LIBCFS_DEL_RULES, and IOC_LIBCFS_GET_RULES.
...
| Gliffy Diagram | ||||||
|---|---|---|---|---|---|---|
|
The diagram above was inspired by: https://www.ece.tufts.edu/~karen/classes/final_presentation/Dragonfly_Topology_Long.pptx
...