...
User Interface
Command Line Syntax
As illustrated in the example above all policies can be specified using the following syntax:
Below is the command like syntax for managing UDSPs
| Code Block |
|---|
# Adding a network rule
lnetctl policy add --src
lnetctl policy |
| Code Block |
lnetctl policy <add | del | show>
--src: ip2nets syntax specifying the local NID to match
--dst: ip2nets syntax specifying the remote NID to match
--rte: ip2nets syntax specifying the router NID to match
--priority: Priority to apply to rule matches
--idx: Index of where to insert the rule. By default it appends to
the end of the rule list |
As of the time of this writing only "priority" action shall be implemented. However, it is feasible in the future to implement different actions to be taken when a rule matches. For example, we can implement a "redirect" action, which redirects traffic to another destination. Yet another example is "lawful intercept" or "mirror" action, which mirrors messages to a different destination. This might be useful for keeping a standby server updated with all information going to the primary server. A lawful intercept action allows personnel authorized by a Law Enforcement Agency (LEA) to intercept file operations from targeted clients and send the file operations to an LI Mediation Device.
...
| Code Block |
|---|
udsp:
- idx: <unsigned int>
src: <ip>@<net type>
dst: <ip>@<net type>
rte: <ip>@<net type>
action:
- priority: <unsigned int> |
Overview of Operations
There are three main operations which can be carried out on UDSPs either from the command line or YAML configuration: add, delete, show.
Add
The UI allows adding a new rule. With the use of the idx optional parameter, the admin can specifiy where in the rule chain the new rule should be added. By default the rule is appended to the list. Any other value will result in inserting the rule in that position.
When a new UDSP is added the entire UDSP set is re-evaluated. This means all Nets, NIs and peer NIs in the systems are traversed and the rules re-applied. This is an expensive operation, but given that UDSP management should be a rare operation, it shouldn't be a problem.
Delete
The UI allows deleting an existing UDSP using its index. The index can be shown using the show command. When a UDSP is deleted the entire UDSP set are re-evaluated. The Nets, NIs and peer NIs are traversed and the rules re-applied..
Show
The UI allows showing existng UDSPs. The format of the YAML output is as follows:
| Code Block |
|---|
udsp:
- idx: <unsigned int>
src: <ip>@<net type>
dst: <ip>@<net type>
rte: <ip>@<net type>
action:
- priority: <unsigned int> |
Design
All policies are stored in kernel space. All logic to add, delete and match policies will be implemented in kernel space. This complicates the kernel space processing. Arguably, policy maintenance logic is not core to LNet functionality. What is core is the ability to select source and destination networks and NIDs in accordance with user definitions. However, the kernel is able to manage policies much easier and with less potential race conditions than user space.
Design Principles
UDSPs are comprised of two parts:
- The matching rule
- The rule action
The matching rule is what's used to match a NID or a network. The action is what's applied when the rule is matched.
A rule can be uniquely identified using an internal ID which is assigned by the LNet module when a rule is added and returned to the user space when the UDSPs are shown.
UDSP Storage
UDSPs shall be defined by administrators either via LNet command line utility, lnetctl, or via YAML configuration file. lnetctl parses the UDSP and stores it in an intermediary format, which will be flattened and passed down to the kernel LNet module. LNet shall store these UDSPs on a policy list. Once policies are added to LNet they will be applied on existing networks, NIDs and routers. The advantage of this approach is that UDSPs are not strictly tied to the internal constructs, IE networks, NIDs or routers, but can be applied whenever the internal constructs are created and if the internal constructs are deleted then they remain and can be automatically applied at a future time.
This makes configuration easy since a set of UDSPs can be defined, like "all IB networks priority 1", "all Gemini networks priority 2", etc, and when a network is added, it automatically inherits these rules.
Peers are normally not created explicitly by the administrators. The ULP requests to send a message to a peer or the node receives an unsolicited message from a peer which results in creating a peer construct in LNet. It is feasible, especially for router policies, to have a UDSP which associates a set of clients with in a specific range with a set of optimal routers. Having the policies stored and matched in kernel aids in fulfilling this requirement.
UDSP Application
Performance needs to be taken into account with this feature. It is not feasible to traverse the policy lists on every send operation. This will add unnecessary overhead. When rules are applied they have to be "flattened" to the constructs they impact. For example, a Network Rule is added as follows: o2ib priority 0. This rule gives priority for using o2ib network for sending. A priority field in the network will be added. This will be set to 0 for the o2ib network. As we traverse the networks in the selection algorithm, which is part of the current code, the priority field will be compared. This is a more optimal approach than examining the policies on every send to see if it we get any matches.
...
| Code Block |
|---|
/* lnet structure will keep a list of UDSPs */
struct lnet {
...
list_head ln_udsp_list;
...
}
/* each NID range is defined as net_id and an ip range */
struct lnet_ud_nid_descr {
__u32 ud_net_id;
list_head ud_ip_range;
}
/* UDSP action types */
enum lnet_udsp_action_type {
EN_LNET_UDSP_ACTION_PRIORITY = 0,
EN_LNET_UDSP_ACTION_NONE = 1,
}
/*
* a UDSP rule can have up to three user defined NID descriptors
* - src: defines the local NID range for the rule
* - dst: defines the peer NID range for the rule
* - rte: defines the router NID range for the rule
*
* An action union defines the action to take when the rule
* is matched
*/
struct lnet_udsp {
list_head udsp_on_list;
__u32 idx;
lnet_ud_nid_descr *udsp_src;
lnet_ud_nid_describe *udsp_dst;
lnet_ud_nid_descr *udsp_rte;
enum lnet_udsp_action_type udsp_action_type;
union udsp_action {
__u32 udsp_priority;
};
}
/* The rules are flattened in the LNet structures as shown below */
struct lnet_net {
...
/* defines the relative priority of this net compared to others in the system */
__u32 net_priority;
...
}
struct lnet_ni {
...
/* defines the relative priority of this NI compared to other NIs in the net */
__u32 ni_priority;
...
}
struct lnet_peer_ni {
...
/* defines the relative peer_ni priority compared to other peer_nis in the peer */
__u32 lpni_priority;
/* defines the list of local NID(s) (>=1) which should be used as the source */
union lpni_pref {
lnet_nid_t nid;
lnet_nid_t *nids;
}
/* defines the list of router NID(s) to be used when sending to this peer NI */
lnet_nid_t *lpni_rte_nids;
...
}
/* UDSPs will be passed to the kernel via IOCTL */
#define IOC_LIBCFS_ADD_UDSP _IOWR(IOC_LIBCFS_TYPE, 106, IOCTL_CONFIG_SIZE)
/* UDSP will be grabbed from the kernel via IOCTL
#define IOC_LIBCFS_GET_UDSP _IOWR(IOC_LIBCFS_TYPE, 106, IOCTL_CONFIG_SIZE) |
Kernel IOCTL Handling
| Code Block |
|---|
/* api-ni.c will be modified to handle adding a UDSP */
int
LNetCtl(unsigned int cmd, void *arg)
{
...
case IOC_LIBCFS_ADD_UDSP: {
struct lnet_ioctl_config_udsp *config_udsp = arg;
mutex_lock(&the_lnet.ln_api_mutex);
/*
* add and do initial flattening of the UDSP into
* internal structures
*/
rc = lnet_add_and_flatten_udsp(config_udsp);
mutex_unlock(&the_lnet.ln_api_mutex);
return rc;
}
case IOC_LIBCFS_GET_UDSP: {
struct lnet_ioctl_config_udsp *get_udsp = arg;
mutex_lock(&the_lnet.ln_api_mutex);
/*
* get the udsp at index provided. Return -ENOENT if
* no more UDSPs to get
*/
rc = lnet_add_udsp(get_udsp, get_udsp->idx);
mutex_unlock(&the_lnet.ln_api_mutex);
return rc
}
...
} |
Kernel Selection Algorithm Modifications
| Code Block |
|---|
/*
* select an NI from the Nets with highest priority
*/
struct lnet_ni *
lnet_find_best_ni_on_local_net(struct lnet_peer *peer, int md_cpt)
{
...
list_for_each_entry(peer_net, &peer->lp_peer_nets, lpn_peer_nets) {
...
struct lnet_net *net;
net = lnet_get_net_locked(peer_net->lpn_net_id);
if (!net)
continue
/*
* look only at the NIs with the highest priority and disregard
* nets which have lower priority. Nets with equal priority are
* examined and the best_ni is selected from amongst them.
*/
net_prio = net->net_priority;
if (net_prio > best_net_prio)
continue;
else if (net_prio < best_net_prio) {
best_net_prio = net_prio;
best_ni = NULL;
}
best_ni = lnet_find_best_ni_on_spec_net(best_ni, peer,
best_peer_net, md_cpt, false);
...
}
...
}
/*
* select the NI with the highest priority
*/
static struct lnet_ni *
lnet_get_best_ni(struct lnet_net *local_net, struct lnet_ni *best_ni,
struct lnet_peer *peer, struct lnet_peer_net *peer_net,
int md_cpt)
{
...
ni_prio = ni->ni_priority;
if (ni_fatal) {
continue;
} else if (ni_healthv < best_healthv) {
continue;
} else if (ni_healthv > best_healthv) {
best_healthv = ni_healthv;
if (distance < shortest_distance)
shortest_distance = distance;
/*
* if this NI is lower in priority than the one already set then discard it
* otherwise use it and set the best prioirty so far to this NI's.
*/
} else if ni_prio > best_ni_prio) {
continue;
} else if (ni_prio < best_ni_prio)
best_ni_prio = ni_prio;
}
...
}
/*
* When a UDSP rule associates local NIs with remote NIs, the list of local NIs NIDs
* is flattened to a list in the associated peer_NI. When selecting a peer NI, the
* peer NI with the corresponding preferred local NI is selected.
*/
bool
lnet_peer_is_pref_nid_locked(struct lnet_peer_ni *lpni, lnet_nid_t nid)
{
...
}
/*
* select the peer NI with the highest priority first and then the
* preferred one
*/
static struct lnet_peer_ni *
lnet_select_peer_ni(struct lnet_send_data *sd, struct lnet_peer *peer,
struct lnet_peer_net *peer_net)
{
...
ni_is_pref = lnet_peer_is_pref_nid_locked(lpni, best_ni->ni_nid);
lpni_prio = lpni->lpni_priority;
if (lpni_healthv < best_lpni_healthv)
continue;
/*
* select the NI with the highest priority.
*/
else if lpni_prio > best_lpni_prio)
continue;
else if (lpni_prio < best_lpni_prio)
best_lpni_prio = lpni_prio;
/*
* select the NI which has the best_ni's NID in its preferred list
*/
else if (!preferred && ni_is_pref)
preferred = true;
...
} |
UDSP Marshaling
After a UDSP is parsed in user space it needs to be marshaled and sent to the kernel. The kernel will de-marshal the data and store it in its own data structures. The UDSP is formed of the following pieces of information:
- Index: The index of the UDSP to insert or delete
- Source Address expression: A dot expression describing the source address range
- Net of the Source: A net id of the source
- Destination Address expression: A dot expression describing the destination address range
- Net of the Destination: A net id of the destination
- Router Address expression: A dot expression describing the router address range
- Net of the Router: A net id of the router
- Action Type: An enumeration describing the action type.
- Action: A structure describing the action if the UDSP is matched.
The data flow of a UDSP looks as follows:
Gliffy Diagram name DataFlow pagePin 2
YAML Syntax
Defined here.
Userspace Structures
| Code Block |
|---|
/* each NID range is defined as net_id and an ip range */
struct lnet_ud_nid_descr {
__u32 ud_net_id;
list_head ud_ip_range;
}
/* UDSP action types */
enum lnet_udsp_action_type {
EN_LNET_UDSP_ACTION_PRIORITY = 0,
EN_LNET_UDSP_ACTION_NONE = 1,
}
/*
* a UDSP rule can have up to three user defined NID descriptors
* - src: defines the local NID range for the rule
* - dst: defines the peer NID range for the rule
* - rte: defines the router NID range for the rule
*
* An action union defines the action to take when the rule
* is matched
*/
struct lnet_udsp {
list_head udsp_on_list;
__u32 idx;
lnet_ud_nid_descr *udsp_src;
lnet_ud_nid_describe *udsp_dst;
lnet_ud_nid_descr *udsp_rte;
enum lnet_udsp_action_type udsp_action_type;
union udsp_action {
__u32 udsp_priority;
};
} |
Marshaled Structures
| Code Block |
|---|
struct cfs_range_expr {
struct list_head re_link;
__u32 re_lo;
__u32 re_hi;
__u32 re_stride;
};
struct lnet_ioctl_udsp {
__u32 iou_idx;
enum lnet_udsp_action_type iou_action_type
union action iou_action {
__u32 priority;
}
__u32 iou_src_dot_expr_count;
__u32 iou_dst_dot_expr_count;
__u32 iou_rte_dot_expr_count;
char iou_bulk[0];
}; |
The address is expressed as a list of cfs_range_expr. These need to be marshalled. For IP address there are 4 of these structures. Other type of addresses can have a different number. As an example, gemini will only have one. The corresponding iou_[src|dst|rte]_dot_expr_count is set to the number of expressions describing the address. Each expression is then flattened in the structure. They have to be flattened in the order defined: SRC, DST, RTE.
The kernel will recieve the marshalled data and will form its internal structures. The functions to marshal and de-marshal should be straight forward. Note that user space and kernel space use the same structures. These structure will be defined in a common location. For this reason the functions to marshal and de-marshal will be shared.
Kernel Structure
Defined here.
Structures
| Code Block |
|---|
/* This is a common structure which describes an expression */
struct lnet_match_expr {
};
struct lnet_selection_descriptor {
enum selection_type lsd_type;
char *lsd_pattern1;
char *lsd_pattern2;
union {
__u32 lsda_priority;
} lsd_action_u;
};
/*
* lustre_lnet_add_selection
* Delete the peer NIDs. If all peer NIDs of a peer are deleted
* then the peer is deleted
*
* selection - describes the selection policy rule
* seq_no - sequence number of the command
* err_rc - YAML structure of the resultant return code
*/
int lustre_lnet_add_selection(struct selection_descriptor *selection, int seq_no, struct cYAML **er_rc); |
cfg-100, cfg-105, cfg-110, cfg-115, cfg-120, cfg-125, cfg-130, cfg-135, cfg-140, cfg-160, cfg-165
lnetctl Interface
...
# Adding a network priority rule. If the NI under the network doesn't have
# an explicit priority set, it'll inherit the network priority:
lnetctl > selection net [add | del | show] -h
Usage: selection net add --net <network name> --priority <priority>
WHERE:
selection net add: add a selection rule based on the network priority
--net: network string (e.g. o2ib or o2ib* or o2ib[1,2])
--priority: Rule priority
Usage: selection net del --net <network name> [--id <rule id>]
WHERE:
selection net del: delete a selection rule given the network patter or the id. If both
are provided they need to match or an error is returned.
--net: network string (e.g. o2ib or o2ib* or o2ib[1,2])
--id: ID assigned to the rule returned by the show command.
Usage: selection net show [--net <network name>]
WHERE:
selection net show: show selection rules and filter on network name if provided.
--net: network string (e.g. o2ib or o2ib* or o2ib[1,2])
# Add a NID priority rule. All NIDs added that match this pattern shall be assigned
# the identified priority. When the selection algorithm runs it shall prefer NIDs with
# higher priority.
lnetctl > selection nid [add | del | show] -h
Usage: selection nid add --nid <NID> --priority <priority>
WHERE:
selection nid add: add a selection rule based on the nid pattern
--nid: nid pattern which follows the same syntax as ip2net
--priority: Rule priority
Usage: selection nid del --nid <NID> [--id <rule id>]
WHERE:
selection nid del: delete a selection rule given the nid patter or the id. If both
are provided they need to match or an error is returned.
--nid: nid pattern which follows the same syntax as ip2net
--id: ID assigned to the rule returned by the show command.
Usage: selection nid show [--nid <NID>]
WHERE:
selection nid show: show selection rules and filter on NID pattern if provided.
--nid: nid pattern which follows the same syntax as ip2net
# Adding point to point rule. This creates an association between a local NI and a remote
# NID, and assigns a priority to this relationship so that it's preferred when selecting a pathway..
lnetctl > selection peer [add | del | show] -h
Usage: selection peer add --local <NID> --remote <NID> --priority <priority>
WHERE:
selection peer add: add a selection rule based on local to remote pathway
--local: nid pattern which follows the same syntax as ip2net
--remote: nid pattern which follows the same syntax as ip2net
--priority: Rule priority
Usage: selection peer del --local <NID> --remote <NID> --id <ID>
WHERE:
selection peer del: delete a selection rule based on local to remote NID pattern or id
--local: nid pattern which follows the same syntax as ip2net
--remote: nid pattern which follows the same syntax as ip2net
--id: ID of the rule as provided by the show command.
Usage: selection peer show [--local <NID>] [--remote <NID>]
WHERE:
selection peer show: show selection rules and filter on NID patterns if provided.
--local: nid pattern which follows the same syntax as ip2net
--remote: nid pattern which follows the same syntax as ip2net
# the output will be of the same YAML format as the input described below. |
As of the time of this writing only "priority" action shall be implemented. However, it is feasible in the future to implement different actions to be taken when a rule matches. For example, we can implement a "redirect" action, which redirects traffic to another destination. Yet another example is "lawful intercept" or "mirror" action, which mirrors messages to a different destination. This might be useful for keeping a standby server updated with all information going to the primary server. A lawful intercept action allows personnel authorized by a Law Enforcement Agency (LEA) to intercept file operations from targeted clients and send the file operations to an LI Mediation Device.
Anchor YAMLSyntax YAMLSyntax
YAML Syntax
| YAMLSyntax | |
| YAMLSyntax |
| Code Block |
|---|
udsp:
- idx: <unsigned int>
src: <ip>@<net type>
dst: <ip>@<net type>
rte: <ip>@<net type>
action:
- priority: <unsigned int> |
Overview of Operations
There are three main operations which can be carried out on UDSPs either from the command line or YAML configuration: add, delete, show.
Add
The UI allows adding a new rule. With the use of the idx optional parameter, the admin can specifiy where in the rule chain the new rule should be added. By default the rule is appended to the list. Any other value will result in inserting the rule in that position.
When a new UDSP is added the entire UDSP set is re-evaluated. This means all Nets, NIs and peer NIs in the systems are traversed and the rules re-applied. This is an expensive operation, but given that UDSP management should be a rare operation, it shouldn't be a problem.
Delete
The UI allows deleting an existing UDSP using its index. The index can be shown using the show command. When a UDSP is deleted the entire UDSP set are re-evaluated. The Nets, NIs and peer NIs are traversed and the rules re-applied..
Show
The UI allows showing existng UDSPs. The format of the YAML output is as follows:
| Code Block |
|---|
udsp:
- idx: <unsigned int>
src: <ip>@<net type>
dst: <ip>@<net type>
rte: <ip>@<net type>
action:
- priority: <unsigned int> |
Design
All policies are stored in kernel space. All logic to add, delete and match policies will be implemented in kernel space. This complicates the kernel space processing. Arguably, policy maintenance logic is not core to LNet functionality. What is core is the ability to select source and destination networks and NIDs in accordance with user definitions. However, the kernel is able to manage policies much easier and with less potential race conditions than user space.
Design Principles
UDSPs are comprised of two parts:
- The matching rule
- The rule action
The matching rule is what's used to match a NID or a network. The action is what's applied when the rule is matched.
A rule can be uniquely identified using an internal ID which is assigned by the LNet module when a rule is added and returned to the user space when the UDSPs are shown.
UDSP Storage
UDSPs shall be defined by administrators either via LNet command line utility, lnetctl, or via YAML configuration file. lnetctl parses the UDSP and stores it in an intermediary format, which will be flattened and passed down to the kernel LNet module. LNet shall store these UDSPs on a policy list. Once policies are added to LNet they will be applied on existing networks, NIDs and routers. The advantage of this approach is that UDSPs are not strictly tied to the internal constructs, IE networks, NIDs or routers, but can be applied whenever the internal constructs are created and if the internal constructs are deleted then they remain and can be automatically applied at a future time.
This makes configuration easy since a set of UDSPs can be defined, like "all IB networks priority 1", "all Gemini networks priority 2", etc, and when a network is added, it automatically inherits these rules.
Peers are normally not created explicitly by the administrators. The ULP requests to send a message to a peer or the node receives an unsolicited message from a peer which results in creating a peer construct in LNet. It is feasible, especially for router policies, to have a UDSP which associates a set of clients with in a specific range with a set of optimal routers. Having the policies stored and matched in kernel aids in fulfilling this requirement.
UDSP Application
Performance needs to be taken into account with this feature. It is not feasible to traverse the policy lists on every send operation. This will add unnecessary overhead. When rules are applied they have to be "flattened" to the constructs they impact. For example, a Network Rule is added as follows: o2ib priority 0. This rule gives priority for using o2ib network for sending. A priority field in the network will be added. This will be set to 0 for the o2ib network. As we traverse the networks in the selection algorithm, which is part of the current code, the priority field will be compared. This is a more optimal approach than examining the policies on every send to see if it we get any matches.
Anchor InKernelStructures InKernelStructures
In Kernel Structures
| InKernelStructures | |
| InKernelStructures |
| Code Block |
|---|
/* lnet structure will keep a list of UDSPs */
struct lnet {
...
list_head ln_udsp_list;
...
}
/* each NID range is defined as net_id and an ip range */
struct lnet_ud_nid_descr {
__u32 ud_net_id;
list_head ud_ip_range;
}
/* UDSP action types */
enum lnet_udsp_action_type {
EN_LNET_UDSP_ACTION_PRIORITY = 0,
EN_LNET_UDSP_ACTION_NONE = 1,
}
/*
* a UDSP rule can have up to three user defined NID descriptors
* - src: defines the local NID range for the rule
* - dst: defines the peer NID range for the rule
* - rte: defines the router NID range for the rule
*
* An action union defines the action to take when the rule
* is matched
*/
struct lnet_udsp {
list_head udsp_on_list;
__u32 idx;
lnet_ud_nid_descr *udsp_src;
lnet_ud_nid_describe *udsp_dst;
lnet_ud_nid_descr *udsp_rte;
enum lnet_udsp_action_type udsp_action_type;
union udsp_action {
__u32 udsp_priority;
};
}
/* The rules are flattened in the LNet structures as shown below */
struct lnet_net {
...
/* defines the relative priority of this net compared to others in the system */
__u32 net_priority;
...
}
struct lnet_ni {
...
/* defines the relative priority of this NI compared to other NIs in the net */
__u32 ni_priority;
...
}
struct lnet_peer_ni {
...
/* defines the relative peer_ni priority compared to other peer_nis in the peer */
__u32 lpni_priority;
/* defines the list of local NID(s) (>=1) which should be used as the source */
union lpni_pref {
lnet_nid_t nid;
lnet_nid_t *nids;
}
/* defines the list of router NID(s) to be used when sending to this peer NI */
lnet_nid_t *lpni_rte_nids;
...
}
/* UDSPs will be passed to the kernel via IOCTL */
#define IOC_LIBCFS_ADD_UDSP _IOWR(IOC_LIBCFS_TYPE, 106, IOCTL_CONFIG_SIZE)
/* UDSP will be grabbed from the kernel via IOCTL
#define IOC_LIBCFS_GET_UDSP _IOWR(IOC_LIBCFS_TYPE, 106, IOCTL_CONFIG_SIZE) |
Kernel IOCTL Handling
| Code Block |
|---|
/* api-ni.c will be modified to handle adding a UDSP */
int
LNetCtl(unsigned int cmd, void *arg)
{
...
case IOC_LIBCFS_ADD_UDSP: {
struct lnet_ioctl_config_udsp *config_udsp = arg;
mutex_lock(&the_lnet.ln_api_mutex);
/*
* add and do initial flattening of the UDSP into
* internal structures
*/
rc = lnet_add_and_flatten_udsp(config_udsp);
mutex_unlock(&the_lnet.ln_api_mutex);
return rc;
}
case IOC_LIBCFS_GET_UDSP: {
struct lnet_ioctl_config_udsp *get_udsp = arg;
mutex_lock(&the_lnet.ln_api_mutex);
/*
* get the udsp at index provided. Return -ENOENT if
* no more UDSPs to get
*/
rc = lnet_add_udsp(get_udsp, get_udsp->idx);
mutex_unlock(&the_lnet.ln_api_mutex);
return rc
}
...
} |
Kernel Selection Algorithm Modifications
| Code Block |
|---|
/*
* select an NI from the Nets with highest priority
*/
struct lnet_ni *
lnet_find_best_ni_on_local_net(struct lnet_peer *peer, int md_cpt)
{
...
list_for_each_entry(peer_net, &peer->lp_peer_nets, lpn_peer_nets) {
...
struct lnet_net *net;
net = lnet_get_net_locked(peer_net->lpn_net_id);
if (!net)
continue
/*
* look only at the NIs with the highest priority and disregard
* nets which have lower priority. Nets with equal priority are
* examined and the best_ni is selected from amongst them.
*/
net_prio = net->net_priority;
if (net_prio > best_net_prio)
continue;
else if (net_prio < best_net_prio) {
best_net_prio = net_prio;
best_ni = NULL;
}
best_ni = lnet_find_best_ni_on_spec_net(best_ni, peer,
best_peer_net, md_cpt, false);
...
}
...
}
/*
* select the NI with the highest priority
*/
static struct lnet_ni *
lnet_get_best_ni(struct lnet_net *local_net, struct lnet_ni *best_ni,
struct lnet_peer *peer, struct lnet_peer_net *peer_net,
int md_cpt)
{
...
ni_prio = ni->ni_priority;
if (ni_fatal) {
continue;
} else if (ni_healthv < best_healthv) {
continue;
} else if (ni_healthv > best_healthv) {
best_healthv = ni_healthv;
if (distance < shortest_distance)
shortest_distance = distance;
/*
* if this NI is lower in priority than the one already set then discard it
* otherwise use it and set the best prioirty so far to this NI's.
*/
} else if ni_prio > best_ni_prio) {
continue;
} else if (ni_prio < best_ni_prio)
best_ni_prio = ni_prio;
}
...
}
/*
* When a UDSP rule associates local NIs with remote NIs, the list of local NIs NIDs
* is flattened to a list in the associated peer_NI. When selecting a peer NI, the
* peer NI with the corresponding preferred local NI is selected.
*/
bool
lnet_peer_is_pref_nid_locked(struct lnet_peer_ni *lpni, lnet_nid_t nid)
{
...
}
/*
* select the peer NI with the highest priority first and then the
* preferred one
*/
static struct lnet_peer_ni *
lnet_select_peer_ni(struct lnet_send_data *sd, struct lnet_peer *peer,
struct lnet_peer_net *peer_net)
{
...
ni_is_pref = lnet_peer_is_pref_nid_locked(lpni, best_ni->ni_nid);
lpni_prio = lpni->lpni_priority;
if (lpni_healthv < best_lpni_healthv)
continue;
/*
* select the NI with the highest priority.
*/
else if lpni_prio > best_lpni_prio)
continue;
else if (lpni_prio < best_lpni_prio)
best_lpni_prio = lpni_prio;
/*
* select the NI which has the best_ni's NID in its preferred list
*/
else if (!preferred && ni_is_pref)
preferred = true;
...
} |
UDSP Marshaling
After a UDSP is parsed in user space it needs to be marshaled and sent to the kernel. The kernel will de-marshal the data and store it in its own data structures. The UDSP is formed of the following pieces of information:
- Index: The index of the UDSP to insert or delete
- Source Address expression: A dot expression describing the source address range
- Net of the Source: A net id of the source
- Destination Address expression: A dot expression describing the destination address range
- Net of the Destination: A net id of the destination
- Router Address expression: A dot expression describing the router address range
- Net of the Router: A net id of the router
- Action Type: An enumeration describing the action type.
- Action: A structure describing the action if the UDSP is matched.
The data flow of a UDSP looks as follows:
Gliffy Diagram name DataFlow pagePin 2
YAML Syntax
Defined here.
Userspace Structures
| Code Block |
|---|
/* each NID range is defined as net_id and an ip range */
struct lnet_ud_nid_descr {
__u32 ud_net_id;
list_head ud_ip_range;
}
/* UDSP action types */
enum lnet_udsp_action_type {
EN_LNET_UDSP_ACTION_PRIORITY = 0,
EN_LNET_UDSP_ACTION_NONE = 1,
}
/*
* a UDSP rule can have up to three user defined NID descriptors
* - src: defines the local NID range for the rule
* - dst: defines the peer NID range for the rule
* - rte: defines the router NID range for the rule
*
* An action union defines the action to take when the rule
* is matched
*/
struct lnet_udsp {
list_head udsp_on_list;
__u32 idx;
lnet_ud_nid_descr *udsp_src;
lnet_ud_nid_describe *udsp_dst;
lnet_ud_nid_descr *udsp_rte;
enum lnet_udsp_action_type udsp_action_type;
union udsp_action {
__u32 udsp_priority;
};
} |
Marshaled Structures
| Code Block |
|---|
struct cfs_range_expr {
struct list_head re_link;
__u32 re_lo;
__u32 re_hi;
__u32 re_stride;
};
struct lnet_ioctl_udsp {
__u32 iou_idx;
enum lnet_udsp_action_type iou_action_type
union action iou_action {
__u32 priority;
}
__u32 iou_src_dot_expr_count;
__u32 iou_dst_dot_expr_count;
__u32 iou_rte_dot_expr_count;
char iou_bulk[0];
}; |
The address is expressed as a list of cfs_range_expr. These need to be marshalled. For IP address there are 4 of these structures. Other type of addresses can have a different number. As an example, gemini will only have one. The corresponding iou_[src|dst|rte]_dot_expr_count is set to the number of expressions describing the address. Each expression is then flattened in the structure. They have to be flattened in the order defined: SRC, DST, RTE.
The kernel will recieve the marshalled data and will form its internal structures. The functions to marshal and de-marshal should be straight forward. Note that user space and kernel space use the same structures. These structure will be defined in a common location. For this reason the functions to marshal and de-marshal will be shared.
Kernel Structure
Defined here.
Structures
| Code Block |
|---|
/* This is a common structure which describes an expression */
struct lnet_match_expr {
};
struct lnet_selection_descriptor {
enum selection_type lsd_type;
char *lsd_pattern1;
char *lsd_pattern2;
union {
__u32 lsda_priority;
} lsd_action_u;
};
/*
* lustre_lnet_add_selection
* Delete the peer NIDs. If all peer NIDs of a peer are deleted
* then the peer is deleted
*
* selection - describes the selection policy rule
* seq_no - sequence number of the command
* err_rc - YAML structure of the resultant return code
*/
int lustre_lnet_add_selection(struct selection_descriptor *selection, int seq_no, struct cYAML **er_rc); |
cfg-100, cfg-105, cfg-110, cfg-115, cfg-120, cfg-125, cfg-130, cfg-135, cfg-140, cfg-160, cfg-165
YAML Syntax
Each selection rule will translate into a separate IOCLT to the kernel.
...