You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

Introduction

It is sometimes desirable to fine tune the selection of local/remote NIs used for communication. For example currently if there are two networks an o2ib and a tcp network, both will be used. Especially if the traffic volume is low the credits criteria will be equivalent between the nodes, and both networks will be used in round robin. However, the user might want to use one network for all traffic and keep the other network free unless the other network goes down.

User Defined Selection Policies (UDSP) will allow this type of control. 

UDSPs are configured from lnetctl via either command line or YAML config files and then passed to the kernel. Policies are applied to all local networks and remote peers then stored in the kernel. Whenever new peers/peer_nis/local networks/local nis are added they are matched against the rules.

The user interface is recorded here.

Use Cases

Preferred Network

If a node can be reached on two networks, it is sometimes desirable to designate a fail-over network. Currently in lustre there is the concept of High Availability (HA) which allows servicenode nids to be defined as described in the lustre manual section 11.2. By using the syntax described in that section, two nids to the same peer can also be defined. However, this approach suffers from the current limitation in the lustre software, where the NIDs are exposed to layers above LNet. It is ideal to keep network failures handling contained within LNet and only let lustre worry about defining HA. 

Given this it is desirable to have two LNet networks defined on a node, each could have multiple interfaces. Then have a way to tell LNet to always use one network until it is no longer available, IE: all interfaces in that network are down.

In this manner we separate the functionality of defining fail-over pairs from defining fail-over networks.

Preferred NIDs

Depending on the network topology which the Lustre network is built on, it might be necessary to assign priorities to specific interfaces which are connected to optimized paths. In this way messages don't take more hops than necessary to get to the destination. As an example, in a dragonfly topology as diagrammed below, a node can have multiple interfaces on the same network, but some interfaces are not optimized to go directly to the destination group. So if the selection algorithm is operating without any rules, it could select a local interface which is less than optimal.

Therefore, giving priority for a local NID within a network is a way to ensure that messages always prefer the optimized paths.

DragonFly Topology

Preferred local/remote NID pairs

This is a finer tuned method of specifying an exact path, by not only specifying a priority to a local interface or a remote interface, but by specifying concrete pairs of interfaces that are most preferred. A peer interface can be associated with multiple local interfaces if necessary, to have a N:1 relationship between local interfaces and remote interfaces.

DLC APIs

The DLC library will provide the outlined APIs to expose a way to create, delete and show rules.

/*
 * lustre_lnet_add_net_sel_pol
 *   Add a net selection policy. If there already exists a 
 *   policy for this net it will be updated.
 *      net - Network for the selection policy
 *      priority - priority of the rule
 */
int lustre_lnet_add_net_sel_pol(char *net, int priority);
 
/*
 * lustre_lnet_del_net_sel_pol
 *   Delete a net selection policy.
 *      net - Network for the selection policy
 *      id - [OPTIONAL] ID of the policy. This can be retrieved via a show command.
 */
int lustre_lnet_del_net_sel_pol(char *net, int id);
 
/*
 * lustre_lnet_show_net_sel_pol
 *   Show configured net selection policies.
 *      net - filter on the net provided.
 */
int lustre_lnet_show_net_sel_pol(char *net);
 
/*
 * lustre_lnet_add_nid_sel_pol
 *   Add a nid selection policy. If there already exists a 
 *   policy for this nid it will be updated. NIDs can be either
 *   local NIDs or remote NIDs.
 *      nid - NID for the selection policy
 *      priority - priority of the rule
 */
int lustre_lnet_add_nid_sel_pol(char *nid, int priority);
 
/*
 * lustre_lnet_del_nid_sel_pol
 *   Delete a nid selection policy.
 *      nid - NID for the selection policy
 *      id - [OPTIONAL] ID of the policy. This can be retrieved via a show command.
 */
int lustre_lnet_del_nid_sel_pol(char *nid, int id);
 
/*
 * lustre_lnet_show_nid_sel_pol
 *   Show configured nid selection policies.
 *      nid - filter on the NID provided.
 */
int lustre_lnet_show_nid_sel_pol(char *nid);
 
/*
 * lustre_lnet_add_nid_sel_pol
 *   Add a peer to peer selection policy. If there already exists a 
 *   policy for the pair it will be updated.
 *      src_nid - source NID
 *      dst_nid - destination NID
 *      priority - priority of the rule
 */
int lustre_lnet_add_peer_sel_pol(char *src_nid, char *dst_nid, int priority);
 
/*
 * lustre_lnet_del_peer_sel_pol
 *   Delete a peer to peer selection policy.
 *      src_nid - source NID
 *      dst_nid - destination NID
 *      id - [OPTIONAL] ID of the policy. This can be retrieved via a show command.
 */
int lustre_lnet_del_peer_sel_pol(char *src_nid, char *dst_nid, int id);


/*
 * lustre_lnet_show_peer_sel_pol
 *   Show peer to peer selection policies.
 *      src_nid - [OPTIONAL] source NID. If provided the output will be filtered
 *                on this value.
 *      dst_nid - [OPTIONAL] destination NID. If provided the output will be filtered
 *                on this value.
 */
int lustre_lnet_show_peer_sel_pol(char *src_nid, char *dst_nid);

 

Data structures

IOCTL

Serialization/Deserialization

Policy Application

Selection Algorithm Integration

  • No labels