A router is a node which has the routing feature turned on using lnetctl set routing 1
or the equivalent modprobe configuration.
router_ping_timeout + MAX(live_router_check_interval, dead_router_check_interval)
then it's marked downavoid_asym_router_failure
is set to 1.A gateway in this context is the peer NI created when adding a route on a node. For example: lnetctl route add --net tcp --gateway <gateway-NID>.
Dealing with that peer-NI is somewhat of a special case.
avoid_asym_router_failure
is set to 1.The routing infrastructure currently performs the following functionality
lpni_last_alive
lpni_timestamp
dead
if there was an errorlnet_parse()
router_ping_timeout + MAX(live_router_check_interval, dead_router_check_interval)
then it's marked down.avoid_asym_router_failure
is set to 1, which it is by default.Multi-Rail introduced the concept of a peer and a peer NI. A peer can have multiple peer NIs. This changes the semantics of route configuration. Currently a route can be configured as:
lnetctl route add --net <remote net> --gateway <gateway-peer-NID> |
The gateway-peer-NID refers to a specific interface on the router. However with MR enabled on the router, multiple interfaces can be configured on the same network. Therefore, the configuration semantics should be as follows:
lnetctl route add --net <remote net> --gateway <gateway-primary-NID> |
A router should be discovered on first use. The discovery process will determine all the interfaces available on the router. There could be multiple interfaces on the same network.
A route should only be marked down if it can not route message, which means that the route's remote net on the router has no active interface.
Nodes on different networks will use different primary NIDs to refer to the same router. IE a primary NID is only a representation of the router on the peer with the route configured.
Currently a route is selected based on the priority and hops value given to it, after that the credits for the peer NI are evaluated. With Multi-Rail there should be a two evaluation factors in the selection process.
A mechanism should be created to restrict the selection to a group of peer NIs that could belong to different gateways that can reach the same remote network.
In this way the Mulit-Rail aspect of the gateway is considered.
Furthermore, with the LNet Resiliency feature the healthiest interface of the router or set of routers is selected.
The LNet Health/Resiliency feature has added the following features:
The original route code which implements the requirements outlined above are no longer inline with the new mechanisms implemented. There needs to be an effort taken to bring the router code more inline with the new features implemented.
Some details were documented here: Routing and MR integration
There are two ways to discover a router:
Different routes can be added using different NIDs of the same gateway. When the gateway is discovered on first use there will be a need to consolidate the routing information.
For example, let's take the scenario where SET-A of routes were entered through the gateway using GW-NID-A and SET-B of routes were entered through the gateway using GW-NID-B. This will create the following structure:
When the GW is discovered three scenarios are possible:
In all these cases we will need to consolidate the routing information as follows:
In the discovery code the consolidation of the peer information is driven from: lnet_peer_data_present()
The change is tracked under:
Multi-Rail considers that a Peer can have multiple interfaces, IE Peer NIs, on different networks.
Currently when a route is added the gateway is one of the leaves of the tree, IE a peer NI. To fully integrate routing with MR the gateway should be considered the tip of the tree, IE the peer.
This change will overhaul the current routing code. There are several fields in the struct lnet_peer_ni
which are used for routing:
/* messages blocking for router credits */ struct list_head lpni_rtrq; /* chain on router list */ struct list_head lpni_rtr_list; /* # times router went dead<->alive. Protected with lpni_lock */ int lpni_alive_count; /* # refs from lnet_route_t::lr_gateway */ int lpni_rtr_refcount; /* routes on this peer */ struct list_head lpni_routes; /* router checker state */ struct lnet_rc_data *lpni_rcd |
These fields will need to be transitioned to the peer, and all areas of the code which use them will need to be modified.
Tracked under:
Gateways are pinged on a configured interval. If the Gateway is dead, then there is another configuration parameter which governs the frequency of the ping to determine if gateway is back up.
The ping requires somewhat of an extensive infrastructure, including MD/EQ, an event handler and router state management. Much of that can be consolidated with the Discovery code. The Discovery code currently implements all the infrastructure required to manage sending pings and receiving responses. The intent of this proposal is to use this existing infrastructure in place of the router pinger.
The monitor thread calls a function to check if the gateways should be pinged.
The function traverses the list of gateways and sends out a ping if it is required.
Code exists to handle receiving a reply for the ping. When a REPLY is received a function is called to analyze the NIs in the REPLY. This analysis revolves around checking the status of each of the peer interfaces. If the asymmetric router failure is set then the gateway is marked down.
There are a couple of significant improvements/simplifications that can be done in this area:
Tracked under
Currently, there is some tricky code to determine the aliveness of a peer. The intent of the code seems to be divided into two categories: Router and Gateway
A router is a node which has routing feature enabled
A gateway is a peer_ni which references a router
Most of these requirements can be consolidated with the Health feature. Here is the proposed solution:
Using this proposal the router and gateway requirements will continue to be achieved, while at the same time, getting rid of lots of code which is currently used to keep track of the aliveness of the peer NIs.
Tracked Under: