Table of Contents |
---|
Original Pre-Health Requirements
Router Requirements
A router is a node which has the routing feature turned on using lnetctl set routing 1
or the equivalent modprobe configuration.
- Track the last time stamp any message was received on a local NI
- if the NI hasn't received any traffic for a period of
router_ping_timeout + MAX(live_router_check_interval, dead_router_check_interval)
then it's marked down- This is done so that other nodes using the gateway can mark the route down, given that
avoid_asym_router_failure
is set to 1.
- This is done so that other nodes using the gateway can mark the route down, given that
- Do not send messages to a peer NI which is marked down.
- Set the peer status to up when messages are received
- For each peer NI that is marked down, when there are messages to forward to it, query it at least once per second to check if it is back up. If the query result determines that the peer NI is reachable, the peer NI state is set to UP. Messages can then be send to that peer NI.
Gateway Requirements
A gateway in this context is the peer NI created when adding a route on a node. For example: lnetctl route add --net tcp --gateway <gateway-NID>.
Dealing with that peer-NI is somewhat of a special case.
- Mark the gateway peer NI as down when the LND fails to send a message
- Note although the LND notifications happen for all peer NIs it is only pertinent on routers or for gateways.
- Mark the gateway peer NI as up when we receive an unsolicited message or when we receive a REPLY for a PING sent from the router checker.
- Mark the route as down if one of the gateway's interfaces, identified by the gateway peer NI, are down, provided the
avoid_asym_router_failure
is set to 1.
Peer Requirements
- Do not check for peer aliveness when sending a message to a peer.
- Pick a route which has its gateway peer NI marked as up.
Implementation Details
The routing infrastructure currently performs the following functionality
- Keep track of the last time the peer was alive,
lpni_last_alive
- Keep track the last time the peer was notified that its state has changed,
lpni_timestamp
- The peer can change state under the following conditions:
- The LND notifies that the peer is down when it fails to send a message to the peer.
- As an example in o2iblnd:
- kiblnd_peer_connect_failed() and kiblnd_disconnect_conn() call kiblnd_peer_notify() which calls lnet_notify() to set the peer to
dead
if there was an error
- kiblnd_peer_connect_failed() and kiblnd_disconnect_conn() call kiblnd_peer_notify() which calls lnet_notify() to set the peer to
- As an example in o2iblnd:
- A message is received in
lnet_parse()
- In this case the peer state is set to alive only for gateway peer NIs
- When the router checker ping is responded to or it fails.
- If the router checker ping times out.
- The LND notifies that the peer is down when it fails to send a message to the peer.
- The peer can change state under the following conditions:
- This step only concerns routers: Only send the message if the peer is alive, determined as outlined above.
- On the router if the NI hasn't received any traffic for a period of
router_ping_timeout + MAX(live_router_check_interval, dead_router_check_interval)
then it's marked down.- This is done in order for the peers using the router to mark the peer down when the
avoid_asym_router_failure
is set to 1, which it is by default.
- This is done in order for the peers using the router to mark the peer down when the
LNet Multi-Rail Routing
Multi-Rail introduced the concept of a peer and a peer NI. A peer can have multiple peer NIs. This changes the semantics of route configuration. Currently a route can be configured as:
...
Nodes on different networks will use different primary NIDs to refer to the same router. IE a primary NID is only a representation of the router on the peer with the route configured.
Multi-Rail Router Requirements
- Configure a percentage of the maximum health below which an interface will not be selected for use. This percentage value will be referred to as
router_sensitivity_percentage
- 100% means that an interface which has less than MAX_HEALTH will not be selected for use
- 0% means that an interface will be selected for use as long as it has the best health value among the available interfaces.
- Do not put message on the wire if the health of a peer_ni is below
MAX_HEALTH * router_sensitivity_percentage
- Attempt to recover an unhealthy peer_ni once per second by pinging it
- LND shall notify LNet whenever it determines a peer_ni is alive or dead. The API will provide a parameter which will force LNet to fully recover the peer_ni's health.
- Currently the gnilnd is aware of the gni network health and therefore it can inform the LNet layer when a peer is alive or dead. In fact the gnilnd is the only LND which informs LNet when the peer is alive. All the other LNDs only tell the LNet when the peer is disconnected. Therefore the gnilnd can set the
fully_recover
parameter to true, while the other LNDs can set it to false.
- Currently the gnilnd is aware of the gni network health and therefore it can inform the LNet layer when a peer is alive or dead. In fact the gnilnd is the only LND which informs LNet when the peer is alive. All the other LNDs only tell the LNet when the peer is disconnected. Therefore the gnilnd can set the
- LNet shall call an LND API to notify that a peer_ni is dead whenever the peer_ni's health goes below
MAX_HEALTH * router_sensitivity_percentage
- This is derived from the current code and is only applicable to socklnd.
Multi-Rail Route Requirements
- A route is considered down if there are no viable peer_nis on the remote net of the gateway
- EX: if a route is defined as:
lnetctl route add --net tcp2 --gateway 10.10.10.3@tcp
, then if the gateway defined as10.10.10.3@tcp
has no healthypeer_nis
on tcp2, then that route is dead
- EX: if a route is defined as:
- A gateway is consider down under two circumstances:
- All remote nets reported in the
REPLY
to thePING
are down - All local representation of the peer_nis on the remote net have a health value below:
MAX_HEALTH * rtr_sensitivity_percentage
- All remote nets reported in the
Configuration
A router can be configured as follows to utilize the new health infrastructure
lnet_health_sensitivity >= 1 ## this will decrement the health of the NI by the value specified everytime there is a failure to send to that interface
router_sensitivity_percentage = 100 ## this will consider the route down if there is no NI on the remote net of the gateway with health == LNET_MAX_HEALTH_VALUE
- Optionally we can set
retry_count > 0 ## this will attempt to resend a message on a different NI if one is available
Route Selection
Currently a route is selected based on the priority and hops value given to it, after that the credits for the peer NI are evaluated. With Multi-Rail there should be a two evaluation factors in the selection process.
...
Furthermore, with the LNet Resiliency feature the healthiest interface of the router or set of routers is selected.
LNet Resiliency
The LNet Health/Resiliency feature has added the following features:
...
Some details were documented here: Routing and MR integration
Proposed Changes
Router Discovery
There are two ways to discover a router:
...
The change is tracked under:
Jira | ||||||||
---|---|---|---|---|---|---|---|---|
|
Router Peer instead of Peer NI
Multi-Rail considers that a Peer can have multiple interfaces, IE Peer NIs, on different networks.
...
Tracked under:
Jira | ||||||||
---|---|---|---|---|---|---|---|---|
|
Router Ping
Gateways are pinged on a configured interval. If the Gateway is dead, then there is another configuration parameter which governs the frequency of the ping to determine if gateway is back up.
...
Tracked under
Jira | ||||||||
---|---|---|---|---|---|---|---|---|
|
Router Aliveness and Health
Currently, there is some tricky code to determine the aliveness of a peer. The intent of the code seems to be divided into two categories: Router and Gateway
...
A gateway is a peer_ni which references a router
Aliveness on a Router
- If a peer NI is dead do not send messages to it
- If a peer NI is dead query it every 1 second to see if it's back up when there is traffic.
- If messages are received on a peer NI set it's aliveness to up
- If a local NI does not receive a message for a configured period of time, then bring down the status of the local NI. That will be discovered when the router is pinged.
Aliveness for a gateway
- Set the gateway peer NI to down if there is a failure to send a message
- The gateway peer NI is pinged every configured period of time.
- Set the gateway peer_ni to down if the routing is not possible through it
- Mark the gateway peer NI as up when it receives a message
...