Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • On transmit timeout kiblnd notifies LNet that the peer has closed due to an error. This goes through the lnet_notify path.
  • The peer aliveness at the LNet layer is set to 0 (dead), and the last alive
  • In IBLND whenever a message is received successfully, transmitted successfully or a connection is completed (whether it is successful or has been rejected) then the last alive time of the peer is set.
  • At the LNet layer whenever sending a message to a peer check if that peer is alive. for a non router node, lnet_peer_isaliveness_aliveenabled() is calledwill always return 0:
    • If the peer is marked dead and you've been notified by the lnd of its death at time X which is after the last known alive time, then consider the peer currently dead.
    • Otherwise consider the peer is alive if peer_timeout seconds has not passed from the last time it was alive.
    • if the peer_timeout has elapsed then consider the peer dead.
      • The issue with that is we will never retry this peer ever again after the peer_timeout is elapsed.
    • In case if the node is a router router_ping_timeout defaults to 50, which is less than 
    • Code Block
      #define lnet_peer_aliveness_enabled(lp) (the_lnet.ln_routing != 0 && \
      ((lp)->lpni_net) && \
      	(lp)->lpni_net->net_tunables.lct_peer_time_out > 0)
    • In effect, the aliveness of the peer is not considered at all if the node is not a router. 

      • This can remain the same since the health of the peer will be considered in lnet_select_pathway() before this is considered.
      • In fact if the logic for the health of the peer is done in lnet_select_pathway(), then the logic in lnet_post_send_locked() can be removed. A peer will always be as healthy as possible by the time the flow hits lnet_post_send_locked()
  • If the node is not a router, then a peer will always be tried irregardless of its health. If it is a router then 

 

 

Health Revisited

There are different scenarios to consider with Health:

...