...
- LND Timeout: LND declares that a message won't arrive.
- IB timeout is (default?) slightly less than 4 seconds
- LND timeout is
timeout
module parameter foro2ib
andgni
,sock_timeout
module parameter forsock
?
- LNet Reply Timeout: LNet declares an Ack/Reply won't arrive. > 2 * LND Timeout * (max hops -1)
- Depends on the route!
- LNet Retry Timeout: LNet gives up on retries. > LNet Reply Timeout * max LNet retries
- Depends on the route!
peer_timeout
module parameter: peer is declared dead. Either use for LNet Retry Timeout, or > LNet Retry Timeout.
It is not completely obvious how this scheme interacts with the Lustre's timeout
parameter (the Lustre RPC timeout, from which a number of timeouts appear to be derived from that, but I think it isn't the LND timeout), or the peer_timeout
per LND module parameterare derived).
O2IBLND
Overview
There are two types of events to account for:
...