Test # | Tag | Procedure | Script | Result |
---|
1 | LNET_MSG_STATUS_LOCAL_INTERRUPT LNET_MSG_STATUS_LOCAL_DROPPED LNET_MSG_STATUS_LOCAL_ABORTED LNET_MSG_STATUS_LOCAL_NO_ROUTE LNET_MSG_STATUS_LOCAL_TIMEOUT | - MR Node with Multiple interfaces
- Send a PING
- Simulate an <error>
- PING msg should be queued on resend queue
- PING msg will be resent on a different interface
- Failed interfaces' health value will be decremented
- Failed interface will be placed on the recovery queue
|
|
|
2 | Sensitivity == 0 | - Same setup as 1
- NI is not placed on the recovery queue
|
|
|
3 | Sensitivity > 0 | - Same setup as 1
- NI is placed on the recovery queue
- Monitor network activity as NI is pinged until health is back to maximum
|
|
|
4 | Sensitivity > 0 Buggy interface | - Same setup as 1
- NI is placed on recovery queue
- NI is pinged ever 1 second
- Simulate ping failure ever other ping
- NI's health should be decremented on failure
- NI should remain on the recovery queue
|
|
|
5 | Retry count == 0 | - Same setup as 1
- Message will not be retried and the message will be finalized immediately
|
|
|
6 | Retry count > 0 | - Same setup as 1
- Message will be transmitted for a maximum of retry count or until the message expires
|
|
|
7 | REPLY timeout | - Same setup as 1
- Except Use LNet selftest
- Simulate a local timeout
- Re-transmit
- No REPLY received
- Message is finalized and TIMEOUT event is propagated.
|
|
|
8 | ACK timeout | - Same setup as 7 except simulate ACK timeout
|
|
|
9 | LNET_MSG_STATUS_LOCAL_ERROR | - Same setup as 1
- Message is finalized immediately (not resent)
- Local NI is placed on the recovery queue
- Same procedure to recover the local NI
|
|
|
10 | LNET_MSG_STATUS_REMOTE_DROPPED | - Same setup as 1
- Message is queued for resend depending on retry_count
- peer_ni is placed on the recovery queue (not if sensitivity == 0)
- peer_ni is pinged every 1 second
|
|
|
11 | LNET_MSG_STATUS_REMOTE_ERROR LNET_MSG_STATUS_REMOTE_TIMEOUT LNET_MSG_STATUS_NETWORK_TIMEOUT | - Same setup as 1
- Message is not resent
- peer_ni recovery happens as outlined in previous cases
|
|
|