Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Test #TagProcedureScriptResult
1Immediate Failure
  • Send a PING
  • simulate an immediate LND failure (EX: NOMEM)
  • Message should not be resent

lnetctl discover <nid>

lctl net_drop_add with "-e local_error"

lnetctl discover <nid>

pass

Asynchronous Errors

Test #TagProcedureScriptResult
1

LNET_MSG_STATUS_LOCAL_INTERRUPT

LNET_MSG_STATUS_LOCAL_DROPPED

LNET_MSG_STATUS_LOCAL_ABORTED

LNET_MSG_STATUS_LOCAL_NO_ROUTE

LNET_MSG_STATUS_LOCAL_TIMEOUT

  • MR Node with Multiple interfaces
  • Send a PING
  • Simulate an <error>
  • PING msg should be queued on resend queue
  • PING msg will be resent on a different interface
  • Failed interfaces' health value will be decremented
  • Failed interface will be placed on the recovery queue


2Sensitivity == 0
  • Same setup as 1
  • NI is not placed on the recovery queue


3Sensitivity > 0
  • Same setup as 1
  • NI is placed on the recovery queue
  • Monitor network activity as NI is pinged until health is back to maximum


4

Sensitivity > 0

Buggy interface

  • Same setup as 1
  • NI is placed on recovery queue
  • NI is pinged ever 1 second
  • Simulate ping failure ever other ping
  • NI's health should be decremented on failure
  • NI should remain on the recovery queue


5Retry count == 0
  • Same setup as 1
  • Message will not be retried and the message will be finalized immediately


6Retry count > 0
  • Same setup as 1
  • Message will be transmitted for a maximum of retry count or until the message expires


7REPLY timeout
  • Same setup as 1
  • Except Use LNet selftest
  • Simulate a local timeout
  • Re-transmit
  • No REPLY received
  • Message is finalized and TIMEOUT event is propagated.


8ACK timeout
  • Same setup as 7 except simulate ACK timeout


9LNET_MSG_STATUS_LOCAL_ERROR
  • Same setup as 1
  • Message is finalized immediately (not resent)
  • Local NI is placed on the recovery queue
  • Same procedure to recover the local NI


10LNET_MSG_STATUS_REMOTE_DROPPED
  • Same setup as 1
  • Message is queued for resend depending on retry_count
  • peer_ni is placed on the recovery queue (not if sensitivity == 0)
  • peer_ni is pinged every 1 second


11

LNET_MSG_STATUS_REMOTE_ERROR

LNET_MSG_STATUS_REMOTE_TIMEOUT

LNET_MSG_STATUS_NETWORK_TIMEOUT

  • Same setup as 1
  • Message is not resent
  • peer_ni recovery happens as outlined in previous cases


...