You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 3 Next »

Selection Algorithm Scenarios

Test #TagProcedureScriptResult
1SRC_SPEC_LOCAL_MR_DST
  • MR Node
  • MR Peer
  • Send a ping
  • REPLY for PING should always come on the same interface that PING was sent on.
  • Check the TRACE in the logs to verify
  • Repeat the test. A different local NI should be used for each new PING.


2SRC_SPEC_LOCAL_MR_DST
  • MR Node
  • MR Peer
  • Initiate discovery
  • Node → PING → Peer
  • Node ← PUSH ← Peer
  • Node should respond with an ACK to the same interface as the one it received the PUSH on
  • Check the TRACE in the logs to verify
  • Repeat the test
    • Peer's local_ni when sending the PUSH should be different.


3SRC_SPEC_ROUTER_MR_DST
  • MR Node
  • NMR Router
  • MR Peer
  • Send a ping
  • REPLY for PING should always come on the same interface that PING was sent on.
  • Check the TRACE in the logs to verify
  • Router should be used
  • Repeat the test. A different local NI should be used for each new PING.


4SRC_SPEC_ROUTER_MR_DST
  • MR Node
  • NMR Router
  • MR Peer
  • Initiate discovery
  • Node → PING → Peer
  • Node ← PUSH ← Peer
  • Node should respond with an ACK to the same interface as the one it received the PUSH on
  • Check the TRACE in the logs to verify
  • Router should be used
  • Repeat the test. Peer's local_ni when sending the PUSH should be different.



5SRC_SPEC_ROUTER_MR_DST
  • MR Node
  • MR Router
  • MR Peer
  • Send a ping
  • REPLY for PING should always come on the same interface that PING was sent on.
  • Check the TRACE in the logs to verify
  • Repeat sending
  • Router interfaces should be used in round robin, while the peer destination should remain constant.
  • Repeat the test. A different local NI should be used for each new PING.


6SRC_SPEC_ROUTER_MR_DST
  • MR Node
  • MR Router
  • MR Peer
  • Initiate discovery
  • Node → PING → Peer
  • Node ← PUSH ← Peer
  • Node should respond with an ACK to the same interface as the one it received the PUSH on
  • Check the TRACE in the logs to verify
  • Router interfaces should be used in round robin, while the peer destination should remain constant.
  • Repeat the test. Peer's local_ni when sending the PUSH should be different.



7SRC_SPEC_LOCAL_NMR_DST
  • Same as 1 and 2
  • Except that repeating the test will not result in a different local_ni being used.


8SRC_SPEC_ROUTER_NMR_DST
  • Same as 3 - 6
  • Except that repeating the test will not result in different local NIs being used.


9SRC_ANY_LOCAL_MR_DST
  • MR Node
  • MR Peer
  • Send multiple PINGs
  • PING REPLYs should come on the same interface
  • Every PING will select a new local/remote NIs


10SRC_ANY_ROUTER_MR_DST
  • MR Node
  • NMR Router
  • MR Peer
  • Send Multiple PINGs
  • Node will cycle over local_NIs
  • Node will use the same destination NID as final destination
  • Node will use the NMR Router


11SRC_ANY_ROUTER_MR_DST
  • MR Node
  • MR Router
  • MR Peer
  • Send Multiple PINGs
  • Node will cycle over local_NIs
  • Node will use the same destination NID as final destination
  • Node will use the different interfaces of the MR Router
  • MR Router will cycle over the interfaces of the Final destination.


12SRC_ANY_LOCAL_NMR_DST
  • MR Node
  • NMR Peer
  • Send multiple PINGs
  • Node will use same source/dst NID for all PINGs


13SRC_ANY_ROUTER_NMR_DST
  • MR Node
  • NMR Router
  • NMR Peer
  • Send multiple PINGs
  • Node will use the same source/dst NIDs for all PINGs
  • Node will use the router interface


14SRC_ANY_ROUTER_NMR_DST
  • MR Node
  • MR Router
  • NMR Peer
  • Send multiple PINGs
  • Node will use the same source/dst NIDs for all PINGs
  • Node will cycle through the Router's interfaces



Error Scenarios

Synchronous Errors

Test #TagProcedureScriptResult
1Immediate Failure
  • Send a PING
  • simulate an immediate LND failure (EX: NOMEM)
  • Message should not be resent


Asynchronous Errors

Test #TagProcedureScriptResult
1

LNET_MSG_STATUS_LOCAL_INTERRUPT

LNET_MSG_STATUS_LOCAL_DROPPED

LNET_MSG_STATUS_LOCAL_ABORTED

LNET_MSG_STATUS_LOCAL_NO_ROUTE

LNET_MSG_STATUS_LOCAL_TIMEOUT

  • MR Node with Multiple interfaces
  • Send a PING
  • Simulate an <error>
  • PING msg should be queued on resend queue
  • PING msg will be resent on a different interface
  • Failed interfaces' health value will be decremented
  • Failed interface will be placed on the recovery queue


2Sensitivity == 0
  • Same setup as 1
  • NI is not placed on the recovery queue


3Sensitivity > 0
  • Same setup as 1
  • NI is placed on the recovery queue
  • Monitor network activity as NI is pinged until health is back to maximum


4

Sensitivity > 0

Buggy interface

  • Same setup as 1
  • NI is placed on recovery queue
  • NI is pinged ever 1 second
  • Simulate ping failure ever other ping
  • NI's health should be decremented on failure
  • NI should remain on the recovery queue


5Retry count == 0
  • Same setup as 1
  • Message will not be retried and the message will be finalized immediately


6Retry count > 0
  • Same setup as 1
  • Message will be transmitted for a maximum of retry count or until the message expires


7REPLY timeout
  • Same setup as 1
  • Except Use LNet selftest
  • Simulate a local timeout
  • Re-transmit
  • No REPLY received
  • Message is finalized and TIMEOUT event is propagated.


8ACK timeout
  • Same setup as 7 except simulate ACK timeout


9LNET_MSG_STATUS_LOCAL_ERROR
  • Same setup as 1
  • Message is finalized immediately (not resent)
  • Local NI is placed on the recovery queue
  • Same procedure to recover the local NI


10LNET_MSG_STATUS_REMOTE_DROPPED
  • Same setup as 1
  • Message is queued for resend depending on retry_count
  • peer_ni is placed on the recovery queue (not if sensitivity == 0)
  • peer_ni is pinged every 1 second


11

LNET_MSG_STATUS_REMOTE_ERROR

LNET_MSG_STATUS_REMOTE_TIMEOUT

LNET_MSG_STATUS_NETWORK_TIMEOUT

  • Same setup as 1
  • Message is not resent
  • peer_ni recovery happens as outlined in previous cases


Random Failures

User Interface

Test #TagProcedureScriptResult

lnet_transaction_timeout
  • Set lnet_transaction_timeout to a value < retry_count via lnetctl and YAML
    • This should lead to a failure to set
  • Set lnet_transaction_timeout to a value > retr_count via lnetctl and YAML
    • lnet_lnd_timeout value should == lnet_transaction_timeout / retry_count
  • Show value via "lnetctl global show"



lnet_retry_count
  • Set the lnet_retry_count to a value > lnet_transaction_timeout via lnetctl and YAML
    • This should lead to a failure to set
  • Set the lnet_retry_count to a value < lnet_transaction_timeout via lnetctl and YAML
    • lnet_lnd_timeout value should == lnet_transaction_timeout / retry_count
  • Show value via "lnetctl global show"



lnet_health_sensitivity
  • Set the lnet_health sensitivity from lnetctl and from YAML
  • Show value via "lnetctl global show"



NI statistics
  • verify LNet health statistics
    • lnetctl net show -v 3



Peer NI statistics
  • verify LNet health statistics for peer NIs
    • lnetctl peer show -v 3



NI Health value
  • verify setting the local NI health statistics
    • lnetctl net set --nid <nid> --health <value>
  • Redo from YAML



Peer NI Health value
  • verify setting the local NI health statistics
    • lnetctl peer set --nid <nid> --health <value>
  • Redo from YAML


  • No labels