Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Test #TagProcedureScriptResult
1self test
  • MR Node
  • NMR Peer
  • Self-test
  • Randomize local NI failure
  • Randomize Remote NI failure


2self test
  • MR Node
  • MR Peer
  • Self-test
  • Randomize local NI failure
  • Randomize Remote NI failure


3self test
  • MR Node 
  • MR Router
  • NMR Peer
  • Self-test
  • Randomize local NI failure
  • Randomize Remote NI failure


4self test
  • MR Node 
  • MR Router
  • MR Peer
  • Self-test
  • Randomize local NI failure
  • Randomize Remote NI failure


5self test
  • MR Node 
  • NMR Router
  • NMR Peer
  • Self-test
  • Randomize local NI failure
  • Randomize Remote NI failure


6self test
  • MR Node 
  • NMR Router
  • MR Peer
  • Self-test
  • Randomize local NI failure
  • Randomize Remote NI failure


MR Router Testing

Test #TagProcedureScriptResult

Discovery triggered on route add
  • Bring up Router A with two interfaces
    • tcp0
    • tcp1
  • Bring up Peer A and add network on tcp0
  • Add router to tcp1 on peerA
  • Observe that a discovery occurs from peer A→ Router A



Discovery triggered on interval
  • Bring up Router A with two interfaces
    • tcp0
    • tcp1
  • Bring up Peer A and add network on tcp0
  • Add router to tcp1 on peerA
  • Observe that a discovery occurs from peer A→ Router A
  • Keep the two nodes up for 4 minutes
  • Every router_interval_timeout a discovery should occur from peerA→ RouterA



Router tcp1 down due to no traffic
  • Bring up Router A with two interfaces
    • tcp0
    • tcp1
  • Bring up Peer A and add network on tcp0
  • Add router to tcp1 on peerA
  • Observe that a discovery occurs from peer A→ Router A
  • Keep the two nodes up for 4 minutes
  • Every router_interval_timeout a discovery should occur from peerA→ RouterA
  • Since there is no traffic on tcp1 RouterA tcp1 should be down
    • verify via: lnetctl net show -v



Router tcp1 comes up when peerB is brought up
  • Bring up Router A with two interfaces
    • tcp0
    • tcp1
  • Bring up Peer A and add network on tcp0
  • Add router to tcp1 on peerA
  • Observe that a discovery occurs from peer A→ Router A
  • Keep the two nodes up for 4 minutes
  • Every router_interval_timeout a discovery should occur from peerA→ RouterA
  • Since there is no traffic on tcp1 RouterA tcp1 should be down
    • verify via: lnetctl net show -v
  • Bring up Peer B and add network on tcp1
  • Add router to tcp on peer B
  • Observe that a discovery occurs from peerB → RouterA
  • Observe that a RouterA tcp1 is now up



Add route without router there
  • Bring up Peer A and add network on tcp0
  • Add route to tcp1 on peerA
  • Observe that a discovery occurs but no response since router is not up
  • lnetctl route show -v # shows that router is down
  • lnetctl peer show -v # shows the peer is down
  • Bring up Router A with two interfaces: tcp0, tcp1
  • After router_interval_timeout a discovery should verify that router A is up
  • lnetctl route show -v # shows that router is down because no routerA tcp1 network should be down
  • lnetctl peer show -v # shows the peer is up
  • Bring up PeerB and add network on tcp1
  • lnetctl route show -v # shows that router is up



traffic should trigger an attempt at router discovery
  • Bring up Peer A and add network on tcp0
  • Add route to tcp1 on peerA
  • Observe that a discovery occurs but no response since router is not up
  • lnetctl route show -v # shows that router is down
  • lnetctl peer show -v # shows the router is down
  • Bring up Router A with two interfaces: tcp0, tcp1
  • Bring up PeerB and add network on tcp1
  • Before the router_interval_timeout expires do a:
    • lnetctl discover PeerB@tcp1
    • This should trigger a discovery of router A
    • lnetctl peer show -v # shows the peer is up and multi-rail
    • lnetctl route show -v # shows the route up



Ping should not trigger discovery of router
  • Bring up Peer A and add network on tcp0
  • Add router to tcp1 on peerA
  • Observe that a discovery occurs but no response since router is not up
  • lnetctl route show -v # shows that router is down
  • lnetctl peer show -v # shows the router is down
  • Bring up Router A with two interfaces: tcp0, tcp1
  • Bring up PeerB and add network on tcp1
  • Before the router_interval_timeout expires do a:
    • lnetctl ping PeerB@tcp1
    • This should NOT trigger a discovery of router A
    • ping should fail
    • lnetctl peer show -v # shows the peer is down
    • lnetctl route show -v # shows the route down



Multi-interface router even traffic distribution
  • Bring up Router A with 4 interfaces. 2 on tcp0 and 2 on tcp1
  • Bring up Peer A with interface on tcp0
  • Bring up Peer B with interface on tcp1
  • Run traffic using selftest
  • Observe that traffic is distributed on all router interfaces evenly



Multi-interface router with one bad interface
  • Bring up Router A with 4 interfaces. 2 on tcp0 and 2 on tcp1
  • Bring up Peer A with interface on tcp0
  • Bring up Peer B with interface on tcp1
  • Run traffic using selftest
  • Observe that traffic is distributed on all router interfaces evenly
  • Enable health (sensitivity, retries)
  • Add a PUT drop rule on the router to drop traffic on one of the interfaces in tcp0
  • Observe that traffic goes to the other interfaces. There shouldn't be any drop in traffic.
  • As long as the interface has less than optimal health, it should never be used for routing.



Multi-interface router with a bad interface than recovers
  • Bring up Router A with 4 interfaces. 2 on tcp0 and 2 on tcp1
  • Bring up Peer A with interface on tcp0
  • Bring up Peer B with interface on tcp1
  • Run traffic using selftest
  • Observe that traffic is distributed on all router interfaces evenly
  • Enable health (sensitivity, retries)
  • Add a PUT drop rule on the router to drop traffic on one of the interfaces in tcp0
  • Observe that traffic goes to the other interfaces. There shouldn't be any drop in traffic.
  • As long as the interface has less than optimal health, it should never be used for routing.
  • Remove the PUT drop rule from the router
  • Eventually that interface should be healthy again
  • Traffic should resume using that interface


User Interface

Test #TagProcedureScriptResult
1lnet_transaction_timeout
  • Set lnet_transaction_timeout to a value < retry_count via lnetctl and YAML
    • This should lead to a failure to set
  • Set lnet_transaction_timeout to a value > retr_count via lnetctl and YAML
    • lnet_lnd_timeout value should == lnet_transaction_timeout / retry_count
  • Show value via "lnetctl global show"


2lnet_retry_count
  • Set the lnet_retry_count to a value > lnet_transaction_timeout via lnetctl and YAML
    • This should lead to a failure to set
  • Set the lnet_retry_count to a value < lnet_transaction_timeout via lnetctl and YAML
    • lnet_lnd_timeout value should == lnet_transaction_timeout / retry_count
  • Show value via "lnetctl global show"


3lnet_health_sensitivity
  • Set the lnet_health sensitivity from lnetctl and from YAML
  • Show value via "lnetctl global show"


4NI statistics
  • verify LNet health statistics
    • lnetctl net show -v 3


5Peer NI statistics
  • verify LNet health statistics for peer NIs
    • lnetctl peer show -v 3


6NI Health value
  • verify setting the local NI health statistics
    • lnetctl net set --nid <nid> --health <value>
  • Redo from YAML


7Peer NI Health value
  • verify setting the local NI health statistics
    • lnetctl peer set --nid <nid> --health <value>
  • Redo from YAML


...