You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Next »

2.12 Discovery of Non-MR Peer May Yield Unreachable NID

If non-MR peer (2.10.8) is discovered by a 2.12 MR peer, the following problem may happen: if non-MR peer has LNets that are not defined on the MR peer, it is possible that a NID on the undefined LNet is listed as primary. Later this causes communication problems when mounting.

Steps To Reproduce

Configuration

Prepare two nodes, one running 2.10.8, another 2.12.4 build.  Configure LNet similar to the following:

PeerA (2.10.8 non-MR)PeerB (2.12.4 MR)
lnetctl net show

net:

    - net type: lo

      local NI(s):

        - nid: 0@lo

          status: up

    - net type: o2ib

      local NI(s):

        - nid: 192.168.1.123@o2ib

          status: up

          interfaces:

              0: ib0

    - net type: o2ib4

      local NI(s):

        - nid: 192.168.1.123@o2ib4

          status: up

          interfaces:

              0: ib0

lnetctl net show

net:

    - net type: lo

      local NI(s):

        - nid: 0@lo

          status: up

    - net type: o2ib4

      local NI(s):

        - nid: 192.168.1.105@o2ib4

          status: up

          interfaces:

              0: ib0


Procedure

Run discovery of PeerA from PeerB and check the results:

Problem Behaviour Expected Behaviour
lnetctl discover 192.168.1.123@o2ib4

discover:

    - primary nid: 192.168.1.123@o2ib

      Multi-Rail: False

      peer ni:

        - nid: 192.168.1.123@o2ib4

        - nid: 192.168.1.123@o2ib

lnetctl peer show

peer:

    - primary nid: 192.168.1.123@o2ib

      Multi-Rail: False

      peer ni:

        - nid: 192.168.1.123@o2ib4

          state: NA

        - nid: 192.168.1.123@o2ib

          state: NA

lnetctl discover 192.168.1.123@o2ib4

discover:

    - primary nid: 192.168.1.123@o2ib4

      Multi-Rail: False

      peer ni:

        - nid: 192.168.1.123@o2ib4


lnetctl peer show

peer:

    - primary nid: 192.168.1.123@o2ib4

      Multi-Rail: False

      peer ni:

        - nid: 192.168.1.123@o2ib4

          state: NA

Note that in "problem" scenario, PeerA's  primary NID is on o2ib net, which is not accessible from PeerB.

References

DDN-1228, LU-13548

  • No labels