THIS PAGE IS ARCHIVED REFERENCE MATERIAL - PLEASE DO NOT EDIT DIRECTLY. THE LIVING VERSION OF THIS MATERIAL IS MAINTAINED IN THE LUSTRE OPERATIONS MANUAL, AVAILABLE AT http://lustre.org/documentation/.
Update
This page shows how to configure Multi-Rail with the routing feature before the Multi-Rail Routing Feature landed in 2.13. Routing code has always monitored the state of the route, in order to avoid using unavailable ones. This section describes how you can configure multiple interfaces on the same gateway node but as different routes. This way we can use the existing route monitoring algorithm to guard against interfaces going down.
With the Multi-Rail routing feature which landed in 2.13 the new algorithm uses the health feature to monitor the different interfaces of the gateway and always ensures that we use the healthiest interface. Therefore, the configuration trick described in this wiki page is no longer needed, however, can still work.
Overview
The MR algorithm allows the usage of multiple interfaces configured on the same LNet network. It also allows simultaneous usage of interfaces configured on different LNet networks.
MR configuration can be applied on the Router to aggregate the interfaces performance.
MR Cluster Example
The below example outlines a simple system where all the lustre nodes are MR capable. Each node in the cluster has two interfaces.
The routers can aggregate the interfaces on each side of the network by configuring them on the appropriate network.
An example configuration:
Routers lnetctl net add --net o2ib0 --if ib0,ib1 lnetctl net add --net o2ib1 --if ib2,ib3 lnetctl peer add --nid <peer1-nidA>@o2ib,<peer1-nidB>@o2ib,... lnetctl peer add --nid <peer2-nidA>@o2ib1,<peer2-nidB>@o2ib1,... lnetctl set routing 1 Clients lnetctl net add --net o2ib0 --if ib0,ib1 lnetctl route add --net o2ib1 --gateway <rtrX-nidA>@o2ib lnetctl peer add --nid <rtrX-nidA>@o2ib,<rtrX-nidB>@o2ib Servers lnetctl net add --net o2ib1 --if ib0,ib1 lnetctl route add --net o2ib0 --gateway <rtrX-nidA>@o2ib1 lnetctl peer add --nid <rtrX-nidA>@o2ib1,<rtrX-nidB>@o2ib1
In the above configuration the clients and the servers are configured with only one route entry per router. This works because the routers are MR capable. By adding the routers as peers with multiple interfaces to the clients and the servers, when sending to the router the MR algorithm will ensure that both interfaces of the routers are used.
However, as of the 2.10 release LNet Resiliency is still under development and single interface failure will still cause the entire router to go down.
Utilizing Router Resiliency
Currently, LNet provides a mechanism to monitor each route entry. LNet pings each gateway identified in the route entry on regular, configurable interval to ensure that it is alive. If sending over a specific route fails or if the router pinger determines that the gateway is down, then the route is marked as down and is not used. It is subsequently pinged on regular, configurable intervals to determine when it becomes alive again.
This mechanism can be combined with the MR feature in 2.10 to add this router resiliency feature to the configuration.
Routers lnetctl net add --net o2ib0 --if ib0,ib1 lnetctl net add --net o2ib1 --if ib2,ib3 lnetctl peer add --nid <peer1-nidA>@o2ib,<peer1-nidB>@o2ib,... lnetctl peer add --nid <peer2-nidA>@o2ib1,<peer2-nidB>@o2ib1,... lnetctl set routing 1 Clients lnetctl net add --net o2ib0 --if ib0,ib1 lnetctl route add --net o2ib1 --gateway <rtrX-nidA>@o2ib lnetctl route add --net o2ib1 --gateway <rtrX-nidB>@o2ib Servers lnetctl net add --net o2ib1 --if ib0,ib1 lnetctl route add --net o2ib0 --gateway <rtrX-nidA>@o2ib1 lnetctl route add --net o2ib0 --gateway <rtrX-nidB>@o2ib1
There are a few things to note in the above configuration
- The clients and the servers are now configured with two routes, each route's gateway is one of the interfaces of the route.
- The clients and servers will view each interface of the same router as a separate gateway and will monitor them as described above.
- The clients and the servers are not configured to view the routers as MR capable. This is important because we want to deal with each interface as a separate peers and not different interfaces of the same peer.
- The routers are configured to view the peers as MR capable. This is an oddity in the configuration, but is currently required in order to allow the routers to load balance the traffic load across its interfaces evenly.
Mixed MR/Non-MR cluster
The above principles can be applied to mixed MR/Non-MR cluster. For example, the same configuration shown above can be applied if the clients and the servers are non-MR while the routers are MR capable. This appears to be a common cluster upgrade scenario. Both NASA and ANU have presented the idea of upgrading the routers to MR to increase the bandwidth of the routers, connecting different areas of the cluster.
1 Comment
Joseph Gmitter
This material has been merged into the manual. Any further updates to this page need to be reflected in the manual.