More Details | |
---|---|
LU-11292 lnet: Discover routers on first use Discover routers on first use. This brings the behavior when interacting with routers inline with when dealing with normal peers. Test-Parameters: forbuildonly Signed-off-by: Amir Shehata <ashehata@whamcloud.com> Change-Id: I8527e41daf2f5f6ab5f04aac1285aaa6cc4ee594 |
|
https://review.whamcloud.com/#/c/33183/ | |
LU-11298 lnet: use peer for gateway The routing code uses peer_ni for a gateway. However with Mulit-Rail a gateway could have multiple interfaces on several different networks. Instead of using a single peer_ni as the gateway we should be using the peer and let the MR selection code select the best peer_ni to send to. This patch moves the gateway from peer to peer_ni. Much of the code needs to be rewritten in the following patches to account for that change. This patch disables the routing features by disabling the code to add/delete routes. Test-Parameters: forbuildonly Signed-off-by: Amir Shehata <ashehata@whamcloud.com> Change-Id: Ia7dab552268c4a7fbd7b88122b9a95363d155fd7 | The routing code will change quiet a bit so this patch removes most of the current routing code and then reintroduces it later. This patch concentrates on switching the gateway from using The design decision here is that a gateway is a node where LNet is started with the routing feature enabled. A gateway node can have multiple interfaces. In order to align routing with Multi-Rail, then the code should be first selecting a gateway peer, then using multi-rail to select the best peer_ni on that gateway to use. The following functions are removed in this patch and will be introduced in later patches lnet_is_route_alive() lnet_rtr_addref_locked() lnet_rtr_decref_locked() lnet_shuffle_seed() lnet_add_route_to_rnet() lnet_add_route() # the bulk of the code is removed lnet_check_routes() # the bulk of the code is removed lnet_del_route() # the bulk of the code is removed lnet_parse_rc_info() # the bulk of the code is removed lnet_destroy_rc_data() lnet_update_rc_data_locked() lnet_router_check_interval() lnet_ping_router_locked() lnet_prune_rc_data() lnet_compare_peers() Key fields are moved from lpni_rtrq # moved lpni_rtr_list # moved lpni_ping_notsent # deleted lpni_ping_timestamp # deleted lpni_ping_deadline # deleted lpni_rtr_refcount # moved lpni_healthy # this is a remnant code which is cleaned up lpni_routes # moved The lnet_route structure is changed in the following way: struct lnet_peer *lr_gateway # this is now lnet_peer instead of lnet_peer_ni __u32 lr_lnet # it is no longer possible to determine the local network of the route by simply looking at the gateway peer, since the peer can have multiple interfaces on different networks. Therefore the route now must define the local network and remote network. This way we are able to select and compare routes properly. The rest of the changes concentrate on removing the use of In lib-move.c there are changes in both
Routing is disabled with this patch. |
https://review.whamcloud.com/#/c/33184/ | |
LU-11299 lnet: lnet_add/del_route() Reimplemented lnet_add_route() and lnet_del_route() to use the peer instead of the peer_ni. Test-Parameters: forbuildonly Signed-off-by: Amir Shehata <ashehata@whamcloud.com> Change-Id: I3734098a81ab18d1d74220c691d96a9b9817e6da | NOTES: lnet_check_routes() is removed in this patch. We should move it in its own patch against ticket: LU-10153. Since the previous patch removes a bunch of functions. The reason for removing lnet_check_routes() is that we no longer restrict multiple routes on the same remote network. This patch re-implements the following functions, which now use lnet_rtr_addref_locked() lnet_rtr_decref_locked() lnet_shuffle_seed() lnet_add_route_to_rnet() lnet_add_route() lnet_del_route_from_rnet() lnet_del_route() |
https://review.whamcloud.com/#/c/33185/ | |
LU-11300 lnet: router aliveness A route is considered alive if the gateway is able to route messages from the local to the remote net. That means that at least one of the network interfaces on the remote net of the gateway is viable. Introduced the concept of sensitivity percentage. This defaults to 100%. It holds a dual meaning: 1. A route is considered alive if at least one of the its interfaces' health is >= LNET_MAX_HEALTH_VALUE * router_sensitivity_percentage 100 means at least one interface has to be 100% healthy 2. On a router consider a peer_ni dead if its health is not at least LNET_MAX_HEALTH_VALUE * router_sensitivity_percentage. 100% means the interface has to be 100% healthy. Re-implemented lnet_notify() to decrement the health of the peer interface if the LND reports a failure on that peer. Test-Parameters: forbuildonly Signed-off-by: Amir Shehata <ashehata@whamcloud.com> Change-Id: Ie97561fb70bf6a558bc90fa9266a6ba38fa3d293 |
Overview
Content Tools