Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

https://review.whamcloud.com/#/c/34445/

More Details
Code Block
LU-
10153
12080 lnet: 
remove
recovery 
route
event 
add
handling 
restriction
broken
    
Remove restriction with adding routes to the same remote network via two different gateways. Signed-off-by: Amir Shehata <ashehata@whamcloud.com> Change-Id: Iefc5aa10f73e9e7bdd283f5e933fbb8ee819df50

There is no need to restrict the addition of routes to the same remote network via two different gateways.

This change is simple. Just remove lnet_check_routes() and its callers.

https://review.whamcloud.com/#/c/33182 Code BlockLU-11292 lnet: Discover routers on first use Discover routers on first use. This brings the behavior when interacting with routers inline with when dealing with normal peers.
Don't increment health on unlink event.
If a SEND fails an unlink will follow so no need to do any
special processing on SEND event. If SEND succeeds then we
wait for the reply.
When queuing a message on the NI recovery queue only do so
if the MT thread is still running.
    
Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: 
I8527e41daf2f5f6ab5f04aac1285aaa6cc4ee594
  • lnet_initiate_peer_discovery()
    • This function initiates peer discovery for the passed in lpni and returns LNET_DC_WAIT{{
    • It is called when we want to discover a peer on first use.
    • It is called when we want to discover a gateway on first use.
  • lnet_handle_find_routed_path()
    • Call lnet_find_routed_locked() to find a gateway
    • If the gateway has not been discovered yet. Then discover it.
    • Increment the sequence number on the route only if the route is going to be used.
      • This helps in ensuring that the route sequence numbers remain sane.
  • lnet_find_route_locked()
    • returns the route to use use_route and the previous route prev_route
    • It not longer increments the sequence number of the route since finding the route doesn't equate to using the route
      • Incrementing the route sequence number is delegated to the calling function.
I4877caebcac5cdfc35a59a18a3e3451b1f23cb0d
This should be ported to b2_12
https://review.whamcloud.com/#/c/34477
Code Block
LU-12080 lnet: clean mt_eqh properly
    
There is a scenario where you have a peer on your recovery queue
that's down. So you keep pinging it, but every ping times out
after 10 seconds. In the middle of these 10 seconds you perform a
shutdown. First you try to do the rsp_tracker_clean. It goes through
and calls MDUnlink on the MD related to that ping. But because the
message has a ref count on the MD, it doesn't go away. The MD gets
zombied. And just waits for lnet_md_unlink to be called in
lnet_finalize(). Then you hit clean_peer_ni_recovery. We see the peer
on the queue, we try to call Unlink on it, but when we lookup the
MD using lnet_handle2md() we can't find it. Afterwards we try to clean
up the EQ and it asserts. Even if we remove the assert we end up with
a resource leak since the EQ is not actually freed since we won't call
LNetEQFree() again.
   
The solution is to pull the EQ create in the LNetNIInit() and deletion
happens in lnet_unprepare. By this point all the remaining messages
would've been finalized and all references on the EQ are gone,
allowing us to clean it up properly
    
Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I7fd6018ee2e57f82c649fc3658352e89a4309986
This should be ported to b2_12
https://review.whamcloud.com/#/c/34967
Code Block
LU-12344 lnet: handle remote health error
    
When a peer is dead set the health status to REMOTE_DROPPED
in order to handle health properly for the peer.
When dropping a routed message set REMOTE_ERROR. Routed messages
are dropped when the routing feature is turned off which could
be considered a configuration error if it happens in the middle
of traffic. Therefore, it's better to flag this issue at this
point without resending the message.
    
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I131263215a68fc8607582643a47007ce4d04abbc
This should be ported to b2_12
https://review.whamcloud.com/#/c/34252
Code Block
LU-11816 lnet: setup health timeout defaults
    
Enable health feature by default.
Setup transaction timeout to a default 10 seconds and
retry count to 3 when health is enabled. When health
is disabled set default transaction timeout to 50.
When toggling between health enabled/disabled the defaults
will always kick in.
    
Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I153c2822898b44e33871ec827de7e61f153bb1db

This patch turns on the health feature since in 2.12 it was off by default. The MR routing feature and related health went through significant testing on Cray HW, thanks to Chris Horn, and some fixes were made to the Health feature in the process.

This should be ported to b2_12

https://review.whamcloud.com/#/c/34607
Code Block
LU-12163 lnet: fix cpt locking
    
In lnet_select_pathway() the call to lnet_handle_send_case_locked()
can result in sd_cpt being changed. If this function returns
REPEAT_SEND, we'll go back to the again label. It is possible at
this time to initiate discovery, which will unlock the cpt.
If the local cpt isn't updated we could potentially be manipulating
the wrong cpt resulting in some form of corruption or dead lock.
    
Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Ifd39b0d84f8cce859151f7cc900a082481dd7218
This should be ported to b2_12
https://review.whamcloud.com/#/c/34770/
Code Block
LU-12201 lnet: detach response tracker
    
We need to unlink the response tracker from MDs even if the
corresponding message failed to send.
    
https://review.whamcloud.com/#/c/33183/
Code Block
LU-11298 lnet: use peer for gateway

The routing code uses peer_ni for a gateway. However with Mulit-Rail
a gateway could have multiple interfaces on several different
networks. Instead of using a single peer_ni as the gateway we should
be using the peer and let the MR selection code select the best
peer_ni to send to.

This patch moves the gateway from peer to peer_ni. Much of the
code needs to be rewritten in the following patches to account
for that change. This patch disables the routing features by
disabling the code to add/delete routes.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Ia7dab552268c4a7fbd7b88122b9a95363d155fd7

The routing code will change quiet a bit so this patch removes most of the current routing code and then reintroduces it later.

This patch concentrates on switching the gateway from using lnet_peer_ni to using lnet_peer.

The design decision here is that a gateway is a node where LNet is started with the routing feature enabled. A gateway node can have multiple interfaces. In order to align routing with Multi-Rail, then the code should be first selecting a gateway peer, then using multi-rail to select the best peer_ni on that gateway to use.

The following functions are removed in this patch and will be introduced in later patches

Code Block
lnet_is_route_alive()
lnet_rtr_addref_locked()
lnet_rtr_decref_locked()
lnet_shuffle_seed()
lnet_add_route_to_rnet()
lnet_add_route() # the bulk of the code is removed
lnet_check_routes() # the bulk of the code is removed
lnet_del_route() # the bulk of the code is removed
lnet_parse_rc_info() # the bulk of the code is removed
lnet_destroy_rc_data()
lnet_update_rc_data_locked()
lnet_router_check_interval()
lnet_ping_router_locked()
lnet_prune_rc_data()
lnet_compare_peers()

Key fields are moved from lnet_peer_ni to lnet_peer or deleted including:

Code Block
lpni_rtrq # moved
lpni_rtr_list # moved
lpni_ping_notsent # deleted
lpni_ping_timestamp # deleted
lpni_ping_deadline # deleted
lpni_rtr_refcount # moved
lpni_healthy # this is a remnant code which is cleaned up
lpni_routes # moved

The lnet_route structure is changed in the following way:

Code Block
struct lnet_peer *lr_gateway # this is now lnet_peer instead of lnet_peer_ni
__u32 lr_lnet
It is no longer possible to determine the local network of the route by simply looking at the gateway peer, since the peer can have multiple interfaces on different networks. Therefore the route now must  define the local network and remote network. This way we are able to select and compare routes properly.

The rest of the changes concentrate on removing the use of lnet_peer_ni as the gateway and replacing it with lnet_peer

In lib-move.c there are changes in both lnet_post_routed_recv_locked() and lnet_return_rx_credits_locked()

lnet_find_route_locked() is marked as "to be implemented". As a result lnet_handle_find_routed_path() which calls lnet_find_route_locked() is also incomplete due to removal of routing functionality. There are changes there, but the changes are mainly to avoid compilation problem. It will be re-implemented in a later patch.

Routing is disabled with this patch.

https://review.whamcloud.com/#/c/33184/ Code BlockLU-11299 lnet: lnet_add/del_route() Reimplemented lnet_add_route() and lnet_del_route() to use the peer instead of the peer_ni.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: 
I3734098a81ab18d1d74220c691d96a9b9817e6da

This patch re-implements the following functions, which now use lnet_peer instead of lnet_peer_ni for the gateway

Code Block
lnet_rtr_addref_locked()
lnet_rtr_decref_locked()
lnet_shuffle_seed()
lnet_add_route_to_rnet()
lnet_add_route()
lnet_del_route_from_rnet()
lnet_del_route()
Prevent peer_ni deletion if it's being used as a router
Code Block
LU-11551 lnet: Do not allow deleting of router nis
    
Check the peer before deleting a peer_ni. If it's a router then do
not allow deletion of the peer-ni.
    
Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I372052b4e9b5af3a8f18a49676fc60b4c8077cbd
Add a check before deleting the peer_ni in lnet_del_peer_ni()Router sensitivity introduction
I4f320274576790e3332f66f30aad5c2b3450b955
This should be ported to b2_12
https://review.whamcloud.com/#/c/34771
Code Block
LU-11297 lnet: invalidate recovery ping mdh
    
For cleanliness, ensure that recovery ping mdh is invalidated when
an peer ni or a local ni are allocated
    
Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: If06448b1602b3680831244923b6b982a555159ea
This should be ported to b2_12
https://review.whamcloud.com/#/c/34778
Code Block
LU-12249 lnet: fix list corruption
    
In shutdown the resend queues are cleared and freed. The monitor
thread state is set to shutdown. It is possible to get lnet_finalize()
called after the queues are freed. The code checks for ln_state to see
if we're shutting down. But in this case we should really be checking
ln_mt_state. The monitor thread is the one that matters in this case,
because it's the one which allocates and frees the resend queues
Code BlockLU-11300 lnet: router sensitivity Introduce the lnet_router_sensitivity module parameter to control the sensitivity of routers to failures. It defaults to 100% which means a router interface needs to be fully healthy in order to be used
.
    
Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: 
I3e9333033f049918c1cdca58a72604c71884acbe
Ia077cec7a52ef5cd2e1b231437c6265ba9416b1b
This
patch introduces the router_sensitivity_percentage module parameterRouter sensitivity user space setting
should be ported to b2_12
https://review.whamcloud.com/#/c/34796
Code Block
LU-
11300
12254 lnet: correct 
configure lnet_router_sensitivity
discovery LNetEQFree()
    
Allow the configuration of lnet_router_sensitivity from the user space utility lnetctl Test-Parameters:
The EQ needs to be freed after all the queues are cleaned to avoid
having non-processed events on the event queue on free. This will
prevent the memory from being freed.
    
Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: 
I715059580d3d2d443432a8b550d4cdafc9f9f632This patch allows setting the router sensitivity from lnetctlCache the ni_status reported in the ping REPLY.
Ie38ec25e09bf6d7cf2aadc30edd91d298897c51b
This should be ported to b2_12
https://review.whamcloud.com/#/c/34798
Code Block
LU-
11300
12264 lnet: 
cache ni status When processing the data in the PUSH or the REPLY make sure to cache the ns_status. This is the status of the peer_ni as reported by the peer itself
Protect lp_dc_pendq manipulation with lp_lock
    
Protect the peer discovery queue from concurrent manipulation by
acquiring the lp_lock.
    
Test-Parameters: forbuildonly
Signed-off-by: 
Amir
Chris 
Shehata
Horn 
<ashehata@whamcloud
<hornc@cray.com>
Change-Id: 
I14de2460f578fb7f47d329a97b8833f49c569b74
If43b877c1c7ea203f346a3d6ea846f00b8f9661f
This
patch caches the ns_status reported in the PING reply for a GET sent as part of discovery
  • lnet_peer_merge_data()
    • When processing the data in the PUSH or the REPLY make sure to cache the ns_status. This is the status of the peer_ni as reported by the peer itself.
should be ported to b2_12
https://
https://
review.whamcloud.com/#/c/
33303
34885/
Code Block
LU-
11300
12199 lnet: 
start
Ensure md 
with
is 
peer down When creating an peer_ni call lnet_peers_start_down
detached when msg is not committed
    
It's possible for lnet_is_health_check() to 
check if we should set the peer_ni's status as up or down. Test-Parameters: forbuildonly Signed-off-by: Amir Shehata <ashehata@whamcloud.com> Change-Id: I05005f10ca4b1b11f93e57c304052e155679304a

qThis patch ensures we maintain current behavior. We start with peer down depending on the tunable checked via lnet_peers_start_down.

  • This is achieved by setting the lpni_ns_status = LNET_NI_STATUS[UP|DOWN]
Cache the routing feature status reported in the ping REPLY
return "true" when the
message has not hit the network. In this situation the message is
freed without detaching the MD. As a result, requests do not receive
their unlink events and these requests are stuck forever.
    
A little cleanup is included here:
 - The value of lnet_is_health_check() is only used in one place, so
   we don't need to save the result of it in a variable.
 - We don't need separate logic to detach the md when the send was
   successful. We'll fall through to the finalizing code after
   incrementing the health counters
    
Test-Parameters: forbuildonly
Cray-bug-id: LUS-7239
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I6301d491090b862d016eed3aac8afd7be8685e57
This should be ported to b2_12
https://review.whamcloud.com/#/c/34797/
Code Block
LU-12199 lnet: verify msg is commited for send/recv
    
Before performing a health check make sure the message
is committed for either send or receive. Otherwise we
can just finalize it.
    
Code Block
LU-11300 lnet: Cache the routing feature
    
When processing a REPLY or a PUSH for a discovery cache the
whether the routing feature is enabled or disabled as
reported by the peer.
    
Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I69bd41fade196773af0e1004c2e7fff2fb91392d

This patch caches the routing feature status (enabled or disabled) in the REPLY or the PUSH processed by lnet_peer_merge_data().

This is important for later patches which check if the peer is a router or not.

https://review.whamcloud.com/#/c/33186/ Code BlockLU-11300 lnet: peer aliveness Peer NI aliveness is now solely dependent on the health infrastructure. With the addition of router_sensitivity_percentage, peer NI is considered dead if its health drops below the percentage specified of the total health. Setting the percentage to 100% means that a peer_ni is considered dead if it's interface is less than fully healthy. Removed obsolete code that queries the peer NI every second since the health infrastructure introduces the recovery mechanism which is designed to recover the health of peer NIs.
Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: 
I506060fbb66c74295808891b689d7d634dc69284

******(CONTINUE FROM HERE)******* NOTE: This patch needs cleaning up to make it match the final form.

  • lnet_is_peer_ni_alive()
    • Determines if the peer is alive if it's health value >= MAX * router sensitvity percentage
  • and
    • the status reported in the ping is UP
  • lnet_peer_alive_locked() - changes
  • Set the lpni_ns_status = LNET_NI_STATUS_UP when we receove a message from that peer.

Code which calls this function is updated.

router_proc.c cleanup

Merge the code from https://review.whamcloud.com/#/c/33302/3 to this patch.

Id7bd956f8e81e60a2d63059730973f851d4c7abe
This should be ported to b2_12
https://review.whamcloud.com/#/c/34957
Code Block
LU-12339 lnet: select LO interface for sending
    
In the following scenario
    
Lustre->LNetPrimaryNID with 0@lo
Discover is initiated on 0@lo
The peer is created with 0@lo and <addr>@<net>
The interface health of the peer's <addr>@<net> is decremented
LNetPut() to self on <addr>@<net>
selection algorithm selects 0@lo to send to
    
This exposes an issue where we try and go through the peer credit
management algorithm, but because there are no credits associated with
0@lo we end up indefinitely queuing the message. ptlrpc will then get
stuck waiting for send completion on the message.
    
This was exposed via conf-sanity 32a
    
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I98e9d3428b594a0d041d27d8e8d8de7596825edc
This should be ported to b2_12
https://review.whamcloud.com/#/c/33447/
Code Block
LU-10153 lnet: remove route add restriction
    
Remove restriction with adding routes to the same remote network
via two different gateways.
    
https://review.whamcloud.com/#/c/33185/ Code BlockLU-11300 lnet: router aliveness A route is considered alive if the gateway is able to route messages from the local to the remote net. That means that at least one of the network interfaces on the remote net of the gateway is viable. Introduced the concept of sensitivity percentage. This defaults to 100%. It holds a dual meaning: 1. A route is considered alive if at least one of the its interfaces' health is >= LNET_MAX_HEALTH_VALUE * router_sensitivity_percentage 100 means at least one interface has to be 100% healthy 2. On a router consider a peer_ni dead if its health is not at least LNET_MAX_HEALTH_VALUE * router_sensitivity_percentage. 100% means the interface has to be 100% healthy. Re-implemented lnet_notify() to decrement the health of the peer interface if the LND reports a failure on that peer. Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: 
Ie97561fb70bf6a558bc90fa9266a6ba38fa3d293

NOTE: break up this patch into a patch which introduces the router_sensitvity_percentage and a patch which uses the value. The changes which are in lib-msg.c do not belong to this patch. Need a separate patch for it.

NOTE: This should be titled "route aliveness"

This patch introduces a new concept on how to determine that a route is alive. A route is alive if the following two conditions are met:

  1. At least one local interface to the route is healthy
  2. At least one remote interface of the route is up

The health value of a router remote interface will always be set to MAX because we do not send to it directly, therefore we never decrement its health value. The way we know if it's up or down is when we discover it, the router response with the status of the interface which we cache and use to determine the status of the remote interface.

  • lnet_is_gateway_net_alive()
    • A net of the gateway is alive if it has at least one alive ni
Cleanup all rcd codeCleans up the legacy code which handled router pingingUpdate LND notify mechansim
Iefc5aa10f73e9e7bdd283f5e933fbb8ee819df50

There is no need to restrict the addition of routes to the same remote network via two different gateways.

This change is simple. Just remove lnet_check_routes() and its callers.

https://review.whamcloud.com/#/c/33182
Code Block
LU-11292 lnet: Discover routers on first use

Discover routers on first use. This brings the behavior when
interacting with routers inline with when dealing with normal
peers.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I8527e41daf2f5f6ab5f04aac1285aaa6cc4ee594
  • lnet_initiate_peer_discovery()
    • This function initiates peer discovery for the passed in lpni and returns LNET_DC_WAIT{{
    • It is called when we want to discover a peer on first use.
    • It is called when we want to discover a gateway on first use.
  • lnet_handle_find_routed_path()
    • Call lnet_find_routed_locked() to find a gateway
    • If the gateway has not been discovered yet. Then discover it.
    • Increment the sequence number on the route only if the route is going to be used.
      • This helps in ensuring that the route sequence numbers remain sane.
  • lnet_find_route_locked()
    • returns the route to use use_route and the previous route prev_route
    • It not longer increments the sequence number of the route since finding the route doesn't equate to using the route
      • Incrementing the route sequence number is delegated to the calling function.

Updates

lnet_notify()

lnet_set_healthv()

lnet_notify_peer_down()

gni changes

o2iblnd changes

socklnd changes
https://review.whamcloud.com/#/c/
33187/5/lnet/lnet/router.c (end of the file changes)Use discovery for router checking

lnet_consolidate_routes_locked()

lnet_peer_get_ni_locked()

lnet_check_routers()

33183/
Code Block
LU-11298 lnet: use peer for gateway

The routing code uses peer_ni for a gateway. However with Mulit-Rail
a gateway could have multiple interfaces on several different
networks. Instead of using a single peer_ni as the gateway we should
be using the peer and let the MR selection code select the best
peer_ni to send to.

This patch moves the gateway from peer to peer_ni. Much of the
code needs to be rewritten in the following patches to account
for that change. This patch disables the routing features by
disabling the code to add/delete routes.

https://review.whamcloud.com/#/c/33188/ Code BlockLU-11378 lnet: MR aware gateway selection When selecting a route use the Multi-Rail Selection algorithm to select the best available peer_ni of the best route. The selected peer_ni can then be used to send the message or to discover it if the gateway peer needs discovering.
Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: 
I376af57611591eed2eb1edb80a1b3a68b5aefd19This patch modifies lib-move.c to properly select the gateway and then the gateway peer_ni to send to.https://review.whamcloud.com/#/c/33298/
Code Block
LU-11300 lnet: consider router_check_interval

Consider router_check_interval when waking up the monitor thread,
to make sure you wakeup the monitor thread at the earliest possible
time.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Ibc4b53886b59a9bc174a29d0da711ac77db3a62c

The monitor thread wakes up the minimum of

  • lnet_recovery_interval
  • lnet_transaction_timeout /2
  • router_check_interval

This patch introduces the router_check_interval for consideration in the monitor thread wake up algorithm

https://review.whamcloud.com/#/c/33299/
Code Block
LU-11299 lnet: router discovery complete callback

Added a discovery complete callback which is called when a
router has completed it's discovery process. If the router
failed discovery then the status of each lpni is set to
down. This is necessary because lpnis on remote networks
are never communicated with. So their health remains at max.
However, if we can't discover the router, then we have to
assume that the whole router and all its NIs are down.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I8d77b78b20ac555bc3afabd9404ca4b0fd19bd2d

Introduce:

  • lnet_router_discovery_complete()
    • Called from lnet_peer_discovery_complete() for peers which are routers
    • It checks the discovery completion code and if it's an error sets all the lpni→lpni_ns_status = LNET_NI_STATUS_DOWN
Ia7dab552268c4a7fbd7b88122b9a95363d155fd7

The routing code will change quiet a bit so this patch removes most of the current routing code and then reintroduces it later.

This patch concentrates on switching the gateway from using lnet_peer_ni to using lnet_peer.

The design decision here is that a gateway is a node where LNet is started with the routing feature enabled. A gateway node can have multiple interfaces. In order to align routing with Multi-Rail, then the code should be first selecting a gateway peer, then using multi-rail to select the best peer_ni on that gateway to use.

The following functions are removed in this patch and will be introduced in later patches

Code Block
lnet_is_route_alive()
lnet_rtr_addref_locked()
lnet_rtr_decref_locked()
lnet_shuffle_seed()
lnet_add_route_to_rnet()
lnet_add_route() # the bulk of the code is removed
lnet_check_routes() # the bulk of the code is removed
lnet_del_route() # the bulk of the code is removed
lnet_parse_rc_info() # the bulk of the code is removed
lnet_destroy_rc_data()
lnet_update_rc_data_locked()
lnet_router_check_interval()
lnet_ping_router_locked()
lnet_prune_rc_data()
lnet_compare_peers()

Key fields are moved from lnet_peer_ni to lnet_peer or deleted including:

Code Block
lpni_rtrq # moved
lpni_rtr_list # moved
lpni_ping_notsent # deleted
lpni_ping_timestamp # deleted
lpni_ping_deadline # deleted
lpni_rtr_refcount # moved
lpni_healthy # this is a remnant code which is cleaned up
lpni_routes # moved

The lnet_route structure is changed in the following way:

Code Block
struct lnet_peer *lr_gateway # this is now lnet_peer instead of lnet_peer_ni
__u32 lr_lnet

It is no longer possible to determine the local network of the route by simply looking at the gateway peer, since the peer can have multiple interfaces on different networks. Therefore the route now must define the local network and remote network. This way we are able to select and compare routes properly.

The rest of the changes concentrate on removing the use of lnet_peer_ni as the gateway and replacing it with lnet_peer

In lib-move.c there are changes in both lnet_post_routed_recv_locked() and lnet_return_rx_credits_locked()

lnet_find_route_locked() is marked as "to be implemented". As a result lnet_handle_find_routed_path() which calls lnet_find_route_locked() is also incomplete due to removal of routing functionality. There are changes there, but the changes are mainly to avoid compilation problem. It will be re-implemented in a later patch.

Routing is disabled with this patch.

https://review.whamcloud.com/#/c/33184/
Code Block
LU-11299 lnet: lnet_add/del_route()

Reimplemented lnet_add_route() and lnet_del_route() to use
the peer instead of the peer_ni.

https://review.whamcloud.com/#/c/33300/ Code BlockLU-11475 lnet: allow deleting router primary_nid Discovery doesn't allow deleting a primary_nid of a peer. This is necessary because upper layers only know to reach the peer by using the primary_nid. For routers this is not the case. So if a router changes its interfaces and comes back up again, the peer_ni should be adjusted.
Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: 
I9da056172f35a5f15eed5ba0e02fcb37ac414c54
  • Add a new bit to the peer state: LNET_PEER_RTR_NI_FORCE_DEL
  • add a force parameter to lnet-peer_ni_del_locked() this will be set to true in lnet_peer_merge_data()for routers it's okay to delete the primary_nid because the upper layers
    I3734098a81ab18d1d74220c691d96a9b9817e6da

    This patch re-implements the following functions, which now use lnet_peer instead of lnet_peer_ni for the gateway

    Code Block
    lnet_rtr_addref_locked()
    lnet_rtr_decref_locked()
    lnet_shuffle_seed()
    lnet_add_route_to_rnet()
    lnet_add_route()
    lnet_del_route_from_rnet()
    lnet_del_route()
    don't really rely on it. So if we're being told that the router changed its primary_nid then it's okay to delete it.
    https://review.whamcloud.com/#/c/
    33301
    33448/
    Code Block
    LU-
    11477
    11551 lnet: 
    handle
    Do 
    health
    not 
    for incoming messages In case
    allow deleting of 
    routers
    router 
    (as
    nis
     
    well
     
    as
     
    for
     
    Check the
    general case)
     peer before deleting a peer_ni. If it's 
    important to update the health
    a router then do
    not allow deletion of the 
    ni/lpni for incoming messages. For an lpni specifically when we receive a message is when we know that the lpni is up. A percentage router health is required in order to send a message to a gateway. That defaults to 100, meaning that
    peer-ni.
        
    Test-Parameters: forbuildonly
    Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
    Change-Id: I372052b4e9b5af3a8f18a49676fc60b4c8077cbd
    Add a check before deleting the peer_ni in lnet_del_peer_ni()
    https://review.whamcloud.com/#/c/33449/
    Code Block
    LU-11300 lnet: router sensitivity
        
    Introduce the lnet_router_sensitivity module parameter to control
    the sensitivity of routers to failures. It defaults to 100% which
    means a router interface 
    has
    needs to
     be 
    absolutely
    fully healthy in order to 
    send to it. This matches the current behavior. So if a router interface goes down an its health goes down significantly, but then it comes back up again; either we receive a message from it or we discover it and get a reply, then in order to start using that router interface again we have to boost its health all the way up to maximum. This behavior is special cased for routers. Test-Parameters:
    be
    used.
        
    Test-Parameters: forbuildonly
    Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
    Change-Id: I3e9333033f049918c1cdca58a72604c71884acbe
    This patch introduces the router_sensitivity_percentage module parameter
    https://review.whamcloud.com/#/c/33455/
    Code Block
    LU-11300 lnet: configure lnet_router_sensitivity
        
    Allow the configuration of lnet_router_sensitivity from the user space utility lnetctl
        
    Test-Parameters: forbuildonly
    Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
    Change-Id: 
    Ida6c23f95dbef56c2e6ed7b6d03743939d8b30a0

    Most of the modifications are in lnet_health_check()

  • handle incoming messages
  • Only increment the health of a peer_ni if we receive a message from it. Sending a message successfully doesn't mean the peer_ni is actually alive.
  • For router case boost the health of the peer_ni of the router to max if we receive a message from it.
    I715059580d3d2d443432a8b550d4cdafc9f9f632
    This patch allows setting the router sensitivity from lnetctl
    https://review.whamcloud.com/#/c/33450/
    Code Block
    LU-11300 lnet: cache ni status
    
    When processing the data in the PUSH or the REPLY make sure to cache
    the ns_status. This is the status of the peer_ni as reported by the
    peer itself.
        
    Test-Parameters: forbuildonly
    Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
    Change-Id: I14de2460f578fb7f47d329a97b8833f49c569b74

    This patch caches the ns_status reported in the PING reply for a GET sent as part of discovery

    • lnet_peer_merge_data()
      • When processing the data in the PUSH or the REPLY make sure to cache the ns_status. This is the status of the peer_ni as reported by the peer itself.
    https://review.whamcloud.com/#/c/33451/
    Code Block
    LU-11300 lnet: Cache the routing feature
        
    When processing a REPLY or a PUSH for a discovery cache the
    whether the routing feature is enabled or disabled as
    reported by the peer.
        
    Test-Parameters: forbuildonly
    Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
    Change-Id: I69bd41fade196773af0e1004c2e7fff2fb91392d

    This patch caches the routing feature status (enabled or disabled) in the REPLY or the PUSH processed by lnet_peer_merge_data().

    This is important for later patches which check if the peer is a router or not.

    https://review.whamcloud.com/#/c/33186/
    Code Block
    LU-11300 lnet: peer aliveness
    
    Peer NI aliveness is now solely dependent on the health
    infrastructure. With the addition of router_sensitivity_percentage,
    peer NI is considered dead if its health drops below the percentage
    specified of the total health. Setting the percentage to 100% means
    that a peer_ni is considered dead if it's interface is less than
    fully healthy.
    
    Removed obsolete code that queries the peer NI every second since
    the health infrastructure introduces the recovery mechanism which
    is designed to recover the health of peer NIs.
    
    Test-Parameters: forbuildonly
    Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
    Change-Id: I506060fbb66c74295808891b689d7d634dc69284

    • lnet_is_peer_ni_alive()
      • Determines if the peer is alive if it's health value >= MAX * router sensitvity percentage
    • and
      • the status reported in the ping is UP
    • lnet_peer_alive_locked() - changes
    • Set the lpni_ns_status = LNET_NI_STATUS_UP when we receove a message from that peer.

    Code which calls this function is updated.

    router_proc.c cleanup

    https://review.whamcloud.com/#/c/33185/
    Code Block
    LU-11300 lnet: router aliveness
    
    A route is considered alive if the gateway is able to route
    messages from the local to the remote net. That means that
    at least one of the network interfaces on the remote net of
    the gateway is viable.
    
    Introduced the concept of sensitivity percentage. This defaults
    to 100%. It holds a dual meaning:
    1. A route is considered alive if at least one of the its interfaces'
    health is >= LNET_MAX_HEALTH_VALUE * router_sensitivity_percentage
    100 means at least one interface has to be 100% healthy
    2. On a router consider a peer_ni dead if its health is not at least
    LNET_MAX_HEALTH_VALUE * router_sensitivity_percentage.
    100% means the interface has to be 100% healthy.
    
    Re-implemented lnet_notify() to decrement the health of the
    peer interface if the LND reports a failure on that peer.
    
    Test-Parameters: forbuildonly
    Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
    Change-Id: Ie97561fb70bf6a558bc90fa9266a6ba38fa3d293

    NOTE: break up this patch into a patch which introduces the router_sensitvity_percentage and a patch which uses the value. The changes which are in lib-msg.c do not belong to this patch. Need a separate patch for it.

    NOTE: This should be titled "route aliveness"

    This patch introduces a new concept on how to determine that a route is alive. A route is alive if the following two conditions are met:

    1. At least one local interface to the route is healthy
    2. At least one remote interface of the route is up

    The health value of a router remote interface will always be set to MAX because we do not send to it directly, therefore we never decrement its health value. The way we know if it's up or down is when we discover it, the router response with the status of the interface which we cache and use to determine the status of the remote interface.

    • lnet_is_gateway_net_alive()
      • A net of the gateway is alive if it has at least one alive ni
    https://review.whamcloud.com/#/c/33452
    Code Block
    LU-11300 lnet: simplify lnet_handle_local_failure()
        
    Pass the struct lnet_ni to lnet_handle_local_failure() instead of the
    message structure, since nothing else from the message is being
    used. This also makes symmetrical with lnet_handle_remote_failure()
        
    Test-Parameters: forbuildonly
    Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
    Change-Id: I10146ec5bf5f378e28a7725382f00132ada32c6e

    https://review.whamcloud.com/#/c/33187/
    Code Block
    LU-11299 lnet: Cleanup rcd
        
    Cleanup all code pertaining to rcd, as routing code will use
    discovery going forward and there will be no need to keep its own
    pinging code.
        
    Test-Parameters: forbuildonly
    Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
    Change-Id: If31caa3b5703df40b6ae0f758f2fe764991aa4f3
    Cleans up the legacy code which handled router pinging
    https://review.whamcloud.com/#/c/33453/
    Code Block
    LU-11299 lnet: modify lnd notification mechanism
        
    LND notifies when a peer is up or down. If a it notifies
    LNet that the peer is up and sets the "reset" flag to true
    then this indicates to LNet that the LND knows about the health
    of the peer and is telling LNet that the peer is fully healthy.
    LNet will set the health value of the peer to maximum, otherwise
    it will increment the health by one.
        
    If the LND notifies the LNet that the peer is down, LNet will
    decrement the health of the peer by sensitivity value configured.
        
    LNet then turns around and rechecks the peer aliveness and if its
    dead it'll notify the LND. This code is only used by the socklnd
    because it needs to teardown connections. This is in keeping with
    the original funcionality.
        
    Test-Parameters: forbuildonly
    Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
    Change-Id: Ifa614405fb0c2cd4f6bcb1a2a97e856320eb6cbe

    Updates

    lnet_notify()

    lnet_set_healthv()

    lnet_notify_peer_down()

    gni changes

    o2iblnd changes

    socklnd changes

    https://review.whamcloud.com/#/c/33454/
    Code Block
    LU-11299 lnet: use discovery for routing
    
    Instead re-inventing the wheel, routing now uses discovery.
    Everyone router interval the router is discovered. This will
    update the router information locally and will serve to let the 
    router know that the peer is alive.
    
    Test-Parameters: forbuildonly
    Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
    Change-Id: I211bf15af0b0a5d50f9e2a69a385419a1dd5096

    lnet_consolidate_routes_locked()

    lnet_peer_get_ni_locked()

    lnet_check_routers()

    lnet_router_discovery_complete()

    • Called from lnet_peer_discovery_complete() for peers which are routers
    • It checks the discovery completion code and if it's an error sets all the lpni→lpni_ns_status = LNET_NI_STATUS_DOWN
    https://review.whamcloud.com/#/c/33188/
    Code Block
    LU-11378 lnet: MR aware gateway selection
    
    When selecting a route use the Multi-Rail Selection algorithm to
    select the best available peer_ni of the best route. The selected
    peer_ni can then be used to send the message or to discover it
    if the gateway peer needs discovering.
    
    Test-Parameters: forbuildonly
    Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
    Change-Id: I376af57611591eed2eb1edb80a1b3a68b5aefd19
    This patch modifies lib-move.c to properly select the gateway and then the gateway peer_ni to send to.
    https://review.whamcloud.com/#/c/33298/
    Code Block
    LU-11300 lnet: consider router_check_interval
    
    Consider router_check_interval when waking up the monitor thread,
    to make sure you wakeup the monitor thread at the earliest possible
    time.
    
    Test-Parameters: forbuildonly
    Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
    Change-Id: Ibc4b53886b59a9bc174a29d0da711ac77db3a62c

    The monitor thread wakes up the minimum of

    • lnet_recovery_interval
    • lnet_transaction_timeout /2
    • router_check_interval

    This patch introduces the router_check_interval for consideration in the monitor thread wake up algorithm

    https://review.whamcloud.com/#/c/33300/
    Code Block
    LU-11475 lnet: allow deleting router primary_nid
    
    Discovery doesn't allow deleting a primary_nid of a peer. This
    is necessary because upper layers only know to reach the peer by
    using the primary_nid. For routers this is not the case. So
    if a router changes its interfaces and comes back up again, the
    peer_ni should be adjusted.
    
    Test-Parameters: forbuildonly
    Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
    Change-Id: I9da056172f35a5f15eed5ba0e02fcb37ac414c54
    • Add a new bit to the peer state: LNET_PEER_RTR_NI_FORCE_DEL
    • add a force parameter to lnet-peer_ni_del_locked() this will be set to true in lnet_peer_merge_data()
      • for routers it's okay to delete the primary_nid because the upper layers don't really rely on it. So if we're being told that the router changed its primary_nid then it's okay to delete it.

    https://review.whamcloud.com/#/c/34539
    Code Block
    LU-11475 lnet: transfer routers
        
    When a primary NID of a peer is about to be deleted because
    it's being transfered to another peer, if that peer is a gateway
    then transfer all gateway properties to the new peer.
        
    Test-Parameters: forbuildonly
    Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
    Change-Id: Ib475c389ca5630906416a5112b3088f6f5d03950

    https://review.whamcloud.com/#/c/33301/
    Code Block
    LU-11477 lnet: handle health for incoming messages
    
    In case of routers (as well as for the general case) it's important to
    update the health of the ni/lpni for incoming messages. For an lpni
    specifically when we receive a message is when we know that the lpni
    is up.
    
    A percentage router health is required in order to send a message to a
    gateway. That defaults to 100, meaning that a router interface has to
    be absolutely healthy in order to send to it. This matches the current
    behavior. So if a router interface goes down an its health goes down
    significantly, but then it comes back up again; either we receive a
    message from it or we discover it and get a reply, then in order to
    start using that router interface again we have to boost its health
    all the way up to maximum.
    
    This behavior is special cased for routers.
    
    Test-Parameters: forbuildonly
    Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
    Change-Id: Ida6c23f95dbef56c2e6ed7b6d03743939d8b30a0

    Most of the modifications are in lnet_health_check()

    • handle incoming messages
    • Only increment the health of a peer_ni if we receive a message from it. Sending a message successfully doesn't mean the peer_ni is actually alive.
    • For router case boost the health of the peer_ni of the router to max if we receive a message from it.
      • We need to do that because if we increment it by 1 only and the percentage sensitivity is set to 100 we will not use that router interface for routing. This is to keep current routing behavior.
    https://review.whamcloud.com/#/c/33304/
    Code Block
    LU-11478 lnet: misleading discovery seqno.
    
    There is a sequence number used when sending discovery messages. This
    sequence number is intended to detect stale messages. However it
    could be misleading if the peer reboots. In this case the peer's
    sequence number will reset. The node will think that all information
    being sent to it is stale, while in reality the peer might've
    changed configuration.
    
    There is no reliable why to know whether a peer rebooted, so we'll
    always assume that the messages we're receiving are valid. So we'll
    operate on first come first serve basis.
    
    Test-Parameters: forbuildonly
    Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
    Change-Id: I421a00e47bc93ee60fa37c648d6d9a726d9def9c

    Need to pass this by Olaf Weber

    https://review.whamcloud.com/#/c/33305/
    Code Block
    LU-11470 lnet: drop all rule
    
    Add a rule to drop all messages arriving on a specific interface.
    This is useful for simulating failures on a specific router interface.
    
    Test-Parameters: forbuildonly
    Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
    Change-Id: Ic69f683fb2caf7a69a1d85428878c89b7b1ee3ad

    For testing routers we want to be able to add a rule on the router to drop all messages arriving on that router interface from anywhere. This way we can simulate a router interface down scenario. Problem is the source and destination in the router case are not the router NID. So the rule specifies the local NID of the router.

    If the local  nid is not specific then it default to LNET_NID_ANY. Unlike source and destination it is mandatory. specifying NID any allows the drop rule to match messages in the absence of a specified local_nid

    drop all field is added which can be set from command line.



    https://review.whamcloud.com/#/c/33620/3
    Code Block
    LU-11641 lnet: handle discovery off
    
    When discovery is turned off locally or when the peer either has
    discovery off or doesn't support MR at all then degrade discovery
    behavior to a standard ping. This will allow routers to continue
    using discovery mechanism even if it's turned off.
    
    Test-Parameters: forbuildonly
    Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
    Change-Id: I7f0829d37cbff2bf9e41de251efa715fc4c97e5d

    The original discovery behavior was that if you turn off discovery then it doesn't send a PING or a PUSH. However, this causes a problem for routers. One example is Cray need to turn off discovery because it interferes with their routing setup. To handle this we changed the behavior for discovery off case. Discovery will always PING when requested. When the PING REPLY comes it'll update the existing peer_nis but not add or move peers_nis among peers. This is needed to update the router peer_ni statuses on the nodes.

    Most of the changes are spread through peer.c to change what we do when discovery on the peer is off or when it is turned off locally.

    https://review.whamcloud.com/#/c/33634/2
    Code Block
    LU-11297 lnet: handle router health off
    
    Routing infrastructure depends on health infrastructure to manage
    route status. However, health can be turned off. Therefore, we need
    to enable health for gateways in order to monitor them properly.
    Each peer now has its own health sensitivity. When adding a route
    the gateway's health sensitivity can be explicitly set from lnetctl
    or if not specified then it'll default to 1, thereby turning health
    on for that gateway, allowing peer NI recovery if there is a failure.
    
    Test-Parameters: forbuildonly
    Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
    Change-Id: Ibae33d595e97d0eec432ae8f5d51898ce0776f01
    
    The way router health is decided is via the health mechanism. This presents a problem since health is turned off by default. This patch enables it for all routers.
    https://review.whamcloud.com/#/c/33635/2
    Code Block
    LU-11297 lnet: set gw sensitivity from lnetctl
    
    Allow an optional parameter from the:
    lnetctl route add
    command to set the health sensitivity of the gateway
    lnetctl route add --net <net> --gateway <gw> --sensitivity <value>
    
    Test-Parameters: forbuildonly
    Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
    Change-Id: Iee120c78a41b79da6ab6bdf1560f558df89233e2
    Allow configuring router sensitivity value when adding a route. It's an optional parameter and defaults to 1 if not specified.
    https://review.whamcloud.com/#/c/33651/
    Code Block
    LU-11664 lnet: push router interface updates
    
    A router can bring up/down its interfaces if it hasn't received any
    messages on that interface for a configurable period
    (alive_router_ping_timeout). When this even occures the router can now
    push its status change to the peers it's talking to in order to inform
    them of the change in its status. This will allow the router users to
    handle asym router failures quicker.
    
    Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
    Change-Id: I9530ed7d9bc0a86edc43e3f610cc943f1732dcfd
    To allow for faster recovery if the router interfaces go down because it doesn't receive a ping, a push with the updated information is pushed to all the peers. This allows peers to stop using these routers if asym router failure is enabled (it is by default).
    https://review.whamcloud.com/#/c/34510
    Code Block
    LU-11299 lnet: net aliveness
        
    If a router is discovered on any interface on the network, then
    update the network last alive time and the NI's status to UP.
    If a router isn't discovered on any interface on a network,
    then change the status of all the interfaces on that network to down.
        
    Test-Parameters: forbuildonly
    Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
    Change-Id: I1d67eb4b3284ccb8306ad4c877a2fcbdf4958d8c

    https://review.whamcloud.com/#/c/34511/
    Code Block
    LU-11299 lnet: discover each gateway Net
        
    Wakeup every gateway aliveness interval / number of local networks.
    Discover each local gateway network in round robin.
        
    This is done to make sure the gateway keeps its networks up.
        
    Test-Parameters: forbuildonly
    Signed-off-by: Amir Shehat <ashehata@whamcloud.com>
    Change-Id: I4035e39c286cb599d4eb8f9df7ed5d278e6d744a
    We need to do that because if we increment it by 1 only and the percentage sensitivity is set to 100 we will not use that router interface for routing. This is to keep current routing behavior.

    https://review.whamcloud.com/#/c/
    33304
    34625/
    Code Block
    LU-
    11478
    12053 lnet: 
    misleading discovery seqno. There is a sequence number used when sending discovery messages. This sequence number is intended to detect stale messages. However it could be misleading if the peer reboots. In this case the peer's sequence number will reset. The node will think that all information being sent to it is stale, while in reality the peer might've changed configuration. There is no reliable why to know whether a peer rebooted, so we'll always assume that the messages we're receiving are valid. So we'll operate on first come first serve basis.

    Need to pass this by Olaf Weber

    look up MR peers routes
        
    An MR peer can have multiple interfaces some of which we might
    have a route to. The primary NID of the peer might not necessarily
    specify a NID we have a route to. When looking up a route, we must
    iterate over all the nets the peer is on and select the one which
    we can route to. Taking into consideration the peer can exist on
    multiple routed networks we also have a simple round robin algorithm
    to iterate over all the networks we can reach the peer on.
        
    Test-Parameters: forbuildonly
    Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
    Change-Id: 
    I421a00e47bc93ee60fa37c648d6d9a726d9def9c
    I0651dd4f732c8b71872f73cf2512b08f34129bd9

    https://review.whamcloud.com/#/c/
    33305
    34772/
    Code Block
    LU-
    11470
    12200 lnet: 
    drop
    check 
    all rule Add a rule to drop all messages arriving
    peer timeout on a 
    specific interface. This is useful for simulating failures on a specific
    router
    interface.
    
    
    Test-Parameters:
     
    forbuildonly Signed-off-by:
     
    Amir
     
    Shehata
     
    <ashehata@whamcloud.com> Change-Id: Ic69f683fb2caf7a69a1d85428878c89b7b1ee3ad

    For testing routers we want to be able to add a rule on the router to drop all messages arriving on that router interface from anywhere. This way we can simulate a router interface down scenario. Problem is the source and destination in the router case are not the router NID. So the rule specifies the local NID of the router.

    If the local  nid is not specific then it default to LNET_NID_ANY. Unlike source and destination it is mandatory. specifying NID any allows the drop rule to match messages in the absence of a specified local_nid

    drop all field is added which can be set from command line.
    
    On a router assume that a peer is alive and attempt to send it
    messages as long as the peer_timeout hasn't expired.
        
    Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
    Change-Id: I0806a52c8ad7acc1c93dcf32353f1c4467c618b1