Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • LNet shall not be responsible for end-to-end message reliability
  • This feature will not add any health functionality to LNDs other than o2iblnd and socklnd. gnilnd, etc will not be modified.

Sign-off


Key Milestones and Deliverables

...

Each requirement will be in one of the following statuses

StatusDescription
ACCEPTEDRequirement has been reviewed and accepted for implementation
IN-PROGRESSRequirement is being reviewed.
REJECTEDRequirement has been reviewed and rejected for implementation. It will not be covered in the HLD or the implementation.


Terms

TermDescription
SHALL

This word, or the terms "REQUIRED" or "MUST", mean that the definition is an absolute requirement of the specification.

SHALL NOT

This phrase, or the phrase "MUST NOT", mean that the definition is an absolute prohibition of the specification.

SHOULDThis word, or the adjective "RECOMMENDED", mean that there may exist valid reasons in particular circumstances to ignore a particular item, but the full implications must be understood and carefully weighed before choosing a different course.
SHOULD NOT

This phrase, or the phrase "NOT RECOMMENDED" mean that there may exist valid reasons in particular circumstances when the particular behavior is acceptable or even useful, but the full implications should be understood and the case carefully weighed before implementing any behavior described with this label.

...

IDClassVersionStatusDescription

Anchor
lnt-005
lnt-005
lnt-005

REQUIRED1.0ACCEPTEDLNet shall  maintain a health value per local NI

Anchor
lnt-010
lnt-010
lnt-010

REQUIRED1.0ACCEPTEDLNet shall maintain a health value per peer NI

Anchor
lnt-015
lnt-015
lnt-015

REQUIRED1.0ACCEPTED

The health value is a positive number between 0 - 1000.

This range is chosen to allow enough granularity for decrementing and incrementing the health value.

Anchor
lnt-020
lnt-020
lnt-020

REQUIRED1.0ACCEPTEDLNet shall decrement the health value of an NI by the configured health sensitivity value whenever there is a error sending a message over or to the NI. The health value shall not be less than 0.

Anchor
lnt-025
lnt-025
lnt-025

REQUIRED1.0ACCEPTEDLNet shall increment the health value but not beyond a 1000, which represents a healthy NI.

Anchor
lnt-030
lnt-030
lnt-030

REQUIRED1.0ACCEPTED

LNet shall determine the NI to select based on the following ordered criteria:

  1. NI health
  2. NUMA closeness
  3. NI available credits
  4. Round Robin

Anchor
lnd-035
lnd-035

lnt-035

REQUIRED1.0ACCEPTED

When LNet fails to send on a local NI or to a remote NI, it shall place that NI on a recovery queue. The NI shall be pinged or used for a ping periodically to determine if it has recovered. 1000 minus current health value pings must pass sequentially in order for an interface to be considered fully healthy.

EX: if the NI's health value is 900, then 100 pings using that NI must be successful in order for the NI to be considered fully healthy. Each successful send will increment the NI's health value by 1.

Anchor
lnt-040
lnt-040
lnt-040

REQUIRED1.0ACCEPTED

On a local, remote and network timeout LNet shall reselect a pair of local and peer NI to resend the message.

  • Another option  is to be more granular when selecting the interfaces depending on the timeout that occurred. If local timeout then only re-select a different local NI. If remote timeout then re-select a peer NI and for a network timeout then select a new pair of local and peer NIs.
  • This has the disadvantage of a more complicated implementation.
  • There isn't a clear advantage of making the selection more granular.
  • Since the health value of the NI in question is decremented based on the error encountered, then the selection algorithm will favor the NI with poor health less.

Anchor
lnt-045
lnt-045
lnt-045

REQUIRED1.0ACCEPTEDOn any type of timeout, if the peer is non-MR capable, LNet shall retransmit the message from the same local NI.

Anchor
lnt-050
lnt-050
lnt-050

DESIRED1.0ACCEPTED

For the routers LNet shall re-transmit a message over any of its local NIs.

Routers are a special case since non-MR peers expect the same source NID of the final destination, but doesn't care about the router NIDs. The router NIDs are not passed up to ptlrpc or other ULPs.

Anchor
lnt-055
lnt-055
lnt-055

REQUIRED1.0ACCEPTED

LNet shall not attempt to resend a message on the following failure types:

  1. Shutdown in progress
  2. Out of memory
  3. Discovery errors out with one of the errors on this list.
  4. An MD bind failure
    1. -EINVAL
    2. -HOSTUNREACH
  5. Invalid information given

  6. Internal failure

The assumption is that any resend will encounter the same failure again. Let the upper layers deal with the failure.

Anchor
lnt-060
lnt-060
lnt-060

REQUIRED1.0ACCEPTED

LNet shall re-transmit messages no more than the retry count specified by the user.

Anchor
lnt-065
lnt-065
lnt-065

REQUIRED1.0ACCEPTED

LNet shall stop re-transmitting when one of the following criterion is satisfied

  1. Message is sent successfully
  2. Retry count is reached
  3. Transaction timeout expires.

Anchor
lnt-070
lnt-070
lnt-070

REQUIRED1.0ACCEPTEDLNet shall default the transaction timeout to 5 seconds

Anchor
lnt-075
lnt-075
lnt-075

REQUIRED1.0ACCEPTED

LNet shall timeout a message and send a failure event to the ULP if a message is not re-transmitted successfully

Anchor
lnt-080
lnt-080
lnt-080

REQUIRED1.0ACCEPTED

LNet shall calculate the message timeout based on the ULP provided timeout if one is provided or the configured transaction timeout otherwise.

message timeout = transaction timeout / number of retries.

Anchor
lnt-085
lnt-085
lnt-085

REQUIRED1.0ACCEPTEDLNet shall pass the message timeout to the LND and will rely on the LND to enforce the timeout. If the LND times out the message then it will notify the LNet layer which will attempt to re-transmit the message.

Anchor
lnt-090
lnt-090
lnt-090

REQUIRED1.0ACCEPTEDLNet shall not attempt to re-transmit if the retry count is set to 0

Anchor
lnt-095
lnt-095
lnt-095

REQUIRED1.0ACCEPTEDLNet shall monitor the ACK/REPLY for a PUT/GET. It will send a timeout event for a PUT or a GET if the respective ACK/REPLY is not received within the transaction timeout/

Anchor
lnt-100
lnt-100

lnt-100

REQUIRED1.0ACCEPTED

LNet shall allow the callers of LNetGet() or LNetPut() to specify a different transaction timeout other than the one configured in the system.

  • EX: lnetctl ping can specify a shorter timeout than ptlrpc

Anchor
lnt-105
lnt-105
lnt-105

REQUIRED1.0ACCEPTED

LNet shall activate the transaction timeout only after a PUT which requires an ACK or a GET which requires a REPLY is successfully passed to the LND (IE lnd_send() returns successfully)

For PUT which requires no ACK no timeout will be activated.

Anchor
lnt-110
lnt-110
lnt-110

DESIRED1.0IN-PROGRESSLNet shall use UDEV events to propagate errors detected on a local or peer NI.

Anchor
lnt-115
lnt-115
lnt-115

DESIRED1.0IN-PROGRESSLNet shall handle flapping of interfaces and will favor the interface less.

Statistics Requirements

IDClassVersionStatusDescription

Anchor
stt-005
stt-005
stt-005

REQUIRED1.0ACCEPTEDLNet shall  maintain the number of resends due to a local timeout per local NI

Anchor
stt-010
stt-010
stt-010

REQUIRED1.0ACCEPTEDLNet shall maintain the number of resends due to a remote timeout per peer NI

Anchor
stt-015
stt-015
stt-015

REQUIRED1.0ACCEPTEDLNet shall maintain the number of resends due to a network timeout per local and peer NI

Anchor
stt-020
stt-020
stt-020

DESIRED1.0ACCEPTEDLNet shall maintain the number of local interface down events

Anchor
stt-025
stt-025
stt-025

DESIRED1.0ACCEPTEDLNet shall maintain the number of local interface up events

Anchor
stt-030
stt-030
stt-030

DESIRED1.0ACCEPTEDLNet shall maintain the average time it takes to successfully send a message per peer NI

Anchor
stt-035
stt-035
stt-035

DESIRED1.0ACCEPTEDLNet shall maintain the average time it takes to successfully complete a transaction per peer NI

Anchor
stt-040
stt-040
stt-040

DESIRED1.0IN-PROGRESSLNet shall provide a method to reset statistics.

...