You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 3 Next »

Overview

LNet is a virtual networking layer which allows Lustre nodes to communicate with each other.

System Level Overview

System Diagram

LNetSystemDiagram

  • lnetctl: User space utility used to configure and query LNet kernel module
  • DLC Library: User space library which communicates with LNet kernel module primarily via IOCTL
  • LNet IOCTL: Module which handles the IOCTLs and calls appropriate callbacks in the LNet kernel module
  • PTLRPC: Kernel module which implements an RPC protocol. It's the primary user of LNet.
  • LNet: Kernel module which implements the Lustre Networking communication protocol
  • o2iblnd: Verbs driver. It executes RDMA operations via verbs
  • socklnd: TCP/IP driver. It sends/receives TCP messages
  • gnilnd: Cray/HPE driver not maintained by us
Block Level Overview

LNet Block Level Diagram

LNetBlockDiagram

The diagram above represents the different functional blocks in LNet. A quick overview will help in understanding the code

  • When LNet starts up it reads various module parameters and configures itself based on these values.
  • Further configuration can be added dynamically via lnetctl  utility.
  • The main APIs to request LNet to send messages are LNetPut()  and LNetGet() .
    • When a message is sent a peer block is created to track messages to and from that peer.
    • When a message is received a peer block is created to track messages to and from that peer.
  • When sending messages LNet has to select the local and remote interfaces (IE the path the message will traverse to reach its destination). It does so through the selection algorithm.
    • In that process it selects the local network interfaces and remote network interfaces for the destination peer.
  • Each peer has its own set of credits used to rate limit messages to it. LNet checks and manages these credits before sending the message.
    • When a message is sent a credit is consumed.
    • When a message is received a credit is returned.
  • If the destination peer is not on the same network as the node, then lookup a route to the final destination. If no route is present then the message can not be sent.
  • If a node is acting as a router, then it can receive messages to which it is not the final destination. It then can forward these messages to the final destination.
    • When a received message is to be forwarded then a router buffer is used to receive the message data. Router buffers have their own credits.
  • A fault injection module can be activated for testing. That module will simulate message send/receive failures.

Useful Documentation


LNet Source Directory
  • lustre-release/lnet 
    • Top LNet director
  • include 
    • lnet
      • Internal includes
    • uapi 
      • include used by user space and other kernel modules
  • klnds 
    • gnilnd 
      • Cray LNet Driver (LND). Developed and tested by Cray/HPE
    • o2iblnd 
      • IB LND used by mellanox and Intel OmniPath. It uses the Verbs API
      • Only uses IBoIP for connection establishment
    • socklnd
      • Socket LND used for ethernet interfaces. It uses TCP/IP
  • lnet 
    • LNet kernel source directory
  • selftest 
    • LNet selftest tool. Generated RDMA traffic. Runs in knernel
  • utils 
    • User space tools including lnetctl and liblnetconfig.
    • lnetctl is a CLI used to configure lnet
    • liblnetconfig is the library used by lnetctl to communicate with the lnet kernel module

Presentations


Tasks


Medium
  • LNet Router Testing
    • We need to expand our testing of LNet. The link above lists a set of routing tests. We need to write LUTF scripts for them
      • Benefits:
        • Learn how to configure LNet routers
        • Learn how to use the LUTF
        • Learn how to test LNet
        • Learn the code
  • LU-12041
  • No labels