Purpose
This document intends to provide guidelines for troubleshooting common issues involving LNet as seen in the field, with the idea that non-developers should be able to diagnose most of such problems, and, if solution is not achievable on the spot, provide developers with adequate detail for timely investigation.
Overview
From Lustre POV LNet is the layer providing the abstraction of physical network. LNet module is an "umbrella" module over a number of LNDs (Lustre Network Drivers) dealing with specific network types.
Because Lustre is a distributed FS, lots of errors in the system may have the markers of a "network error", and it may be not always trivial to determine whether LNet is actually at fault.
Topology
It is important to know how the customer is using the network with Lustre.