Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Code Block
 872 #ifdef HAVE_IB_CQ_INIT_ATTR
 873 »·······cq_attr.cqe = IBLND_CQ_ENTRIES(conn);
 874 »·······cq_attr.comp_vector = kiblnd_get_completion_vector(conn, cpt);
 875 »·······cq = ib_create_cq(cmid->device,
 876 »·······»·······»·······  kiblnd_cq_completion, kiblnd_cq_event, conn,
 877 »·······»·······»·······  &cq_attr);
 878 #else
 879 »·······cq = ib_create_cq(cmid->device,
 880 »·······»·······»·······  kiblnd_cq_completion, kiblnd_cq_event, conn,
 881 »·······»·······»·······  IBLND_CQ_ENTRIES(conn),
 882 »·······»·······»·······  kiblnd_get_completion_vector(conn, cpt));
 883 #endif

 898 »·······rc = ib_req_notify_cq(cq, IB_CQ_NEXT_COMP);

 904 »·······init_qp_attr->event_handler = kiblnd_qp_event;
 905 »·······init_qp_attr->qp_context = conn;
 906 »·······init_qp_attr->cap.max_send_sge = *kiblnd_tunables.kib_wrq_sge;
 907 »·······init_qp_attr->cap.max_recv_sge = 1;
 908 »·······init_qp_attr->sq_sig_type = IB_SIGNAL_REQ_WR;
 909 »·······init_qp_attr->qp_type = IB_QPT_RC;
 910 »·······init_qp_attr->send_cq = cq;
 911 »·······init_qp_attr->recv_cq = cq;
 912 
 913 »·······conn->ibc_sched = sched;
 914 
 915 »·······do {
 916 »·······»·······init_qp_attr->cap.max_send_wr = kiblnd_send_wrs(conn);
 917 »·······»·······init_qp_attr->cap.max_recv_wr = IBLND_RECV_WRS(conn);
 918 
 919 »·······»·······rc = rdma_create_qp(cmid, conn->ibc_hdev->ibh_pd, init_qp_attr);
 920 »·······»·······if (!rc || conn->ibc_queue_depth < 2)
 921 »·······»·······»·······break;
 922 
 923 »·······»·······conn->ibc_queue_depth--;
 924 »·······} while (rc);

The LND has its own protocol, where some messages are exchanged to determine the size of the RDMA that's about to happen. Once that's determined, then when initialize the RDMA operation.

Code Block
wrq->wr.next»···= &(wrq + 1)->wr;                                                                                               
wrq->wr.wr_id»··= kiblnd_ptr2wreqid(tx, IBLND_WID_RDMA);                                                                        
wrq->wr.sg_list»= sge;                                                                                                          
wrq->wr.opcode»·= IB_WR_RDMA_WRITE;                                                                                             
wrq->wr.send_flags = 0; 

/* kiblnd_init_rdma() for more details */

We then post the work request on the qpOnce qp is created, we post the RDMA

Code Block
rc = ib_post_send(conn->ibc_cmid->qp, wr, &bad);

Note that the LND never does an RDMA read. It only does an RDMA write. This is for historical limitations, which might not be applicable with the latest technology.

Passive Connection Establishment

...

When a LND connection is created a number of buffers, each is 4K in size, are posted to the QP to receive incoming RDMAs. Receiving and sending messages in the LND is governed by a credit system to ensure the peers don't over flow the buffers on the QP.

Notes on RDMA and QP Timeouts

View file
nameRDMA_timeouts_last.pdf
height250