This page documents reference counter strategy used with o2iblnd connections:
- In kiblnd_create_conn()
- atomic_set(&conn->ibc_refcount, 1 + IBLND_RX_MSGS(conn));
- 1 ref for caller and 1 ref for each rxmsg
- The 1 ref for caller is decremented when the connection fails:
- kiblnd_passive_connect() if rdma_accept() fails
- kiblnd_active_connect() if rdma_connect() fails
- kiblnd_cm_callback() in most of the failure cases:
- RDMA_CM_EVENT_UNREACHABLE
- RDMA_CM_EVENT_CONNECT_ERROR
- RDMA_CM_EVENT_REJECTED
- RDMA_CM_EVENT_DISCONNECTED
- atomic_set(&conn->ibc_refcount, 1 + IBLND_RX_MSGS(conn));
- In kiblnd_post_rx(). When an rx buffer is posted using ib_post_recv()
- This is needed because another thread could destroy the connection.
kiblnd_conn_decref()
is called at the end of the function.- kiblnd_drop_rx() also calls kiblnd_conn_decref() if the ib_post_recv() fails
- In kiblnd_queue_tx_locked(): When a new tx is queued on the connection. That means each tx on that connection grabs a refcount.
- This refcount remains until the call to kiblnd_tx_done() which finalizes the tx.
- Any txs that are queued but not posted can be removed when the connection is completed/aborted.
- kiblnd_conn_decref() is called to remove the refcount used by the tx.
- txs which are sent (ibc_sent > 0) the code waits for those to complete before completing the tx.
- This means that a connection can hang around for a while longer (not sure how long, maybe indefinitely) if the communication stack has a bug.
- Any txs that are queued but not posted can be removed when the connection is completed/aborted.
- This refcount remains until the call to kiblnd_tx_done() which finalizes the tx.
- In kiblnd_launch_tx()
- function grabs the refcount to be able to unlock the g_lock and queue the tx using kiblnd_queue_tx()
- It calls kiblnd_conn_decref() after it's done with the connection.
- This kiblnd_conn_addref()/kiblnd_conn_decref() are paired in this function.
- In kiblnd_connreq_done()
- When the connection is established and added to the peer's list of connections (peer_ni→ibp_conns)
- This is decremented in kiblnd_connd().
- When a connection is being closed (kiblnd_close_conn_locked()) it's added to the kiblnd_data.kib_connd_conns.
- The kiblnd_connd() wakes up periodically and disconnects the connection, after which it decrements the refcount.
- Once the connection is added to ibp_conns then it is available to be picked up by scheduler threads or other sending threads. Any further access to the connection needs to grab a refcount. This happens in this function to:
- kiblnd_check_sends_locked(conn);
- kiblnd_handle_early_rxs(conn);
- In kiblnd_check_conns()
- When connections timeout or if we need to check that we have to see if there are enough credits to send further txs, then we add those connections to a local closes and checksends lists. These are processed locally in the function. So a refcount is grabbed and released after the connections on the list are processed.
- In kiblnd_cq_completion()
- IB stack calls this function whenever there is a complete even on the completion queue (cq)
- refcount is acquired when the connection is added on the scheduler list: sched→ibs_conns
- The refcount is decremented when that conn is processed by the kiblnd_scheduler() thread. See below.
- In kiblnd_scheduler()
- The cq is polled and if there is >=1 completions two things are done:
- We add the connection back to the sched→ibs_conns list and grab a refcount.
- We also process the completion using kiblnd_complete()
- kiblnd_conn_decref() is called to release the refcount grabbed in kiblnd_cq_completion()
- The cq is polled and if there is >=1 completions two things are done:
- In kiblnd_rx_complete()
- This function is called when an rx completion event is processed.
- If the connection is closing it'll call kiblnd_drop_rx() which decrements the refcount.
- rdma_disconnect() called from kiblnd_finalise_conn() triggers all posted rxs to be completed. An rx completion event is generated for each.
- Each one of these events are hanlded by kiblnd_rx_complete() and kiblnd_drop_rx() is called for each therefore taking away the reference counts added for each rx posted when the connection is created.
- In the normal case, it calls kiblnd_handle_rx()
- This function re-posts the rx again inline or calls lnet_parse() which eventually calls kiblnd_recv() which then re-posts the rx; thereby maintaining the correct refcount on the connection.