rxe: Fix dma.length computation in wr_set_sge_list

[ Upstream commit 406cd2ad08cd852647414cfbf0f2de7ba6517ec1 ]

wr_set_sge_list() summed the SGE lengths with a loop that never
advanced sg_list:

	while (num_sge--)
		tot_length += sg_list->length;

so tot_length ended up as num_sge * sg_list[0].length instead of the
true sum, and wqe->dma.length / wqe->dma.resid were written with that
wrong value. The per-SGE entries themselves were unaffected because
they are populated by the preceding memcpy().

The kernel rxe driver requires dma.length == sum(sge[i].length) and
enforces it in rxe_mr.c:copy_data(), so a multi-SGE WR posted through
the ibv_qp_ex builder API (ibv_wr_set_sge_list) on rxe completes with
IB_WC_LOC_PROT_ERR once finish_packet()/copy_data() runs off the end
of the SGE list.

The legacy ibv_post_send path (init_send_wqe) is unaffected; it sums
the lengths with an indexed for loop.

Fix by computing the total with an indexed loop, matching the style
already used in rxe_post_one_recv() and init_send_wqe() in this file.

Fixes: 1a894ca10105 ("Providers/rxe: Implement ibv_create_qp_ex verb")
Signed-off-by: Jared Holzman <jholzman@nvidia.com>
Signed-off-by: Nicolas Morey <nmorey@suse.com>
diff --git a/providers/rxe/rxe.c b/providers/rxe/rxe.c
index 541f1c4..fc91223 100644
--- a/providers/rxe/rxe.c
+++ b/providers/rxe/rxe.c
@@ -1139,6 +1139,7 @@
 	struct rxe_send_wqe *wqe = addr_from_index(qp->sq.queue,
 						   qp->cur_index - 1);
 	size_t tot_length = 0;
+	size_t i;
 
 	if (qp->err)
 		return;
@@ -1151,8 +1152,8 @@
 	wqe->dma.num_sge = num_sge;
 	memcpy(wqe->dma.sge, sg_list, num_sge*sizeof(*sg_list));
 
-	while (num_sge--)
-		tot_length += sg_list->length;
+	for (i = 0; i < num_sge; i++)
+		tot_length += sg_list[i].length;
 
 	wqe->dma.length = tot_length;
 	wqe->dma.resid = tot_length;