scst/iscsi-scst/README.iser - backupdr - Git at Google

 iSCSI extensions for RDMA driver
 ================================

 Installation & Configuration:
 ---------------------------
 For installation and configuration, see iscsi README.
 There are no specific configuration options for iSER.
 See below for performance optimizations as well as troubleshooting.
 There is also a HOWTO on http://community.mellanox.com/docs/DOC-1479

 Performance considerations:
 ---------------------------

 In order to achieve better performance, it is recommended to specify
 "QueuedCommands 128" parameter per iSER target, since the transport
 is very fast and you usually want to connect it to fast backstorage.

 For performance tuning of initiator and target machines, see
 http://community.mellanox.com/docs/DOC-1483

 Note that if you have an SSD controller that is close to a particular
 NUMA node, you want the HCA to be close to the same node.

 Limitations:
 -------------
 * Bidirectional commands are not supported
 * Maximum number of concurent login requests that can be handled is 127 by default.
   Note that there may be more connections, but only up to 127 login requests
   can be handled at the same time. If you wish to increase this, load isert_scst with
   module parameter isert_nr_devs set to the number of login requests you need to handle.


 Troubleshooting:
 -----------------
 * Initiator fails to connect to target. The following message is seen in dmesg:
   Failed to accept conn request, err: -22
   The cause of this is often compilation issues if you have OFED or MLNX_OFED installed:
   If you are compiling for OFED/MLNX_OFED, make sure OFED is installed for
   the kernel you are running. Also, make sure you followed ALL steps described
   in README.iser_ofed.
   If you are compiling for non-OFED kernel, make sure you don't have
   OFED/MLNX_OFED installed.


 * Discovery of iSER targets takes a long time or login to all discovered targets fails.
   iSCSI discovery does not have a way to determine between iSCSI and iSER
   enabled portals. Thus, initiator tries to connect to all interfaces it
   discovered (by default discovery is done over iSCSI TCP).
   In order to prevent this behaviour, you should specify
   "allowed_portal <target interface IP>" parameter for each target you want
   to export through specific RDMA capable adapters.


 * Initiator keeps connecting and disconnecting from target in a loop
   with constant interval after target reboot.
   The problem may be that connection requests from initiator are received
   on wrong port/HCA. This can be one due to one (or both) of the following issues:
     1) net.ipv4.conf.all.arp_ignore sysclt is not set to 2
        rdma-cm relies on ARP responses being received on the same interface
        that sent the request. Linux default does not do that.
        In order to make Linux behave good for rdma-cm, you _MUST_ add
        "net.ipv4.conf.all.arp_ignore = 2" to /etc/sysctl.conf
     2) You have more than 1 HCA and PCI mappings to netdev devices is not
        persistent between reboots. Possible solution is to have udev rules
        for mapping the ibX devices in persistent way.
        See below for udev scripts example:

 /lib/udev/net.sh
 -------------------
 #!/bin/sh

 . /etc/sysconfig/net.conf

 type_fd="/sys/${DEVPATH}/type"
 if [ ! -f $type_fd ]; then
         exit
 fi
 type=`cat /sys/${DEVPATH}/type`

 if [ "$type" = "32" ]; then # IPoIB interface
         i=0
         CONFDEV="DEV${i}"
         CONFPCI=${!CONFDEV}
         PCI=`basename $PHYSDEVPATH`
         while [ -n "$CONFPCI" ]; do
                 if [ "$CONFPCI" = "$PCI" ]; then
                         devid=$(printf "%d\n" `cat /sys/$DEVPATH/dev_id`)
                         let id=$i*2+$devid
                         DEV="ib$id"
                         echo "$DEV"
                         exit
                 fi
                 let i=i+1
                 CONFDEV="DEV$i"
                 CONFPCI=${!CONFDEV}
         done
 fi

 /etc/sysconfig/net.conf
 -----------------------
 DEV0="0000:01:00.0"
 DEV1="0000:02:00.0"

 /etc/udev/rules.d/90-network.rules
 -------------------------------------
 ACTION=="add", SUBSYSTEM=="net", PROGRAM="/lib/udev/net.sh", RESULT=="?*", NAME="$result"


 * Login to all targets from initiator sometimes times out.
   It may be a network problem (try running tools like ibdiagnet
   and rping between target and initiator hosts). The description of those tools
   is beyond the scope of this readme.
   Another issue may be that you failed to set net.ipv4.conf.all.arp_ignore sysctl
   to the value of 2 (see above problem for more detailed explanation).


 * When running IO, latency is getting higher and higher all the time.
   If you have enabled intel_iommu either in kernel command line or in
   kernel config (it may be enabled by default), you should specify
   iommu=pt on kernel command line to avoid the latency issue.
	iSCSI extensions for RDMA driver
	================================

	Installation & Configuration:
	---------------------------
	For installation and configuration, see iscsi README.
	There are no specific configuration options for iSER.
	See below for performance optimizations as well as troubleshooting.
	There is also a HOWTO on http://community.mellanox.com/docs/DOC-1479

	Performance considerations:
	---------------------------

	In order to achieve better performance, it is recommended to specify
	"QueuedCommands 128" parameter per iSER target, since the transport
	is very fast and you usually want to connect it to fast backstorage.

	For performance tuning of initiator and target machines, see
	http://community.mellanox.com/docs/DOC-1483

	Note that if you have an SSD controller that is close to a particular
	NUMA node, you want the HCA to be close to the same node.

	Limitations:
	-------------
	* Bidirectional commands are not supported
	* Maximum number of concurent login requests that can be handled is 127 by default.
	Note that there may be more connections, but only up to 127 login requests
	can be handled at the same time. If you wish to increase this, load isert_scst with
	module parameter isert_nr_devs set to the number of login requests you need to handle.


	Troubleshooting:
	-----------------
	* Initiator fails to connect to target. The following message is seen in dmesg:
	Failed to accept conn request, err: -22
	The cause of this is often compilation issues if you have OFED or MLNX_OFED installed:
	If you are compiling for OFED/MLNX_OFED, make sure OFED is installed for
	the kernel you are running. Also, make sure you followed ALL steps described
	in README.iser_ofed.
	If you are compiling for non-OFED kernel, make sure you don't have
	OFED/MLNX_OFED installed.


	* Discovery of iSER targets takes a long time or login to all discovered targets fails.
	iSCSI discovery does not have a way to determine between iSCSI and iSER
	enabled portals. Thus, initiator tries to connect to all interfaces it
	discovered (by default discovery is done over iSCSI TCP).
	In order to prevent this behaviour, you should specify
	"allowed_portal <target interface IP>" parameter for each target you want
	to export through specific RDMA capable adapters.


	* Initiator keeps connecting and disconnecting from target in a loop
	with constant interval after target reboot.
	The problem may be that connection requests from initiator are received
	on wrong port/HCA. This can be one due to one (or both) of the following issues:
	1) net.ipv4.conf.all.arp_ignore sysclt is not set to 2
	rdma-cm relies on ARP responses being received on the same interface
	that sent the request. Linux default does not do that.
	In order to make Linux behave good for rdma-cm, you _MUST_ add
	"net.ipv4.conf.all.arp_ignore = 2" to /etc/sysctl.conf
	2) You have more than 1 HCA and PCI mappings to netdev devices is not
	persistent between reboots. Possible solution is to have udev rules
	for mapping the ibX devices in persistent way.
	See below for udev scripts example:

	/lib/udev/net.sh
	-------------------
	#!/bin/sh

	. /etc/sysconfig/net.conf

	type_fd="/sys/${DEVPATH}/type"
	if [ ! -f $type_fd ]; then
	exit
	fi
	type=`cat /sys/${DEVPATH}/type`

	if [ "$type" = "32" ]; then # IPoIB interface
	i=0
	CONFDEV="DEV${i}"
	CONFPCI=${!CONFDEV}
	PCI=`basename $PHYSDEVPATH`
	while [ -n "$CONFPCI" ]; do
	if [ "$CONFPCI" = "$PCI" ]; then
	devid=$(printf "%d\n" `cat /sys/$DEVPATH/dev_id`)
	let id=$i*2+$devid
	DEV="ib$id"
	echo "$DEV"
	exit
	fi
	let i=i+1
	CONFDEV="DEV$i"
	CONFPCI=${!CONFDEV}
	done
	fi

	/etc/sysconfig/net.conf
	-----------------------
	DEV0="0000:01:00.0"
	DEV1="0000:02:00.0"

	/etc/udev/rules.d/90-network.rules
	-------------------------------------
	ACTION=="add", SUBSYSTEM=="net", PROGRAM="/lib/udev/net.sh", RESULT=="?*", NAME="$result"


	* Login to all targets from initiator sometimes times out.
	It may be a network problem (try running tools like ibdiagnet
	and rping between target and initiator hosts). The description of those tools
	is beyond the scope of this readme.
	Another issue may be that you failed to set net.ipv4.conf.all.arp_ignore sysctl
	to the value of 2 (see above problem for more detailed explanation).


	* When running IO, latency is getting higher and higher all the time.
	If you have enabled intel_iommu either in kernel command line or in
	kernel config (it may be enabled by default), you should specify
	iommu=pt on kernel command line to avoid the latency issue.