As some of you may know, I have a very over-engineered lab, and as part of that I run an Infiniband SAN. When I went for Infiniband there really wasn’t much information about what it is or how to use it, only that it’s a cheap way to get a crazy fast SAN connection, which isn’t the whole story. In the next weeks I will start with what Infiniband is, and how to make an informed decision if it’s right for you or not.
I will start with going over some terminology that goes with Infiniband. This will lay the groundwork for the next few posts. One of the mistakes I made with Infiniband is not understanding the Infiniband terminology, I just made the false assumption that it’s just like ethernet, only faster.
- SDR: Single Data Rate, this is referred to as 10Gbps, although with the overhead from encoding you’ll only see about 8Gbps. This is the cheapest speed to start with Infiniband, switches run about $200. These use a CX4 Connector
- DDR: Double Data Rate, this is referred to as 20Gbps when it’s really only 16Gbps with the encoding overhead. DDR switches start around $400-$500. DDR Connections still use a CX4 Connector
- QDR: Quad Data Rate, this is referred to as 40Gbps, when it’s really only 32Gbps with the encoding overhead. QDR switches start around $1000. QDR and above connections use a QSFP Connector.
- FDR-10, FDR & EDR: these run at 40Gbps, 50Gbps and 100Gbps respectively. The encoding overhead on these are only about 3%. These are beyond the scope of this article.
Host Channel Adapter, this is what the Infiniband PCI adapter is referred to.
Infiniband Cable Types:
Infiniband cables are expensive, SDR / DDR cables run around $30-$50 each, and QSFP cables can go for $70-$100 at least. You have to be very carefuly when shopping for cables as Infiniband CX4 cables look very similar to an SAS SFF-8470 cable. If the price seems to good to be true, then it’s not an Infiniband cable. Cables They come in both copper and fiber varieties, the optical cables are lighter and have a longer length, but tend to be more expensive.
- CX4: This is the cable type used for both SDR and DDR Infiniband. The problem is while DDR CX4 is backward compatible with SDR, the reverse is not true. Cables for sale on the internet don’t always specify which type of CX4 it is, so if it doesn’t say “DDR” on it, make sure and ask. These connectors tend to be made of steel and are very heavy duty. They come in either “pinch” style or “latch” style. Either works on adapters and switches.
- QSFP: This is used for QDR and above connections, but it is backward compatible, so you can get a QSFP to CX4 cable adapter. QSFP cables have a 2-3 inch plug that is inserted into the switch and adapter for a more secure connection than a CX4 connector. they usually have a pull table dangling off to remove the cable after it’s inserted.
Each Infiniband speed has a base signalling rate, for SDR it’s 2.5Gbp, DDR is 5Gbps and QDR is 30Gbps. Sometimes you will see on cables, adapters and switches either “CX1”, “CX4” or “CX12”, this refers to how many lanes of traffic is supported by that link. Mostly you will see “CX4” connections which would give you a 10Gbps SDR connection or a 20Gbps DDR connection. Some of the higher-end switches offer “CX12” connections which would give you a 30Gbps SDR Connection and a 60Gbps DDR Connection. CX12 uses a different type of connector and cabling, so make sure if you buy a switch with CX12 connections, you have CX12 connectors on the cards, and CX12 cables to connect them.
Infiniband, like Ethernet has switches, and they come in either managed or unmanaged. Unless the switch specifically says “managed”, it’s not. Infiniband is different as you can daisy change the adapters together to avoid using a switch, but you will experience some performance loss. Obviously daisy chaining them requires dual-port cards.
For an Infiniband fabric to be fully functional, you must have at least one subnet manager running. A subnet manager assigns a unique identifier to each adapter and builds a routing table. You can have multiple subnet managers running for failover, but only one can be active at the same time. The second one you add will detect the first one running and switch to a passive mode. Most managed switches will include a subnet manager, but if it’s not managed, you’ll have to run the subnet manager on one of the connected nodes. OpenSM is included with most drivers.
These come in many shapes and sizes. I try to stay with switches from mellanox and voltaire, as they are still in the Infiniband business and have a very active support community. You will See “Infinihost III” cards for about $30 each, but I would go for at least ConnectX cards, or if you’re running windows, go for ConnectX-2 cards, as there is better driver support.
IP over Infiniband:
IPoIB is used to run IP over Infiniband, so you can assign IP addresses and ping other machines. Depending on your O/S it’s either installed by default (Windows) or has to be enabled (linux). IPoIB lets you use iSCSI and other IP based applications over your fast Infiniband fabric. IPoIB does add some overhead to the fabric, depending on what you’re running but it’s usually around 25%. IPoIB is also not bridgeable, so even though windows sees it as just another network adapter, you will be unable to share your infiniband network with your virtual machines. The only way to do that is with network virtualization.
SCSI Remote Protocol, this is the alternative to using IPoIB and iSCSI. SRP is basically running SCSI command directly over the Infiniband fabric, giving you a very low latency and high speed connection without the overhead from IPoIB. This is only available in infiniband and certain 10gb ethernet adapters. But for SRP to be available, the drivers must support it.
Generic SCSI subsystem. This is software used on linux to give you Infiniband or FC targets. There are different ways to get SCST, either add it to an existing linux install or have it come with your distribution. ESOS and Openfiler come with it. SCST is not currently supported under windows, so if you want a lower-priced or free Infiniband Target server, you will use SCST on linux, there is no way to cheaply or easily run an Infiniband target server under windows unless you use IPoIB and iSCSI
This is the first part of my Infiniband Introduction, next I will explain some setup issues and what makes it good and bad.
Links for further Research:
http://www.openfabrics.org Openfabrics is the home of an open-source driver software mostly for mellanox adapters. While site is still running, the forums are dead and the website hasn’t shown activity in a while. The driver still does not support server 2012 or 2012R2.
http://community.mellanox.com/welcome Active support site for mellanox Infiniband products. Help there is generally useful and you’ll usually get a reply to help requests.
http://www.servethehome.com/ A website with good information and many people using Infiniband that are active on their forums.