ISL Trunking?

Chris Wilson

unread,

Mar 11, 2014, 9:21:54 PM3/11/14

to esos-...@googlegroups.com

Hello,

Getting more comfortable with the OS as I go - I'm loving it! One quick question - can we set up ISL Trunking on multiple ports?

Thanks in advance.

Martino Io

unread,

Mar 12, 2014, 6:21:35 AM3/12/14

to esos-...@googlegroups.com

Hi Chris,

what is the correlation of ISL Trunks to ESOS?

Normally you would use ISLs between switches (Inter-Switch-Link) and if you would like to "aggregate" bandwidth from multiple target FC ports (on storage), you would simply use multipath on the Initiator side (as long as all the ports are zoned correctly).

Or maybe I'm just missing something...

Marcin

--
You received this message because you are subscribed to the Google Groups "esos-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to esos-users+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Chris Wilson

unread,

Mar 12, 2014, 9:57:44 AM3/12/14

to esos-...@googlegroups.com

I'll have to research further - It's probably me missing something.

Thanks though. :)

Martino Io

unread,

Mar 15, 2014, 11:31:21 PM3/15/14

to esos-...@googlegroups.com

Ok so maybe a little FC class would help here

FC is a complex protocol and it's way of operating is quite different from Ethernet networks, as most people approaching FC for the first time tend to see it in a similar way because it's used to interconnect different devices and similarly to Ethernet it's made up by switches with ports and cables or fibers :)

However on the bottom lies a totally different signaling,framing,logic and subset of other protocols (for the adventurous check T11 document T11/13-113v1), which tends to confuse most Network admins, so let's have a look at some simple examples and how they can be described technically.

SAN ( Storage Area Network ) were born around the mid 90' as during the 80' the storage boxes were normally attached locally via dedicated bus (SCSI) to the server/mainframe which was accessing the data, over the years with creation of more complex and capable storage solutions a single server/mainframe was under utilizing these resources, so the SCSI committee decided to develop a switched network protocol which would be SCSI capable and enclose the benefits of dedicated channels (superior to a shared SCSI BUS) and some of the goodies of a network (not considered to be reliable enough at that time). While FC was around before being standardized (1988) it got ratified as an ANSI and ISO/IEC Standard in 1994 thus becoming the standard protocol employed in SAN networks. As a layered protocol FC was able to transport most of existing upper protocols (SCSI, ATM, FICONN, IP to mention a few) without problems, but it lost it's battle as a generic Networking protocol due to the cheaper Ethernet, so today it's employed in the storage business mainly (apart from FICON in mainframes).

Skipping most of the low level technicalities, interconnecting a server with a storage box can be done in several ways:

- directly connecting the ports with a fiber

- using a hub

- using a switch

In all cases the "topology" and port types will change, when you connect 2 devices directly it's called a P2P (Point-to-point) topology, the simplest you can have. Both ports will be of PN_Port type and from a physical point of view the TX of the server is connected with the RX of the storage and vice versa.

Hubs contrary to Ethernet are still employed today (although very rare) to form an Arbitrated Loop topology (used internally in some storage boxes, to interconnect expansion shelves), in this case the devices ports type will be L_Port (Loop port), logically it behaves similarly to a token Ring, except it allows bi-directional communication. The Arbitrated Loop can be achieved without using a HUB, as long as you have more than 1 port on the a single device.

In a switched network a new concept is introduced, the fabric. Speaking plain, a fabric is not necessarily built from many switches (as many think), as it's simply a model with a set of functions (receive the frames from a source port and route the frames to the destination port), thus you have a fabric in a single switch; interconnecting switches together simply expands the same fabric (more on this later).

For our example we will connect the storage to Port 0 on the switch and server to port 1, initially the enabled switch port 0 is a U_Port (on Brocade switch, Universal Port), when the storage will be turned on, the port type will change and become an F_Port (Fabric port, a particular type of an N_Port, or in T11, called Fx_Port).

When the FC HBA on the storage performs the boot/initialization process it will try to detect a signal on the fiber and if all goes well it will perform a Fabric Login to establish service parameters with the fabric and negotiate capabilities and so on. Once the HBA performs the fabric login, it obtains the port address which identifies that port uniquely in the fabric (used for N to N communication), it's a 24Bit address which is represented as a 6 hex digits for simplicity FFFFFF.

The first 2 digits identify the Domain ID (The switch), the second 2 the Area ID (the physical port) and the last 2 the Node Address (special); there are some special reserved addresses in every fabric, which are used for essential services (ex. the Name Server at 0xFFFFFC or the Fabric controller at 0xFFFFFD) and each port must be able to reach them without prior knowledge of the address.

Why you need a Name server on a Fabric? The answer is quite complex, as the Name server in a Fabric performs a variety of functions, such as some zoning and port information repository, to best summarize it:

while the Fabric addresses are assigned dynamically by the switch they will change every time you perform some changes (ex. plug the storage to port 3 instead of 1, change the Domain ID of the switch), you need a permanent way of tracking connected devices in a fabric, that's what WWNs are used for (world wide name), more or less guaranteed to be unique across ports (like MAC in ethernet); however the main difference is that they are not essential for low level fabric operations, as everything works on the 24 bit switch assigned address.

Turning on the server, the same process is repeated and a unique address is assigned by the switch to the port, a nameserver entry stored in the switch, mapping the WWN (and some other data) to the current address.

Let's assume that our storage box is already well configured and we are only using a new switch which by default comes with a policy forbidding communication between N ports unless explicitly allowed. So we will need to zone the two N ports in a "partition" allowing to send data between them; zoning internally operates on the 24bit addresses, but to make the SAN Admin happy, port WWN or node WWNs can be employed (one of the reasons a Name Server exists on FC Fabrics).

Once the 2 ports are in the same zone, they can try to send data between them, on a low level view the first time any port try to do so, a Port Login must be performed, and it's during that phase that the "zoning" is verified, and the NS Entry created and a list of available devices presented to the port.

Now let's assume that our infrastructure grew past the capabilities of a single switch and we want to expand the fabric, we can:

- buy a bigger switch

- add another switch to the same fabric

We bought a second switch, what has to be done in order to interlink devices from switch 1 to switch 2?

Again, putting it simple, not much. Just take a cable and connect 2 ports together, the second switch will "join" the same fabric of switch 1, download zoning database and synchronize name server data and other relevant state/service information from the first one, all done automatically without creating down time; this operation is called "merging of fabrics", because as I said in the beginning each individual switch if not connected to another one, is an individual Fabric. With the merging, on of the two will accept the other as the "master" and the fabric will be reconfigured to be aware of the new switch.

Example of "fabricshow" on a brocade switch (member of a 5 switch fabric), the master is indicated by > before its name.

Switch ID Worldwide Name Enet IP Addr FC IP Addr Name

-------------------------------------------------------------------------

89: fffc59 10:00:00:05:1e:xx:xx:xx xx.xx.xx.xx 0.0.0.0 >"Switch 1"

91: fffc5b 10:00:00:05:1e:xx:xx:xx xx.xx.xx.xx 0.0.0.0 "Switch 2"

93: fffc5d 10:00:00:05:1e:xx:xx:xx xx.xx.xx.xx 0.0.0.0 "Switch 3"

95: fffc5f 10:00:00:05:1e:xx:xx:xx xx.xx.xx.xx 0.0.0.0 "Switch 4"

97: fffc61 10:00:00:05:1e:xx:xx:xx xx.xx.xx.xx 0.0.0.0 "Switch 5"

The technical details are much more complex and there are many new concepts introduced with this operation; the 2 ports used to interconnect the switches will not be F_Ports but E_Ports and this special connection is called ISL (Inter switch link); but overall the concepts explained before still apply.

If we require more bandwidth to interconnect the switches we can create a Trunk, while trunks are not part of the T11 specification, each vendor implements them with some differences. On brocade Trunks are a licensed separately feature and are limited to 8 ISLs per trunk (obviously with the same destination), however you can create a trunk group out of multiple trunks, to further increase available bandwidth.

There are important low level considerations while creating trunk groups as they imply specific ASIC mappings (Trunk to ASIC) example of a Brocade 6510:

http://community.brocade.com/legacyfs/online/12493_port%20groups.png

The possibility to interconnect multiple switches together in a single fabric implies some routing protocol (not to be confused with Inter Fabric Routing), so FC comes equipped with a dynamic routing protocol able to automatically compute paths and costs associated, based on hop counts and bandwidth considerations.

Many people ask why you need trunking while having the possibility to create multiple ISLs with the same destination.

The simple answer:

You can have multiple ISLs to the same Domain (Switch) but only one will be used for data transmission between domains as all of them will have the same metrics essentially providing redundancy (if one link fails the other will be used).

Example of ISL layout and routing information on a brocade switch:

trunkshow

1: 0-> 48 10:00:00:05:1e:xx:xx:cc 91 deskew 16 MASTER

1-> 49 10:00:00:05:1e:xx:xx:cc 91 deskew 15

3-> 51 10:00:00:05:1e:xx:xx:cc 91 deskew 16

2-> 50 10:00:00:05:1e:xx:xx:cc 91 deskew 16

topologyshow

Domain: 91

Metric: 500

Name: Switch 2

Path Count: 2

Hops: 1

Out Port: 1/0

In Ports: 1/8 1/9 1/12 1/14 1/15 2/0 2/8 2/9 2/13 2/14 2/15 3/0 3/8 3/9 3/10 3/11

4/0 4/8 4/9 4/10 4/11 9/8 9/9 9/12 9/14 9/15 10/0 10/8 10/9 10/13 10/14 10/15

11/0 11/8 11/9 11/10 11/11 12/0 12/8 12/9 12/10 12/11 1/16 2/28 3/28 4/24 9/24 10/28

11/28 12/16 1/34 1/35 1/36 1/37 1/38 2/34 2/35 2/36 2/37 2/39 3/32 3/33 3/34 3/35

4/32 4/33 4/34 4/35 9/34 9/35 9/36 9/37 9/38 10/34 10/35 10/36 10/37 10/39 11/32 11/33

11/34 11/35 12/32 12/33 12/34 12/35

Total Bandwidth: 16.000 Gbps

Bandwidth Demand: 1350 %

Flags: D

As you can see in the trunkshow, it's a trunk made from 4 ISL links with a total of 16Gbps of bandwidth; the deskew value is used by low level functions (related to the way data is encoded and transmitted and varies accordingly to the difference in length between the fibers which are part of the trunk) and the 91 in the third column identifies the destination Domain.

In the second command you can see the destination ports reachable trough that Trunk and another interesting value, the Bandwidth Demand (oversubscription)

This is another new concept introduced by interconnecting the switches; ideally a switch should have enough bandwidth to link all of the ports together at maximum speed possible, so for example a 16 port 4Gbps switch must be able to process a global flow of 4x16x2: 128Gbps of N port traffic flow, representing no over subscription (1:1), but when you start connecting 2 switches together, you force the traffic flow to go trough a reduced speed port or trunk (compared to the whole switch) and this generates oversubscription.

Take as an example an 8 port 4Gbps switch connected to another 8 port switch via a 2 port ISL Trunk, so the available bandwidth between the 2 switches is 8Gbps, now let's put a 4 port storage on switch A and a 4 port initiator on switch B and perform some tests, 4x4Gbps = 16Gbps bandwidth available, but we have a bottleneck on the ISL Trunk which can only carry 8Gbps thus limiting the transfer to 8Gbps and generating congestion on the ISL Trunk probably disrupting other traffic which might flow trough it.

Ideally the ratio should be 1:1 but it's not practical to create such ISLs as they would simply utilize half of the available ports on a switch so storage locality (putting storage boxes and initiators on the same switch) and careful planning is very important. Over-subscription happens in big modular switches as well, because normally the slots which hosts the switch blades have less bandwidth than the total ports on the blade; oversubscription is also possible within the same switch if it employs more than 1 ASIC, as the ASIC to ASIC circuitry might not be able to carry all of the data at the same speed as withing the ASIC itself.

Now that we have a 2 switch fabric and our SAN is working properly, we should be satisfied with the results, but in order to make our SAN Enterprise ready we would like to actually have a second site used for disaster recovery in case something really bad will happen to the first location.

Carefully considering a site, we choose a location distant 50Km from out main site; we can choose to extend our fabric by using long distance ISLs (they can go up to 100Km) or eventually establish a second not related fabric in our new site and employ FC Inter fabric routing so we can reach devices between them without actually merging the fabrics.

The decision here is not an easy one, but finally we choose to create a second fabric and employ routing, we bought the necessary license and connected the long distance links via a single mode Fiber using long range SFPs.

After the initial setup, we perform a switchshow on the fabric in our main site and discover a lot of new strange switches:

Switch ID Worldwide Name Enet IP Addr FC IP Addr Name

-------------------------------------------------------------------------

1: fffc01 50:00:51:ea:cd:xx:xx:xx 0.0.0.0 0.0.0.0 "fcr_xd_1_43"

2: fffc02 50:00:51:ea:cd:xx:xx:xx 0.0.0.0 0.0.0.0 "fcr_xd_2_52"

3: fffc03 50:00:51:ea:cd:xx:xx:xx 0.0.0.0 0.0.0.0 "fcr_xd_3_45"

49: fffc31 50:00:51:ea:cd:xx:xx:xx 0.0.0.0 0.0.0.0 "fcr_fd_49"

89: fffc59 10:00:00:05:1e:xx:xx:xx xx.xx.xx.xx 0.0.0.0 >"Switch 1"

91: fffc5b 10:00:00:05:1e:xx:xx:xx xx.xx.xx.xx 0.0.0.0 "Switch 2"

93: fffc5d 10:00:00:05:1e:xx:xx:xx xx.xx.xx.xx 0.0.0.0 "Switch 3"

95: fffc5f 10:00:00:05:1e:xx:xx:xx xx.xx.xx.xx 0.0.0.0 "Switch 4"

97: fffc61 10:00:00:05:1e:xx:xx:xx xx.xx.xx.xx 0.0.0.0 "Switch 5"

These are special phantom switches used to translate addresses (remember the fabric unique 24bit) from one fabric to another one, as the 2 fabrics are totally unaware of the presence of each other. Technically you will have a Front Domain ("fcr_fd_49"), which is the Domain ID of the switch and Translate Domains ("fcr_xd_1_43"), the translation table resides on the FC router.

When performing zoning, it's essential to have the aliases and zones created on both fabrics (this is not performed automatically) in order to be able to send data across 2 N ports in two different fabrics; the router will "publish" updated information to both NS and the devices can safely establish transmission channels across fabrics.

Finally we have an Enterprise ready Fabric!

To make it clear, I've omitted really a lot information, some things might be more or less wrong and don't take this as a tutorial, I wrote this essentially to introduce some of the more complex concepts in a practical way and to encourage people to read more about FC, because it's really a nice protocol, having nearly everything you need to survive even the stricter SAN requirements.

--

Marcin

Chris Wilson

unread,

Mar 19, 2014, 1:56:12 AM3/19/14

to esos-...@googlegroups.com

Wow! Brilliant! Thank you Marcin. I appreciate you talk g the time to write all this out. I still have a lot to learn.

Expanding on what you wrote; how difficult is it to create an aggregation of sorts? For example. Each of my hosts have a dual-port 4Gbps card in them. Can I bring those together for a theoretical 8Gbps? Can this also be configured on the esos side?

Thanks again for the lesson. Just shows how green I am with the FC environment as a whole.

Martino Io

unread,

Mar 19, 2014, 4:55:11 AM3/19/14

to esos-...@googlegroups.com

Normally you would always have a minimum of 2 Fabrics, for obvious redundancy reasons. Having 2 fabrics (so minimum of 2 switches) implies also using 2 HBAs (preferred) or dual port HBA, this under normal operating conditions means that you have 2x4Gb links active; If your storage has at least 1 FC port per fabric (which it must), then you already have 8Gb of total bandwidth available combined in the 2 fabrics between the storage and the initiator.

When you export the LUN on ESOS, make sure to include both pWWN of the initiator in the config (1 for Fabric A and 1 for Fabric B), on the initiator you will need the multipath driver configured properly, in case you're using ESX the appropriate driver is selected automatically, you have only to change the path selection policy and if you're only using 1 box setup configure it to Round-Robin, this will forward SCSI requests across the 2 links, effectively using all of the available bandwidth for data transfers.

Adding more HBAs on the Initiator will further increase available bandwidth for I/O operations (as long as the storage box is able to supply the necessary I/O), today is quite common to see 4x8Gb configs in hosts dedicated for virtualization.
My ESOS boxes have at least 6 FC ports, and some initiators have 4 FC ports (especially the ones used for Virtualization).

-

Marcin

Chris Wilson

unread,

Mar 19, 2014, 10:41:43 AM3/19/14

to esos-...@googlegroups.com

One last question - For now at least :)

In the interest of budget, since I'm setting this up at home and can't justify purchasing a second fabric switch for the sake of redundancy. For now (until I can get a second). Would configuring multiple zones for the connections ultimately accomplish the same task as having each port or HBA on different switches?

Thanks in advance,

Steve Jones

unread,

Mar 19, 2014, 10:46:52 AM3/19/14

to esos-...@googlegroups.com

On my home lab, which was ESOS plus 3 ESXi systems, I originally bought two 2gb FC switches for about $50 each, and had it setup like it "should" be, but then later I found that the 4g cards weren't that much more expensive, but only bought one switch, and did exactly what you're talking about - dual-homed to two different ports on the same switch. It seemed to work fine. ESX saw the multiple paths. I can't say that the FC was the bottleneck, so I don't know if it really used the paths efficiently, but it definitely seemed to work..

-Steve J - Another FC Newbie trying to learn!

Martino Io

unread,

Mar 25, 2014, 6:28:22 AM3/25/14

to esos-...@googlegroups.com

1 Switch is perfectly fine for testing, in order to check if the data is currently transferred using the multiple paths, just check the portstats and if there isn't much difference between the 2 ports then your setup is working fine.

In case of brocade switches just check the "porterrshow" and compare the tx/rx frames of both ports

Reply all

Reply to author

Forward