Thread Links Date Links
Thread Prev Thread Next Thread Index Date Prev Date Next Date Index

Re: Does Ten-Gigabit Ethernet need fault tolerance?




Roy,

I apologize for not including any FT restoration times in my note. However, my point
is that restoration time is typically zero in a mainframe channel environment.
Furthermore, I see very little difference today between the directions mainframe
channel netwoks are taking and where Ethernet already is. The basic components that
are required to constuct a FT network are already present in Ethernet.

Best Regards,
Rich

--

Roy Bynum wrote:

> Rich,
>
> I hate to burst your bubble.  I would not call a traffic restoration of minutes
> "fault tolerance".  If you will look at the functionality of an L3 only traffic
> restoration service you will find it to be unable to do better in complex
> architectures.  Even on the OC48 POS systems that use the SONET over head fault
> detection to trigger an L3 restoration event, it still greatly exceeds the
> restoration time of an L1 (SONET/SDH) only restoration process.  Even use of the
> fault restoration within ATM greatly exceeds an L1 only restoration process.  This
> kind of traffic restoration is NOT fault tolerance.  At present, only the tightly
> coupled remote error indication process of SONET/SDH can be considered fault
> tolerance; an L1 only process, even then it is greatly effected by the
> implementation architecture.
>
> Thank you,
> Roy Bynum
> MCI WorldCom
>
> Rich Taborek wrote:
>
> > Joe,
> >
> > I heartily agree with Mick's point. Fault Tolerance in Ethernet, if implemented
> > at all, is generally implemented in at L3 or above (the MAC). I believe this to
> > be appropriate for Ethernet networks. It seems to me that FT requirements are
> > orthogonal to data transport requirements which are the purview of Ethernet.
> > Here are some examples of networks which support FT:
> >
> > 1) WANs implementing SONET FT at level 1: Very, very expensive but capable of
> > reliably transporting any type of data reliably upon protocol conversion. FT at
> > level 1. FT at other levels is essentially ancillary.
> >
> > 2) Mainframe channels: Fairly fault tolerant individually, and virtually
> > fault-free considering multi-path configurations and multiple hosts controlled
> > by a single operations point. I would put these in the very expensive category.
> > FT at levels 1, 2 and 3 and above.
> >
> > 3) Commodity servers, Ethernet and FT extensions such as Link Aggregation and
> > Rapid Reconfiguration: Fault tolerant as a system, expensive only in comparison
> > to non-fault tolerant systems (i.e. dirt cheap compared to other FT
> > alternatives). FT at levels 3 and above.
> >
> > I also agree that sufficient failure detection is already built into Ethernet.
> >
> > I'd put my money at the threshold of door #3.
> >
> > Best Regards,
> > Rich
> >
> > --
> >
> > "Mick Seaman" wrote:
> >
> > > What needs to be built in is the detection of failure. What we don't need to
> > > do is to build everything into the MAC. I suggest you look at the fault
> > > tolerant capabilities provided by P802.3ad and the work on Rapid
> > > Reconfiguration starting in 802.1.
> > >
> > > Both these (will) provide a degree of fault tolerance based on using
> > > protocols that are independent of MAC details to allow network nodes to
> > > precalculate their response to a low level indication of failure. There is
> > > really no need to build these protocols into the MAC.
> > >
> > > Mick
> > >
> > > -----Original Message-----
> > > From: owner-stds-802-3-hssg@xxxxxxxxxxxxxxxxxx
> > > [mailto:owner-stds-802-3-hssg@xxxxxxxxxxxxxxxxxx]On Behalf Of Joe Gwinn
> > > Sent: Friday, July 16, 1999 3:15 PM
> > > To: stds-802-3-hssg@xxxxxxxx
> > > Subject: Does Ten-Gigabit Ethernet need fault tolerance?
> > >
> > > The purpose of this note is to present a case for inclusion of fault
> > > tolerance in 10GbE, and to offer a suitable proven technology for
> > > consideration.  However, no salesman will call.
> > >
> > > Why Fault Tolerance?  Ten-Gigabit Ethernet is going to be a relatively
> > > expensive, high-performance technology intended for major backbones,
> > > perhaps even nibbling at the bottom end of the wide-area network (WAN)
> > > market.  In such applications, high availability is very much desired; loss
> > > of such a backbone or WAN is much too disruptive (and therefore expensive)
> > > to be much tolerated, and this kind of a market will gladly pay a
> > > reasonable premium to achieve the needed fault tolerance.
> > >
> > > Why add Fault Tolerance now?  Because it's easiest (and thus cheapest) if
> > > done from the start, and because having FT built in and therefore becoming
> > > ubiquitous will be a competitive discriminator, neutralizing one of the
> > > remaining claimed advantages of ATM.
> > >
> > > Isn't Fault Tolerance difficult?  In hub-and-spoke (logical star, physical
> > > loop) topologies such as GbE and10GbE, it's not hard to achieve both fault
> > > tolerance (FT) and military-level damage tolerance (DT).  In networks of
> > > unrestricted topology, it's a lot harder.  The presence of bridges does not
> > > affect this conclusion.
> > >
> > > How do I know that FT is so easily achieved?  Because it's already been
> > > done, may be bought commercially, and is in use on one military system and
> > > is proposed for others.  The FT/DT technology mentioned here was developed
> > > on a US Navy project, and is publically available without intellectual
> > > property restrictions.  Why was the technology made public?  To encourage
> > > its adopotion and use in COTS products, so that defense contractors can buy
> > > FT/DT lans from catalogs, rather than having to develop them again and
> > > again, at great risk and expense.
> > >
> > > What is the difference between Fault Tolerance and Damage Tolerance?  In
> > > fault tolerance, faults are rare and do not correlate in either time or
> > > place. The classic example is the random failure of hardware components.
> > > (Small acts of damage, such as somebody tripping over a wire or breaking a
> > > connector somewhere, are treated as faults here because they are also rare
> > > and uncorrelated.) In damage tolerance, the individual faults are sharply
> > > correlated in time and place, and are often massive in number. The classic
> > > military example is a weapon strike. In the commercial world, a major power
> > > failure is a good example. Damage tolerance is considered much harder to
> > > accomplish than fault tolerance. If you have damage tolerance, you also
> > > have fault tolerance, but fault tolerance does not by itself confer damage
> > > tolerance.
> > >
> > > How is this Damage Tolerance achieved?  All changes in LAN segment topology
> > > (the loss or gain of nodes (NICs), hubs, or fibers) are detected in MAC
> > > hardware by the many link receivers, which report both loss and acquisition
> > > of modulated light. This surveillance occurs all the time on all links, and
> > > is independent of data traffic. Any change in topology provokes the
> > > hardware into "rostering mode", the automatic exploration of the segment
> > > using a flood of special "roster" packets to find the best path, where
> > > "best" is defined as that path which includes the maximum number of nodes
> > > (NICs).
> > >
> > > Just how fault tolerant and damage tolerant is this scheme?  A segment will
> > > work properly with any number of nodes and hubs, if sufficient fibers
> > > survive to connect them together, and will automatically configure itself
> > > into a working segment within a millisecond of the last fault. If the
> > > number of broken fibers is less than the number of hubs, all surviving
> > > nodes will remain accessible, regardless of the fault pattern. If the
> > > number of fiber breaks is equal to or greater than the number of hubs,
> > > there is a simple equation to predict the probability of loss of access to
> > > a typical node due to loss of hubs and/or fibers, given only the number of
> > > hubs and the probability of any fiber breaking: Pnd[p,r]= ((2p)(1-p))^r,
> > > where p is the probability of fiber breakage and r is the number of
> > > surviving hubs (which ranges from zero to four in a quad system). This
> > > equation is exact (to within 1%) for fiber breakage probabilities of 33% or
> > > less, and applies for any number of hubs.
> > >
> > > The simplicity of this equation is a consequence of the simplicity of this
> > > protocol, which is currently implemented in standard-issue FPGAs (not
> > > ASICs), and works without software intervention.  It can also be
> > > implemented in firmware.
> > >
> > > To give a numerical example, in a 33-node 4-hub segment, loss of 42 fibers
> > > (16% of the segment's 264 fibers) would lead to only 0.5% of the nodes
> > > becoming inaccessible, on average. Said another way, after 42 fiber breaks,
> > > there are only five chances out of a thousand that a node will become
> > > inaccessible. This is very heavy damage, with one fiber in six broken. To
> > > take a more likely example, with three broken fibers, all nodes will be
> > > accessible, and with four broken fibers, there is less than one chance in a
> > > million that a node will become inaccessible. Recovery takes two ring tour
> > > times plus settling time (electrical plus mechanical), typically less than
> > > one millisecond in ship-size networks, measured from the last fault.
> > > Chattering and/or intermittent faults can be handled by a number of
> > > mechanisms, including delaying node entry by up to one second. Few current
> > > LAN technologies approach this degree of resilience, or speed of recovery.
> > >
> > > In commercial systems and some military systems, a dual-ring solution is
> > > sufficient.  Up to quad-ring solutions are comercially available, needed
> > > for some military systems.  However, the ability to support up to quad
> > > redundant systems should be provided in 10GbE, for two reasons.  First,
> > > quad is needed for the military market, which may be economically
> > > significant in the early years of 10GbE.  Second, quad provides a clear
> > > growth path and a way to reassure non-military customers that their most
> > > stringent problems can be solved: One can ask them if their needs really
> > > exceed those of warships duelling with supersonic missiles.
> > >
> > > The basic technical document, the RTFC Principles of Operation, is on the
> > > GbE website as "http://grouper.ieee.org/ groups/802/3/ 10G_study/public/
> > > email_attach/ gwinn_1_0699.pdf" and "http://grouper.ieee.org/
> > > groups/802/3/10G_study/ public/ email_attach/ gwinn_2_0699.pdf".   I was a
> > > member of the team that developed the technology, and am the author of
> > > these documents.
> > >
> > > Although these documents assume RTFC, a form of distributed shared memory,
> > > the basic rostering technology can easily be adapted for Gigabit and
> > > Ten-Gigabit Ethernet as well.  For nontechnical reasons, RTFC originally
> > > favored smart nodes connected via dumb hubs.  However, the overall design
> > > can be somewhat simplified if one goes the other way, to dumb nodes and
> > > smart hubs.  This also allows the same dumb nodes to be used in both non-FT
> > > and FT networks, increasing node production volume, and does not force
> > > users to throw nodes away to upgrade to FT.
> > >
> > > I therefore would submit that 10GbE would greatly benefit from fault
> > > tolerance, and also that it's very easily achieved if included in the
> > > original design of 10GbE.
> > >
> > > Joe Gwinn

-------------------------------------------------------------
Richard Taborek Sr.    Tel: 650 210 8800 x101 or 408 370 9233
Principal Architect         Fax: 650 940 1898 or 408 374 3645
Transcendata, Inc.           Email: rtaborek@xxxxxxxxxxxxxxxx
1029 Corporation Way              http://www.transcendata.com
Palo Alto, CA 94303-4305    Alt email: rtaborek@xxxxxxxxxxxxx