Re: Does Ten-Gigabit Ethernet need fault tolerance? (nonredundant NICs)
Roy,
At 7:53 PM 99/7/27, Roy Bynum wrote:
>
>You use the term "rostering algorithm". Does this mean that the using P802.3ad
>would not be a simple binary decision circuit built into the chip? Very
>few fault
>tolerance systems have more than a binary structure, those are tertiary
>with simple
>"lockstep" hardware logic.
I don't understand 802.3ad well enough to answer the question. Could you
expand the question a bit?
>Over the years, I have learned that the closer that you get to the level
>that is
>being "protected" the faster and more reliable fault tolerance is. In
>this case, it
>is the optical transport that is being "protected". I am all for the use
>of link
>aggregation for existing 802.3 interfaces, primarily because fault allowance
>technology does not exist for them otherwise. Simple hardware fault tolerance
>technology does exist for 10gb interfaces today.
I agree that closer is better, and that RTFC protects the optical
transport. RTFC also protects against the loss of nodes (containing NICs)
and hubs. However, RTFC is not link aggregation; with RTFC, the capacity
of the segment is identical to the capacity of a non-redundant segment of
the same topology, regardless of how many added hubs and links have been
provided. The whole point of RTFC is fault and damage tolerance, not
capacity enhancement.
>10GbE will most likely not be implemented over BLSR rings in its early
>stages of
>deployment. This is because of the massive amount of fiber transport
>facilities
>that are being deployed today. I do think that any WAN implementation
>will use the
>2km interface directly into what is called a "lite LTE". This is an LTE
>that has
>line/segment SONET/SDH OAM&P functionality, without the TDM multiplexing of a
>standard LTE. The 10GbE will have path overhead functionality only. This
>type of
>interface will need very simple fiber maintenance functionality, the kind
>that is
>resolved by a simple binary hardware solution.
What's a "BLSR ring"? A SONET component? If so, SONET already has its
own fault tolerance provisions. One may debate its adequacy, but I would
doubt that anybody will wish to layer RTFC on top of SONET. Nor do I
propose any such thing. That leaves pure LAN implementations of 10GbE
needing fault and damage tolerance.
In any case, RTFC works in ring topologies.
>The 40km implementation will be used over metropolitan, leased fiber
>systems. These
>will be, for the most part, diverse path 1+1 systems. This kind of
>deployment will
>need very robust, tightly coupled fault tolerance functionality. Without the
>ability to control fiber breaks, fiber degradation, and other fiber
>related issues,
>the ability to switch to alternate receiver with minimum loss of data
>traffic will
>be paramount. I have a hard time believing that any upper layer
>functionality can
>accomplish this with 100% reliability.
RTFC is designed to handle just such systems, altough the recovery times
will be slowed by pesky speed-of-light delays. Assuming that 40 kilometers
is the network diameter, and that there are 100 nodes in the 10 gigabit
backbone, with a hub in the center, a ring tour time would be 20
milliseconds, so recovery would be two or three times that, call it 50
milliseconds. Nobody can do better, as this is 99% speed of light delay.
If I understand your nomenclature, RTFC and rostering are lower-layer
functions, and I agree that recovery is easier the lower it's done. And,
the faster it's done, the less impact on upper-layer software and users.
Joe
>Joe Gwinn wrote:
>
>> Roy,
>>
>> At 9:12 PM 99/7/24, Roy Bynum wrote:
>> >
>> >Does RTFC allow a minimally trained individual to simply plug two fiber T/R
>> >pairs into the 10GbE interface to implement fault tolerance and if a
>>second T/R
>> >pair, parallel to the first, is not plugged in the fault tolerance is not
>> >implemented? This will be the simplest and most common implementation
>>process.
>>
>> Yes, this will work, by design. The rostering algorithm will just treat
>> the missing path as broken, and press on. There is no problem with parts
>> of the segment having non-redundant NICs, although those NICs will be cut
>> out of the segment if those NICs or their links fail.
>>
>> Joe
[snip]