Hari Byte vs. Word Striping
Dear HSSG'ers,
If you're a fan of Hari and interested in the Byte vs. Word Striping
issue please read on. If not... click here [ ]
Still here? Get a large turkey leg and a Martinelli's in front of you.
Save the Guinness for later for this one :-)
I'd like to take this opportunity to present my view of this issue and
clearly show why Byte Striping is the optimum engineered and open system
solution for 10 GbE as well as other standards and industry interfaces
which may use Hari. My decision is based on the development of criteria
for evaluating Hari striping granularity and the rating of byte and word
striping against this criteria. Criteria development with consider Hari
functionality within an 10 GbE PHY.
Hari Definition:
Hari is a proposed as a 4 lane, 3.125 Gbaud, 8b/10B encoded, short
distance, protocol independent chip-to-chip interface capable of
transporting 10 Gbps Ethernet data. Hari was proposed in Kauai by
several individuals for use as an interface between the PCS/PMA and PMD
sublayers. Hari is the same interface as the Serial 10 GMII proposed by
Howard Frazier of Cisco in Montreal. One slightly confusing attribute of
Hari is that the corresponding function in GbE, the Ten-Bit Interface
(TBI) WAS the PMA sublayer and the TBI serial interface was propagated
across the medium. For a 10 GbE PMD either Hari coding and signaling is
propagated across the medium or a coding and/or signaling translation is
performed within the PMD prior to signaling over the medium..
An illustration of the location of Hari within a 10 GbE PHY is shown in
figure 1. For reasons of simplification only the unidirectional transmit
path is shown:
+---------+ Hari +---------+
| +----------> | Medium
| 10 GbE +----------> 10 GbE +------------>
| PCS/PMA +----------> PMD | 1-4 fibers
| +----------> |
+---------+ 4 serial +---------+
8B/10B
3.125 Gbps
Figure 1 - Location of Hari within a 10 GbE PHY
For the purposes of this thread, I'd like to use as an example
application of Hari in 10 GbE the interconnect of a PCS/PMA chip
(integration with a MAC is optional) and a 10 GbE Transceiver module
(PMD) via standard FR-4 traces. Any extensions of Hari beyond this
example are deemed to be outside the scope of the striping issue.
However, medium skew MUST be considered since support for all 10 GbE
PMDs, and therefore, skew compensation is a Hari requirement. An
illustration of the location of Hari within a possible 10 GbE device
implementation is shown in figure 2. For reasons of simplification only
the unidirectional transmit path is shown:
+--------------+
| | +-----------------+
| E S| Hari |S E |
| 10 GbE n e+--------------->e n Transceiver | Medium
| D r+--------------->r D Module +------------>
| (MAC)/ e D+--------------->D e (PMD) | 1-4 fibers
| PHY/PMA c e+--------------->e c |
| s| FR-4 PCB |s |
| | Traces +-----------------+
+--------------+ <= 20-24"
Figure 2 - Location of Hari within a 10 GbE Device
Two basic methods are being proposed for striping data across Hari: byte
and word.
- Byte striping implies the striping of a single byte (more accurately a
10-bit code-group) on each lane in rotating order.
- Word striping implies the striping of 40-bit words on each lane in
rotating order.
The source proposed Parallel 10 GMII stream assumed to be common to both
striping methods is shown in figure 3 using the proposed Howard Frazier,
Cisco, mapping per
http://grouper.ieee.org/groups/802/3/10G_study/public/nov99/frazier_1_1199.pdf
page 8
D<0:7> wwww...IISddddd/dddddIIISddd/ddddIIISdddd... Legend: I=Idle
D<8:15> rrrr...IIdddddd/ddddTIIIdddd/ddddIIIddddd... S=SOP
D<16:23> dddd...IIdddddd/ddddIIIIdddd/ddddIIIddddd... T=EOP
D<24:31> 3210...IIdddddd/ddddIIIIdddd/dddTIIIddddd... d=data
Figure 3 - Parallel 10 GMII stream
Proposed Byte striping for 10 GbE is shown in figure 4 using Howard
Frazier, Cisco, mapping per
http://grouper.ieee.org/groups/802/3/10G_study/public/nov99/frazier_1_1199.pdf
page 15
Lane 0 wwww...KRSddddd/dddddKRKSddd/ddddKRKSdddd... Legend: K=Comma/Idle
Lane 1 rrrr...KRdddddd/ddddTKRKdddd/ddddKRKddddd... R=Idle
Lane 2 dddd...KRdddddd/ddddRKRKdddd/ddddKRKddddd... S=SOP
Lane 3 3210...KRdddddd/ddddRKRKdddd/dddTKRKddddd... d=data T=EOP
Figure 4 - Byte striping for 10 GbE
Proposed Word striping for 10 GbE is shown in figure 5 using Mark
Ritter, IBM, mapping per
http://grouper.ieee.org/groups/802/3/10G_study/public/nov99/ritter_1_1199.pdf
pages 15 and 16
Lane 0 wrd0...Kidldddd/ddddKRTdKddS/ddddKidlKddd... Legend: K=Comma Kidl=Idle
Lane 1 wrd1...Kidldddd/ddddKidlKddd/ddddKidldddd... R=Idle
Lane 2 wrd2...KddSdddd/ddddKidldddd/ddddKidldddd... S=SOP
Lane 3 wrd3...Kddddddd/ddddKidldddd/TdddKddSdddd... d=data T=EOP
Figure 5 - Word striping for 10 GbE
Hari functions common to Byte- and Word-Striping:
---------------------------------------------------
Hari consist of 4 individual 8B/10B encoded serial links (lanes). The
structure of each link is essentially equivalent to the GbE TBI. Each
link contains a Serializer/Deserializer which converts 10-bit 8B/10B
encoded code-groups to a serial stream of bits and vice-versa. Each of
Hari's 4 SerDes must operate at the bit rate of 3.125 Gbaud or 1.5625
GHz. This is clearly the highest speed logic within Hari. Several
prominent SerDes vendors have publicly stated in HSSG forums that the a
Hari implementation is within the limits of 0.25 micron CMOS. Therefore,
any notion that support of clock rates of 312 MHz "causes considerable
pain" is without basis.
Bit Sync: Each lane must clock its 320 psec duration serial data at a
1.5625 GHz clock rate. In addition, each Hari lane is likely to be out
of phase with respect to each other lane requiring multiple 1.5625 GHz
clock phases in order to clock all lanes reliably. Furthermore, dynamic
skew and/or lane jitter complicate reliable bit processing. Bit
processing logic is the highest speed, highest power and most demanding
logic within Hari.
Code-Group Sync: The Deserializer of each lane delineates code-groups by
aligning to comma boundaries. This process typically requires the
stretching of the code-group (byte) clock to adjustment to the new comma
boundary. Regardless of striping granularity, the deserializer is
responsible for delivering code-groups and a code-group (byte) clock to
the receiver. Individual bits are clocked at a 1.5625 GHz rate.
Adjustments to the byte clock due to the detection of comma boundaries
must be performed in one-bit increments.
Data Processing: The Parallel 10 GMII interface from the PCS to the MAC
is currently proposed to be no wider than 32-bits and operate at a 312
MHz rate (156 MHz if both clock edges are used). It is assumed that the
PMA and Hari will likely be implemented in the same or superior
technology to that of the MAC. In order to keep up with a 10 Gbps data
rate, both the MAC and Hari must process 32-bits of (unencoded) data at
312 MHz data rate, regardless of striping granularity. Once again, any
assertion that Hari data path operation at a 312 MHz data rate is
difficult is unfounded as this rate corresponds to the MACs data path
rate. The ONLY relief would be provided by converting to 64-bit data
paths quickly once inside the chip. This is clearly an implementation
tradeoff and not a standards issue.
Lane Frequency Sync: Multi-lane SerDes operation is simplified by
ensuring that the same clock is used to drive all Hari lanes.
Striping Evaluation Criteria:
---------------------------
1) MAC stream mapping: The 10 GbE MAC stream illustrated in figure 3
above is an octet (byte) stream passed across a proposed parallel bus
called Parallel 10 GMII (XGMII) interface which is 32-bits wide. 4 MAC
octets are passed across the Parallel 10 GMII every 3.2 ns corresponding
to a 312 MHz rate. The 4 MAC octets are arranged on the parallel bus in
the following order:
Octet 0 - D<0:7>
Octet 1 - D<8:15>
Octet 2 - D<16:23>
Octet 3 - D<24:31>
In essence the MAC stream is Byte-Striped across the Parallel 10 GMII
resulting in a 32-bit vertical MAC Words. The MAC stream and MAC Words
correspond directly to a Byte-Striped Hari approach exemplified in
figure 4 above and are orthogonal to a Word-Striped Hari approach
exemplified in figure 5 above.
Additional logic including high-speed buffering would be required to
properly map the MAC stream to Hari employing Word-Striping. 16-octet
granularity processing would be required since no data can be forwarded
across Hari until 4 MAC Words are completely received.
Advantage: Byte-Striping
2) Striping Latency: The orthogonal nature of Word-Striping at each
boundary where the contents of the data stream must be processed or
forwarded results in a striping latency penalty equivalent to the
difference in striping granularity. The striping granularity of a
Byte-Striped approach is 10-bits, corresponding to one encoded 8B/10B
code-group. The resultant latency can be quantified as (40-10) x (bit
time) = (30 x 320ps) = 9.6 ns.
The number of times this latency penalty must be paid in an Ethernet
link is equivalent to the total number of times that the data stream
must be processed to interpret its contents or converted from a
Byte-Striped bus to Word-Striping at the same point. For example,
consider a 10 GbE LAN PHY employing a Serial PMD with a medium line rate
of 10.3125 Gbps. The structure of this link is shown in figure 6 below:
+-----------+ +-----------+
| b +---word---> b 64B/66B | Medium
| 10 GbE y +---word---> y Serial +------------>
| PCS/PMA t +---word---> t PMD | 1 fiber
| e +---word---> e | 10.3125 Gbps
+-----------+ +-----------+
Hari
Figure 6 - 10.3125 Gbps Serial PMD LAN PHY
For the 10.3125 Gbps Serial PMD LAN PHY Word-Striping must be converted
to a parallel bus for processing at the MAC and for conversion to
64B/66B Frames for serial transmission resulting in two latency
penalties at each link end and four latency penalties (38.4 nsec) for
the link.
An additional latency penalty due to skew is incurred independent of
striping granularity. This penalty is described further under criteria
#3 (Skew Compensation).
Latency penalty effects are application dependent and significantly
affect performance for low-latency Hari applications such as
chip-to-chip interconnects, backplanes, clustering, System Area
Networks, etc. Latency affects from word striping are acknowledged to be
largely negligible for most Ethernet and Fibre Channel applications.
However, additional logic including high-speed buffering would be
required to convert from Byte- to Word-Striping at each conversion point
is Hari Word-Striping is employed.
Advantage: Byte-Striping
3) Skew Compensation: Hari is parallel arrangement of signals and as
such is subject to skew between the signals. Furthermore, it is a system
objective of Hari to accommodate PMD skew. Skew may be static or
dynamic. For the purpose of this discussion, dynamic skew (e.g. skew
jitter) will be ignored since its effects are considered to be
negligible relative to static skew. Maximum static skew can be specified
as follows:
PCB lane-to-lane skew: <1 UI
SerDes lane-to-lane skew: <1 UI
- E.g. 320 ps at 3.125 GBaud
Medium lane-to-lane skew: <16 UI
- Sufficient for 40 km WWDM links @1300 nm (14.4 UI)
Total maximum lane-to-lane skew: <20 UI
- E.g. 6.4 ns at 3.125 GBaud
- Total = 2 x PCB + 2 x SerDes + Medium skew
.: 20 UI deskew pattern needs to be 40-bits
Note that all above skew numbers are very liberal since the maximum
proposed WWDM link is 10 km, not 40 km, resulting in a medium skew of
3.6 UI, significantly less than the budgeted 16 UI and resulting in a
more practical total skew of <8 UI, which is less than one 8B/10B
code-group. Furthermore, Single-Channel PMDs including Serial and MAS
have zero medium skew resulting in a total skew of <4 UI.
Skew compensation can be performed in a variety of ways. The serial
8B/10B encoded nature of the individual Hari signals has lead to a
common deskew methodology independent of striping granularity. This
methodology is to use the 8B/10B code-group alignment properties of the
"comma" to align multiple serial lanes. Both Byte- and Word-Striping
make use of the 8B/10B comma to perform Hari deskew. Figure 7 below
exemplifies a lane skew of a repeating /K/R/ Idle pattern proposed for
10 GbE.
Note that the use of alternate patterns such as those shown in figure 5
above are considered to be equivalent for the purpose of illustrating
Hari and medium skew. Note also that skew between lanes is typically not
in integer UI increments. However, fractional skew is assumed to be
handled in bit-level Deserializer logic and that a Deserializer is
assumed to be equally capable of presenting either individual or
multiple code-groups for skew compensation. Therefore, fractional bit
skew is assumed to be independent of the striping granularity and both
Byte and Word-Striping mechanisms need only consider skew to be in terms
of integer bit values.
Transmitter Media Receiver
Lane 1 kkrrrrrrrrrrkkkkkkkkkkrr-----------------------kkrrrrrrrrrrkkkkkkkkkkrr
Lane 2 kkrrrrrrrrrrkkkkkkkkkkrr-------------------------kkrrrrrrrrrrkkkkkkkkkkrr
Lane 3 kkrrrrrrrrrrkkkkkkkkkkrr---------------------kkrrrrrrrrrrkkkkkkkkkkrr
Lane 4 kkrrrrrrrrrrkkkkkkkkkkrr--------------------------kkrrrrrrrrrrkkkkkkkkkkrr
Legend: k = individual bits of /K28.5/; r = individual bits of /K28.0/
Figure 7 - Example Hari and medium skew Tx to Rx, 5 UI
Figure 8 below is the 8B/10B serial stream corresponding to the pattern
in figure 7 above as well as the position of the /comma/ bit pattern.
Transmitter Media Receiver
+K28.5 // +K28.0 // -K28.5 // -K28.0 /
Lane 1 111101000011010/comma/00-----------------------111101000011010/comma/00
Lane 2 111101000011010111110000-------------------------111101000011010/comma/00
Lane 3 111101000011010111110000---------------------111101000011010/comma/00
Lane 4 111101000011010111110000--------------------------111101000011010/comma/00
Figure 8 - 8B/10B serial stream (Idle)
Figure 9 below shows the compensation of skew at the receiver by the
alignment of like comma bit patterns such as the comma+ or comma-. Note
that the cost of deskew through comma alignment is a latency penalty in
terms of the actual total skew between lanes rounded up to the next
integer UI value. For example, the latency penalty shown in figure 9 is
5 UI or 5 x 320 ps = 160 ps. This latency penalty is assumed to be
independent of the striping granularity. However, the skew latency
penalty must be added to striping granularity latency penalty as well as
standard logic delay penalties (e.g. encoding/decoding, single
code-group serialization/deserialization, etc.) to arrive at a total
Hari + medium latency penalty.
Receiver Input Skewed Receiver Input Deskewed
Lane 1 --111101000011010/comma/00 -----111101000011010/comma/00
Lane 2 ----111101000011010/comma/00 -----111101000011010111110000
Lane 3 111101000011010/comma/00 -----111101000011010111110000
Lane 4 -----111101000011010/comma/00 -----111101000011010111110000
Figure 9 - Receiver skew compensation via comma alignment
It has been claimed by some Word-Striping proponents that there is no
need to deskew when Word-Striping. It should be noted that the
Word-Striping approach merely establishes fixed 40-bit word comma
boundaries subject to typically protocol dependent word encoding rules
and that deskew is a REQUIRED process for any parallel interface. The
proposed 10 GbE Ethernet Idle pattern /K/R/ issued during link
initialization and amenable to Byte-Striping is defined as a 40-bit
repeating pattern of /-K/+R/+K/-R/ synchronized across all lanes at the
transmitter. This pattern is deemed to have identical skew compensation
qualities to any 40-bit pattern required for Word-Striping.
Thus far, deskew seems to be independent of striping granularity. I will
now consider the issue of deskew implementation, which I admit should
not be a standards issue. However, the Word-Striping approach requires
that lane deskew is performed subsequent to data deserialization (past
the SerDes) while the Byte-Striping approach allows the simultaneous
alignment and deskew of multiple serial lanes within the Deserializer
itself.
It is assumed that the Hari SerDes used for each lane is similar to the
GbE Ten-Bit Interface (TBI) and that the striping granularity is
independent of SerDes design. Note that the Deserializer of each lane
already delineates code-groups by aligning to comma boundaries, the same
basic process used by the Word-Striping approach to perform skew
compensation. The key to the Byte-Striping approach is the direct use of
the Deserializer comma alignment information across lanes to output
code-group aligned, deskewed multilane information. The output of a
Byte-Striped Deserializer are 40-bit words at the same rate as the MAC
(312 MHz) representing the encoded version of the 32-bit parallel input
(e.g. PCS) to Hari. The Byte-Striped approach eliminates extraneous
high-speed logic required to perform the deskew function past the
SerDes.
Advantage: Byte-Striping
4) Train-Up Sequence: A train-up sequence, for the purpose of this
discussion, is a sequence different from the standard Idle pattern to
"train-up" skew compensation logic.
The proposed 10 GbE Ethernet Idle pattern /K/R/ issued during link
initialization and amenable to Byte-Striping is defined as a 40-bit
repeating pattern of /-K/+R/+K/-R/ synchronized across all lanes at the
transmitter. This pattern is deemed to have identical skew compensation
qualities to any 40-bit Idle pattern required for Word-Striping.
It is assumed that link initialization resulting from actions such as
power-on, detection of signal after Loss_of_Signal, etc. will insure
that a synchronized pattern enabling a Hari receiver to perform deskew
will be transmitted. Therefore, no train-up sequences are required for
either Byte- or Word-Striping.
Advantage: None. Equivalent for Byte- or Word-Striping.
5) Data Processing Rate: Hari is located within a 10 GbE link and must
process information at a rate equal to or higher than the MAC rate.
Since Hari is 8B/10B encoded. Hari information must be wider in order to
support processing at the MAC rate. The rate of processing of Hari
40-bit words must be at a 312 MHz rate. No lower rate is possible unless
the word width is extended. The data processing rate is independent of
striping granularity.
Advantage: None. Equivalent for Byte- or Word-Striping.
6) SerDes Width: Hari SerDes width for each lane must be a minimum of
one 8B/10B code-group (10-bits). The SerDes may be wider. However,
faster logic, specifically in CMOS carries with is significantly higher
power penalties than slower logic. Several proponents of Word-Striping
have indicated a preference for 20-bit SerDes. I can't tell for sure
from the proposal Word Striping on Multiple Serial Lanes, Mark Ritter,
IBM:
http://grouper.ieee.org/groups/802/3/10G_study/public/nov99/ritter_1_1199.pdf
, but page 8 seems to show a (40-bit) Word Clock (WC) emanating from the
CDR (Clock and Data Recovery unit). The only apparent reason for a wider
SerDes to accommodate Word-Striping is the detection of comma boundaries
in only one of the code-group positions. This carries with is a power
penalty. In summary, Word-Striping is at a disadvantage to Byte-Striping
with respect SerDes width if a wider SerDes is required.
Advantage: Byte-Striping.
7) Even/Odd Alignment: The only reason that GbE observed Even/Odd
alignment was to leverage some early SerDes parts which employed 20-bit
interfaces. I had always viewed this as a poor implementation dependent
tradeoff which significantly complicated GbE PCS architecture. Once
aligned, it is very difficult for an 8B/10B stream to violate Even/Odd
alignment. The error checking capabilities of 8B/10B are very robust if
used properly (e.g. by maintaining running disparity rules, etc.).
Even/Odd alignment checking is a poor tradeoff with the requirement to
align on Even/Odd boundaries and the negligible additional checking that
it provides. Even/Odd and even word alignment is irrelevant where it
count most, within a frame or packet.
A Byte-Striping approach does not require nor benefit from Even/Odd
alignment, its implementation complexity or its impact on SerDes width
and associated power consumption.
Advantage: Byte-Striping.
8) PMD Clock Tolerance Compensation: Multiple clock domains within a PMD
may be required in some implementations in order to meet Hari or medium
interface jitter specifications. Multiple clock domains result whenever
an external clock source is used to reclock or retime the incoming
signal. Since clock frequencies will likely differ between the incoming
signal and the local source, within their tolerances, clock tolerance
compensation is required.
PMDs supporting Hari may support many different Opto-Electronic elements
and support various line rates and modulation methods. In addition at
the "magic" convergence data rate of 10 Gbps, and given prior generation
similarities in PMDs (e.g. GbE, FC), there is a strong desire for 10
Gbps PMDs to be protocol independent. Therefore, there is a strong
desire for all 10 Gbps PMDs to support a common clock tolerance
compensation method.
The PMD clock tolerance compensation method selected by Hari developers
as the most straightforward and amenable to all protocols is the "Skip"
method initially proposed for use for System I/O ( a.k.a. SIO and now
called InfiniBand). The Skip method employs the insertion/removal of a
column of special code-groups during the IPG. This method is detailed in
the proposed Hari Coding Objectives presentation, Rich Taborek:
http://grouper.ieee.org/groups/802/3/10G_study/public/nov99/taborek_1_1199.pdf
This clock tolerance compensation method is equally applicable to Byte-
and Word-Striping.
However, Word-Striping proponents have recently proposed an alternate
PMD clock tolerance compensation method which is dramatically more
complex than the Skip method. This method required the insertion/removal
of potentially protocol dependent ordered-sets, which are specifically
defined combinations of special and data code-groups. Specifically for
Fibre Channel, and without a change to the mapping of existing Fibre
Channel ordered-sets, a PMD may be required to recognize hundreds of
ordered-sets for removal purposes, complicating PMD implementation and
increasing power consumption. Furthermore, different protocols may
choose different ordered-sets for insertion/deletion further
complicating PMD implementation and increasing power consumption. This
method is detailed in the proposed Word Striping on Multiple Serial
Lanes, Mark Ritter, IBM:
http://grouper.ieee.org/groups/802/3/10G_study/public/nov99/ritter_1_1199.pdf
A Byte-Striping Skip method better addresses the requirements of PMD
clock tolerance compensation method than does a Word-Striping Idle
insert/delete method.
Advantage: Byte-Striping.
9) Running Disparity Processing: Disparity generation and running
disparity (RD) checking is a principal part of 8B/10B error control.
This processing is significantly more difficult as data rates increase.
Coding proposals which compromise the RD error control feature of 8B/10B
code are likely to better served by a code which is more efficient.
The simplest RD processing methodology is one which performs that
processing on any separately for each lane without any correction or
cross-lane processing. The Byte-Striping approach supports this
methodology and enables RD processing at virtually any reasonable rate
including in units of 40-bit words per lane (i.e. once each 12.8 ns or
78 MHz).
Word-Striping proponents advocate the complex recalculation of disparity
for 12.5 Gbps Serial LAN PHY PMDs. RD recalculation is required since
words from each lane are multiplexed in rotating order to form a single
serial 12.5 Gbaud stream. Since the beginning and ending RD's of the
multiplexed words may not match (~50% of the time), RD recalculation via
methods such as complementing disparity dependent 6B and 4B vectors must
be employed where necessary. This method is detailed in the proposed
Word Striping on Multiple Serial Lanes, Mark Ritter, IBM:
http://grouper.ieee.org/groups/802/3/10G_study/public/nov99/ritter_1_1199.pdf
page 11.
An alternate Serial PMD employing Byte-Striping that has been proposed
by Rick Walker and Richard Dugan of Agilent in Kauai
http://grouper.ieee.org/groups/802/3/10G_study/public/nov99/walker_1_1199.pdf,
supports Hari and its 8B/10B and then recodes the data with a Low
overhead coding I refer to as 64B/66B, resulting in the transport of 10
Gbps of Ethernet data at the more reasonable line rate of 10.3125 Gbps,
allowing the use of virtually all existing OC-192 class optoelectronic
components.
In summary Byte-Striping enables the simplest RD processing for all LAN
PHY PMDs.
Advantage: Byte-Striping.
10) Preservation of Ordered-Sets: Some protocols heavily utilize
ordered-sets. Most notable is Fibre Channel. As a result, it is
advantageous to preserve those ordered-set definitions to enable the
smoothest migration to 10 Gbps products. However, This requirement
conflicts with a desire to specify protocol independent PMDs, especially
since significant differences exist between protocols at 1 Gbps and
migration to 10 Gbps with the fewest protocol changes is desirable for
each.
Byte-Striping and other objectives specified in the proposed Hari Coding
Objectives presentation, Rich Taborek:
http://grouper.ieee.org/groups/802/3/10G_study/public/nov99/taborek_1_1199.pdf
suggest minor changes to 1 GbE coding to simultaneously arrive at a 10
GbE and protocol independent PMD. InfiniBand coding objectives are in
line with the latter proposal.
The proposed Word Striping on Multiple Serial Lanes, Mark Ritter, IBM:
http://grouper.ieee.org/groups/802/3/10G_study/public/nov99/ritter_1_1199.pdf
suggests a significant departure from 1 GbE coding to 10 GbE. In
addition, this proposal drastically violates Fibre Channel ordered-set
rules in its 10 GbE proposed encodings by not observing the
/K28.5/Dxx.y/Dxx.y/Dxx.y/ format. Proposed 10 GbE mappings include all
of the following formats. Some of the mappings have apparent
inconsistencies. All violate FC ordered-set rules:
/Dxx.y/Dxx.y/Dxx.y/K28.5/ - Idle Comma position reversed
/K27.7/Dxx.y/Dxx.y/K28.5/ - Start of Packet, Comma during Ethernet Preamble
/K29.7/K28.0/K28.0/K28.5/ - End of Packet in byte 0
/Dxx.y/K29.7/K28.0/K28.5/ - End of Packet in byte 1
/Dxx.y/Dxx.y/K29.7/K28.5/ - End of Packet in byte 2
/Dxx.y/Dxx.y/Dxx.y/K29.7/ - End of Packet in byte 3
By making the assumption that the proposed 10 GbE mappings above or
other changes will be suggested for FC (the Ritter proposal has been
agendized for the Fibre Channel meetings to held in Reno, NV the week of
December 6, 1999), FC mappings will be drastically changed. Alternative
remappings more efficiently supporting 10 GbE, InfiniBand and Fibre
Channel and Byte-Striping have been proposed during Hari development.
In summary, some compromise must be made in all protocols in order to
have protocol independent PMDs. Byte-Striping is at the heart of the
compromise between 10 GbE and InfiniBand. Fibre Channel would do well to
follow this compromise.
Advantage: Byte-Striping.
11) Preservation of Existing SerDes Designs: Byte-Striping requires the
that multiple lanes and skew be considered in the SerDes design where
Word-Striping does not and can reuse existing 3.125 Gbps SerDes, if any.
Not much to be said here. I view this Word-Striping advantage as similar
to the preservation of separately discharging hot and cold water faucets
(remember York :-)
Advantage: Word-Striping.
12) Logic Complexity:
Word-Striping: Additional logic including high-speed buffering would be
required to properly map the MAC stream to Hari employing Word-Striping.
16-octet granularity processing would be required since no data can be
forwarded across Hari until 4 MAC Words are completely received. The
Byte-Striped approach eliminates extraneous high-speed logic required to
perform the deskew function past the SerDes. PMDs may be required to
recognize hundreds of ordered-sets for clock tolerance compensation
purposes. Complex running disparity recalculation is required to support
12.5 Gbps Serial LAN PHY PMDs.
Byte-Striping: Additional logic is required between all lane
Deserializers to utilize comma alignment information in order to deskew
the multiple lanes and forward 40-bit words to the next stage.
Advantage: Byte-Striping.
13) Power Consumption:
Word-Striping: Additional logic including high-speed buffering to store
portions of deserialized words, post deserializer deskew logic,
ordered-set compare logic, running disparity recalculation logic and
SerDes designs greater than 10-bits wide significantly increase power
consumption.
Byte-Striping: Additional logic is required between all lane
Deserializers to utilize comma alignment information in order to deskew
the multiple lanes and forward 40-bit words to the next stage. However,
implementations which optimize SerDes designs for multi-lane
applications with a focus in reducing power consumption should be
explored in detail for 10 Gbps applications.
14) Patent Protection:
Word-Striping: IBM patent
Byte-Striping: None known.
Advantage: Byte-Striping.
Final Results
-------------
An X in a column indicates an advantage. Ties are blank.
Striping Evaluation Criteria | Byte | Word |
---------------------------------------+------+------+
1) MAC stream mapping | X | |
2) Striping Latency | X | |
3) Skew compensation | X | |
4) Train-up sequences | | |
5) Data processing rate | | |
6) SerDes width | X | |
7) Even/Odd alignment | X | |
9) Running disparity adjustment | X | |
8) PMD clock tolerance compensation | X | |
10) Preservation of ordered-sets | X | |
11) Preservation of existing SerDes | | X |
12) Logic complexity | X | |
13) Power consumption | X | |
14) Patent protection | X | |
---------------------------------------+------+------+
Total Advantages | 11 | 1 |
--
Best regards,
You friendly Systems Architect,
Rich
----------------------------------------------------------
Richard Taborek Sr. 1441 Walnut Dr. Campbell, CA 95008 USA
Tel: 408-330-0488 or 408-370-9233 Cell: 408-832-3957
Email: rtaborek@xxxxxxxxxx or rtaborek@xxxxxxxxxxxxx