Thread Links Date Links
Thread Prev Thread Next Thread Index Date Prev Date Next Date Index

RE: XAUI/XGXS protocol




Kameran,

I too am having difficulty putting the pieces together for the SLP proposal.
I would like to focus on just one of Rich's comments below, but with a
swizzle. In short, I simply cannot see how the parallel (4 lane)
implementation works.

Before I go into any details, I would like to make the following comments:
1. I cannot see how a SLP serial solution could meet all of the objectives
with the serial PMD proposals on the table (distance over MMF).
2. It is therefore required that SLP support at least one of the WWDM, PAM,
or Parallel proposals.
3. I simply can't make the jump from SLP to PAM, so I won't try.
4. SLP must therefore support either 4, 8, or 12 lanes for either the WWDM
or Parallel proposals OR SLP will be used only with serial and something
else would be used for the lane'd versions.
5. Since the committee seems to be most interested in a Unified Phy
solution, I don't like the second option of (4).
6. Therefore, SLP must support multiple lanes to meet the objectives.

Rich makes a number of assumptions in his analysis of this case related to
the XAUI/XGXS interface and the PCS supporting SLP. While these may be
dismissed because XAUI is not officially adopted, I don't consider the PMA
interface (also four lanes) optional (based on the comments above). 

I therefore have a number of things that I would like clarifications on:
1. Am I correct that in a 4-wide WWDM PMA interface (for instance), that we
would see the T-Flag (12 bytes) split between the 4 lanes becoming 3 bytes
on each lane? (reference pages 4 and page 13 of your New Mexico presentation
(http://grouper.ieee.org/groups/802/3/ae/public/mar00/azadet_1_0300.pdf)?
2. If so, how is the synchronization and deskew accomplished on this set of
3 byte fields (continued in questions below)?
3. I assume that the matching to the T-Flag must occur across all four
lanes. Otherwise the calculations on page 7 would use 24 instead of 96 and
go to you know where. What is the proposed scheme for keeping this robust
without misaligning the columns, and making the synchronization complicated?
According to my brief calculations, a match of the 24 bits (assuming never a
drop of one of the idle bytes) occurs on an average of 60 times per second
assuming that you never match across byte boundaries (a rather bad
assumption, no?).
4. Can I assume that the EOP (byte 1 of the T-Flag) always exists on the
same lane? If so, what are the assumptions about maintaining lane integrity?
Is the scheme consistent with the serial assumptions?
5. If you scramble the control codes (page 9), how do you bit sync prior to
the demux since you can't descramble until you have alignment, or, is the
intent to descramble in the serial mode? What is the logic for
simultaneously bit syncing, descrambling, and detection of the alignment? If
it isn't simultaneous, what is the "process?"
6. What is the math that shows the time to resync (bit to byte to lane) all
lanes under worst case assumptions about detection errors (as in question
number 3 above)?
7. If you don't scramble, does the idle stream on one of the lanes become
something like: 1011,0001 (randomly chosen from page 11) repeated
indefinitely? Three of the four lanes would have this exact pattern
simultaneously?

I have more questions, but these, coupled with Rich's below seem adequate as
a starting point.

jonathan

> -----Original Message-----
> From: Rich Taborek [mailto:rtaborek@xxxxxxxxxxx]
> Sent: Thursday, March 16, 2000 7:34 PM
> To: HSSG
> Subject: XAUI/XGXS protocol
> 
> 
> 
> Kamran,
> 
> I've initiated a new thread since it's changed drastically. 
> Please see my
> comments below:
> 
> Kameran Azadet wrote:
> > 
> > Hi Rich,
> > 
> >         here are answers to your questions:
> > 
> > >Take SLP as an example. SLP mapped to XAUI would logically 
> be mapped in one of
> > >two ways. Theoretically, many other striping granularities, such as
> > >word-striping, are possible but I fail to see any benefits 
> in doing so:
> > >
> > >1) Stripe SLP frames one-by-one each of 4 lanes;
> > >2) Stripe SLP frames byte-by-byte on 4 lanes.
> > 
> > I don't know where "SLP frames" came from? SLP is a byte by 
> byte mapping of
> > packets (for instance ethernet packets). We never suggested 
> to send frames on
> > each lane. For a XAUI type interface the striping could be 
> anything, byte or
> > word striping or other schemes.
> 
> Your response is in line with my evaluation. I'm sorry about 
> posing the question
> this way, but your proposal left me no choice but to ask 
> since you have
> mentioned SLP as being a candidate for the XAUI/XGXS but 
> there is nothing in
> your Albuquerque that suggests how this may work. Your 
> response suggesting XAUI
> SLP word striping leaves me a bit confused though. I'll 
> assume byte-striping as
> suggested in your Albuquerque proposal and your first 
> sentence in the above
> paragraph until I'm corrected.  
> 
> > >Obviously method (1) striping whole SLP frames on each 
> lane does not lend itself
> > >to chip-to-chip interconnect or backplane applications 
> since the buffer sizes
> > >and control logic required to buffer variable sized SLP 
> frames encapsulating
> > 
> > Yes, but no-one ever proposed this...
> 
> I agree. It's dropped.
> 
> > >Ethernet packets would be unwieldy. Method (2) is fatally 
> flawed for the
> > >following set of reasons:
> > >
> > >a) Individual Lane as well as 4-lane Link synchronization 
> is not achievable with
> > >the present SLP proposal since no deskew methodology has 
> been proposed. Note
> > >that SLP non-scrambled /A/K/R/ sequences as used by 8B/10B 
> would create an EMI
> > >exposure similar to that of 8B/10B and the the 
> insertion/removal of /R/s would
> > >invalidate the SLP T-flag (EOP indication).
> > 
> > Regardless of whether a packet delineation scheme is 
> scrambled or un-scrambled,
> > bytes sent on four lanes can be scrambled, using 
> EMI-optimized scrambling. In
> > their SUPI proposal, Nortel presented an elegant way to do 
> deskewing in a
> > "protocol
> > independent" fashion. For instance by removing one byte of 
> idle per IPG, one
> > can insert un-scrambled columns of control characters such as:
> > A K R
> > A K R
> > A K R
> > A K R
> > (to be consistent with your byte striped HARI proposal). 
> This requires an
> > insert/delete FIFO. A, K and R are bytes instead of 10b 
> words. One can remove
> > a whole column of Rs without affecting the T-flag. Please 
> note the line rate on
> > each lane can be exactly 2.500 Gb/s. No rate conversion 
> circuitry is necessary.
> 
> For one to remove a whole column of R's one must first locate 
> the column of R's.
> For an 8B/10B based XAUI/XGXS it's quite simple and low power:
> 
> 1) A XAUI receiver takes in four incoming bit streams which 
> are frequency but
> not phase locked;
> 2) 4 separate synchronization circuits lock to either one of 
> two 7-bit comma
> patterns to achieve lane synchronization and byte alignment. 
> Note that this can
> happen on the first comma detected;
> 3) Parallel logic at 312.5 MHz aligns /A/ codes on all 4 
> lanes to deskew the
> byte-striped information.
> 
> The synchronization process for SLP is significantly more 
> complex. Please
> correct me if I'm wrong. I'm going to try and design the simplest SLP
> synchronization process.
> 
> ASIDE: Please note that I'm having a very difficult time 
> comparing SLP and SUPI
> to 8B/10B as XAUI/XGXS alternatives because there are no 
> complete SLP and SUPI
> XAUI/XGXS alternative proposals on the table. This is 
> especially true for SLP.
> Your statements that SLP could be byte or word striped and 
> that the control
> codes may or may not be scrambled make a world of difference 
> when I try to
> architect a working solution never mind trying to figure out 
> how to optimize an
> implementation.  Therefore: I'll make another assumption here 
> and assume that
> SLP control codes are NOT scrambled.  
> 
> 1) A XAUI receiver takes in four incoming bit streams which 
> are frequency but
> not phase locked;
> 2) Since there is no way to reliably determine byte 
> boundaries nor infer that
> the bit stream is scrambled or not at this point, 
> synchronization must occur
> across 4 lanes simultaneously;
> 3) Synchronization involves the detection of a possible match 
> of a 16-bit
> pattern of non-scrambled data in the presence of up to 37 
> bits of skew = 53
> bit/lane pattern match assuming an 8-bit deserializer width 
> and implementation.
> Possible infers to the relatively high likelihood that the AK 
> pattern will
> appear in a random data stream in any of the 4 lanes producing false
> synchronization;
> 4) Deskew requires both bit and byte alignment across 4 lanes 
> within the deskew
> window.
> 
> Once the receiver is in sync, /R/ columns can be inserted or removed.
> 
> I am concerned about both the size of the pattern match and 
> the rate at which
> this match must be performed. For example, is a serial shift 
> register used to
> perform this pattern match at the bit rate? If not, the 
> multiplexing logic used
> to ensure a 4-lane simultaneous pattern match across a 53-bit 
> space is very
> unwieldy compared to 8B/10B 7-bit comma match. 
> 
> I believe that the 25% overhead of 8B/10B at this level is an 
> excellent tradeoff
> against the complex and statistical nature of SLP. Note that 
> 8B/10B requires
> absolutely no rate matching since 8-bit data is transported 
> at exactly the same
> rate as 10-bit data. Both move at 312.5 MHz in chunks of 4 
> bytes/code-groups.
> 
> > >b) The SLP proposal does not support clock tolerance 
> compensation as does 8B/10B
> > >by the insertion/removal of /R/ columns since doing so 
> would compromise the SLP
> > >T-flag. Such a compromise increase significantly the 
> probability of false match
> > >as well as undetected packet error not to mention the 
> complexity of the
> > >synchronization logic.
> > 
> > T-flag is not affected. Probability of false match is if I 
> remember well, on the
> > order of once every 2000 billion years for perfect match, 
> and once every ~14
> > million years for match with 3 bit error tolerance.
> 
> Your statement above greatly concerns me: "One can remove a 
> whole column of Rs
> without affecting the T-flag". It is clear that removing a 
> column of /R/s
> significantly increases the probability of false EOP, and 
> worse yet, undetected
> error since the T-flag may be reduced from 12-bytes to 
> 8-bytes at the receiver
> upon /R/ column removal. This problem does not exist with an 
> 8B/10B-based
> XAUI/XGXS. Please check your calculations.
> 
> > >c) Synchronization is slow and complex is based on the 
> detection of multiple
> > >possible control patterns as is the case of supporting 
> other 10 GbE features
> > >such as Busy Idle, or any similar special control codes 
> which have been proposed
> > >for supporting all-optical switches ((NTT Albuquerque 
> proposal) or Fibre
> > >Channel/InfiniBand specific control codes.
> > 
> > Synchronization is not slow and complex. I also disagree 
> with your statement
> > that SLP is specific to ethernet, and can not support a large set of
> > control codes. We showed that with a 4b Hamming distance 
> one can support up
> > to 16 control characters. By using a control character such 
> as "O", one can
> > create ordered set control for fibre channel (ODDDODDD...). 
> It can support
> > other control words too (up to 16).
> 
> Additional control codes such as, Busy Idle, Fibre Channel 
> Ordered Sets, would
> interrupt the /A/K/R/ Idle pattern. This is not a problem for 
> an 8B/10B-based
> XAUI/XGXS in terms of synchronization time or complexity. 
> They are essentially
> passed straight through. SLP of the other hand would have to 
> deal with multiple
> 4-lane T-Flags unless you impose additional rules requiring a 
> longer minimum IPG
> for SLP or switch to a shorter T-Flag, which greatly 
> increases the probability
> of false EOP and the undetected error rate. Note that multiple T-Flags
> dramatically increase the complexity of synchronization 
> pattern match logic
> since the the T-Flag pattern is essentially the 
> synchronization pattern. 
> 
> > >d) Single lane synchronization, as simple as 8B/10B comma 
> detection, is not
> > >possible since SLP requires the detection of multiple long 
> patterns across all
> > >lanes to acquire link sync. Link and system problem 
> determination as well as
> > >scalability to higher speeds are all significantly compromised.
> > 
> > The scheme described above is scalable to other speeds.
> 
> The probability of false sync increases with an increase in 
> lanes since SLP link
> synchronization requires a pattern match across all lanes in 
> the presence of
> skew. Any 16-bit match of multiple possible T-Flag lane 
> patterns in the data in
> any lane would result in false sync.
>  
> > >e) SLP frames provide no error detection or containment 
> capabilities for
> > >transported data thereby increasing the undetected error 
> rate of the link.
> > 
> > 64b/66b does not provide data error detection capability. 
> How come ethernet
> > works well with 64b/66b that does not support error detection?
> 
> This entire thread does not concern 64B/66B since 64B/66B has 
> never been
> formally proposed as a coding for XAUI/XGXS. I'd like to 
> limit this thread to
> XAUI/XGXS protocol only. My point (e) is that 8B/10B provides 
> very robust data
> error detection capability and SLP provides essentially 
> "negative" error
> detection capability since false T-Flags can indeed create 
> false "short"
> packets.
>  
> > >To a lesser extent (not much less) the same general 
> argument above applies to
> > >the use of SUPI or 64B/66B as a XAUI encoding.
> > 
> > SUPI was proposed by Nortel more as a PMD interface, not 
> for XAUI, but I
> > don't see why it couldn't work for XAUI.
> 
> For many of the same reasons I've given above comparing an 
> 8B/10B-based
> XAUI/XGXS to an SLP-based XAUI/XGXS.
>  
> > on the other hand:
> > 
> > 64b/66b does not work for XAUI. The encoding does not allow 
> mapping things like
> > ------------------------------
> > D D D T I I I S, which you can observe on a lane of HARI. 
> Therefore 64b/66b
> > is not a candidate. And I don't recall Rick Walker or Rich 
> Dungan ever
> > saying that it could be used in HARI.
> 
> Rick Walker, Don Alderrou (nSerial) and I discussed an 
> 64B/66B-based XAUI/XGXS
> in Albuquerque over lunch and concluded that it wasn't such a 
> great idea for
> many of the reasons I presented above. It is clearly more 
> reliable than SLP
> since 66B frames are very well defined and easy to check. In 
> that sense it does
> offer excellent error detection for control information 
> including robust EOP and
> SOP as well. The down sides are that 66B frames would have to 
> be striped across
> each lane increasing latency and synchronization of 66B 
> frames on each lane is a
> statistical (read: slow) process.
> 
> Note that "D D D T I I I S" is NOT a possible XAUI pattern if 
> I interpret it
> correctly. I assume that the first "D" corresponds to lane 0 
> and the "S"
> corresponds to lane 3. To date, the only proposals on the 
> table for 10 GbE, 10
> GFC and InfiniBand all have "S" in lane 0 only. The 
> Albuquerque 64B/66B proposal
> supports all proposed XGMII and XAUI/XGXS patterns.   
> 
> > Hope this clarifies a bit
> 
> To clarify things, I think I'd like to see a concise and 
> complete SLP, SUPI or
> 64B/66B alternative coding proposal for XAUI/XGXS. Right now, 
> no such animal
> exists. It would be much easier for the 802.3ae committee and 
> reflector
> participants to compare such a proposal to the strongly 
> supported 8B/10B-based
> XAUI/XGXS proposal.
> 
> In closing, one related issue that you did not touch on is 
> EMI. Reflector
> traffic recently has been very high in the EMI area and I 
> agree the concern is
> well places. 8B/10B codes, especially repetitive codes sent 
> during IPG tend to
> have high EMI. Don and I have come up with some very simple 
> solutions which I'll
> share in another thread. However, in this note you proposed 
> an SLP /A/K/R/K/...
> repeating pattern which is non-scrambled. You might have Joel 
> Goergen take a
> look at this pattern at the SLP 2.5 Gbps with respect to its 
> current 8B/10B
> equivalent at 3.125 Gbps on his current test setup to see how 
> much worse the SLP
> Idle pattern is in terms of EMI. I haven't run it, but I'd be 
> interesting in
> seeing the results. The exact repeating patterns for both are 
> as follows. Note
> that I simply picked the first three control codes from slide 
> 11 of your
> presentation to represent SLP /A/K/R/ Please feel free to 
> optimize this code
> choice for EMI yourself:
> 
> SLP repeating Idle pattern: 
> 00010111 01001110 01110100 01001110 01110100 01001110 
> 01110100 01001110 01110100
> 01001110 01110100 01001110 01110100 01001110 01110100 01001110
> 
> 8B/10B repeating Idle pattern: 
> 001111 0011 110000 0101 001111 0101 001111 1010 110000 1011 
> 110000 0101 001111
> 0101 001111 1010 110000 1011 110000 0101 001111 0101 001111 
> 1010 110000 1011
> 110000 0101 001111 0101 001111 1010 110000 1100 001111 1010 
> 110000 1011 110000
> 0101 001111 0101 001111 1010 110000 1011 110000 0101 001111 
> 0101 001111 1010
> 110000 1011 110000 0101 001111 0101 001111 1010 110000 1011 
> 110000 0101
>    
> > Regards,
> > 
> > Kamran
> 
> -- 
> 
> Best Regards,
> Rich
>                                       
> ------------------------------------------------------- 
> Richard Taborek Sr.                 Phone: 408-845-6102       
> Chief Technology Officer             Cell: 408-832-3957
> nSerial Corporation                   Fax: 408-845-6114
> 2500-5 Augustine Dr.        mailto:rtaborek@xxxxxxxxxxx
> Santa Clara, CA 95054            http://www.nSerial.com
>