Thread Links Date Links
Thread Prev Thread Next Thread Index Date Prev Date Next Date Index

Re: Possible problem with deskew in byte striping




Rich,

Thanks for the response.  A couple follow-up questions 
and comments are embedded below.

I'll put one comment here, since it's of wider interest.
The skew sources identified by Joel Dedrick's paper are
not present at the input to a HARI interface.  They are
issues with the implementation.  Hence, they might
influence the initialization patterns, but the HARI
skew spec (i.e., at the HARI inputs) should not change.
Do you agree?

Regards,
Mike

Rich Taborek wrote:
> 
> Mike,
> 
> My responses to your queries are interspersed below:
> 
> Mike Jenkins wrote:
> >
> > Rich,
> >
> > I believe I can see some difficulties predicted in Joel Dedrick's
> > presentation, "New Sources of Lane to Lane Skew, and a Proposal
> > for Alignment", given at the recent HSSG meeting.
> > The reference is:
> >
> > http://grouper.ieee.org/groups/802/3/10G_study/public/jan00/dedrick_1_0100.pdf
> >
> > I noted in your presentation that you were modifying the byte
> > stripe coding proposal as suggested by Joel to accommodate the
> > extra skew he predicts.  The component of extra skew identified
> > as due to quantization to the "read clock" period is an extra
> > 10 UI (3.2 ns).  Here are the implications I see of this effect:
> >
> > 1) As the presentation states: "Deskew circuits must be in a
> >    common clock domain, so operate on the data after this
> >    additional skew is inserted."
> >
> >    As described in Mark Ritter's November 99 presentation,
> >
> > http://grouper.ieee.org/groups/802/3/10G_study/public/nov99/ritter_1_1199.pdf
> >
> >    word striping can establish a common clock domain BEFORE the
> >    FIFO since the word 'data valid' time exceeds the skew.  (Ie,
> >    a clock from any lane can latch all four lanes.)  Therefore,
> >    word striping does not incur this extra skew.
> 
> The additional skew incurred by Joel's implementation is the result of
> performing comma detection in parallel logic in order to reduce the function and
> power consumption of the baud rate SerDes. This is explained on page 3 of Joel
> Diedrick's Dallas '00 presentation:
> 
> http://grouper.ieee.org/groups/802/3/10G_study/public/jan00/dedrick_1_0100.pdf
> 
> A Word-Striping implementation with the same power reduction goals may encounter
> the same additional skew, identified as SerDes(Rx) skew on page 6 of Joel's
> presentation.
> 
> The revised Hari skew budget proposed on page 6 of Joel's presentation is 37 UI,
> up from 20 UI. Neither the 40-bit proposed Hari KR Idle pattern nor the fixed
> 40-bit Word-Striped Idle is capable of performing deskew in excess of +/- 20
> bits. Joel's proposed solution is to simply replace the first and every 16th
> column of R's in the KR Idle with a column of A's. This solution provides a +/-
> 80-bit deskew.

	The effect I was discussing is not the byte alignment issue
	on page 3 of Joel's presentation, but the clock period
	quantization on page 4. 
> 
> > 2) I believe this extra skew adds directly to latency of the
> >    byte stripe scheme.
> 
> I fully agree that this additional skew adds latency. However, the additional
> skew is implementation dependent and independent of striping methodology.

	
	Since word striping can latch all lanes with a common clock 
	before the FIFO, this problem DOES NOT EXIST in word striping.
	[Joel, if I am misusing your results, please correct me.]

> 
> > 3) (Most importantly) I confess that I do not clearly see how
> >    the byte stripe deskew would be accomplished since no one
> >    has outlined any algorithm for doing so.  But, I think I see
> >    in this effect the sort of thing that has always worried me
> >    about implementing byte striping....After a byte-striped HARI
> >    has deskewed, what happens when the dynamic component of skew
> >    for one of the lanes drifts across one of the read clock
> >    quantization thresholds as implied in Joel's presentation?
> 
> An example Column-Striped implementation which easily handles dynamic skew can
> operate as follows: A receive PLL tracks dynamic skew at the bit rate for each
> lane. "Byte" clocks running at 1/10 the bit rate pass deserialized data to
> slower parallel logic. A "byte" clock corresponding to the slowest lane +
> dynamic skew + design margin is used to deskew all lanes and pass aligned 40-bit
> words to decode logic.

	The "slower parallel logic" is still running at 312 MHz!

	Also, how do you determine which is the "'byte' clock 
	corresponding to the slowest lane...used to deskew all
	lanes" until you have first determined which is the 
	slowest lane by deskewing the data?
> 
> > I would appreciate hearing your views.
> >
> > Regards,
> > Mike
> > --
> > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >  Mike Jenkins               Phone: 408.433.7901            _____
> >  LSI Logic Corp, ms/G715      Fax: 408.433.7461        LSI|LOGIC| (R)
> >  1525 McCarthy Blvd.       mailto:Jenkins@xxxxxxxx        |     |
> >  Milpitas, CA  95035         http://www.lsilogic.com      |_____|
> > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> 
> --
> 
> Best Regards,
> Rich
> 
> -------------------------------------------------------------
> Richard Taborek Sr.         Tel: 408-845-6102 or 408-370-9233
> Chief Technology Officer                   Cell: 408-832-3957
> nSerial Corporation                         Fax: 408-845-6114
> 2500-5 Augustine Dr.            Email: rtaborek@xxxxxxxxxxxxx
> Santa Clara, CA 95054          Alt email: rtaborek@xxxxxxxxxx

-- 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 Mike Jenkins               Phone: 408.433.7901            _____     
 LSI Logic Corp, ms/G715      Fax: 408.433.7461        LSI|LOGIC| (R)   
 1525 McCarthy Blvd.       mailto:Jenkins@xxxxxxxx        |     |     
 Milpitas, CA  95035         http://www.lsilogic.com      |_____|    
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~