# Clause 184 functions (supporting comments 243- 247, 249, 250, 252)

Tom Huber, Nokia

1

# **Supporters**

- Steve Gorshe, Microchip
- Matt Brown, Alphawave
- Arnon Loewenthal, Alphawave
- Gary Nicholl, Cisco
- Eugene Opsasnick, Broadcom Supporters<br>• Steve Gorshe, Microchip<br>• Matt Brown, Alphawave<br>• Arnon Loewenthal, Alphawave<br>• Gary Nicholl, Cisco<br>• Eugene Opsasnick, Broadcom<br>• Leon Bruckman, Nvidia
- 

# General problem statement

- Clause 184 defines the inner FFC and PMA for 800GBASE-LR1 using pseudocode to describe the processes in 184.4 and 184.5
- Multiple comments were submitted against D1.0 with the intention to simplify the pseudocode; these were rejected on the basis that: all are the inner FEC and PMA for 800GBASE-LR1<br>ang pseudocode to describe the processes in 184.4 and 184.5<br>ltiple comments were submitted against D1.0 with the<br>ention to simplify the pseudocode; these were rejected on the<br>
	- The text was technically correct, unambiguous, and came from the baseline slides
	- It would be better to have a more complete proposal before making any subsequent subclause)
- This presentation provides a more complete proposal in support of comments 243-247, 249, 250, and 252  $\frac{1}{3}$

# Structure of the inner FEC functions

Each of these functions is specified with pseudocode, with the output of each function providing the input to  $\|\mathcal{F}_{\text{RCH-posed}}\|$ the next one



# Clarifying the pseudocode fragments

- The pseudocode fragments in clause 184.4.[1-7] and 184.5.8 are unnecessarily complex, which hinders understanding for readers that don't already understand the processing that is occurring
	- The complexity largely comes from the inclusion of extra iterators for lanes (when functions are performed per-lane) and/or bits within the symbols that are being manipulated
	- In addition, pseudocode for the PCS lane alignment and reordering is introduced, even though these processes are well-specified already with state diagrams
- It would also be beneficial to have English descriptions of what some of these functions are doing at the beginning of each subclause so the reader can better understand the detailed manipulations that are being specified via the pseudocode

# Lane alignment and reordering (Subclauses 184.4.1-2, comments 243-245)

- Clauses 184.4.1 and 184.4.2 provide pseudocode that is intended to explain how to lock to the AMs and reorder the PCS lanes; there is no need for this level of specification in clause 184
	- These functions are essentially unchanged since 802.3ba, other than the number of PCS lanes and the values of the AMs
	- The functions are specified using state diagrams in clause 119 (referenced by clause 172)
	- The functions are used not only in the x00GBASE-R PCS (x=1,2,4,8), but also in the x00GXS (x=2,4,8)
- Aligning to AMs inherently aligns to 10-bit symbols because the RS FEC frame is based on 10-bit symbols and is delimited by the AMs

# PCS receive example (clause 119, to which clause 172 refers)

#### • 119.2.5.1 Alignment lock and deskew

The receive PCS forms *n* separate bit streams by concatenating the bits from each of the *n* **PCS receive example (clause 119, to which**<br>clause 172 refers)<br>• 119.2.5.1 Alignment lock and deskew<br><sup>PMA:IS\_UNITDATA\_indication primitives in the order they are received (where  $n = 8$  for a 200GBASER<br>PMA:IS\_UNITDATA\_indic</sup> PCS and  $n = 16$  for a 400GBASE-R PCS). It obtains lock to the alignment markers as specified by the alignment marker lock state diagram shown in Figure 119–12. Note that alignment marker lock is achieved before FEC codewords are processed and therefore the alignment markers are processed in a high error probability environment.

After alignment marker lock is achieved on each of the *n* lanes (bit streams), all inter-lane Skew is removed as specified by the PCS synchronization state diagram shown in Figure 119–13. The PCS receive function shall support a maximum Skew of 180 ns, and maximum Skew Variation of 4 ns, between PCS lanes. Not required!

#### • 119.2.5.2 Lane reorder and de-interleave

PCS lanes can be received on different lanes of the service interface from which they were originally transmitted. The PCS receive function shall order the PCS lanes according to the PCS lane number. The PCS lane number is defined by the unique portion (UM0 to UM5) of the alignment marker that is mapped to each PCS lane (see 119.2.4.4).

7

# PHY\_XS examples (clauses 118 and 171)

#### • 118.1.2 200GXS/400GXS Sublayer

The 200GXS, if implemented, shall be identical in function to the 200GBASE-R PCS in Clause 119 with the addition of the functions defined in 118.2. A single device may be configured as either a 200GXS or the 200GBASE-R PCS and may be managed through different optional management registers.

The 400GXS, if implemented, shall be identical in function to the 400GBASE-R PCS in Clause 119 with the addition of the functions defined in 118.2. A single device may be configured as either a 400GXS or the 400GBASE-R PCS and may be managed through different optional management registers. • **118.1.2 200GXS/400GXS Sublayer**<br>The 200GXS, if implemented, shall be identical in function to the 200GBASE-R PCS in Clause 119 with the<br>addition of the functions defined in 118.2. A single device may be configured as ei

#### • 171.3 PHY 800GXS

The PHY 800GXS shall be identical in function to an 800GBASE-R PCS (see Clause 172) with the following exceptions:

# Pseudocode from 184.4.1-2

For each I

 $\textsf{Pseudocode from 184.4.1-2} \ \textsf{AlgorithmIC} \ \textsf{Pseudocode from 184.4.1} \ \textsf{For each } I \ \textsf{for} \ \textsf{on} = 0 \ \textsf{to} \ 1 \ \textsf{For} \ \textsf{in} = 0 \ \textsf{to} \ 31 \ \textsf{End}[\textsf{for}, I] = \textsf{FEC:IS\_UNITDATA\_m.request}(\textsf{tx\_symbol}) \ \textsf{to} \ \textsf{RS-FEC} \ \textsf{symbol}(\textsf{which is} \ \textsf{to} \ \textsf{not} \ \textsf{to} \ \textsf{to} \ \textsf{to} \ \textsf{$ For  $m = 0$  to 31 pcsli $[m, i]$  = FEC:IS UNITDATA *m*.request(tx symbol) End for

End for

RS-FEC symbol alignment shall be achieved on the 32 pcsli $[m, i]$ lanes ( $m = 0$  to 31) as follows:<br>
— *j* mod 10 = 0 when pcsli[*m*, *j*]

corresponds to the first bit of an RS-FEC symbol

#### • Reordering (184.4.2)

The 32 pcsli $[m]$  lanes ( $m = 0$  to 31) are rearranged to 32 pcsla $[q]$ lanes ( $q = 0$  to 31) where q corresponds to PCSL q

While this correctly describes alignment to RS-FEC symbols (which is what was While this correctly describes alignment<br>to RS-FEC symbols (which is what was<br>intended – full deskew is not required), it<br>isn't adding any new information beyond<br>what is in the state diagram in figure 119-<br>12 isn't adding any new information beyond what is in the state diagram in figure 119- 12

This is more complex than what is in clause 119 and isn't really adding any additional information

# Proposal for subclauses 184.4.1-2

- The main value of the pseudocode in these clauses is that it ultimately defines the vector pcsla[], which is the input to the next function **OPOSAL for Subclauses 184.4.1-2**<br>
The main value of the pseudocode in these clauses is that it ultimately<br>
efines the vector pcsla[], which is the input to the next function<br>
• pcsla[] can be defined directly and more cl
	- pcsla[] can be defined directly and more clearly without pseudocode
- The alignment lock and lane reordering functions are clearly defined in 172.2.5.1-2 (which point to 119.2.5.1-2)
- Suggested changes:
	-
	-
	- - index q indicates the PCS lane number (0 to 31) and the index *i* represents the sequence of 10bit RS FEC symbols.

# Lane permutation (Subclause 184.4.3, comments 245-246 )

- The 800GBASE-R PCS has two flows, with two FEC encoders each
	- PCS lanes 0-15 come from flow0, lanes 16-31 from flow1
	- Within each PCS lane, the FEC symbols from the two encoders for that flow alternate
		- Lanes in flow0 have symbols A, B, A, B; those in flow1 have C, D, C, D
- The purpose of the lane permutation function is to create a set of 32 output lanes (inner FEC flows) that all have the symbol pattern A, B, C, D
- This is accomplished by swapping the assignment of PCS lanes of flow0 and flow1 to the corresponding sets of 16 output lanes every two symbols
	- In other words, output lane 0 takes two symbols from PCS lane 0, then two from PCS lane 16, output lane 1 takes two symbols from PCS lane 1, then two from PS lane 17, etc.

# Pseudocode from 184.4.3



The permutation function does not change bit positions within the symbols, so it is simpler to describe the operation on symbols (i.e., replace 10*i+j* with *i* and eliminate the *j* loop).

# Proposal for subclause 184.4.3 (1)

#### • Introductory text:

This function rearranges the RS FEC symbols of the PCS lanes to create 32 output inner FEC lanes such that each group of four consecutive symbols on each output lane contains one symbol from each of the four RS FEC encoders in the 800GBASE-R PCS

#### • Pseudocode

Define pcsla[q, i] to be the 10-bit symbol in PCS lane q at time i (after lane alignment and reordering)

Define permo[q, i] to be the 10-bit symbol in output lane q at time i at the output of the permutation function

The permutation function is defined by the following pseudocode:

```
For each i
 For each q = 0 to 31
  permo[q, i] = pcsla[(q + 16×floor(i/2)] mod 32, i]
 End for
End for
```
### Proposal for subclause 184.4.3 (2)





• Modify figure 184-3 to more clearly shows the detail of the 40-bit symbols on which the lane permutation is operating and how the function is producing groups of 4 symbols in each lane that come from four different FEC codewords.

# Remaining functions are per-lane

- Per figure 184-2, the functions between lane permutation and interleaving into the BCH FEC frame are performed separately on each lane
- Corresponding pseudocode should describe the processing for a single lane, not the set of 32 lanes



# Convolutional interleaver (Subclause 184.4.4, comment 247)

- Convolutional interleaver<br>
(Subclause 184.4.4, comment 247)<br>
 The purpose of the convolutional interleaver is to rearrange the time order of<br>
the RS FEC symbols for each lane such that each BCH FEC symbol (which<br>
has a pa the RS FEC symbols for each lane such that each BCH FEC symbol (which has a payload of 11 RS FEC symbols) has no more than one RS FEC symbol from any RS FEC codeword Convolutional interleaver<br>
Subclause 184.4.4, comment 247)<br>
The purpose of the convolutional interleaver is to rearrange the time order of<br>
the RS FEC symbols for each lane such that each BCH FEC symbol (which<br>
has a paylo **(Subclause 184.4.4, comment 247)**<br>
• The purpose of the convolutional interleaver is to rearrange the time order of<br>
the RSFEC symbols for each than esuch that each BCH FEC symbol (which<br>
has a payload of 11 RS FEC symbo
- The (preceding) lane permutation function has created lanes where each group of 4 consecutive RS FEC (10-bit) symbols comes from four separate RS
- lane such that consecutive symbols in the output stream were separated by 17 symbols in the input stream
	- RS FEC codewords are 5440 bits (13.6 40-bit symbols)

# Pesudocode from 184.4.4



This function operates independently on each lane, on 40-bit symbols (i.e. groups of four 10-bit RS FEC symbols). It rearranges the symbols such that the BCH FEC codewords (which are 110 bits) contain no more than one symbol from any RS FEC

lanes, so it can be specified more simply as operating on an individual lane, in which case the index p and associated for loop can be removed

The operation does not change bit positions within the 40-bit symbols; it simply changes the position of the symbols. The multiplier 40, index j and associated for loop can be removed so the operation is described based on 40-bit symbols.  $17$ 

# Proposal for subclause 184.4.4

- Introductory text is fine as written (from the start of the clause through the first paragraph after the figure)
- Replace the text on page 479, starting at the second paragraph below figure 184-4:

The following is performed individually on each of the q lanes of permo, operating on 40-bit symbols (consisting of four RS-FEC symbols) j: Introductory text is fine as written (from the start of the<br>through the first paragraph after the figure)<br>Replace the text on page 479, starting at the second p<br>below figure 184-4:<br>he following is performed individually o • Replace the text on page 479, starting at the second paragraph<br>below figure 184-4:<br>The following is performed individually on each of the  $q$  lanes of<br>permo, operating on 40-bit symbols (consisting of four RS-FEC<br>symbol

For each j

End for

# BCH encoder (Subclause 184.4.5, comment 249) • This can be described more clearly in periods.<br>• This can be described more clearly in periods of the operates on 110-bit symbols (11 RS-FEC symbols) and adds 16<br>• This can be described more clearly in pseudocod using th

- The BCH encoder adds the inner FEC code
- It operates on 110-bit symbols (11 RS-FEC symbols) and adds 16 parity bits to create 126-bit symbols
	- notation rather than a bit-level iterator
- Since the function operates on each lane individually, there is no need to include a per-lane index in the description of the encoder

# Pseudocode from 184.4.5 This function operates independently on



each lane, on 110-bit symbols (i.e. groups of eleven 10-bit symbols). It computes the 16 parity bits and adds them to the end of the 110 input bits to create the 126-bit BCH codeword.

The operation could be described more simply as appending 16 bits to each group of 110 bits rather than copying 110 bits and then adding 16 bits, like what is done in clause 91.

This  $p$  is not referring to the lane number iterator, but to the parity bits computed by the FEC.

The operation does not change lane numbers, so the algorithm can be specified more simply as operating on an individual lane and the index  $p$ and associated for loop can be removed

# Proposal for subclause 184.4.5

On page 480:

- Delete the dashed items defining the indexes  $q$ , i, and j
- Delete the text describing how  $u$  and  $v$  are related to  $i$  and  $j$
- Remove the references to  $q$  in the description of the message polynomial  $m(x)$
- Define parity[15:0] to be the coefficients of the computed parity polynomial
- Replace the pseudocode with this:

For each u

```
encodeo[126u:126u+109] = convio[110u:110u+109]
```

```
encodeo[126u+110:126u+125] = parity[15:0]
```
End for

(i.e., after each 110 bits, add the 16 computed parity bits for those 110 bits)  $^{\frac{21}{21}}$ 

# 184.4.6 - Circular shift<br>(Subclause 184.4.6, comment 2<br>C<sup>The circular shift ratates the bite within the 11</sub></sup> (Subclause 184.4.6, comment 250)

- The circular shift rotates the bits within the 110 payload bits of the BCH codewords to further improve burst tolerance (leaving the 16 parity bits unchanged)
- The amount of shift depends on the lane number

## Pseudocode from 184.4.6



This function operates independently on each lane, on the first 110 bits of each BCH FEC codeword, reorganizing the bits. The amount of shifting is different for each lane, so the index  $p$  is needed, but there is no need to include an iterator since the function is applied to each lane

Since bits 110-125 do not change, there is no need to explain that; the second for j loop can be

value of  $p$  is still important, however, since the shift does depend on the lane number)

# Proposal for subclause 184.4.6

Replace the entire clause with this:

The circular shift function is applied to each lane. It rearranges the 110 payload bits of each BCH FEC codeword to further increase robustness to burst errors. Consider each 126-bit BCH FEC codeword as a vector of bits, encodeo[j], and apply the following process: eplace the entire clause with this:<br>he circular shift function is applied to each lane. It rearranges the<br>10 payload bits of each BCH FEC codeword to further increase<br>bbustness to burst errors. Consider each 126-bit BCH F Replace the entire clause with this:<br>The circular shift function is applied to<br>110 payload bits of each BCH FEC cor<br>robustness to burst errors. Consider e<br>codeword as a vector of bits, encoded<br>process:<br>For  $j = 0$  to 109<br>c

```
For j = 0 to 109
```
Where q corresponds to the lane number (0 to 31)

# Convolutional de-interleaver (Subclause 184.5.8, comment 252) Convolutional de-interleaver<br>• Subclause 184.5.8, comment 252)<br>• The purpose of the convolutional de-interleaver is to undo the<br>• manipulation performed by the convolutional interleaver

manipulation performed by the convolutional interleaver

## Pseudocode from 184.5.8

40-bit symbol iterator For each  $i$ Lane iterator For  $p = 0$  to 31 Bits within a symbol iterator For  $i = 0$  to 39 output  $[p, 40 \times (i + 18 \times (2 - i \mod 3)) + j] = \text{input}[p, 40i + j]$ End for **End** for End for The operation does not move symbols between

This function operates independently on each lane, on 40-bit symbols (i.e. groups of four 10-bit RS FEC symbols). It restores the original order of 40-bit symbols prior to convolutional interleaving in the transmitter.

lanes, so it can be specified more simply as operating on an individual lane, in which case the index p and associated for loop can be removed

The operation does not change bit positions within the 40-bit symbols; it simply changes the position of the symbols. The multiplier 40, index j and associated for loop can be removed so the operation is described based on 40-bit symbols.

# Proposal for subclause 184.5.8

- Introductory text is fine as written (up through the first paragraph after the figure)
- Replace the text on page 490 with this:

The following is performed individually on each of the q lanes:

**Proposal for subclause 184.5.8**<br>• Introductory text is fine as written (up through the first paragraph<br>• Replace the text on page 490 with this:<br>The following is performed individually on each of the *q* lanes:<br>Denote th input[ $j$ ] and output[ $j$ ], where the index  $j$  identifies 40-bit symbols. Introductory text is fine as written (up through th<br>after the figure)<br>Replace the text on page 490 with this:<br>he following is performed individually on each of<br>enote the input and output of the convolutional<br>iput[j] and o

For each j

End for

Note: output[j] is undefined when the index is negative. <sup>27</sup>