# **Receiver performance evaluation**

(comments #324, #325, and #326)

Adam Healey Broadcom Inc. IEEE P802.3dj Task Force September 2024 (r1)

AUI = attachment unit interface C2C = chip-to-chip C2M = chip-to-module FEC = forward error correction

#### Introduction



- Block error ratio is new metric defined in Annex 174A
- It is a measure of the performance of an inter-sublayer link (ISL) between two Physical Medium Attachment (PMA) sublayers
- It is intended to ensure that errors on an ISL can be corrected by the Physical Coding Sublayer (PCS) receive function as needed to meet frame loss ratio (FLR) targets
- It is intended to provide information similar to PCS error counters when the PCS is not included in the test

# **Block error ratio measurement requirements**

#### IEEE P802.3dj/D1.1 174A.6 item b)

*b)* At the output of the receiving PMA PAM4 decoder, insert random bit errors with a BER of BER<sub>added</sub>.

- Account for the errors allocated to other ISLs that may exist in the network path
- It suggests that hardware needs to be added to implementations and/or test equipment to insert the additional errors
- Comment #324 suggests that the term "random" is not sufficiently defined for a hardware implementation
- Comment #325 suggests that, if the goal is to add truly random errors, then the impact of BER<sub>added</sub> can be assessed with calculations
- Alternative methods to ensure an ISL adheres to its error ratio allocation will be offered in this contribution

### **Error ratio allocations**

| ISL            | FLR [1] | CER [2]  |         | BER     |
|----------------|---------|----------|---------|---------|
| AUI C2C        |         |          |         | 0.08e-4 |
| AUI C2M        |         |          |         | 0.24e-4 |
| PMD-to-PMD [3] | 6e-11   | 1.45e-11 | 2.92e-4 | 2.28e-4 |
| AUI C2M        |         |          |         | 0.24e-4 |
| AUI C2C        |         |          |         | 0.08e-4 |

[1] One extender is allowed per host and each extender is allocated FLR = 0.1e-11.

[2] Based on 800 Gb/s and 1.6 Tb/s Ethernet. CER allowances for 200 and 400 Gb/s Ethernet are somewhat larger.

[3] May include inner FEC.

- The bit error ratio (BER) allocations in this table are for example only (and are the subject of other comments)
- *BER<sub>added</sub>* is sum of the BER allocations for ISLs other than the ISL under test
- Conversions between codeword error ratio (CER) and BER assume only random errors i.e., there is no error extension
- Allocations are combined via simple addition under the assumption that the occurrence of errors on any ISL is independent of the occurrence of errors on the other ISLs

# **Definition of a block**

#### IEEE P802.3dj/D1.1 174A.6 items d) and e)

d) Divide the stream into a series of 10-bit symbols.

- e) Within each block of 544 symbols count the number of errored symbols, where errored symbols are symbols with one or more bit errors.
- This is not consistent with the structure of the data on an ISL lane (comment #326)
- 10-bit symbols from 4 different codewords are interleaved on each 200G lane
- 10-bit symbols from a codeword are separated by 3 10-bit symbols from other codewords
- Each codeword spans 4 x 544 / p symbols on a given lane for an p-lane ISL
- Therefore, for the block error ratio to reflect the impact of errors on encoded data, a block should consist of every fourth 10-bit symbol
- A block should consist of 544 / p symbols spanning 4 x 544 / p symbols (with every fourth symbol being considered in the count)
- Multiple blocks may be measured in parallel to reduce measurement time

# **Example for 800 Gb/s Ethernet**



- Structure of the encoded data on the ISL (or at the input / output of the inner FEC when included)
- Shaded squares represent 10-bit symbols within the same codeword

 Single lane test using PMA pattern generator and a pattern checker that counts errors in a way that is consistent with how encoded data will be processed

# Histogram of errors

#### IEEE P802.3dj/D1.1 174A.6 items f) and g)

f) Count the number of blocks with 16 or more errored symbols (NE).

- g) Count the total number of blocks analyzed (NT). The value of NT should be sufficiently large to reliably verify that block error ratio requirements are met, either by direct measurement or statistical projection.
- It will take a multiple of 30 minutes to directly verify that the block error ratio is less than 1.45e-11 (the multiple depends on the desired confidence level)
- Therefore, it seems that statistical projection will be a popular choice
- Projections are likely to be done from a histogram of the number of errors per block
- E.g., number of blocks with *i* 10-bit symbol errors, *i* from 1 to 15 and i > 15
- This is similar to the FEC\_codeword\_error\_bin\_i counters provided by the PCS (except those counters may span multiple PMA lanes)
- Histogram measurement is a useful feature that can be defined for the PMA error counter

#### **Expected histogram of errors**

- Given the BER allocated to an ISL, an expected histogram of errors can be computed
- Computation assumes that errors are independent and identically distributed ("random")
- If each PAM-4 symbol error corresponds to only one bit error, then the PAM-4 symbol error ratio is SER = 2BER
- A 10-bit Reed-Solomon symbol is mapped to five PAM-4 symbols and the probability of a 10-bit Reed-Solomon symbol error is  $RSSER = 1 (1 SER)^5$
- The probability that *i* 10-bit symbol errors occur in a group of *n* 10-bit symbols is defined by the following equation

$$H(i) = \binom{n}{i} RSSER^{i} (1 - RSSER)^{n-i}$$
 Equation (1)

• Where n = 544 / p for a *p*-lane ISL and  $\binom{n}{i} = \frac{n!}{i! (n-i)!}$  is the binomial coefficient

# **Combinations of independent events**

- Consider two ISLs in the network path x and y with corresponding error histograms  $H_x(i)$ and  $H_y(i)$
- If the occurrence of errors on one ISL is independent from the occurrence of errors on the other ISL, the combined error histogram is defined by the following equation

$$H(i) = \sum_{j=0}^{l} H_x(j)H_y(i-j)$$
 Equation (2)

- It can be applied recursively to compute the error histogram for the entire network path
- When the error histograms represent random errors with corresponding bit error ratios  $BER_x$  and  $BER_y$ , H(i) may also be derived from Equation (1) using  $BER = BER_x + BER_y$
- This equation can also be used to combine error histograms from multiple lanes within an ISL (under the same assumption of independence)

i

### **Proposed changes to the measurement**

- Remove the insertion of "random" errors
- Measure the histogram of 10-bit symbol errors per block for the receiver under test
- A block is defined to consist of 544 / p symbols spanning 4 x 544 / p symbols (every fourth symbol being considered in the count) where p is the number of PMA lanes
- Let  $H_m^{(k)}(i)$  be the measured histogram of *i* errors in a block for lane *k* where  $i \le 16$ , i = 16 corresponds to more than 15 10-bit symbol errors in a block, and k = 0 to p 1
- $H_m^{(k)}(i)$  is normalized to the total number of blocks included in the measurement
- Note that  $H_m^{(k)}(16)$  corresponds to "NE" as currently defined in Annex 174A

- Note that it should be permissible to reduce test time by projecting the values for larger *i*
- The projection should provide an accurate prediction of the value of  $H_m^{(k)}(i)$  that would be observed over longer-term testing or at least provide an upper bound on the value

# Method of verification 1: Mask

- Define *BER<sub>total</sub>* to be the total BER allocation for the PHY-to-PHY link
- It is derived from the CER allocated to the PHY-to-PHY link assuming random errors
- Compute H(i) using Equation (1) with  $BER = BER_{total} BER_{added}$  (which is the BER allocated to the device under test)
- Compare  $H_m^{(k)}(i)$ , the measured error histogram for lane k, to H(i)
- $H_m^{(k)}(i)$  must be less than or equal to H(i) for all *i* and for all lanes k = 0 to p 1

- Inspection of Equation (2) suggests that, if  $H_m^{(k)}(i) \le H(i)$  for all *i* and *k*, the CER from the combined contributions of all lanes and  $BER_{added}$  will not exceed the limit
- The converse is not true and the CER could also be explicitly calculated to check that the limit is met despite any mask violations

### Example masks for a BER allocation of 2.28e-4



| i  | 200G ( <i>p</i> = 1, <i>n</i> = 544) | 800G ( <i>p</i> = 4, <i>n</i> = 136) |
|----|--------------------------------------|--------------------------------------|
| 1  | 3.6e-01                              | 2.3e-01                              |
| 2  | 2.2e-01                              | 3.5e-02                              |
| 3  | 9.2e-02                              | 3.6e-03                              |
| 4  | 2.8e-02                              | 2.7e-04                              |
| 5  | 7.0e-03                              | 1.6e-05                              |
| 6  | 1.4e-03                              | 8.2e-07                              |
| 7  | 2.5e-04                              | 3.5e-08                              |
| 8  | 3.9e-05                              | 1.3e-09                              |
| 9  | 5.2e-06                              | 4.1e-11                              |
| 10 | 6.4e-07                              | 1.2e-12                              |
| 11 | 7.1e-08                              | 3.1e-14                              |
| 12 | 7.2e-09                              | 7.5e-16                              |
| 13 | 6.7e-10                              | 1.6e-17                              |
| 14 | 5.8e-11                              | 3.3e-19                              |
| 15 | 4.7e-12                              | 6.1e-21                              |
| 16 | 3.8e-13                              | 1.1e-22                              |

# Method of verification 2: CER calculation

- Initialize  $H_m(i)$  to  $H_m^{(0)}(i)$
- For k = 1 to p 1, iteratively assign  $H_m(i)$  the result of Equation (2) substituting  $H_m(i)$  for  $H_x(i)$  and  $H_m^{(k)}(i)$  for  $H_y(i)$
- The result is the combined contribution of all lanes from the *p*-lane receiver under test
- Assign the result of Equation (1) with  $BER = BER_{added}$  to  $H_a(i)$
- Compute H(i) using Equation (2) substituting  $H_m(i)$  for  $H_x(i)$  and  $H_a(i)$  for  $H_y(i)$

• Compute the codeword error ratio 
$$CER = \sum_{i>15} H(i)$$

• The computed CER must be less than the limit

# **Combination of results from individual lanes**



- Illustration of the process to combine measured histograms from individual lanes
- BER allocation in this example is 2.28e-4
- The allocation for each lane of 800G interface is the same
- This also illustrates that, for random errors, the total allocation for a 4 lane, 800G interface and a 1 lane, 200G interface agree
- Actual measurements may differ between lanes and may include error extension
- The same process can be used with measured histograms assuming that the error events are independent

# Support for multiple data rates

- The definition of a block varies with the aggregate data rate of the receiver under test (e.g., p = 1 for 200 Gb/s, p = 4 for 800 Gb/s)
- Devices that support multiple aggregate rates must verify compliance for each rate
- This can be done via a separate test for each rate or a single test that represents the superset of requirements for all supported rates

# **Summary and conclusions**

- The process of "random" error injection suggested in 174A.6 is not sufficiently defined for hardware implementation and it seems unnecessary
- Truly "random" errors can be accounted for in off-line calculations
- It is proposed that the error injection block be removed
- It is also proposed that PMA-based error counting be defined to include the accumulation of an error histogram (number of 10-bit symbol errors in a "block" of 10-bit symbols)
- Further, it is proposed that a "block" be defined to reflect the structure of encoded data on the ISL
- Given the measured error histogram, compliance can be verified via a comparison of the error histogram to a mask or by explicit calculation of the expected CER
- Either method of verification should be allowed to demonstrate compliance
- The required calculations are straightforward
- The results of the proposed method are expected to be similar to what would be observed by a PCS if it were to be included in the test