# Crędő Crędő Comparent Co

# What Is Important for A Reference Receiver

IEEE P802.3ck Task Force Phil Sun, Yasuo Hidaka, Credo Semiconductor



#### Introduction

- > DFE reference model has been used in multiple IEEE 802.3 projects to qualify channels. It has also been a tool to
  - Check equalizer parameters, e. g., TX FIR range and resolution,
  - Capture and model real system behaviors, e.g., burst errors.
  - Calibrate noise.

- $\succ$  There are discussions whether DFE model needs to be replaced by FFE based models. One major reason was earlier simulations showed DFE model has lower performance because of lack of TX FIR pre3, small b1max range, coarse TX FIR resolution, and etc.
  - Recently proper TX FIR and DFE parameters for 100G are identified, COM tool shows very similar performance between DFE and FFE model. [1]
  - FFE model performance is significantly lower if ADC/AFE ENOB is considered [4].
  - Performance may not be a real concern if DFE and FFE models can track each other.

This contribution discusses desired functions of a good reference model

- how it can support standard development.
- how it captures/reflects important needs of real systems.
- how its performance is correlated to a real system.



#### Simulation Setup

 $\succ$  All Simulations in this contribution are based on COM tool v2.57. The modifications are: To guarantee full grid search, "break" is changed "continue" on line 2642 per discussion with Rich.

- Melitz.
- The number of equalizer post taps is changed from 16 to 24. Shorter equalizer results have already been covered by [1].
- Other experimental modifications are noted in each slides.
- $\succ$  All 115 LR/CR channels contributed to ck project are simulated.
- Forminology is the same as sun\_3ck\_02\_0119 [4]. For example, MFFE0.85 and MDFE0.85 means FFE and DFE receiver with 2.0% TX FIR resolution and 0.85 b1max.



#### What is Missing in Ideal Models and Does it Matter?

- > COM tool does not include implementation details for simplicity and avoid stressing implementation specific penalties.
- > For DFE-based models, implementation imperfection such as analog front end distortion are included in 3dB COM.
- $\succ$  FFE-based model has FFE taps overlapped with all TX FIR taps and DFE taps. Its optimal solution is highly implementation dependent.
  - For example, ADC effective number of bits (ENOB) is one of the major implementation penalties for ADC based SERDES. It changes the behavior of ADC based equalizer. Can ENOB be ignored in the FFE-based reference model?



#### **Overlapped Taps of FFE Receiver**

- > RX FFE has overlapped taps with TX FIR and RX DFE, PDFE has precursors overlapped with TX FIR
- It makes FFE and PDFE adaptation highly implementation/noise dependent
  - Real RX does not behave like adaptation in the FFE model nor PDFE model
- $\succ$  It also makes COM insensitive to TX FIR specs (e.g. # of taps, resolution)
  - However, both of FFE-based and DFE-based real RX requires proper TX FIR





#### Correlation of Ideal Models to Realistic Implementations



model. Big mismatches are observed between ideal FFE/PDFE model and a more realistic FFE model with 5.2b ENOB.

- DFE model correlates better (than ideal FFE model) to the more realistic FFE model with limited ENOB, i.e., even for FFE implementations a DFE model predicts performance better than an ideal FFE model!
- The channels in red circles passed FFE and PDFE (about 3.5dB and 4dB COM), and only have 1dB COM after considering ENOB. [4]
  - FFE implementations are likely not be able to support these channels. And these channels have less than 3dB COM in DFE model.

#### FFE and PDFE "Better Performance" is Real, or A Problem?



- > FFE and PDFE model appear to have "better" performance than DFE model on some channels. Most of them are because of misbehavior of ideal FFE/PDFE model when front end noise/distortion is not properly considered.
- > PDFE with b1max=0.55 gets close to DFE performance in average. However, correlation to realistic FFE model is still much worse than a DFE model



## TX FIR Needed by Ideal Model and Real Systems

- > DFE reference receiver does not have precursors. It is sensitive to TX FIR settings, same as a typical SERDES architecture with direct-feedback DFE taps.
- > FFE ideal receiver is insensitive to TX FIR settings. However a real SERDES with long FFE implementation will need support of TX FIR due to AFE imperfection.
- > If using a weak 2-tap TX FIR which reduces average COM of DFE by 1.62 dB, ideal FFE and PDFE performance is almost unimpacted (0.02 and 0.18 dB). However, FFE model with 5.2 bit ENOB performance is reduced by 0.42 dB.
- > TX FIR will have more impact on FFE based receivers if other amplitude modulated distortion is considered.



## TX FIR Standard Development and Interop

- $\succ$  TX FIR resolution, range, and number of taps have been refined at each SERDES generation to meet performance target.
- Ideal FFE and PDFE models are insensitive to TX FIR settings. However this does not reflect behaviors of real systems - neither DFE nor FFE based receivers.
- DFE reference model has been a great tool to capture TX FIR impact on system performance, and make sure smooth interop of different SERDES architectures in the past years.
- > Relying on FFE or PDFE reference model could lead to lousy TX FIR spec and trap real system, both DFE and FFE based implementations, in suboptimal performance. Interop is a big concern if TX FIR spec is not sufficient.

## FFE Algorithm Discussion and Project Schedule

- > MMSE is widely used for FFE adaptation, which requires proper modeling of noise. FFE in COM tools uses something different for simplicity, but does not always converge.
  - Patches such as "FFE back-off" is used to cover problems found. But no evidence shows it can always work. The number of back off taps is a puzzle to the users.
  - How should noise/distortion be modeled in COM to make FFE model work properly?
- > DFE-based model works well with simple zero-forcing algorithm.
- > MM phase detector in COM tool works well for DFE. Which phase detection algorithm should be used for FFE model?
- > DFE algorithms are well documented in Annex 93A. It will be a long debate how noise should be modeled for FFE and what optimization algorithm to use.
  - Channel qualification results will change during this process
  - Project schedule may be impacted.

## b1max and Burst Error Analysis

**DFE Tap Weights** 



B1max=0.85 is identified to improve DFE receiver performance. 

- b1max allows higher b2. Simulation shows Post FEC performance is better than b1max=0.7. [4]
- High b1 is a common practice. Implementations can also choose to lower b1 by CTLE/FFE without degrading performance.
- Precoding is less effective for some burst errors. Burst caused by heavy DFE tail is one of them. FFE implementations have their own sources.
  - The standard will be too optimistic about precoding performance if none of these types of burst errors are considered.
  - DFE model provides a tool to check this type of errors and make the standard development more consistent. Burst error analysis are based on DFE shape of massive channel simulations.
  - There are discussions to create rules on DFE tails to better analyze burst errors.
  - Discarding DFE based reference model will result in loss of this important tool for burst error analysis.

DFE tap weights of large amount of channels are being used for burst error analysis. For example: anslow\_3ck\_01\_0918 zhang\_3ck\_01a\_0918

## Example of DFE Tap b1 Weight Control



In real systems and simulation tools, DFE tap b1 can be lowered without much performance impact. A simple example is to add FFE post 1.



IEEE P802.3ck 100Gb/s, 200Gb/s, and 400 Gb/s Electrical Interface Task Force

Credo Semiconductor 11

## TX FIR Resolution and Number of Taps



| Reference            | [4] Upadhyaya<br>ISSCC 2018 | [5] Wang<br>ISSCC 2018    | [6] Depaoli<br>ISSCC 2018 | [2] Menol ISSCC 2018                 | [8] Kim ISSCC<br>2018 |
|----------------------|-----------------------------|---------------------------|---------------------------|--------------------------------------|-----------------------|
| Technology           | 16nm                        | 16nm                      | 28nm                      | 14nm                                 | Intel 10nm            |
| Data Rate [Gb/s]     | 56                          | 63.375                    | 64                        | 112                                  | 112                   |
| TX driver            | voltage mode                | voltage mode              | voltage mode              | voltage mode DAC                     | current mode          |
| FFE taps             | 4                           | 3                         | 4                         | 8                                    | 3                     |
| Resolution           | 78-90 slices                | 33 slices with half cells | 72 slices                 | 8 bit DAC                            | 5b for each Tap       |
| Output Swing<br>(mV) | 1000                        | 1000                      | 1000                      | 920                                  | 750                   |
| TX Power (mw)        | -                           | 89.7                      | 135                       | 264<br>Including 34 for 8-tap<br>FIR | 232                   |

ISSCC 2018 50G and 100G SERDES Papers

TX FIR Precursor 3 Impact on VEC at C2M TP1a Simulation setting is the same as in sun\_3ck\_01a\_0119

- Performance
  - 2% or better TX FIR Resolution Improves Performance of both DFE and FFE receiver (more for DFE). [4]
  - TX FIR pre3 with 2% resolution helps difficult C2M channels by up to 1dB VEC at TP1a.
- Cost:
  - Finer than 1% resolution precursor/postcursor resolution has been reported for 112G SERDES. Finer than 1.5% precursor resolution has been implemented for 50G SERDES.
  - [2] implemented 8-tap TX FIR for 112Gb/s SERDES. Power of each FFE tap is about 4mW. [2] For non-DAC based TX, smaller weight taps cost less power.
  - Latency of one extra tap is about 1 symbol time, 18.8ps. It is negligible compared to total latency of the link (FEC latency is about 150ns).
  - Voltage mode DAC is commonly used for power efficiency.



#### Implementation Variations

DFE based Direct feedback and Long FFE with DFE tap 1 are typical implementations [5]. 

- Analog receivers have being emerging later than ADC based receivers at each generation with lower power.
- $\succ$  Real implementations may be a mix of both. For example:
  - FFE + (floating) DFE tails.
  - Less RX precursors to save power.
  - Low-resolution ADC model to save power
- > COM DFE is useful to analyze TX FIR impact and burst error penalty, and help realize smooth interop of different implementation choices.



Credo Semiconductor 13

#### Straw Poll Results on KR/CR Reference Model

Straw Poll #9:

Do you support a reference receiver for copper cable and backplane COM to be...

- (A) DFE as is in past COM (i.e. Annex 93A)
- (B) ZF/MMSE FFE + DFE
- (C) ZF/MMSE FFE + DFE ADC/DSP model
- (D) Something else
- (E) Need more information
- (pick 1) A: 18 B: 13 C: 4 D: 0 E: 14
- > Straw poll at Spokane shows existing DFE reference model is preferred.
- $\succ$  Engineers' instinct: why to change something working well to something unknown?
  - This contribution shows blindly changing DFE model to FFE or PDFE model will cause big problems in standard development and real system support.





## Conclusions

- Existing COM DFE model has been working well to support standard development.
- There are both ADC and analog based SERDES. Without proper noise/distortion modeling, ideal FFE and PDFE model do not capture behaviors of real systems – neither DFE nor FFE based.
  - Ideal FFE/PDFE model will pass channels that cannot be supported by real implementations.
  - It might be a long debate to refine distortion/noise to make FFE/PDFE work properly.
  - DFE model shows the best correlation to a more realistic FFE model.
- FFE/PDFE has similar performance to DFE model in average. FFE/PDFE appear "Better" performance for some channels. This is caused by misbehavior of ideal FFE/PDFE model and will result in passing bad channels.
- Existing DFE model is useful for TX FIR specification and burst error analysis.
  - Relying on FFE model may result in unchecked TX FIR spec and cause Interop issues in the field.
- DFE model achieves good performance after fixes of TX FIR resolution, number of taps, and b1max. These fixes are important for 100G SERDES performance. These are examples that DFE is more sensitive to catch problems for standard development.
  - Higher b1max is a common practice. Simulations also confirm burst error penalty is not a problem. The value of b1 can also be lowered without much performance impact.
  - 8-tap TX FIR with resolution finer than 1% have been reported for 112Gbps SERDES. Each tap is about 4mW [2]. Latency change is negligible.
  - 2% TX resolution and 5 TX taps help to support difficult C2M channels.
- Recommend existing DFE model as KR/CR reference model.
  - b1max and COM threshold can be easily tuned to match performance of DFE and FFE based implementations.



#### References

- [1] P. Sun, Y. Hidaka, "Comparison of KR/CR Reference Receivers," IEEE 802.3ck Task Force Ad Hoc, December 5, 2018.
- [2] C. Menolfi, et al., "A 112Gb/s 2.6pJ/b 8-Tap FFE PAM-4 SST TX in 14nm CMOS", ISSCC, pp. 103-104, Feb. 2018.
- [3] Y. Hidaka, P. Sun, "COM Simulation for 100G KR/CR Channels, update," IEEE 802.3ck Task Force Ad Hoc, December 12, 2018.
- [4] P. Sun, Y. Hidaka, "KR/CR Simulation Results with COM Tool 2.57," IEEE 802.3ck Task Force Meeting, January 2019.
- [5] P. Sun, "100G SERDES Power Study," IEEE 802.3ck Task Force Meeting, September 2018.
- [6] P. Upadhyaya, et al., "A Fully Adaptive 19-to-56Gb/s PAM-4 Wireline Transceiver with a Configurable ADC in 16nm FinFET", ISSCC, pp. 108-109, Feb. 2018.
- [7] L. Wang, et al., "A 64Gb/s PAM-4 Transceiver Utilizing an Adaptive Threshold ADC in 16nm FinFET", ISSCC, pp. 110-111, Feb. 2018.
- [8] E. Depaoli, et al., "A 4.9pJ/b 16-to-64Gb/s PAM-4 VSR Transceiver in 28nm FDSOI CMOS", ISSCC, pp. 112-113, Feb. 2018.
- [9] J. Kim, et al. "A 112Gb/s PAM-4 Transmitter with 3-Tap FFE in 10nm CMOS", ISSCC, pp. 112-113, Feb. 2018.



# **Backup Slides**



IEEE P802.3ck 100Gb/s, 200Gb/s, and 400 Gb/s Electrical Interface Task Force

Credo Semiconductor 17

## COM Spread Sheet

|           | Table 93A-1 parameters |                   |       |                     | I/O control         |                                                  |           |
|-----------|------------------------|-------------------|-------|---------------------|---------------------|--------------------------------------------------|-----------|
|           | Parameter              | Setting           | Units | Information         | DIAGNOSTICS         | 0                                                | logical   |
|           | f_b                    | 53.125            | GBd   |                     | DISPLAY_WINDOW      | 0                                                | logical   |
|           | f_min                  | 0.05              | GHz   |                     | CSV_REPORT          | 1                                                | logical   |
|           | Delta_f                | 0.01              | GHz   |                     | RESULT_DIR          | $\cdot \left( \frac{100GEL_WG_{da}}{te} \right)$ |           |
|           | C_d                    | [1.1e-4 1.1e-4]   | nF    | [TX RX]             | SAVE_FIGURES        | 0                                                | logical   |
|           | z_p select             | [2]               |       | [test cases to run] | Port Order          | [1 3 2 4]                                        |           |
|           | z_p (TX)               | [12 30; 1.8 1.8]  | mm    | [test cases]        | RUNTAG              | CR_eval_                                         |           |
|           | z_p (NEXT)             | [12 30; 1.8 1.8]  | mm    | [test cases]        | COM CONTRIBUTION    | 0                                                | logical   |
|           | z_p (FEXT)             | [12 30; 1.8 1.8]  | mm    | [test cases]        | Operational         |                                                  |           |
|           | z_p (RX)               | [12 30; 1.8 1.8]  | mm    | [test cases]        | COM Pass threshold  | 3                                                | dB        |
|           | C_p                    | [0.87e-4 0.87e-4] | nF    | [TX RX]             | ERL Pass threshold  | 10.5                                             | dB        |
|           | R_0                    | 50                | Ohm   |                     | DER_0               | 1.00E-04                                         |           |
|           | R_d                    | [ 50 50]          | Ohm   | [TX RX]             | T_r                 | 6.16E-03                                         | ns        |
|           | Av                     | 0.413             | V     | vp/vf=.694          | FORCE TR            | 1                                                | logical   |
|           | A_fe                   | 0.413             | V     | vp/vf=.694          | Include PCB         | 0                                                | logical   |
|           | A_ne                   | 0.608             | V     | 1                   | TDR and ERL options |                                                  | U         |
|           | L                      | 4                 |       |                     | TDR                 | 1                                                | logical   |
|           | М                      | 32                |       |                     | ERL                 | 1                                                | logical   |
|           | filter and Eq          |                   |       |                     | ERL ONLY            | 0                                                | logical   |
|           | fr                     | 0.75              | *fb   |                     | TR TDR              | 0.01                                             | ns        |
|           | <br>c(0)               | 0.54              |       | min                 | N                   | 1000                                             |           |
|           | c(-1)                  | [-0.34:0.02:0]    |       | [min:step:max]      | TDR Butterworth     | 1                                                | logical   |
|           | c(-2)                  | [0:0.02:0.12]     |       | [min:step:max]      | beta x              | 1.70E+09                                         | U         |
|           | c(-3)                  | [-0.06:0.02:0]    |       | [min:step:max]      | rho x               | 0.25                                             |           |
|           | c(1)                   | [-0.1:0.05:0]     |       | [min:step:max]      | fixture delay time  | 0                                                | enter sec |
|           | N b                    | 24                | UI    |                     | Receiver testing    |                                                  |           |
|           | b $\max(1)$            | 0.85              |       |                     | RX CALIBRATION      | 0                                                | logical   |
|           | b max(2N b)            | 0.3               |       |                     | -<br>Sigma BBN step | 5.00E-03                                         | V         |
|           | g DC                   | [-20:1:0]         | dB    | [min:step:max]      | Noise, jitter       |                                                  |           |
|           | f z                    | 21.25             | GHz   |                     | sigma RJ            | 0.01                                             | UI        |
|           | f p1                   | 21.25             | GHz   |                     | A DD                | 0.02                                             | UI        |
|           | f p2                   | 53.125            | GHz   |                     | eta 0               | 8.20E-09                                         | V^2/GHz   |
|           | g DC HP                | [-6:1:0]          |       | [min:step:max]      | SNR TX              | 33                                               | dB        |
|           | f HP PZ                | 0.6640625         | GHz   | rll                 | R LM                | 0.95                                             |           |
|           | ffe pre tap len        | 0                 | UI    |                     | _                   |                                                  |           |
|           | ffe post tap len       | 0                 | UI    |                     |                     |                                                  |           |
|           | ffe tap step size      | 0                 |       |                     |                     |                                                  |           |
|           | ffe main cursor min    | 0.7               |       |                     |                     |                                                  |           |
|           | ffe pre tap1 max       | 0.3               |       |                     |                     |                                                  |           |
|           | ffe post tap1 max      | 0.3               |       |                     |                     |                                                  |           |
|           | ffe tapn max           | 0.125             |       |                     |                     |                                                  |           |
| • • • • - | ffe backoff            | 0                 |       |                     |                     |                                                  |           |
|           |                        | č                 |       |                     |                     |                                                  |           |
|           |                        |                   |       |                     |                     |                                                  |           |
|           | CredO                  |                   |       |                     |                     |                                                  |           |
|           | Ciçuo                  |                   |       |                     |                     |                                                  |           |

| Table 93A–3 parameters  |                             |       |
|-------------------------|-----------------------------|-------|
| Parameter               | Setting                     | Units |
| package_tl_gamma0_a1_a2 | $[0\ 0.0009909\ 0.0002772]$ |       |
| package_tl_tau          | 6.141E-03                   | ns/mm |
| package_Z_c             | [87.5 87.5 ; 92.5 92.5 ]    | Ohm   |
|                         |                             |       |
| Table 92–12 parameters  |                             |       |
| Parameter               | Setting                     |       |
| board_tl_gamma0_a1_a2   | [0 3.8206e-04 9.5909e-05]   |       |
| board_tl_tau            | 5.790E-03                   | ns/mm |
| board_Z_c               | 90                          | Ohm   |
| z_bp (TX)               | 119                         | mm    |
| z_bp (NEXT)             | 119                         | mm    |
| z bp (FEXT)             | 119                         | mm    |
| z bp (RX)               | 119                         | mm    |
|                         |                             |       |
|                         |                             |       |
|                         |                             |       |
|                         |                             |       |
|                         |                             |       |
|                         |                             |       |
|                         |                             |       |
|                         |                             |       |
|                         |                             |       |
|                         |                             |       |
|                         |                             |       |
|                         |                             |       |
|                         |                             |       |
|                         |                             |       |
|                         |                             |       |
|                         |                             |       |
|                         |                             |       |
|                         |                             |       |
|                         |                             |       |
|                         |                             |       |
|                         |                             |       |
|                         |                             |       |
|                         |                             |       |
|                         |                             |       |
|                         |                             |       |
|                         |                             |       |
|                         |                             |       |
|                         |                             |       |

## **Channel Data for Simulation**

#### Simulation was done for the following publicly available 115 LR channels Among them, 8 channels are marked up with red dots in the plots.

| CH #   | Channels marked with red dots                            | Group | Description                                          | Reference Document                 |
|--------|----------------------------------------------------------|-------|------------------------------------------------------|------------------------------------|
| 1-2    |                                                          | RM1   | Two Very Good 28dB Loss Ideal Transmission Lines     | mellitz_3ck_adhoc_02_072518.pdf    |
| 3-8    | CH7 : CaBP_BGAVia_Opt2_28dB                              | RM2   | 24/28/32dB Cabled Backplane Channels including Via   | mellitz_3ck_adhoc_02_081518.pdf    |
| 9-10   |                                                          | RM3   | Synthesized CR Channels (2.0m and 2.5m 28AWG Cable)  | mellitz_100GEL_adhoc_01_021218.pdf |
| 11-13  |                                                          | RM4   | Best Case 3", 13", 18" Tachyon Backplane             | mellitz_100GEL_adhoc_01_010318.pdf |
| 14-15  |                                                          | NT1   | Orthogonal or Cabled Backplane Channels              | tracy_100GEL_03_0118.pdf           |
| 16     |                                                          | AZ1   | Orthogonal Backplane Channel                         | zambell_100GEL_01a_0318.pdf        |
| 17-19  |                                                          | HH1   | Initial Host 30dB Backplane Channel Models           | heck_100GEL_01_0118.pdf            |
| 20-35  | CH21 : 16dB 575mm high ISI<br>CH33 : 28dB 575mm high ISI | HH2   | 16/20/24/28dB Cabled Backplane Channels              | heck_3ck_01_1118.pdf               |
| 36-54  | CH36 : Bch1_3p5<br>CH46 : Bch2_a7p5_7                    | UK1   | Measured Traditional Backplane Channels              |                                    |
| 55-73  | CH68 : CAch3_b2                                          | UK2   | Measured Cabled Backplane Channels                   | kareti_3ck_01a_1118.pdf            |
| 74-88  | CH80 : OAch4<br>CH81 : Och4                              | UK3   | Measured Orthogonal Backplane Channels               |                                    |
| 89-115 |                                                          | AZ2   | Measured Orthogonal Backplane with Varied Impedances | zambell_3ck_01_1118.pdf            |

All channel data are taken from IEEE 100GEL Study Group and P802.3ck Task Force – Tools and Channels pages. i.e. http://www.ieee802.org/3/100GEL/public/tools/index.html and http://www.ieee802.org/3/ck/public/tools/index.html IredO

## Muller-Mueller PD vs Modified PD



> MPDFE0.6 is almost not affected, if MM-PD is changed to Modified PD > MFFE0.7 is degraded a lot, if MM-PD is changed to Modified PD



## Correlation to MDFE0.85 model (Y)

6

COM (MDFE0.85)

1

IredO



Excepting variation, MDFE0.85 is similar to MFFE0.7 and MPDFE0.5~0.55



#### Correlation to More Realistic FFE Model with Noise/Distortion



- Real FFE-based implementation is better predicted by DFE than FFE or PDFE
  - Front end noise/distortion degrades FFE performance by about 2dB.

4

3

2

-1

\_red()

 $\mathbf{m}$ 

COM (MFFE0.7)

COM (MFFE0.85\_ENOB)



#### **FFE Noise Amplification**





| Channel | FFE Noise Amplification (dB) |
|---------|------------------------------|
| 37      | 0.1                          |
| 38      | 0.1                          |
| 39      | 0.2                          |
| 40      | 0.1                          |

- FFE noise amplification is less than 0.2dB  $\succ$  Ch37, 38, 74, 75 are all dominated by ISI. Impact of modeling jitter before or after FFE is very small.

COM injects TX, eta\_0, XTK before CTLE, jitter after CTLE.