

# Power and Latency Considerations, 802.3ch case study

KY-ANH TRAN 09/18/24

#### Agenda

- Motivation
- Assumptions
- Analysis
- Comparison with TDD approach
- Conclusion



## **Motivation**

- Power mentioned to be important for 802.3dm, especially with respect to thermal performance of the sensor.
  - "Sensor quality degrades exponentially with increased temperature." <u>https://www.ieee802.org/3/cfi/0723\_1/CFI\_01\_0723.pdf</u>
  - https://www.ieee802.org/3/ISAAC/public/091423/2023-09-18 Automotive%20camera%20PHY%20requirements%20study V2.3.pdf
- Previous contribution raised questions on more work needed to evaluate power/latency: "We propose that further analysis is needed to address the questions identified in this presentation."
  - https://www.ieee802.org/3/dm/public/0724/kang\_3dm\_01b\_2407.pdf
- We will show that traffic profiles can affect latency/power.
  - Use 802.3ch as example. There is considerable system simplification if existing automotive ethernet PHY can service 802.3dm objectives.
  - https://www.ieee802.org/3/dm/public/0524/Relative%20Cost%20Analysis%20of%20802.3ch%20as%20Asymmetri c%20PHY\_Huang\_05122024.pdf



#### Acronyms.

- US: Upstream
- DS: Downstream
- WS: Wake Sleep (event)
- ECU: Electronic Control Unit
- ACF: AVTP (Audio Video Transport Protocol) Control Format (IEEE1722)
- GBB: Generic Byte Bus (IEEE1722)
- IPG: Inter Packet Gap



#### System setup

- Consider 802.3ch sensor PHY inside camera module.
  - 802.3ch sensor PHY assumed to operate in 10Gbps mode, interleaving L = 1, slow\_wake = 0.
- No periodic schedule of EEE wake up needed.



5 🗹 AEONSEMI

# **Camera Uplink Traffic Profile**

- Previous study mention:
  - Intermittent transmission for tunneling I2C and GPIO (periodic shutter control) signal could bring challenges to existing camera system design" <u>https://www.ieee802.org/3/dm/public/0724/kang\_3dm\_01b\_2407.pdf</u>
- We will include both in our analysis:
  - I2C burst to control camera based on image received at most once per frame (e.g. exposure control)
  - Frame sync is also sent once per frame.





#### **I2C transaction**

- I2C CCI protocol commonly used, with 16bit reg address for camera sensors
- Single byte write at random location example below
  - 4 bytes total + ACK/Start[S]/Stop[P]



@1000kHz > 36us



#### **Traffic benchmark**

- I2C control burst uplink every frame.
- Typical number of I2C register writes for this is 1-20.
- Assume:
  - 10 x I2C byte reg write at random locations.
  - Update every single frame (worst case)
- This is an initial strawman to get ballpark numbers and may need to be refined!:
  - In real life, not every frame has control updates to camera, depends on image condition changes and tuning algorithm.
  - In real life, contiguous byte address writes can use I2C sequential write which improves tunneling efficiency.



## IEEE1722 I2C protocol

- I2C ACF (we will call **Byte mode**) tunnels each byte individually as an ethernet frame.
  - Stretching the clock and wait until the ACK packet comes back from the target is expected.
- I2C GBB, Annex XXX (we will call Bulk mode) tunnels a reg write transaction in an ethernet frame.
  - ECU side device can autonomously acknowledge the I2C bus (auto-ACK).
- **Byte mode** throughput for I2C write limited by the round-trip delay.



## IEEE1722 I2C data

- In Bulk mode, each byte write can fit 84 byte frame (including IPG).
- 10 byte writes = 840 bytes of data.
  - < 4 RS-FEC frames which is minimum 802.3ch PHY data transmission time.</p>





# IEEE1722 I2C Bulk timing

- Heuristic: PHY starts wake up after i2c transactions complete by "Twait"
  - ECU PHY automatically ACK's.
- Sensor PHY RX receives 8 RS-FEC frame WAKE before receiving the packetized payload.
- Full duplex time:

11

- Twake = 8 RS frame
- Tdata = 4 RS frame (minimum transmission)
- Tsleep = 8 RS frame / bit rate



# **IEEE1722 Byte timing**

- One can choose to use byte mode tunneling as well. No wait time needed. I2C clock is stretched until ACK is received.
- Each i2c byte is transmitted uplink individually.
- Assume worst case, each byte leads to individual WS (Wake-Sleep) event.
  - Worst case full duplex time due to maximum **WS** events.





# **IEEE1722 Frame sync tunneling uplink**

- Tunnel the timestamp of the frame sync edge signal with IEEE1722 (Annex N IEEE1722b)
  - 2 x 84 byte frame is sufficient (posedge/negedge) for 1 FSYNC pulse
- Single WS event used for each edge tunneled.





# EEE with no periodic schedule

- Note the PHY does not need to rely on pre-arranged periodic schedule
- Wake up behavior responds to data activity instead.

- Save power thanks to long inactive time on the order of the frame time
- Decouple latency with power consumption (Latency set by wake up time rather than artificial periodic schedule)
- Realistic data activity leads to realistic power/latency estimate.



# **Full duplex time**

- Power consumption of the Sensor PHY needs to account time spent in full duplex
- Fullduplex time = Refresh time (in LPI) + WAKE/DATA/SLEEP time





# Full duplex time, I2C Bulk mode

- 2 X FSYNC packets per frame
- 1 X I2C GBB packet per frame
- Add up all the full duplex time, including Refresh during LPI

| Frame rate | % time full duplex |
|------------|--------------------|
| 30fps      | < 1.10%            |
| 60fps      | < 1.16%            |
| 90fps      | < 1.21%            |

| Camera frame rate [fps]         | 60       |                           |                     |
|---------------------------------|----------|---------------------------|---------------------|
|                                 |          |                           |                     |
|                                 |          |                           |                     |
| i2c # of commands / frame       | 10       |                           |                     |
| i2c data [byte]                 | 840      |                           |                     |
| Frame sync packet size [byte]   | 84       |                           |                     |
|                                 |          |                           |                     |
| S (scaling factor)              | 1        |                           |                     |
| Data rate [Gb/s]                | 10.00    |                           |                     |
|                                 |          | Multiplicity per frame    | Time per frame [us] |
| Twake [us]                      | 2.56     | 3                         | 7.68                |
| Tsleep [us]                     | 2.56     | 3                         | 7.68                |
| Tdata mode, i2c bus [us]        | 1.28     | 1                         | 1.28                |
| Tdata mode, fsync transmit [us] | 1.28     | 2                         | 2.56                |
|                                 |          | Total time per frame [us] | 19.20               |
|                                 |          |                           |                     |
| Trefresh / Tqr                  | 1.04%    |                           |                     |
| Camera frame time [us]          | 16666.67 |                           |                     |
| Time full duplex per frame [us] | 192.61   |                           |                     |
| % time in full duplex           | 1.16%    |                           |                     |



# Full duplex time, I2C Byte mode

- 1 I2C packet / byte x 4 bytes / reg write X 10 reg write / frame
  = 40 I2C packets / frame.
- 2 FSYNC packets.

| Frame rate | % time full duplex |
|------------|--------------------|
| 30fps      | < 1.84%            |
| 60fps      | < 2.64%            |
| 90fps      | < 3.44%            |

| Camera frame rate [fps]         | 60       |                           |                     |
|---------------------------------|----------|---------------------------|---------------------|
|                                 |          |                           |                     |
|                                 |          |                           |                     |
| i2c # of commands / frame       | 10       |                           |                     |
| i2c data [byte]                 | 84       |                           |                     |
| Frame sync packet size [byte]   | 84       |                           |                     |
|                                 |          |                           |                     |
| S (scaling factor)              | 1        |                           |                     |
| Data rate [Gb/s]                | 10.00    |                           |                     |
|                                 |          | Multiplicity per frame    | Time per frame [us] |
| Twake [us]                      | 2.56     | 42                        | 107.52              |
| Tsleep [us]                     | 2.56     | 42                        | 107.52              |
| Tdata mode, i2c bus [us]        | 1.28     | 40                        | 51.20               |
| Tdata mode, fsync transmit [us] | 1.28     | 2                         | 2.56                |
|                                 | 1        | Total time per frame [us] | 268.80              |
|                                 |          |                           |                     |
| Trefresh / Tqr                  | 1.04%    |                           |                     |
| Camera frame time [us]          | 16666.67 |                           |                     |
| Time full duplex per frame [us] | 439.61   |                           |                     |
| % time in full duplex           | 2.64%    |                           |                     |

4 bytes / I2C trans x 10 transactions. Each byte cause wake up event (worst case)

#### **Sensor PHY Power discussion**

- 802.3ch sensor PHY Power includes:
  - In full duplex, ECHO + RX + TX + bias + clock
  - Else, TX + bias + clock
- In bulk mode, > 98.8% the time 802.3ch sensor PHY power is TX + bias + clock
  - PHY Power comparable to half duplex transmitter
- Explains how measured 802.3ch sensor PHY power is competitive to incumbent serializers (<u>https://www.ieee802.org/3/dm/public/0524/Evaluation%20of%20802.3ch</u> Tran 050142024a.pdf)
- Further power savings possible:
  - EEE allows going to LPI during frame blanking (5-30% TX power saving depend on frame blanking ratio).



#### Latency metric

- Use definition of "Frame Latency" in previous adhoc: <u>https://www.ieee802.org/3/dm/public/adhoc/080724/turner\_dm\_01\_system\_08072024.pdf</u>
- "Frame Latency" includes PHY/channel delay, any wait time for PHY II interface to be available for transmission, but also "packet latency" discussed in https://www.ieee802.org/3/dm/public/0724/matheus\_dm\_02b\_latency\_07152024.pdf



Analysis scope



## 802.3ch Latency, normal operation

- Time for XGMII to be available bounded by EEE spec: (T\_w\_sys\_tx).
  - Can use wake after sleep complete for I2C data.
- US latency = T\_w\_sys\_tx + packet delay + PHY delay
  - Example 84 byte packet latency = 0.068us @ 10G.
- DS latency = packet latency + PHY delay + channel delay (< 94ns from 802.3ch spec)</p>

| Latency US     |         | Later          | ncy DS  |
|----------------|---------|----------------|---------|
| Latency        | Value   | Latency        | Value   |
| T_w_sys_tx     | 6.400us | PHY delay      | 1.024us |
| PHY delay      | 1.024us | Channel delay  | 0.094us |
| Channel delay  | 0.094us | Packet latency | 0.068us |
| Packet latency | 0.068us | Total Latency  | 1.186us |
| Total Latency  | 7.586us |                |         |



## 802.3ch Latency, initialization

- Camera initialization speed mentioned to be important in previous contribution.
  - https://www.ieee802.org/3/dm/public/0724/houck fuller 3dm 01 0724.pdf
- 802.3ch has flexibility run full duplex during initialization.
  - Latency is for 84 byte frame 1.186us DS/US



# Latency with ASA-MLE TDD

- Latency US = US gap + resync header + [frame size / MII data rate](\*) + PHY delay + channel delay
- Latency DS = DS gap + resync header + [frame size / MII data rate](\*) + PHY delay + channel delay
- (\*) Under discussion.





# **Comparison with TDD ASA-MLE**

- Consider 60fps, 10G downstream use case, Byte mode.
- Consider small 84byte frame (typical for i2c byte or timestamp). Larger frames gives advantages to EEE for frame latency, wake up is amortized.
- We compare against ASA-MLE 10G/100M mode, which has comparable line rate to 802.3ch at 10G. Assume channel delay < 94 ns.</li>

| Comparison                      | 10G 802.3ch<br>(w/ EEE US) | 10G/100M ASA TDD<br>(w/ 10Gbps xMII) | 10G/100M ASA TDD<br>(w/ 100Mbps xMII) |
|---------------------------------|----------------------------|--------------------------------------|---------------------------------------|
| Latency US (initialization)     | 1.186us                    | 26.67us + PHY delay US               | 33.33us + PHY delay US                |
| Latency DS (initialization)     | 1.186us                    | 1.07us + PHY delay DS                | 7.73us + PHY delay DS                 |
| Latency US (normal mode)        | 7.586us                    | 26.67us + PHY delay US               | 33.33us + PHY delay US                |
| Latency DS (normal mode)        | 1.186us                    | 1.07us + PHY delay DS                | 7.73us + PHY delay DS                 |
| Line rate for 10Gbps DS payload | 11.25Gbps                  | 12Gbps                               | 12Gbps                                |



## Conclusion

- Traffic profiles can affect power/latency estimates. For realistic estimates, realistic traffic profiles are needed.
  - Recommend 802.3dm to benchmark solutions using realistic traffic profiles.
- 802.3ch sensor PHY power similar to TX only power with realistic traffic. Time spent in full duplex bounded by:
  - Ising I2C Bulk mode
  - < 3.4% using I2C Byte Mode.</p>
- 802.3ch sensor PHY frame latency can achieve <1.2us / 1.2us (US/DS) during initialization, and <7.6us / 1.2us (US/DS) after initialization.</li>

