Thread Links | Date Links | ||||
---|---|---|---|---|---|
Thread Prev | Thread Next | Thread Index | Date Prev | Date Next | Date Index |
Hi Kirsten, Thanks again for following up.
This brings me to my point about a real-time operating system. Ethernet has an abysmal track record because of the lack of requirements identified at the beginning to prioritize latency.
CAN is widely used braking and engine control applications because it has a predictable maximum latency. It also provides deterministic behavior with priority-based arbitration schemes, ensuring
high-priority messages are sent with a minimal delay, which is critical for time-sensitive applications, as I mentioned. Ethernet, in contrast, needs to develop better methods to handle priority-based arbitration. While TSN functionality can be beneficial,
it falls short when immediate responses are required, failing to provide the necessary latencies customers require. I have had customers refuse to adopt ethernet for engine, braking, and airbag applications purely because of the missed latency and determinism requirements I mentioned. They would very much
like to connect these to a network. However, they can not eliminate CAN because they lack the ability to identify the critical need for low latency in other applications, which is a crucial aspect of ensuring the safety and reliability of these systems. I dove into this example above to help stress the importance of latency customers put on real-time safety-critical applications which what 802.3dm plans to address. That is why I am bringing
this lack of requirement to everyone’s attention. I outlined a scenario in the last email. Would you like me to provide more detail? Let's take 1 scenario into account to emphasize my point – I am sure many people could attest to it during their commute to work – traffic is heavy. Aggressive drivers try to get to places
on time when they're behind schedule. The system must respond to these changes in the environment as we do. Now add construction, meaning rapid changes to lanes, merging, and surroundings exist. What is the plan for scheduling tasks ahead of
time for these scenarios that rapidly changed and were unaccounted for? My question was not answered in the last email.
Can you please explain why we are not trying match get close to the latencies that are used by the SERDES in production which are critical for real time operating systems? Why would it not
be in the interest of the committee to push the latency to the lowest achievable number? As I outlined above these are just a few of the scenarios for a system that are making critical decisions that determine outcome of accident or worse and why customers
view them critical for their systems. Best Regards, TJ From: Matheus Kirsten, EE-352 <Kirsten.Matheus@xxxxxx>
Hello TJ, thank you for you extensive response. There is no reason to do things the same way as you do them with (an FDD-based) SerDes (a TDD-based SerDes already
organizes some aspects differently) as Ethernet offers different tools. That is Hello TJ, thank you for you extensive response. There is no reason to do things the same way as you do them with (an FDD-based) SerDes (a TDD-based SerDes already organizes some aspects differently) as Ethernet offers
different tools. That is like saying, why do not you use Ethernet like CAN? Yes, you might, but you are wasting opportunities and make things unnecessary complicated.
Could you please provide a scenario in which the environment changes so fast that you need all those commands (or at least many) at once? Again, we are not talking about the brake loop. We are talking about
the control loop of the camera, which will provide images also when the imager settings are very slightly suboptimal for a very short time.
As said, I will structure this in more details in a presentation for September. That might be easier than text.
Kind regards, Kirsten Von: TJ Houck <thouck@xxxxxxxxxxx>
Hi Kristen, The scenario you listed is simple and does not deal with the dynamic and rapidly changing environment that occurs in everyday road situations.
Hints the reason customers have strict latency requirements for these rapid changes that can take place.
Imagers do not update with just one command, as you mentioned. It's also important to be aware that there are many other updates that could take place, beyond those you listed. Being prepared for these potential
changes is key.
These updates can take 10s of 100s commands meaning that 100us latency for a single command if it is that even is getting amplified 10x to 100x's more than the suggested requirement. So now times
that 0.03/0.3/3mm x 100s. The customer's algorithms typically control these to help decide the best ML/NNU views. This adds to the processing on top of all these delays for additional distance and is why I have been adamant about the critical factor latencies
play. Once the data is received, it does not instantaneously decide what to do. The processor has to process this data. By the time it realizes what needs to be adjusted, in some cases, 1-100 milliseconds have passed, and now you're burdening the system with
an additional request because it either did not get the appropriate updated commands through in time or did not read what was wrong in time. This means the processor will go through this cycle all over again, which could've been avoided if latencies had been
tightly controlled. On top of that, a sensor does not get its own private I2C channel. They share this bus. A system cannot have 12 x I2C ports dedicated to each device. You can see that quad deserializers available
in the public market usually have only 2 I2C ports, and some customers will share 1 of these I2C busses with all 4 sensors and possibly other devices such as PMICs, uC/uP, etc. The processors have a limited number of I2C ports available. So now multiply those
100s commands by 2 or 4 or higher if there is traffic on that bus – this could be upwards well over
x1000s to get the necessary commands across in time. Regarding the GPIO – as mentioned above, the 350m vehicle distance is not the whole gambit of scenarios in the real world. Let's take 1 scenario into account to emphasize my point – I am sure
many people could attest to it during their commute to work – traffic is heavy. Aggressive drivers try to get to places on time when they're behind schedule. The system
must respond to these changes in the environment as we do. Now add construction, meaning rapid changes to lanes, merging, and surroundings exist. What is the plan for scheduling tasks ahead of time for these scenarios that rapidly changed and
were unaccounted for? Can you please explain why we are not trying match get close to the latencies that are used by the SERDES in production which are critical for real time operating systems? Why would it not be
in the interest of the committee to push the latency to the lowest achievable number? As I outlined above these are just a few of the scenarios for a system that are making critical decisions that determine outcome of accident or worse and why customers view
them critical for their systems. Best Regards, TJ From: Matheus Kirsten, EE-352 <Kirsten.Matheus@xxxxxx>
Hello TJ, thank you for bringing up the example with the imager and pointing out the relevance of driving speed. First, I would suggest to make a clear distinction
between the camera control loop and the brake control loop. The brake control Hello TJ, thank you for bringing up the example with the imager and pointing out the relevance of driving speed.
First, I would suggest to make a clear distinction between the camera control loop and the brake control loop.
The brake control loop takes (hopefully among other sensors) camera data, processes it in an SoC and sends a brake command to the brakes. There is no loop back to the camera so the delay on the low speed return
channel makes no difference. Only the high speed direction counts. Whether in the high speed direction, the camera data needs 1us overall to travers from camera to SoC or 10 us or even 100 us makes at 110 km/h~=70mph a difference of ~0,27mm/2.97mm that the
car will have moved forward (if I have calculated that correctly). At the same time, the camera-based driver assist algorithms inside the SoC are trained to look up to 350m ahead. I would like to understand better, why 3mm, or 0.3mm versus 0.03 mm matter in
the high-speed direction. I intend to detail this more also in my presentation at the September interim. The camera control loop takes statistic data from the imager, processes it in an SoC and sends control information back to the imager that then applies the new parameters. Next to synchronizing various cameras
(which in an Ethernet system can be done with help of the 1AS synched clocks/timestamps, using GPIO triggers for this is unnecessary), examples for items you would like control inside the imager are
The three control items have to do with reacting to a change of light. What is the scenario, in which the light changes so much within ~2m driving (the example of 67ms travelled at ~70mph/110km/h) that the
imager data is no longer useful? The uplink transmission time (10us versus ~100us) would anyway only make a difference, if the total control loop for exposure time exceeded the total budget of the blanking period of 4.3ms by 100us-10us = 90us. With camera
data used to look ahead 350m that means 11s travelling time at 70mph to look ahead and prepare.
As said, more to come in September.
Kind regards, Kirsten Von: TJ Houck <thouck@xxxxxxxxxxx>
Hi Scott, Thanks for following up. If I don't have the patience to respond to this, I am very worried about my future handling my patience with my children, who test me x1000 more on the patience side at times😊.
Your questions are valuable, and I encourage you to continue sharing your thoughts. Please don't feel the need to apologize. I am new, and this is my 1st reflector exchange, but in general, I would be surprised
if these forums were not meant to prompt questioning and understanding, and that is not how I took your previous response.
The main objective of the presentation I gave on latency was to provide a high-level overview and stimulate your thinking about these parameters. I intentionally didn't delve deeper due to time constraints,
but I wanted to spark your interest in the importance of these latencies and their potential use cases. These systems rely on these latencies for critical decisions such as collision avoidance, object detection, and emergency braking at high speeds.
I will leave you with one crucial aspect to consider, and I hope this addresses some of the questions you raised.
This means you now lost 33.33msecs for a 30FPS system, which means you have an additional 33.3msecs for the next frame. This now totals 66.7msecs, meaning the vehicle is moving 70mph (a standard interstate
speed limit in the U.S. in many locations) – 6.8 feet additional vs 3.42 feet if those other commands got to the sensor on time. I get different aspects of this decision, but the question would get directed to me as a system engineer after a severe accident.
Did you give the vehicle the best opportunity to respond accordingly for the communication link or was the system burden with unnecessary latencies that did not allow the processor to get that necessary information in time? Even worse, a simulation would show
this latency difference was the determining factor in the accident's severity.
I mentioned the audio shortcomings of ethernet, as this hit home for me during my Tier 1 days. There are a lot of similarities to this, and the latencies are needed for real-time operating applications. The
only difference here is that this system will make decisions that could determine a no/minor injury, severe injury, or loss of life scenario. System architects have to ask ourselves, did I give the processor its best chance to process the information in time
and allow the system to make the necessary updates, or did I burden the system with unnecessary latency and not give it the best chance? I appreciate your engagement and will address your other questions in a separate email. Your active participation is crucial for our collective understanding of this complex topic. Best Regards, TJ From: Scott Muma <00003414ca8b162c-dmarc-request@xxxxxxxxxxxxxxxxx>
Hi TJ, Thanks for the response, it’s good to hear that I am likely misunderstanding the message you intended to convey. However, I likely remain confused, so will
explain my interpretation of some of the points and some questions it raised Hi TJ,
Thanks for the response, it’s good to hear that I am likely misunderstanding the message you intended to convey. However, I likely remain confused, so will explain my interpretation
of some of the points and some questions it raised for me. Your continued patience is greatly appreciated. First in the processor to camera/sensor direction: Slide 6 “Latency Requirements”
Slide 7 “Latency and Jitter Application Diagram”
Slide 9 “Summary”
Questions:
Second in the camera/sensor to processor direction: Slide 6 “Latency Requirements”
Slide 8 “Latency Requirements”
Slide 9 “Summary”
Questions:
I would not yet claim that 10us switch->camera and 1us camera->switch are not the true requirements, but these requirements will severely constrain the valid solutions and
I am concerned it will take a networked topology off the table. --
May I ask how you concluded that this is not a true requirement and how this would directly impact network solutions, as this is a different requirement than I have described? Apologies if the wording was unclear. I was saying that I am not yet convinced one way or the other if these are the requirements needed to meet the overall application
requirements. Hopefully my questions above clarify why I’m concerned about the impact on a networked solution. I would be interested to understand how this is different from the requirement that you have described given that Ragnar referred below to these
as your “proposed requirements”. Similarly, if we state that the requirements are precisely the observed behavior of a point-to-point connection, then connecting the camera to processor over a network may
not be possible/economical. –-
This sounds like you’re describing two different requirements. The requirements I addressed directly reflect latency when communicating to sensors. The latency you’re describing is when this information wants to be passed into the
network, which is a different requirement than I described. I would ask why we could not simply add other latency requirements for other network applications, add what you’re concerned about, and encourage you to share information about these. Perhaps this is getting to the root of our difference in understanding. I suspect that some of the group (at least me) understands the proposed requirements to
be requirements derived from the processor<>camera interaction and overall application requirements independent of the specific network topology/implementation. For example, if we have one requirement that says the system requires GPIO input-output delay
must be <10us when point-to-point and another requirement that says it must be <100us over a network, then it means we require a PHY that supports the <10us case. However, it seems unlikely and undesirable that the system would have different requirements
based solely on network topology and so that is why this approach is likely to result in an overconstrained PHY. On the other hand if we could say that the GPIO input-output delay can be up to 200us, but the skew/jitter at the output across multiple sensors
is <1us for all topologies, then we can derive a looser PHY delay requirement and have much greater flexibility in making tradeoffs that can reduce the PHY complexity/cost/power, etc. which I understand to be some of the reasons for 802.3dm and what differentiates
it from existing Ethernet PHYs. Best Regards, Scott From: TJ Houck <thouck@xxxxxxxxxxx>
EXTERNAL EMAIL:
Do not click links or open attachments unless you know the content is safe
Hi Scott, Thanks for the follow up. However, I don’t follow how this limits ethernet functionality, nor did my presentation say this was the only requirement. The applications I brought up were to propose limits on
what the SERDES solutions address for their customers today. The presentation aimed to share how automotive ADAS systems are connected to sensors-bridge and Switch-processor. The GPIOs are used as critical trigger events for various applications, and latency
is a crucial reason why SERDES solutions are used today since they address these needs desired by customers. I would not yet claim that 10us switch->camera and 1us camera->switch are not the true requirements, but these requirements will severely constrain the valid solutions and
I am concerned it will take a networked topology off the table. --
May I ask how you concluded that this is not a true requirement and how this would directly impact network solutions, as this is a different requirement than I have described? Similarly, if we state that the requirements are precisely the observed behavior of a point-to-point connection, then connecting the camera to processor over a network may
not be possible/economical. –-
This sounds like you’re describing two different requirements. The requirements I addressed directly reflect latency when communicating to sensors. The latency you’re describing is when this information wants to be passed into the
network, which is a different requirement than I described. I would ask why we could not simply add other latency requirements for other network applications, add what you’re concerned about, and encourage you to share information about these. I believe Kirsten tried to make this point in the call, --
I must’ve missed when this was brought up.
Best Regards, TJ From: Scott Muma <00003414ca8b162c-dmarc-request@xxxxxxxxxxxxxxxxx>
Hi Ragnar, Max, I would also like to have more use case discussions, and appreciate your contributions so far. However, it would be useful to separate the behavior
of specific implementations from the system/application requirements. TJ’s Hi Ragnar, Max, I would also like to have more use case discussions, and appreciate your contributions so far. However, it would be useful to separate the behavior of specific implementations
from the system/application requirements. TJ’s presentation made latency/delay understandable through diagrams, however, I understood the presentation was describing the behavior of a specific implementation.
To take this to an extreme, if we hypothetically connect a processor directly to an imager we could observe the behavior of that implementation and it might “require” even lower
latency because of decisions made by the implementer even if the overall application has no direct requirement for such low latency. If we accept such requirements then there is no possible alternative but direct connection between camera and ECU.
Similarly, if we state that the requirements are precisely the observed behavior of a point-to-point connection, then connecting the camera to processor over a network may not
be possible/economical. I believe Kirsten tried to make this point in the call, and if there is no network possible then Ethernet may be burdening the solution to the point that it can’t even achieve the point-to-point case at similar cost/power/latency.
So to Max’s point on the call, I don’t expect anyone is against a solution that supports a networked topology (since that is the point of Ethernet), but overconstraining the valid solutions will prevent a networked topology. I would not yet claim that 10us switch->camera and 1us camera->switch are not the true requirements, but these requirements will severely constrain the valid solutions and I
am concerned it will take a networked topology off the table. Best Regards, Scott From: Ragnar Jonsson <rjonsson@xxxxxxxxxxx>
EXTERNAL EMAIL:
Do not click links or open attachments unless you know the content is safe
Hi Max and all, At the end of yesterday’s meeting Max asked if we should have more Use-Case ad hoc meetings before the September meeting. There was a problem with my microphone, so you probably did not hear my comment. I
think that we obviously need to have more Use-Case ad hoc meetings before the September meeting. While yesterday’s ad hoc was a good start, we did not even have time to finish going over your proposed definitions of delay vs latency. Kirsten has already sent a follow-up email, highlighting the need for
finishing that discussion. I think that we need a deeper dive on the latency/delay requirements. There was a factor of 10 difference in the two proposed latency requirements presented in Montreal: Kirsten presented https://www.ieee802.org/3/dm/public/0724/matheus_dm_02b_latency_07152024.pdf On slide 3 it states “It provides concrete examples of latency and latency requirements in a camera system.” On slide 9 it states “Ethernet latencies of
10us in the DS and of 100us in the US are sufficiently small …” TJ presented
https://www.ieee802.org/3/dm/public/0724/houck_fuller_3dm_01_0724.pdf On slide 9 it states “It is proposed to limit the latency to
10us worst case in the switch to camera direction and 1us worst case in the camera to switch direction.” TJ told us that these requirements are based on our conversations with multiple OAMs and with the ADAS SoC vendors. There are also other issues that were brought up in Montreal related to Use-Cases that need further discussion. In summary, we clearly need more Use-Case ad hoc meetings. Ragnar To unsubscribe from the STDS-802-3-ISAAC list, click the following link:
https://listserv.ieee.org/cgi-bin/wa?SUBED1=STDS-802-3-ISAAC&A=1 To unsubscribe from the STDS-802-3-ISAAC list, click the following link:
https://listserv.ieee.org/cgi-bin/wa?SUBED1=STDS-802-3-ISAAC&A=1 To unsubscribe from the STDS-802-3-ISAAC list, click the following link:
https://listserv.ieee.org/cgi-bin/wa?SUBED1=STDS-802-3-ISAAC&A=1 To unsubscribe from the STDS-802-3-ISAAC list, click the following link:
https://listserv.ieee.org/cgi-bin/wa?SUBED1=STDS-802-3-ISAAC&A=1 To unsubscribe from the STDS-802-3-ISAAC list, click the following link:
https://listserv.ieee.org/cgi-bin/wa?SUBED1=STDS-802-3-ISAAC&A=1 To unsubscribe from the STDS-802-3-ISAAC list, click the following link:
https://listserv.ieee.org/cgi-bin/wa?SUBED1=STDS-802-3-ISAAC&A=1 To unsubscribe from the STDS-802-3-ISAAC list, click the following link:
https://listserv.ieee.org/cgi-bin/wa?SUBED1=STDS-802-3-ISAAC&A=1 To unsubscribe from the STDS-802-3-ISAAC list, click the following link: https://listserv.ieee.org/cgi-bin/wa?SUBED1=STDS-802-3-ISAAC&A=1 |