Re: [RE] Time sensitivities in ResE. Part 1: Synchronization
Michael,
I am glad that you have separated latency, jitter and synchronization as
these need to be treated as separate (albeit related) items.
Regarding the latest of these, would you agree that the synchronization
is a control plane issue. There is no specific requirement for the data
plane to be synchronized. This is a very important distinction as it
gives network designers much more freedom (and rules out the need for
ATM :-)
I will consider the implementation of synchronization that can achieve
~10uS synchronization over an Ethernet network. Less precise
synchronization can be achieved by using simpler and cheaper approaches
in the slave device but proof of concept might as well aim high.
1. Assumptions
I am assuming that the master and slave devices both have 50ppm crystal
oscillators.
I am assuming that someone has defined or will define a session layer
protocol for electing a clock master. I am not an expert in this subject
but I know that there is a WG in IETF that covers network timing.
I am assuming that someone has defined or will define a transport layer
protocol for the clock master to periodically broadcast timestamp
packets. These packets are sent at the highest network priority and are
the only packets of that type on the network. The packets are sent at
approximately 1 second intervals and they contain the current time in
the master to a precision of 1mS.
2. In a perfect world...
The network has no traffic other than the heartbeat packets. Therefore
the delay to the slave is always constant.
The slave maintains a counter than counts up to microseconds. It then
keeps its current time in microseconds. Additionally, the slave has an
adjustment factor variable. Every 10mS it adds or subtracts the most
significant digit of the adjustment variable to or from the current
time. Every 100mS it adds or subtracts the next significant digit of the
adjustment variable to or from the current time - etc. to the precision
of the adjustment variable.
On receipt of a heartbeat packet the slave will notice any discrepancy
between its current time and the timestamp in the heartbeat packet. It
modifies its adjustment variable to compensate for this difference so
that most times it receives a heartbeat packet there is no discrepancy.
Over time, the oscillators in both the master and slave will drift or
vary due to temperature changes etc. This mechanism will keep the two
clocks in synch with a frequency jitter component in the very high kHz -
MHz range.
3. In the real world
There is traffic running over the links that are also transmitting the
heartbeat packets. Because the heartbeat packet is higher priority than
any other packet it will never be delayed by more than one other packet.
However, a packet in flight cannot be interrupted (well, it can - but we
won't go there for now) therefore the heartbeat packets experience a
variable increase in latency compared to the idle network time. If the
traffic in the network is at 100% utilization then the average jitter is
1/2 the length of the average packet size. If the network is at less
than 50% utilization then most heartbeat packets experience no
additional delay on a link.
When the slave receives two consecutive heartbeat packets, there will
sometimes be a discrepancy between the times indicated - due to network
jitter. The slave will always know that the earlier of the two is
closest to the correct time as indicated on the idle network. Using this
principle, the slave collects a sequence of heartbeat packets and
consistently takes the minimum indicated time from the set as being
correct and modifies its adjustment variable according to that time.
Clearly, the length of the sequence will dictate the speed of reaction
to any real frequency changes and that, along with the precise mechanism
of modifying the adjustment variable make up the hysteresis in this
frequency lock mechanism.
It is likely that a real implementation would benefit from dual
mechanisms - a rapid convergence to lock and then a very slow and stable
mechanism to track against frequency drift in the master & slave
oscillators.
4. Considerations
Anybody writing the definition for the session and transport protocol
needs to analyze this mechanism to decide whether 1mS precision packets
every second is the correct precision & frequency. I would recommend
discussing this with people in the IETF WG - although others may also
attempt the same job.
An implementer and/or an architecture requirement specification might
want to explore the hysteresis characteristics along with statistical
models of realistic networks to see how complex (or simple) a given
slave implementation would have to be. Clearly, a slave that only
requires 20mS precision could be much simpler than a slave that requires
<10uS. It would also be feasible to explore the mechanism to see how
fine the resolution could go for a small, bounded network. My guess is
that it would reach sub uS as long as the frequency drift of the
oscillators is controlled reasonably (i.e. no large & rapid temperature
variations).
None of the above requires any changes to IEEE 802.3 or 802.1. Neither
is it related to my job, so I am limited to "coffee room" type discussion.
Hugh.
< not speaking for Cisco - in fact, no one on this forum is "speaking
for Cisco" >
Michael Johas Teener wrote:
>There seem to be some misunderstandings on what the SG is asking for in
>terms of time-sensitive data. Let me try to explain what I think is the
>intent. I will be doing this in two separate emails since I expect that we
>will end up with separate threads of discussion.
>
>Endpoint synchronization:
>
>There are two primary reasons that endpoints in a media-streaming
>interconnect need to be synchronized.
>
>1) At the source of the stream there frequently is a time base that is not
>under the immediate control of the local network. It could be the
>transmission of an MPEG stream from a broadcast source (terrestrial, cable,
>satellite) ... or most extreme, the sampling clock of an audio A/D. At an
>endpoint where the stream is being reconstructed this sampling clock must be
>reproduced very accurately. If the clock has long term drift (frequency
>mismatch) then you eventually have to drop/add data to get the buffers
>aligned. If the clock has short term jitter or modulation, then the
>reproduced data suffers from distortion. For lack of a better term, perhaps
>we could call this "sample clock synchronization". The general rule for this
>is to do as good as possible ... hence the rather fuzzy requirements in the
>SG objectives list.
>
>2) When a stream has multiple destination endpoints, and those endpoints
>reproduce the data together in a way to create a coherent environment, then
>requirements are placed on how accurately the timing of the reproduction of
>the stream on one endpoint is done with respect to the timing on the other
>endpoint(s). For example, an MPEG stream is sent to both a TV and an audio
>amplifier ... lipsink must be maintained, requiring synchronization on the
>order of 10-20ms (one field or less). For digital speaker systems, the
>synchronization between speakers must be on the order of 200us (some
>audiophiles argue for 10us, but the majority of the imaging information is
>below 3kHz). This kind of thing might be called "presentation time
>synchronization". This is one place where the SG received conflicting
>information, so the more restrictive number was used ... since existence
>proofs (IEEE 1588 and IEEE 1394) shows that this kind of synchronization can
>easily be done with very low complexity.
>
>The numbers I quote here are open for discussion. I think we should all do
>what we can to document the requirements further, with appropriate
>references.
>
>Comments? Additions? Corrections?
>
>
>