Thread Links Date Links
Thread Prev Thread Next Thread Index Date Prev Date Next Date Index

Re: [RE] Time sensitivities in ResE. Part 1: Synchronization



Michael,

I am glad that you have separated latency, jitter and synchronization as 
these need to be treated as separate (albeit related) items.

Regarding the latest of these, would you agree that the synchronization 
is a control plane issue. There is no specific requirement for the data 
plane to be synchronized. This is a very important distinction as it 
gives network designers much more freedom (and rules out the need for 
ATM :-)

I will consider the implementation of synchronization that can achieve 
~10uS synchronization over an Ethernet network. Less precise 
synchronization can be achieved by using simpler and cheaper approaches 
in the slave device but proof of concept might as well aim high.

1. Assumptions

I am assuming that the master and slave devices both have 50ppm crystal 
oscillators.

I am assuming that someone has defined or will define a session layer 
protocol for electing a clock master. I am not an expert in this subject 
but I know that there is a WG in IETF that covers network timing.

I am assuming that someone has defined or will define a transport layer 
protocol for the clock master to periodically broadcast timestamp 
packets. These packets are sent at the highest network priority and are 
the only packets of that type on the network. The packets are sent at 
approximately 1 second intervals and they contain the current time in 
the master to a precision of 1mS.

2. In a perfect world...

The network has no traffic other than the heartbeat packets. Therefore 
the delay to the slave is always constant.

The slave maintains a counter than counts up to microseconds. It then 
keeps its current time in microseconds. Additionally, the slave has an 
adjustment factor variable. Every 10mS it adds or subtracts the most 
significant digit of the adjustment variable to or from the current 
time. Every 100mS it adds or subtracts the next significant digit of the 
adjustment variable to or from the current time - etc. to the precision 
of the adjustment variable.

On receipt of a heartbeat packet the slave will notice any discrepancy 
between its current time and the timestamp in the heartbeat packet. It 
modifies its adjustment variable to compensate for this difference so 
that most times it receives a heartbeat packet there is no discrepancy.

Over time, the oscillators in both the master and slave will drift or 
vary due to temperature changes etc. This mechanism will keep the two 
clocks in synch with a frequency jitter component in the very high kHz - 
MHz range.

3. In the real world

There is traffic running over the links that are also transmitting the 
heartbeat packets. Because the heartbeat packet is higher priority than 
any other packet it will never be delayed by more than one other packet. 
However, a packet in flight cannot be interrupted (well, it can - but we 
won't go there for now) therefore the heartbeat packets experience a 
variable increase in latency compared to the idle network time. If the 
traffic in the network is at 100% utilization then the average jitter is 
1/2 the length of the average packet size. If the network is at less 
than 50% utilization then most heartbeat packets experience no 
additional delay on a link.

When the slave receives two consecutive heartbeat packets, there will 
sometimes be a discrepancy between the times indicated - due to network 
jitter. The slave will always know that the earlier of the two is 
closest to the correct time as indicated on the idle network. Using this 
principle, the slave collects a sequence of heartbeat packets and 
consistently takes the minimum indicated time from the set as being 
correct and modifies its adjustment variable according to that time. 
Clearly, the length of the sequence will dictate the speed of reaction 
to any real frequency changes and that, along with the precise mechanism 
of modifying the adjustment variable make up the hysteresis in this 
frequency lock mechanism.

It is likely that a real implementation would benefit from dual 
mechanisms - a rapid convergence to lock and then a very slow and stable 
mechanism to track against frequency drift in the master & slave 
oscillators.

4. Considerations

Anybody writing the definition for the session and transport protocol 
needs to analyze this mechanism to decide whether 1mS precision packets 
every second is the correct precision & frequency. I would recommend 
discussing this with people in the IETF WG - although others may also 
attempt the same job.

An implementer and/or an architecture requirement specification might 
want to explore the hysteresis characteristics along with statistical 
models of realistic networks to see how complex (or simple) a given 
slave implementation would have to be. Clearly, a slave that only 
requires 20mS precision could be much simpler than a slave that requires 
<10uS. It would also be feasible to explore the mechanism to see how 
fine the resolution could go for a small, bounded network. My guess is 
that it would reach sub uS as long as the frequency drift of the 
oscillators is controlled reasonably (i.e. no large & rapid temperature 
variations).

None of the above requires any changes to IEEE 802.3 or 802.1. Neither 
is it related to my job, so I am limited to "coffee room" type discussion.

Hugh.
< not speaking for Cisco - in fact, no one on this forum is "speaking 
for Cisco" >

Michael Johas Teener wrote:

>There seem to be some misunderstandings on what the SG is asking for in
>terms of time-sensitive data. Let me try to explain what I think is the
>intent. I will be doing this in two separate emails since I expect that we
>will end up with separate threads of discussion.
>
>Endpoint synchronization:
>
>There are two primary reasons that endpoints in a media-streaming
>interconnect need to be synchronized.
>
>1) At the source of the stream there frequently is a time base that is not
>under the immediate control of the local network. It could be the
>transmission of an MPEG stream from a broadcast source (terrestrial, cable,
>satellite) ... or most extreme, the sampling clock of an audio A/D. At an
>endpoint where the stream is being reconstructed this sampling clock must be
>reproduced very accurately. If the clock has long term drift (frequency
>mismatch) then you eventually have to drop/add data to get the buffers
>aligned. If the clock has short term jitter or modulation, then the
>reproduced data suffers from distortion. For lack of a better term, perhaps
>we could call this "sample clock synchronization". The general rule for this
>is to do as good as possible ... hence the rather fuzzy requirements in the
>SG objectives list.
>
>2) When a stream has multiple destination endpoints, and those endpoints
>reproduce the data together in a way to create a coherent environment, then
>requirements are placed on how accurately the timing of the reproduction of
>the stream on one endpoint is done with respect to the timing on the other
>endpoint(s). For example, an MPEG stream is sent to both a TV and an audio
>amplifier ... lipsink must be maintained, requiring synchronization on the
>order of 10-20ms (one field or less). For digital speaker systems, the
>synchronization between speakers must be on the order of 200us (some
>audiophiles argue for 10us, but the majority of the imaging information is
>below 3kHz). This kind of thing might be called "presentation time
>synchronization". This is one place where the SG received conflicting
>information, so the more restrictive number was used ... since existence
>proofs (IEEE 1588 and IEEE 1394) shows that this kind of synchronization can
>easily be done with very low complexity.
>
>The numbers I quote here are open for discussion. I think we should all do
>what we can to document the requirements further, with appropriate
>references.
>
>Comments? Additions? Corrections?
>
>  
>