RE: Question about Link Fault Signalling
If the major point of the discussion next week is how many idles, we are in
a good position. I think though we are in the position where we can over
engineer this solution.
To me we are at a point of subjective decision on when is it good enough
(for most of us the natural tendency is to keep engineering until it is
perfect). In the imperfect solution world, I would choose oscillating out
and back into fault because of idle insertion rather than doing it because
of a false Start character.
I believe there is little difference between 4*4 and 8*4 idles since both
would require specific RS action to ensure that number of idles is
transmitted. To be robust to transmission errors, just sending the
requisite number of idles is not perfect either, the RS would have to send
enough idles that the probability of transmission noise creating the
undesired state is much smaller than the probability of a MAC transmitting
at maximum rate.
At the moment, I am willing to take an imperfect solution because it is
simple, and I SWAG the probability of Idle insertion falsely clearing the
fault condition to be acceptably small with the symptoms of the rare
oscillation be innocuous.
--Bob
-----Original Message-----
From: Stephen.Finch@xxxxxx [mailto:Stephen.Finch@xxxxxx]
Sent: Thursday, January 04, 2001 7:53 AM
To: stds-802-3-hssg@xxxxxxxx
Subject: RE: Question about Link Fault Signalling
Bob,
3*I and 3*P has its own set of problems.
Consider the following connection:
MAC/RS -> PCS -> PMA/PMD -> media -> PMA/PMD -> PCS -> MAC/RS.
Consider that the clock at the first MAC is -50ppm, the clock at the
PMA-mdia-PMA is 0ppm, and at the last MAC +50 ppm.
In this case, both PCS's must insert Idles periodically.
If the LF sequence is LF-I-LF-I, and both PCS determine they must insert
Idles, they could do so at the same point in the stream. The result
would be LF-I-I-I-LF.
So the receiving RS sees a local fault, then clears it, then sets it.
Could it start a packet in while it has the LF cleared?
A very small window, for sure, but....
Counting 4 or 8 of either data or Idle sounds better. If one end
transitions for transmitting RF to transmitting a packet with the number
of Idles between the last RF and the Start to detect that we have
cleared the RF condition, then a packet gets lost. So be it. Packets
get lost occationally in the Ethernet world. Higher level routines
should detect this and recover.
Another option is to clear the RF detection with either 4 or 8 idles, or
a Start.
Steve Finch
"Grow, Bob" <bob.grow@xxxxxxxxx> on 01/03/2001 06:00:25 PM
To: Stephen Finch/SSI1@SSI1
cc:
Subject: RE: Question about Link Fault Signalling
We don't need a new comment, we can discuss and fix it responding to
Shimon's comment. The committee approves/generates the responses. The
diehards that support clause editors get to help accellerate the process
by
developing proposed responses in smaller groups.
While most implementers will combine them, the MAC and RS are separate
entities in the standard. In this particular condition the MAC could be
generating frames which immediately go into the RS bit bucket. When the
RS
switches the mux to to the MAC service interface data it is logically
instantaneous. We agree it should be fixed correctly.
As to your ideas:
1. This requires changes to clause 4 and 6. Yuch!
2. Too liberal, but I might go for 8*4 Data or 8*4 Idle
3. No, 3*4 is the magic number. 4*4 isn't guananteed with minimum IPG.
4a. The Idle insertion must recognize the fault state and only insert
Pulse
words. Yuch!
4b. The exit requirement counts Idles words and resets on a Pulse word,
thus
ignoring frame words. (This would be one more reason to turn this into
a
state machine, which will be strongly resisted by certain influential
people.)
I'd favor either 8*4 Data or 8*4 Idle; or 3*4 Idle. If the later entry
should be 3 Pulse words also. I think I will re-edit the proposed
response
for the 3 gets you in and 3 gets you out. (Why be so religious about
powers
of 2.) For the meeting, I will generate a markup of Shimon's
SuggestedRemedy so it will be easy to see what I have changed.
--Bob
--Bob
-----Original Message-----
From: Stephen.Finch@xxxxxx [mailto:Stephen.Finch@xxxxxx]
Sent: Wednesday, January 03, 2001 4:44 PM
To: Grow, Bob
Subject: RE: Question about Link Fault Signalling
By George I think you've found a problem I had overlooked!
While it is a problem, real systems might not have an issue anyway
as the propogation time for the detection of Idles at MAC1 until it will
actually get a packet out will probably be longer than is needed.
But, with that being said, we should fix the problem correctly.
Some ideas (without much thought):
1. require a minimum number of Idles be sent before the first packet.
2. allow the RF fault to clear on any sequence of 8 non-RF groups to
enable the transmitter.
3. change the 2 or 8 requirement to 4 (minimize the problem?)
4. other ideas??
I just sent off my ballot, so I can't add this one. Can you or someone
else?
Regards,
Steve Finch
"Grow, Bob" <bob.grow@xxxxxxxxx> on 01/03/2001 04:44:34 PM
To: Stephen Finch/SSI1@SSI1
cc:
Subject: RE: Question about Link Fault Signalling
Your explanation makes me go back to our previous exchange on my edits
to
Shimon's text.
Step 7 is where I am concerned if the LinkStatus change to OK requires 8
* 4
bytes of Idle, because that cannot be guaranteed by minimum IPG. If MAC
1
is spewing frames at maximum rate, MAC 2 will never complete step 8
because
the Idle is interupted by frames. As you pointed out, if we leave it as
2 *
4 bytes of Idle, we would oscillate between OK and Down on Idle
insertion in
the presence of a real fault.
Am I missing something? I don't see how either 2*4 Idle or 8*4 Idle is
right.
(I haven't yet been able to break 3*4 Idle, since that will eventually
occur
in the IPG of minimum spaced frames.)
--Bob
-----Original Message-----
From: Stephen.Finch@xxxxxx [mailto:Stephen.Finch@xxxxxx]
Sent: Wednesday, January 03, 2001 3:53 PM
To: stds-802-3-hssg@xxxxxxxx
Subject: Re: Question about Link Fault Signalling
David,
let me take a swing at this ball....
Power up
1. Both sides send idles until idles are received, packets
can be sent only if Idles and/or packets are being received.
|--(Idle)--> PHY |
MAC1 | | MAC2
| PHY <--(Idle)--|
2. Since we're starting up, the links aren't locked yet, so
somewhere in the phy on the receive path in both directions
someone is generating LF. This probably happens in both
directions, so both sides are generating LFs somewhere in the
PHY layer.
|--(Idle)--> PHY ----(LF)-->|
MAC1 | | MAC2
|<--(LF)---- PHY <--(Idle)--|
3. Both MAC/RS's see LF and start send RF (instead of just Idle's). We
stay here until a link comes up.
|----(RF)--> PHY ----(LF)-->|
MAC1 | | MAC2
|<--(LF)---- PHY <--(RF)----|
4. One or the other of the links obtain sync and starts working (you
can work out what happens if both come up simultaneously). We'll assume
its on the top path. When a link comes up, it stops generating LF and
passes through what is being sent, i.e., the RF that is being sent by
the MAC (with RS) in its direction.
|----(RF)--> PHY ----(RF)-->|
MAC1 | | MAC2
|<--(LF)---- PHY <--(RF)----|
5. A MAC2 receives the RFs. This causes MAC2 to stop sending RF and
start sendin Idles. (No packets yet)
|----(RF)--> PHY ----(RF)-->|
MAC1 | | MAC2
|<--(LF)---- PHY <--(Idle)--|
6. When the other link comes up, the PHY element that was generating
LFs stops generating LF and passes through what is being sent, i.e.,
Idles.
|--(RF)----> PHY ----(RF)-->|
MAC1 | | MAC2
|<--(Idle)-- PHY <--(Idle)--|
7. When MAC1 receives Idles (not LF and not RF), it can send packets or
Idles.
|-(Idle/P)-> PHY -(Idle/P)->|
MAC1 | | MAC2
|<--(Idle)-- PHY <--(Idle)--|
8. When MAC2 receives Idles or packets, it too can start sending packets
or continue to send idles.
AND WE'RE UP.
Hope this makes it clear.
Steve Finch
"David Gross" <dgross@xxxxxxxxxxxxxxxxxx> on 01/03/2001 02:35:27 PM
To: "Grow, Bob" <bob.grow@xxxxxxxxx>
cc: rtaborek@xxxxxxxxxxxxx, HSSG <stds-802-3-hssg@xxxxxxxx>
Subject: Re: Question about Link Fault Signalling
Thanks for the response Bob,
I'd just like to make one clarification which I think might be
necessary. I'd like to see "In the case of a Local Fault condition..."
rather than "Upon detection of a Local Fault condition..." (and likewise
for Remote Fault). The reason for this is that since upon start-up, one
can assume that both devices will be in LF and transmitting RF. This
implies that once a device can start recieving data (i.e.: no longer
have a LF) it will be recieving RF. As a result, as the definition seems
to imply, the Fault conditions won't be cleared (The IDLE control words
won't be detected for 2 clock edges), but instead Remote Fault will be
detected. Since RF and LF cannot be detected at the same time (LF
prevents the transmission of recieved RF), it is logical that LF will be
cleared while RF will be achieved. There should be something in there
which allows for the clearing of LF in such a case, and jumping from the
LF condition immediately to the RF condition. Let me know what you
think.
-Dave Gross
Grow, Bob wrote:
>
> Shimon submitted a comment proposing changing the entry to link down
(eitner
> RF or LF) from 3 to 4 status messages, with exit on 8 consecutive idle
> bytes. While I am open to discussion on the numbers, I think his
proposed
> text with improved description of the protocol is a great starting
point for
> discussion. Since this has come up again, here is a slightly edited
version
> of his proposed text.
>
> "46.2.6 Link fault signaling
>
> "Two link fault conditions are specified for 10Gb/s operation: Local
Fault
> and Remote Fault. The Local Fault condition at the Reconciliation
Sublayer
> indicates that a link failure has been detected on the receive path by
a
> local DTE sublayer. The source of the failure could be at the remote
> transmitter, the interconnect between the two DTEs, at one of the
local
> DTE's devices or the interconnect between the local DTE's devices. The
> Remote Fault condition is generated by the Reconciliation Sublayer,
and when
> received by at a Reconciliation sublayer indicates that a link failure
has
> been detected by the remote DTE. The source of the failure could be
at the
> local transmitter, the interconnect between the two DTEs, at one of
the
> remote DTE's devices or the interconnect between the remote DTE's
devices.
>
> " Fault conditions are conveyed over the XGMII using status messages.
All
> status messages are four bytes in length, and are sent on a single
XGMII
> clock edge. A status message is indicated by a Pulse control character
> aligned to lane 0, with the status condition encoded in the three data
bytes
> of lanes 1, 2 and 3. The status encodings are shown in Table 46-4."
>
> <Table 46-4>
> <For the sake of completeness, also show Lane 0 encoding>
>
> "A PHY indicates Local Fault conditions to the Reconciliation sublayer
by
> alternating the corresponding status message with Idle control
characters on
> RXC<3:0> and RXD<31:0>. The Reconciliation sublayer sends the Remote
Fault
> indication to the remote DTE by alternating the Remote Fault message
with
> Idle control characters on TXC<3:0> and TXD<31:0>.
>
> "The PHY repeats a Remote Fault indication received from the remote
DTE
> unless a Local Fault condition is detected resulting in the PHY over
writing
> the received data with the Local Fault indication.
>
> "The Reconciliation sublayer continuously monitors RXC<3:0> and
RXD<31:0>
> for status messages. The reception of four status messages of the same
type
> shall indicate that the corresponding fault condition has occurred.
The
> reception of four Idle control characters on successive RX_CLK edges
(eight
> consecutive Idle control characters) shall clear all fault conditions.
>
> " Upon detection of a Local Fault condition, the Reconciliation
sublayer
> shall:
> 1) Set the link_fail status indication.
> 2) Inhibit the transmission of MAC frames.
> 3) Continuously send alternating Remote Fault messages and Idle
control
> characters.
>
> "Upon detection of a Remote Fault condition, the Reconciliation
sublayer
> shall:
> 1) Set the link_fail status indication.
> 2) Inhibit the transmission of MAC frames.
> 3) Continuously send Idle characters.
>
> "After detecting that the Fault condition has cleared (both Local and
> Remote), the Reconciliation sublayer shall:
> 1) Clear the link_fail status indication.
> 2) Enable the transmission of MAC frames."
>
> --Bob Grow
>
> -----Original Message-----
> From: David Gross [mailto:dgross@xxxxxxxxxxxxxxxxxx]
> Sent: Wednesday, January 03, 2001 8:48 AM
> To: rtaborek@xxxxxxxxxxxxx
> Cc: HSSG
> Subject: Re: Question about Link Fault Signalling
>
> Hi Rich,
>
> I have a quick question about Remote Fault I was hoping you could
> answer. In 46.2.6, it says:"Reception of multiple local fault messages
> causes the Reconcilliation Sublayer to inhibit the transmission of
> frames by MAC, and to encode remote fault status messages on TXC<3:0>
> and TXD<31:0>" It goes on to specify that reception of three LF
messages
> sets link_fail to 1, and none n 6 clock periods clears link_fail.
>
> My question is this: I believe we said that upon recieving RF, the RS
> will output an IDLE stream until it no longer recieves RF. If this is
> so, how many RF messages set this condition to be true, and in how
many
> clocks do we say that this condition is cleared if no RFs are
detected.
> Is it similar to LF, or do we only require that one RF be detected
(and
> then for how long before we reset this IDLE output condition of the RS
> Tx?)
>
> Thanks in advance.
>
> -Dave Gross