Thread Links Date Links
Thread Prev Thread Next Thread Index Date Prev Date Next Date Index

Remote Fault and Break Link




Howard,

On the Link Status bit, you are absolutely correct. I was looking at the same
document and made a transcription error with regard to the Link Status bit. It
looks like I copied partial information from the Remote Fault bit in the same
table.

Thanks also for adding your valuable insight to MAC state, Remote Fault and
Break Link.

On the issue of the choice of Break Link control, a reasonable choice is
associating it with the Control Register Reset bit (0.15) is perfectly
reasonable. The larger issue may be whether we re-use any of the existing
registers for 10GBASE-X. However, I'll defer this discussion to another thread.

As for any mention of Auto-Negotiation, I'm sorry for mentioning it. I DO NOT
support Auto-Negotiation or any protocol even remotely resembling it for
10GBASE-X. I simply mentioned the re-use of that Control Register bit as one of
the four possible choices for initiating Break Link. I don't believe that there
are any proposals for Auto-Negotiation on the table or laying in wait for
10GBASE-X. I believe that its prudent to not bring Auto-Negotiation into any
10GBASE-X discussions.

I respectfully disagree that LSS looks anything like Auto-Negotiation. LSS is a
very simple handshake free mechanism virtually identical to your proposal but
much more flexible. I also disagree that the either Remote Fault and/or Break
Link will not be signaled during data transfer. Management registers control RF
and BL, the MAC may be unaware of such signaling. Therefore, RF and BL must
function properly in the absence or presence of data. I have a couple of other
comments on your suggested Break Link protocol:

1) I assume that the dropping of frames is not an issue for either LSS or your
proposed protocol since either the occurrence of BL or RF will likely result in
some link maintenance action resulting in the link being unavailable for data
transport.

2) Both LSS and your proposal are low level signals. I don't believe that the
continuous aspect of the signaling has bearing on the robustness of the signal.
However, the maintenance of link synchronization provided by the Idle stream
probably gives the LSS protocol a robustness advantage in terms of receiver
message recognition as the receiver remains in its normal message recognition
mode (e.g. looking for data and control words, performing clock tolerance
compensation, attempting to synchronize.

3) Any word oriented code is as easily recognizable as another since the word
recognition logic is the same regardless of the interface (e.g. XAUI, 64B/66B,
XGMII). Therefore, there is no recognition advantage associated with the K28.7
8B/10B code across all four XAUI lanes when compared to any LSS code. Note that
the lanes at the receiver are effectively separate and that the receiver must
deskew the information. Once again, LSS has a slight receiver recognition
advantage since LSS words are interspersed in an Idle stream which guarantees
individual lane synchronization as well as 4-lane deskew. K28.7 cross all 4
lanes has absolutely no magic associated with it.

4) Your proposed link initialization protocol seems to employ a 10 ms timer
following reset. Based on your comments about "Break Link = Reset" in this note,
including both sides issuing Break Link whenever one side does, it appears that
you are architecting an initialization protocol similar to Auto-Negotiation
which I thought we weren't supposed to talk about! The current LSS proposal goes
not require any initialization timer and instead employs RF signaling until the
receiving link end is synchronized. I agree that the response to receipt of
Break Link should be to set Link Status to FAIL, and reset all lane and link
synchronization and deskew state machines. It should be apparent that the
receipt of either the LSS BL or your proposed BL message can effect this action.
However, your proposal requires a minimum 10 ms delay during initialization.
This sound kind of like what my stupid PC does every time it restarts, but it
takes 10 minutes! 

5) A couple of comments on your definition of rx_in_sync: 
- Signal_detect is only relevant to a signal input which typically emanates from
an optical receiver's associated limiting/pre-amplifier. It is not relevant to
most copper link receivers such as XAUI since a) it's likely not present as a
signal input, b) it won't be possible to assess compliance with its performance,
c) it's really not needed. I suggest that this term is only required of optical
receivers and be set to 'one' otherwise.
- pll_lock: It won't be possible to assess compliance with it's performance
either. This term should be deleted.
- lane_rx_in_sync[0:3] is required for 4 lane PMAs including XGXS. Are you
assuming that this term covers serial PMAs also since lane_rx_in_sync[0] is a
proper subset of lane_rx_in_sync[0:3]? If you are, then this term is OK.
- link_deskew[0:3] should be added. If the PMA is serial, this term is not
applicable and set to 'one'. The link is not considered usable at any point if
the deskew process is incomplete.  

6) The simplified initialization sequence starting from reset I see based on
your diagram is as follows:

   +->RESET -> Send Idle & Remote Fault until rx_in_sync 
   |    |
   |    V
   +--SYNC? 
   N    | Y
        V
     LINK_UP -> set link_up bit in status register, accept data from MAC

7) I agree that the same transition from LINK_UP to RESET would apply:

    break_link_received | rx_lost_sync

However, the rx_lost_sync term must be determined from a algorithm which is
transmission code dependent (i.e. 8B/10B, 64B/66B, SUPI, etc.) each of these
codes typically utilizes some algorithm employing hysteresis which guarantees a
minimum BER prior to achieving sync and a proportionally higher BER before
loosing sync. Is the purpose of your watchdog timer to qualify this process? If
that's the case, then I don't believe we need it, the loss-of-sync state
algorithm is adequate. If this is not the case, please explain your loss-of-sync
algorithm.

8) I agree that the response to the receipt of a Remote Fault message is to set
the Remote Fault bit in the Status register.

Best Regards,
Rich
  
--

Corrections to Rich's corrections to Osamu's proposal:

>1) Link Status is already defined in clause 22 (table 22-8) as Status register
>bit 1.2. Its definition is:
>   1 = link is up
>   0 = link is down
>   Read only; Latching high

This is incorrect.  Bit 1.2 is clearly defined as a latching low (LL) bit.
See table 22-8 of IEEE Std 802.3-1998 edition.

>I believe that this definition goes way back in Ethernet history. For
>compatibility reasons, I assume that your condition: "Duplex Link Up with valid
>MAC partner" is equivalent to "link is up". In general, this bit reflects PHY
>and not MAC state. I'd like to propose that we leave this bit defined as is and
>not add any related bits or sub states. I'd be very happy to listen to 
>proposals to augment this bit and state. However, starting out as simple 
>and compatible as possible seems to be appropriate.

This does reflect the state of the Physical link.  The MAC has no up/down
state, because the MAC receives bit by bit and transmits bit by bit to
the Physical layer.  There is no concept of link up/down at the MAC.

>Your note proposes that Remote Fault indicates "Local Sync Up/Down".
>However, I believe that the Remote Fault Status bit reflects the status
>of the remote end of the link, specifically a fault with the remote
>receiver. In your words, the Status bit corresponds more appropriately
>to "Remote Sync Up/Down". I would propose that we leave the current
>Remote Fault Status bit definition exactly as is.

Remote fault indicates that a problem has been DETECTED by the remote
receiver.  The fault could occur at the Local Transmitter, the interconnecting
channel, or the Remote Receiver.

>3) Break Link is a bit more interesting as I don't see a Control register bit
>already defined that maps exactly to this function. The ones that come close
>are:
>
>   0.15 Reset
>   0.12 Auto-Negotiation Enable
>   0.10 Isolate
>   0.5:0 Reserved
>
>I'd have to agree with you that the best choice seems to be to additionally
>define 0.10 as Break Link for 10GBASE-X.

Break Link = Reset.  You send Break Link when everything has gone south and
your only recourse is to Reset and try again from square zero.  You send
Break Link to let the remote end know that you have pushed the reset button,
and the remote end should do the same.

Let's banish all of the AN related bits from our
thoughts, and never mention them again.  Isolate has nothing to do with
the behavior at the MDI.  Don't overload this bit with new meaning.
All that Isolate does is to invoke electrical isolation from the XGMII.

>Your note also proposes an a management register bit advertising Break Link for
>signaling by LSS. I agree with this. I propose that a Register 4, Advertisement
>bit 13 be additionally defined as Break Link.
>
>4) I agree that both RF and BL can be signaled simultaneously. My point about
>priority is that a recipient of an LSS message indication both RF and BL should
>Break the Link rather than report Remote Fault and remain in operation. Perhaps
>this is too obvious and need not be stated.

There is no AN in 10 GigE, so there is no need to implement ANY of the
AN bits or registers in 10 GigE. 

>This is looking very solid, compatible and simple!

This is looking like anything but a solid, compatible and simple scheme.
Indeed, it is looking more like AutoNegotiation every day.

We need two and exactly two primitives:  Remote Fault, and Break Link.
These primitives WILL NOT be sent during normal data transfer, so there
is no need to use a signaling mechanism that slips them into the IPG.
These primitives indicate the occurrence of serious problems which 
totally preclude data exchange.  They should be low-level, continuous
signals.

I propose that we use K28.7 across all four XAUI lanes to signal Break
Link in 8B/10B land, and map this into a control frame in 64B/66B
land.  This is a simple, readily recognizable code.

Break Link is sent for time T (~10 mS) following reset.

The response to receipt of Break Link is to set the link status to
down, and reset the link synchronization state machine and deskew logic.

I propose that we use K28.1 across all four XAUI lanes to signal Remote
Fault in 8B/10B land, and map this into a control frame in 64B/66B
land.  This is a simple, readily recognizable code.

Remote Fault is sent whenever the Local Receiver status is not
rx_in_sync.  For XAUI I see this as a combination of:

   rx_in_sync = signal_detect & pll_lock & lane_rx_in_sync[0:3]

Starting from reset, the sequence is:

      RESET -> Send Break Link for T ~= 10 mS
        |
        V
   WAIT_FOR_SYNC -> Send Remote Fault until rx_in_sync
        |
        V
     IN_SYNC -> Send Idle for T ~= 10 mS
        |
        V
     LINK_UP -> set link_up bit in status register, accept data from MAC


The transition from LINK_UP to RESET would be a combination of

    break_link_received | rx_lost_sync

The term rx_lost_sync is derived from a timer that is retriggered by
rx_in_sync.  It's a watchdog timer that expires if the receiver looses
sync for greater than 5 mS.

The response to the receipt of Remote Fault is to set the Remote Fault
bit in the status register.

Howard Frazier
Cisco Systems, Inc.
                                    
------------------------------------------------------- 
Richard Taborek Sr.                 Phone: 408-845-6102       
Chief Technology Officer             Cell: 408-832-3957
nSerial Corporation                   Fax: 408-845-6114
2500-5 Augustine Dr.        mailto:rtaborek@xxxxxxxxxxx
Santa Clara, CA 95054            http://www.nSerial.com