RE: [RPRWG] Wait To Restore
George,
I guess I don't see the difference. I think the main usage for WTR is to
avoid link flapping. (One would dampen by moving to WTR state, not by
remaining in SF state. See, for example, D0.3, page 115, line 3.) (Are there
any other uses for WTR?) The only new idea being introduced is possibly
changing the value for the WTR timer based on recent events on the link,
which I think we all agree should be possible, but handled outside of the
standard.
jl
-----Original Message-----
From: George Suwala [mailto:gsuwala@xxxxxxxxx]
Sent: Wednesday, July 31, 2002 3:52 PM
To: John Lemon; 'Leon Bruckman'; stds-802-17@xxxxxxxx
Cc: 'Manav Bhatia'
Subject: RE: [RPRWG] Wait To Restore
John,
I agree with you about leaving the current WTR text as is.
However Manav has brought up a good point and it may help
if we differentiate between 2 concepts:
1. WTR, which says to the other nodes: "the link is fine, but I'm choosing
not to use it yet"
2. Damping of flapping of a link by keeping it in a "down" state, which
says to the other nodes: "the link is down".
Point 1 and it's dampening effect has already been covered in this thread
(I think :-)
Point 2. Such damping has only local significance and it should not
introduce interoperability issues as each node uniquely controls
it's own interface state.
From the protocol perspective it is immaterial to this and other nodes
if a link is declared Signal Fail (SF) because a fiber has no signal,
or because fiber receives a perfect signal but interface circuitry
failed and is unable to interpret it correctly, or because the node
made an arbitrary decision to keep the interface in a SF state.
It is common for a SF condition to be detected
by software through an interrupt which is a subject to interrupt
throttling which in fact dampens the SF state changes.
So perhaps, while keeping the WTR text unchanged, we could
add some text to the effect of: "The interface state changes
may also be dampened by a local station by keeping the interface
in a Signal Fail condition while the state changes frequently.
The definition of "frequently" and the dampening algorithm are outside
of the scope of this standard as they have no impact on the RPR protocol"
What do you think?
(I don't think that we should be trying to standardize the dampening
algorithm or parameters)
thanks
George
At 08:33 AM 7/31/2002 -0700, John Lemon wrote:
>Leon,
>
>My understanding of the current text (which is hopefully correct since I
>helped write it) is that the value that can be set for WTR is configured on
>a per-station basis. The basis of choosing the local value is not
specified.
>Therefore, it is entirely supported by the standard for a station to alter
>the configured WTR value on a semi-dynamic basis, such as in reaction to
>perceived link flapping. While I agree that this client behavior is out of
>scope for the standard, I wanted Manav to understand that it is supported
by
>the standard.
>
>jl
>
>-----Original Message-----
>From: Leon Bruckman [mailto:leonb@xxxxxxxxxxxxx]
>Sent: Wednesday, July 31, 2002 12:55 AM
>To: 'John Lemon'; stds-802-17@xxxxxxxx
>Cc: 'Manav Bhatia'
>Subject: RE: [RPRWG] Wait To Restore
>
>
>John,
>My opinion is that we can leave the WTR as in the draft (single WTR value).
>As far as I know, many protection schemes operate in this way.
>On the other hand if we accept that there is merit on performing some type
>of exponential backoff, or other method, it may not be only a matter of
>implementing it at the client level (out of the standard scope), but it may
>need some MIB support (within standard scope).
>
>Leon
>
>-----Original Message-----
>From: John Lemon [mailto:JLemon@xxxxxxxxxxxx]
>Sent: Tuesday, July 30, 2002 7:17 PM
>To: 'Manav Bhatia'; stds-802-17@xxxxxxxx
>Subject: RE: [RPRWG] Wait To Restore
>
>
>
>Manav,
>
>How you set your WTR timer is up to you. If you want to have an exponential
>backoff each time it goes off, you can do that. If you want it the same
each
>time, that's fine. It doesn't matter. Regardless of what you choose, when
>you hold the span in WTR, the other side still "sees" it as being down
(with
>a state of WTR). The fact that it is an independent decision for each
>station does not cause any problems for any other stations, any more than
>the fact that the span is down.
>
>jl
>
>-----Original Message-----
>From: Manav Bhatia [mailto:manav@xxxxxxxxxxx]
>Sent: Monday, July 29, 2002 9:07 PM
>To: John Lemon; stds-802-17@xxxxxxxx
>Subject: Re: [RPRWG] Wait To Restore
>
>
>John,
>After the link has flapped certain nos of times (or whatever the criteria
>is) we declare the link to be down i.e we damp its coming up believing that
>it is unstable and can go down any time. We then observe that the link has
>been stable for quite some time now .. so when do we make it come up again
>i.e when will the other links start hearing from it?
>
>If its left to the vendor to do that then i am afraid we can have some
>problems.
>
>/Manav
>----- Original Message -----
>From: "John Lemon" <JLemon@xxxxxxxxxxxx>
>To: "'Manav Bhatia'" <manav@xxxxxxxxxxx>; <stds-802-17@xxxxxxxx>
>Sent: Tuesday, July 30, 2002 9:02 AM
>Subject: RE: [RPRWG] Wait To Restore
>
>
>| Manav,
>|
>| If one side of a span declares it down (due to flapping or any other
>reason)
>| the opposite side will learn both from the control message sent the other
>| way and also from the lack of keep alives. There is no interoperability
>| issue here.
>|
>| jl
>|
>| -----Original Message-----
>| From: Manav Bhatia [mailto:manav@xxxxxxxxxxx]
>| Sent: Monday, July 29, 2002 7:54 PM
>| To: John Lemon; stds-802-17@xxxxxxxx
>| Subject: Re: [RPRWG] Wait To Restore
>|
>|
>| Hi John,
>| If damping a flapping link is not standardized and if it is left for the
>| vendors to implement it the way the like then we *can* experience a lot
>of
>| problems in interoperatibility between different vendors. This can create
>a
>| havoc in the routing protocols. Suppose link A is flapping and vendor X
>| implements this link flap damp feature then it may start damping it even
>| when it is UP (because it has already flapped many times and has crossed
>a
>| threshold). The other side knows that the link A is UP but wont
>understand
>| why it is not responding to the HELLOs he is sending!
>|
>| Moreover i feel that the various timers associated with damping a
>flapping
>| link should be standardized.
>|
>| As i mentioned before WTR is just not good enough for this purpose.
>|
>| ~Manav
>| ----- Original Message -----
>| From: "John Lemon" <JLemon@xxxxxxxxxxxx>
>| To: "'Manav Bhatia'" <manav@xxxxxxxxxxx>; <stds-802-17@xxxxxxxx>
>| Sent: Monday, July 29, 2002 9:38 PM
>| Subject: RE: [RPRWG] Wait To Restore
>|
>|
>| | Manav,
>| |
>| | There is nothing in the standard that requires or prevents this. This
>is
>| | something a station could easily do, using the existing standard. The
>| | standard provides the basic enabling technology which can then be built
>| upon
>| | by each implementer to provide many different implementations, each
>with
>| | their own unique values.
>| |
>| | jl
>| |
>| | -----Original Message-----
>| | From: Manav Bhatia [mailto:manav@xxxxxxxxxxx]
>| | Sent: Monday, July 29, 2002 7:10 AM
>| | To: stds-802-17@xxxxxxxx
>| | Subject: [RPRWG] Wait To Restore
>| |
>| |
>| |
>| | Hi,
>| | Is there any proposal to damp a flapping link if it flaps beyond some
>| | threshold value? I am looking for exponential decay wherein if it
>remains
>| | stable till some time length then it will again be considered fit for
>| use.
>| | IMHO we must penalize a link more if it flaps severely or more often
>than
>| a
>| | link which flaps lesser number of times and less
>| | vigorously. Using the WTR we don't differentiate between the two cases
>| | since both of them are advertised if they remain stable for some time
>| | period which is specified in the WTR.
>| |
>| | What do others feel about this?
>| |
>| | Regards,
>| | Manav
>| | ----
>| | "When you are courting a nice girl an hour seems like a second. When
>you
>| | sit on a red-hot cinder a second seems like an hour. That's
>relativity."
>| |
>| | -Albert Einstein, on relativity
>| |