RE: [RPRWG] RE: Evaluation of timeToLive alternatives
Necdet,
The example is the one that I gave in response
to Mike's email and is contained in the thread below.
Sure, we can discuss after the RAH meeting on Tuesday.
-Anoop
> -----Original Message-----
> From: Necdet Uzun [mailto:nuzun@xxxxxxxxx]
> Sent: Thursday, June 27, 2002 6:45 PM
> To: Anoop Ghanwani
> Cc: 'Mike Takefman'; David V. James; stds-802-17@xxxxxxxx
> Subject: Re: [RPRWG] RE: Evaluation of timeToLive alternatives
>
>
> Anoop,
>
> Maybe you should explain me how that happens (I am not aware
> of your example
> as I occasionally read the entire message sent to BAH group
> unless I notice
> something interesting to me). I am sure that I will get you a
> solution. You
> want to do that in person (right after next RAH meeting).
>
> Thanks.
>
> Necdet
>
> Anoop Ghanwani wrote:
>
> > Necdet,
> >
> > In that case, with the information that we currently
> > have in the RPR MAC header, there is no way for us
> > to avoid duplication and incorrect learning per the
> > example that I provided (when doing wrapping). The
> > only way out is to add some kind of identification of
> > the node that was responsible for putting the frame
> > on the ring. This will allow it to reliably remove the
> > frame from the ring if it makes its way back to the
> > node.
> >
> > If you can think of a better solution to guarantee
> > that frames will be reliably removed under all
> > circumstances even when wrapping, please share it
> > with the group.
> >
> > -Anoop
> >
> > > -----Original Message-----
> > > From: Necdet Uzun [mailto:nuzun@xxxxxxxxx]
> > > Sent: Thursday, June 27, 2002 6:03 PM
> > > To: Anoop Ghanwani
> > > Cc: 'Mike Takefman'; David V. James; stds-802-17@xxxxxxxx
> > > Subject: Re: [RPRWG] RE: Evaluation of timeToLive alternatives
> > >
> > >
> > > Anoop,
> > >
> > > I am not saying that wrapping does not work with bridging.
> > > What I am saying
> > > that you can not require the same set of features that are
> > > composed for
> > > bridging when steering is used as a protection mechanism from an
> > > implementation that uses wrapping for the protection mechanism.
> > >
> > > Bridging should consider steering and wrapping separately as
> > > they may have
> > > different set of requirements and issues. Solution to a
> > > packet duplication
> > > problem may be different for steering than wrapping..
> > >
> > > Thanks.
> > >
> > > Necdet
> > >
> > > Anoop Ghanwani wrote:
> > >
> > > > Necdet,
> > > >
> > > > I wasn't aware that wrapping was not intended to
> > > > work for bridged frames. All the folks in the BAH group
> > > > have been busy scratching their heads for about 4 months
> > > > now worrying about how to make this stuff work with both
> > > > wrapping and steering. Poor fellas (me included!).
> > > >
> > > > -Anoop
> > > >
> > > > > -----Original Message-----
> > > > > From: Necdet Uzun [mailto:nuzun@xxxxxxxxx]
> > > > > Sent: Thursday, June 27, 2002 4:34 PM
> > > > > To: Anoop Ghanwani
> > > > > Cc: 'Mike Takefman'; David V. James; stds-802-17@xxxxxxxx
> > > > > Subject: Re: [RPRWG] RE: Evaluation of timeToLive alternatives
> > > > >
> > > > >
> > > > > Anoop,
> > > > >
> > > > > We do have a do not wrap bit in the packet header. If a
> > > > > packet is bridged
> > > > > packet it should set the WE bit right (i.e., do not WRAP).
> > > > >
> > > > > Thanks.
> > > > >
> > > > > Necdet
> > > > >
> > > > > Anoop Ghanwani wrote:
> > > > >
> > > > > > Mike,
> > > > > >
> > > > > > Based on the flooding analysis that was presented
> > > > > > at the last meeting, source stripping is required
> > > > > > regardless of whether one does basic or enhanced
> > > > > > bridging.
> > > > > >
> > > > > > Consider the following bridging example. Without
> > > > > > source-stripping, if we're doing wrapping, and a node
> > > > > > on the ring dies, the packet will be able to make its
> > > > > > way back to the source with a valid TTL (because one
> > > > > > of the nodes that would have decremented TTL has
> > > > > > disappeared). If we don't have the source address
> > > > > > (or some other way to identify the source) in the
> > > > > > frame, then the source will pick up that frame as a
> > > > > > regular bridged frame, and will learn the source
> > > > > > address of the frame as being on the ring, which is
> > > > > > incorrect.
> > > > > >
> > > > > > If we're serious about any kind of bridging, I think
> > > > > > we will need a way to explicitly identify at least
> > > > > > the source node (even if we don't care about
> > > > > > explicitly identifying the destination node).
> > > > > >
> > > > > > -Anoop
> > > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: Mike Takefman [mailto:tak@xxxxxxxxx]
> > > > > > > Sent: Thursday, June 27, 2002 1:09 PM
> > > > > > > To: David V. James
> > > > > > > Cc: Anoop Ghanwani; stds-802-17@xxxxxxxx
> > > > > > > Subject: Re: Evaluation of timeToLive alternatives
> > > > > > >
> > > > > > >
> > > > > > > David,
> > > > > > >
> > > > > > > I hate to give a legalistic response, but here it goes.
> > > > > > >
> > > > > > > You are assuming facts not in evidence, specifically
> > > > > > > that enhanced bridging and not basic bridging is
> > > > > > > supported. I am a proponent of basic bridging.
> > > > > > >
> > > > > > > This algorithm does not work for basic bridging
> > > > > > > and hence is flawed. To be clear, the reason it
> > > > > > > does not work is that the the SA of a basic bridge
> > > > > > > is not present in the frame.
> > > > > > >
> > > > > > > That being said, I will look at the algorithms
> > > > > > > intent and evaluate it once the WG makes a decision
> > > > > > > to support enhanced bridging.
> > > > > > >
> > > > > > > As a side note: Its the Canada Day Long weekend coming
> > > > > > > up, so reponse times will be even slower, and the
> > > > > > > US long weekend is following so we might have to discuss
> > > > > > > futher in a restaurant in Vancouver.
> > > > > > >
> > > > > > > cheers,
> > > > > > >
> > > > > > > mike
> > > > > > >
> > > > > > > "David V. James" wrote:
> > > > > > > >
> > > > > > > > Mike,
> > > > > > > >
> > > > > > > > I can see two ways of performing timeToLive timeouts,
> > > > > > > > listed below:
> > > > > > > > 1) Source-specified.
> > > > > > > > a) Before being sent, the source sets:
> > > > > > > > frame.timeToLive=dataBase.hopsToDestination
> > > > > > > > b) Special frame-dependent/topology-dependent
> > > > > > > > stuff happens at the wrap point to prevent
> > > > > > > > the estimate in (a) from becoming incorrect.
> > > > > > > > c) Potential multidrop endpoints check and
> > > > > > > > discardframes based on:
> > > > > > > > frame.timeToLive==1
> > > > > > > > No error is logged.
> > > > > > > > d) Potential duplicates are discarded based on:
> > > > > > > > frame.timeToLive==0
> > > > > > > > An error is logged.
> > > > > > > > 2) Destination-specified.
> > > > > > > > a) Before being sent, the source sets:
> > > > > > > > frame.timeToLive=255
> > > > > > > > b) Potential multidrop endpoints check and
> > > > > > > > discard frames based on:
> > > > > > > > frame.DSID=myState.DSID
> > > > > > > > c) Potential duplicates are discarded based on:
> > > > > > > > frame.timeToLive>database.hopsfFromSource&&
> > > > > > > > frame.timeToLive!=
> > > > > > > >
> database.hopsFromSource+database.stationsOnRing
> > > > > > > >
> > > > > > > > I believe that the option (2) has several benefits:
> > > > > > > > i) The timeToLive field is processed in all frames.
> > > > > > > > ii) The timeToLive always means distance-from-source.
> > > > > > > > iii) Error logs are accurate, since timeToLive frames
> > > > > > > > are never discarded during steady-state
> operations.
> > > > > > > >
> > > > > > > > Can you comments on your perception of these
> conclusions?
> > > > > > > >
> > > > > > > > DVJ
> > > > > > > >
> > > > > > > > David V. James, PhD
> > > > > > > > Chief Architect
> > > > > > > > Network Processing Solutions
> > > > > > > > Data Communications Division
> > > > > > > > Cypress Semiconductor, Bldg #3
> > > > > > > > 3901 North First Street
> > > > > > > > San Jose, CA 95134-1599
> > > > > > > > Work: +1.408.545.7560
> > > > > > > > Cell: +1.650.954.6906
> > > > > > > > Fax: +1.408.456.1962
> > > > > > > > Work: djz@xxxxxxxxxxx
> > > > > > > > Base: dvj@xxxxxxxxxxxx
> > > > > > > >
> > > > > > > > >>-----Original Message-----
> > > > > > > > >>From: owner-stds-802-17@xxxxxxxxxxxxxxxxxx
> > > > > > > >
> >>[mailto:owner-stds-802-17@xxxxxxxxxxxxxxxxxx]On Behalf Of
> > > > > > > Mike Takefman
> > > > > > > > >>Sent: Monday, June 24, 2002 8:17 PM
> > > > > > > > >>To: djz@xxxxxxxxxxx
> > > > > > > > >>Cc: Anoop Ghanwani; stds-802-17@xxxxxxxx
> > > > > > > > >>Subject: Re: [RPRWG] control TTL (the 255-station and
> > > > > > > 2000-km issue)
> > > > > > > > >>
> > > > > > > > >>
> > > > > > > > >>
> > > > > > > > >>That is a fair request David, and I will do
> > > > > > > > >>my best to accomdate it.
> > > > > > > > >>
> > > > > > > > >>Consider instead a host on your ring. Are you
> > > > > > > > >>suggesting that hosts have to behave as
> > > > > > > > >>bridges do in terms of learning the mappings
> > > > > > > > >>of MAC addresses to bridge IDs?
> > > > > > > > >>
> > > > > > > > >>If so, I believe you are placing an overly
> > > > > > > > >>large burden on hosts, one that no other 802
> > > > > > > > >>standard has done.
> > > > > > > > >>
> > > > > > > > >>Either way, the scenario I pointed out for bridges
> > > > > > > > >>(that you deflected by pointing out that I was
> > > > > > > > >>assuming a different scenario (which is fair
> > > > > > > > >>on your part)) is back. A bridge flooding a non local
> > > > > > > > >>packet for the first time, a host inserting a packet
> > > > > > > > >>that is off ring, but does not know which bridge
> > > > > > > > >>it exits (which in my world happens for every host
> > > > > > > > >>packet). If that station goes off the ring, ttl
> > > > > > > > >>will be the only mechanism to stop the double
> > > > > > > > >>delivery.
> > > > > > > > >>
> > > > > > > > >>Don't get me wrong, I do like your slick trick,
> > > > > > > > >>but your solution is to change the frame format
> > > > > > > > >>and add new features, whereas I am working to
> > > > > > > > >>fix the currently approved text.
> > > > > > > > >>
> > > > > > > > >>cheers,
> > > > > > > > >>
> > > > > > > > >>mike
> > > > > > > > >>
> > > > > > > > >>
> > > > > > > > >>
> > > > > > > > >>David James wrote:
> > > > > > > > >>>
> > > > > > > > >>> Mike,
> > > > > > > > >>>
> > > > > > > > >>> I believe part of the problem is that you claiming
> > > > > > > > >>> something that is not documented does not fail.
> > > > > > > > >>> Its hard to argue, as the definition can change
> > > > > > > > >>> whenever a failure is illustrated.
> > > > > > > > >>>
> > > > > > > > >>> Perhaps you should document the TTL stripping
> > > > > > > > >>> protocols with some background text and
> illustrations?
> > > > > > > > >>> And, clearly define the synchronizatoin points
> > > > > > > > >>> with Discovery, that are often implied.
> > > > > > > > >>>
> > > > > > > > >>> In my comments to D0.3 I have done this for
> > > DSID scoping.
> > > > > > > > >>> While this was much easier (since it doesn't have
> > > > > > > > >>> all of TTL stripping exceptions and
> problems), I believe
> > > > > > > > >>> its only fair to ask for you to do the same.
> > > > > > > > >>>
> > > > > > > > >>> Then, we will be able to analysize a problem,
> > > > > > > > >>> without the problem statement constantly changing.
> > > > > > > > >>>
> > > > > > > > >>> Even after that, there is the basic problem that:
> > > > > > > > >>> 1) Bidirectional flooding is required for
> performance.
> > > > > > > > >>> 2) A single-stations failure of (1) generates a
> > > duplicate
> > > > > > > > >>> 3) An multi-station failure of DSID stripping
> > > > > > > > >>> generates no duplicates.
> > > > > > > > >>>
> > > > > > > > >>> Remember, 2/3 is the failure scenario you mentioned
> > > > > > > > >>> in previous email. Having agreed, I'm a bit
> surprized
> > > > > > > > >>> this no longer appears to be a concern to you, just
> > > > > > > > >>> because the repercussions of that statement
> changed...
> > > > > > > > >>>
> > > > > > > > >>> DVJ
> > > > > > > > >>>
> > > > > > > > >>> David V. James, PhD
> > > > > > > > >>> Chief Architect
> > > > > > > > >>> Network Processing Solutions
> > > > > > > > >>> Data Communications Division
> > > > > > > > >>> Cypress Semiconductor, Bldg #3
> > > > > > > > >>> 3901 North First Street
> > > > > > > > >>> San Jose, CA 95134-1599
> > > > > > > > >>> Work: +1.408.545.7560
> > > > > > > > >>> Cell: +1.650.954.6906
> > > > > > > > >>> Fax: +1.408.456.1962
> > > > > > > > >>> Work: djz@xxxxxxxxxxx
> > > > > > > > >>> Base: dvj@xxxxxxxxxxxx
> > > > > > > > >>>
> > > > > > > > >>> > -----Original Message-----
> > > > > > > > >>> > From: owner-stds-802-17@xxxxxxxxxxxxxxxxxx
> > > > > > > > >>> >
> > > [mailto:owner-stds-802-17@xxxxxxxxxxxxxxxxxx]On Behalf Of
> > > > > > > > >>Mike Takefman
> > > > > > > > >>> > Sent: Monday, June 24, 2002 1:31 PM
> > > > > > > > >>> > To: Anoop Ghanwani
> > > > > > > > >>> > Cc: 'djz@xxxxxxxxxxx '; 'stds-802-17@xxxxxxxx '
> > > > > > > > >>> > Subject: Re: [RPRWG] control TTL (the
> 255-station and
> > > > > > > 2000-km issue)
> > > > > > > > >>> >
> > > > > > > > >>> >
> > > > > > > > >>> >
> > > > > > > > >>> > Anoop,
> > > > > > > > >>> >
> > > > > > > > >>> > For a protection hierarchy to work all nodes
> > > > > > > > >>> > need to know about all failures.
> > > > > > > > >>> >
> > > > > > > > >>> > I agree with your comment that this needs to be
> > > > > > > > >>> > clearly documented as part of the standard.
> > > > > > > > >>> > Futhermore, if the WG decides to accept this
> > > > > > > > >>> > TTL decrement algorithm it must be documented
> > > > > > > > >>> > properly.
> > > > > > > > >>> >
> > > > > > > > >>> > With regard to your last question / comment.
> > > > > > > > >>> > In wrapping, the adjacent nodes can react
> > > immediately if
> > > > > > > > >>> > they have the highest priority failure. Thereby
> > > > > > > > >>> > wrapping will have quicker reaction times to
> > > > > > > > >>> > steering. The need to broadcast in the wrapping
> > > > > > > > >>> > case is to support the hierarchy. If no hierarchy
> > > > > > > > >>> > was supported, then the decision could be
> completely
> > > > > > > > >>> > local. In this case I would still argue that a
> > > > > > > > >>> > broadcast of the event was useful for 2 reasons.
> > > > > > > > >>> > 1) The same algorithm supports steering which is
> > > > > > > > >>> > the default mode
> > > > > > > > >>> > 2) The packets that are trapped on the wrong ring
> > > > > > > > >>> > will get killed.
> > > > > > > > >>> >
> > > > > > > > >>> > cheers,
> > > > > > > > >>> >
> > > > > > > > >>> > mike
> > > > > > > > >>> >
> > > > > > > > >>> >
> > > > > > > > >>> > Anoop Ghanwani wrote:
> > > > > > > > >>> > >
> > > > > > > > >>> > >
> > > > > > > > >>> > > Mike,
> > > > > > > > >>> > >
> > > > > > > > >>> > > I was trying to say that when a node
> "unwraps" due
> > > > > > > > >>> > > to the ring healing, it can't throw away packets
> > > > > > > > >>> > > forever because the ring might wrap at
> some other
> > > > > > > > >>> > > place making it valid for this node to
> see packets
> > > > > > > > >>> > > with the wrap bit set. Therefore a
> node would have
> > > > > > > > >>> > > to set some kind of timer (on the order
> of RTT) and
> > > > > > > > >>> > > only throw away packets for that duration.
> > > > > > > > >>> > >
> > > > > > > > >>> > > The above discussion was trying to
> solve the problem
> > > > > > > > >>> > > where all nodes do not know about
> protection events;
> > > > > > > > >>> > > only those adjacent to the fault do. If
> > > all nodes do
> > > > > > > > >>> > > know about protection events, the solution
> > > you mention
> > > > > > > > >>> > > should work, but it does need to be
> > > documented in the
> > > > > > > > >>> > > spec.
> > > > > > > > >>> > >
> > > > > > > > >>> > > [Off topic discussion]
> > > > > > > > >>> > > To me, it seemed like the main argument for
> > > > > doing wrapping
> > > > > > > > >>> > > is that only nodes adjacent to the fault need
> > > > > to know about
> > > > > > > > >>> > > it and react to it. If all nodes do need
> > > to know about
> > > > > > > > >>> > > a protection event, then it it probably
> > > more efficient
> > > > > > > > >>> > > for them to use steering.
> > > > > > > > >>> > >
> > > > > > > > >>> > > -Anoop
> > > > > > > > >>> > >
> > > > > > > > >>> > > -----Original Message-----
> > > > > > > > >>> > > From: Mike Takefman
> > > > > > > > >>> > > To: Anoop Ghanwani
> > > > > > > > >>> > > Cc: djz@xxxxxxxxxxx; stds-802-17@xxxxxxxx
> > > > > > > > >>> > > Sent: 6/24/02 12:07 AM
> > > > > > > > >>> > > Subject: Re: [RPRWG] control TTL (the
> 255-station
> > > > > > > and 2000-km issue)
> > > > > > > > >>> > >
> > > > > > > > >>> > > Anoop,
> > > > > > > > >>> > >
> > > > > > > > >>> > > wrapping nodes always communicate with
> every other
> > > > > > > > >>> > > node anyway. This is necessary for protection
> > > > > > > > >>> > > heirarchy to work. Also, given the broadcast
> > > > > > > > >>> > > nature of messages to make steering
> work in under
> > > > > > > > >>> > > 50 ms, I have no concern over all nodes knowing
> > > > > > > > >>> > > that all protection events are done and
> the ringlets
> > > > > > > > >>> > > are healed.
> > > > > > > > >>> > >
> > > > > > > > >>> > > If one waits for the ringlets to be healed
> > > > > > > > >>> > > and then killing the packet life is fine. Or
> > > > > > > > >>> > > maybe I did not understand your comment.
> > > > > > > > >>> > >
> > > > > > > > >>> > > mike
> > > > > > > > >>> > >
> > > > > > > > >>> > > Anoop Ghanwani wrote:
> > > > > > > > >>> > > >
> > > > > > > > >>> > > > > > The problem with (3), which you seem
> > > to advocate,
> > > > > > > > >>> > > > > > is the time gap between the wrap
> > > action and the
> > > > > > > > >>> > > > > > the distribution/settling of the
> wrap state
> > > > > > > information
> > > > > > > > >>> > > > > > in other stations. During this time
> > > > > difference, any
> > > > > > > > >>> > > > > > and all TTL-strip based frames will
> > > be discarded.
> > > > > > > > >>> > > > >
> > > > > > > > >>> > > > > A good point david, in response
> please consider
> > > > > > > the following
> > > > > > > > >>> > > > >
> > > > > > > > >>> > > > > Never decrement when on the wrong ring.
> > > > > Once the wrap
> > > > > > > > >>> > > > > state is left, kill the packet if
> the ring id
> > > > > > > > >>> > > > > is wrong. THus going into wrap does not
> > > cause the
> > > > > > > > >>> > > > > packets to be prematurely lost. When
> > > leaving wrap
> > > > > > > > >>> > > > > the packets will be killed once
> everyone knows
> > > > > > > > >>> > > > > the wrap is over.
> > > > > > > > >>> > > >
> > > > > > > > >>> > > > Mike,
> > > > > > > > >>> > > >
> > > > > > > > >>> > > > Does everyone on the ring know when a
> > > wrap has occured
> > > > > > > > >>> > > > and when it heals? I thought wrapping was a
> > > > > local issue
> > > > > > > > >>> > > > and only nodes adjacent to the fault know
> > > about it.
> > > > > > > > >>> > > > In that case, if the node at which
> wrapping occurs
> > > > > > > > >>> > > > detects a heal, and for some reason doesn't
> > > > > pull a wrap
> > > > > > > > >>> > > > packet off, it will continue to
> circulate forever.
> > > > > > > > >>> > > > The node can't be dropping wrapped
> packets forever
> > > > > > > > >>> > > > because the wrap could occur somewhere else at
> > > > > > > > >>> > > > which time it would be a legal packet for
> > > > > pass-through.
> > > > > > > > >>> > > >
> > > > > > > > >>> > > > -Anoop
> > > > > > > > >>> > >
> > > > > > > > >>> > > --
> > > > > > > > >>> > > Michael Takefman tak@xxxxxxxxx
> > > > > > > > >>> > > Manager of Engineering, Cisco Systems
> > > > > > > > >>> > > Chair IEEE 802.17 Stds WG
> > > > > > > > >>> > > 2000 Innovation Dr, Ottawa, Canada, K2K 3E8
> > > > > > > > >>> > > voice: 613-254-3399 fax: 613-254-4867
> > > > > > > > >>> >
> > > > > > > > >>> > --
> > > > > > > > >>> > Michael Takefman tak@xxxxxxxxx
> > > > > > > > >>> > Manager of Engineering, Cisco Systems
> > > > > > > > >>> > Chair IEEE 802.17 Stds WG
> > > > > > > > >>> > 2000 Innovation Dr, Ottawa, Canada, K2K 3E8
> > > > > > > > >>> > voice: 613-254-3399 fax: 613-254-4867
> > > > > > > > >>> >
> > > > > > > > >>
> > > > > > > > >>--
> > > > > > > > >>Michael Takefman tak@xxxxxxxxx
> > > > > > > > >>Manager of Engineering, Cisco Systems
> > > > > > > > >>Chair IEEE 802.17 Stds WG
> > > > > > > > >>2000 Innovation Dr, Ottawa, Canada, K2K 3E8
> > > > > > > > >>voice: 613-254-3399 fax: 613-254-4867
> > > > > > >
> > > > > > > --
> > > > > > > Michael Takefman tak@xxxxxxxxx
> > > > > > > Manager of Engineering, Cisco Systems
> > > > > > > Chair IEEE 802.17 Stds WG
> > > > > > > 2000 Innovation Dr, Ottawa, Canada, K2K 3E8
> > > > > > > voice: 613-254-3399 fax: 613-254-4867
> > > > > > >
> > > > >
> > >
>