Thread Links Date Links
Thread Prev Thread Next Thread Index Date Prev Date Next Date Index

Re: [RPRWG] RE: Evaluation of timeToLive alternatives




David,

I am quite certain the BAH has not voted on forcing
a frame to have an SA that exists on the ring. 
Therefore, my objection to your algorithm stands.

Based on that fact, my email did not discuss your
algorithm aside from pointing out its reliance
on having a lookup based on SA. 

Therefore I felt I am under no obligation
to correct a typographical error on your part
since I did not reference it in my rebuttal.

have a great weekend, 

mike

David James wrote:
> 
> Mike,
> 
> I'm surprized that you didn't include the typo fixes
> that I sent you this morning, listed below. I had
> hoped I could hide my coding incompetancy with your
> redistribution of the bug-free version(:>).
>    2) Destination-specified.
>       a) Before being sent, the source sets:
>            frame.timeToLive=255
>       b) Potential multidrop endpoints check and
>          discard frames based on:
>            frame.DSID=myState.DSID
>       c) Potential duplicates are discarded based on:
>            (256-frame.timeToLive)>database.hopsfFromSource&&
>             (256-frame.timeToLive)!=
>              database.hopsFromSource+database.stationsOnRing
> 
> > You are assuming facts not in evidence, specifically
> > that enhanced bridging and not basic bridging is
> > supported. I am a proponent of basic bridging.
> The BAH has voted to support sufficient mechanisms to
> enhanced as well as basic bridging. Any proposed
> formats and protocols thus have to support both.
> 
> > This algorithm does not work for basic bridging
> > and hence is flawed.
> The first email with the typos did not work, was
> reported by you and Anoop, and was then corrected.
> The typo-fixed protocol is believed to work.
> 
> > To be clear, the reason it does not work is that the
> > SA of a basic bridge is not present in the frame.
> Within all three BAH proposals, the source address of
> the bridge is located within the frame.
> The BAH has been moving quickly recently, but I had
> this point was stable for quite some time, including
> the last RPR meeting presentation.
> 
> > US long weekend is following so we might have to discuss
> > futher in a restaurant in Vancouver.
> A good idea. Until then,
> 
> DVJ
> 
> 
> David V. James, PhD
> Chief Architect
> Network Processing Solutions
> Data Communications Division
> Cypress Semiconductor, Bldg #3
> 3901 North First Street
> San Jose, CA 95134-1599
> Work: +1.408.545.7560
> Cell: +1.650.954.6906
> Fax:  +1.408.456.1962
> Work: djz@xxxxxxxxxxx
> Base: dvj@xxxxxxxxxxxx
> 
> > -----Original Message-----
> > From: Mike Takefman [mailto:tak@xxxxxxxxx]
> > Sent: Thursday, June 27, 2002 1:09 PM
> > To: David V. James
> > Cc: Anoop Ghanwani; stds-802-17@xxxxxxxx
> > Subject: Re: Evaluation of timeToLive alternatives
> >
> >
> > David,
> >
> > I hate to give a legalistic response, but here it goes.
> >
> > You are assuming facts not in evidence, specifically
> > that enhanced bridging and not basic bridging is
> > supported. I am a proponent of basic bridging.
> >
> > This algorithm does not work for basic bridging
> > and hence is flawed. To be clear, the reason it
> > does not work is that the the SA of a basic bridge
> > is not present in the frame.
> >
> > That being said, I will look at the algorithms
> > intent and evaluate it once the WG makes a decision
> > to support enhanced bridging.
> >
> > As a side note: Its the Canada Day Long weekend coming
> > up, so reponse times will be even slower, and the
> > US long weekend is following so we might have to discuss
> > futher in a restaurant in Vancouver.
> >
> > cheers,
> >
> > mike
> >
> > "David V. James" wrote:
> > >
> > > Mike,
> > >
> > > I can see two ways of performing timeToLive timeouts,
> > > listed below:
> > >   1) Source-specified.
> > >      a) Before being sent, the source sets:
> > >           frame.timeToLive=dataBase.hopsToDestination
> > >      b) Special frame-dependent/topology-dependent
> > >         stuff happens at the wrap point to prevent
> > >         the estimate in (a) from becoming incorrect.
> > >      c) Potential multidrop endpoints check and
> > >         discardframes based on:
> > >           frame.timeToLive==1
> > >         No error is logged.
> > >      d) Potential duplicates are discarded based on:
> > >           frame.timeToLive==0
> > >         An error is logged.
> > >   2) Destination-specified.
> > >      a) Before being sent, the source sets:
> > >           frame.timeToLive=255
> > >      b) Potential multidrop endpoints check and
> > >         discard frames based on:
> > >           frame.DSID=myState.DSID
> > >      c) Potential duplicates are discarded based on:
> > >           frame.timeToLive>database.hopsfFromSource&&
> > >            frame.timeToLive!=
> > >             database.hopsFromSource+database.stationsOnRing
> > >
> > > I believe that the option (2) has several benefits:
> > >   i)   The timeToLive field is processed in all frames.
> > >   ii)  The timeToLive always means distance-from-source.
> > >   iii) Error logs are accurate, since timeToLive frames
> > >        are never discarded during steady-state operations.
> > >
> > > Can you comments on your perception of these conclusions?
> > >
> > > DVJ
> > >
> > > David V. James, PhD
> > > Chief Architect
> > > Network Processing Solutions
> > > Data Communications Division
> > > Cypress Semiconductor, Bldg #3
> > > 3901 North First Street
> > > San Jose, CA 95134-1599
> > > Work: +1.408.545.7560
> > > Cell: +1.650.954.6906
> > > Fax:  +1.408.456.1962
> > > Work: djz@xxxxxxxxxxx
> > > Base: dvj@xxxxxxxxxxxx
> > >
> > > >>-----Original Message-----
> > > >>From: owner-stds-802-17@xxxxxxxxxxxxxxxxxx
> > > >>[mailto:owner-stds-802-17@xxxxxxxxxxxxxxxxxx]On Behalf Of Mike Takefman
> > > >>Sent: Monday, June 24, 2002 8:17 PM
> > > >>To: djz@xxxxxxxxxxx
> > > >>Cc: Anoop Ghanwani; stds-802-17@xxxxxxxx
> > > >>Subject: Re: [RPRWG] control TTL (the 255-station and 2000-km issue)
> > > >>
> > > >>
> > > >>
> > > >>That is a fair request David, and I will do
> > > >>my best to accomdate it.
> > > >>
> > > >>Consider instead a host on your ring. Are you
> > > >>suggesting that hosts have to behave as
> > > >>bridges do in terms of learning the mappings
> > > >>of MAC addresses to bridge IDs?
> > > >>
> > > >>If so, I believe you are placing an overly
> > > >>large burden on hosts, one that no other 802
> > > >>standard has done.
> > > >>
> > > >>Either way, the scenario I pointed out for bridges
> > > >>(that you deflected by pointing out that I was
> > > >>assuming a different scenario (which is fair
> > > >>on your part)) is back. A bridge flooding a non local
> > > >>packet for the first time, a host inserting a packet
> > > >>that is off ring, but does not know which bridge
> > > >>it exits (which in my world happens for every host
> > > >>packet). If that station goes off the ring, ttl
> > > >>will be the only mechanism to stop the double
> > > >>delivery.
> > > >>
> > > >>Don't get me wrong, I do like your slick trick,
> > > >>but your solution is to change the frame format
> > > >>and add new features, whereas I am working to
> > > >>fix the currently approved text.
> > > >>
> > > >>cheers,
> > > >>
> > > >>mike
> > > >>
> > > >>
> > > >>
> > > >>David James wrote:
> > > >>>
> > > >>> Mike,
> > > >>>
> > > >>> I believe part of the problem is that you claiming
> > > >>> something that is not documented does not fail.
> > > >>> Its hard to argue, as the definition can change
> > > >>> whenever a failure is illustrated.
> > > >>>
> > > >>> Perhaps you should document the TTL stripping
> > > >>> protocols with some background text and illustrations?
> > > >>> And, clearly define the synchronizatoin points
> > > >>> with Discovery, that are often implied.
> > > >>>
> > > >>> In my comments to D0.3 I have done this for DSID scoping.
> > > >>> While this was much easier (since it doesn't have
> > > >>> all of TTL stripping exceptions and problems), I believe
> > > >>> its only fair to ask for you to do the same.
> > > >>>
> > > >>> Then, we will be able to analysize a problem,
> > > >>> without the problem statement constantly changing.
> > > >>>
> > > >>> Even after that, there is the basic problem that:
> > > >>> 1) Bidirectional flooding is required for performance.
> > > >>> 2) A single-stations failure of (1) generates a duplicate
> > > >>> 3) An multi-station failure of DSID stripping
> > > >>>    generates no duplicates.
> > > >>>
> > > >>> Remember, 2/3 is the failure scenario you mentioned
> > > >>> in previous email. Having agreed, I'm a bit surprized
> > > >>> this no longer appears to be a concern to you, just
> > > >>> because the repercussions of that statement changed...
> > > >>>
> > > >>> DVJ
> > > >>>
> > > >>> David V. James, PhD
> > > >>> Chief Architect
> > > >>> Network Processing Solutions
> > > >>> Data Communications Division
> > > >>> Cypress Semiconductor, Bldg #3
> > > >>> 3901 North First Street
> > > >>> San Jose, CA 95134-1599
> > > >>> Work: +1.408.545.7560
> > > >>> Cell: +1.650.954.6906
> > > >>> Fax:  +1.408.456.1962
> > > >>> Work: djz@xxxxxxxxxxx
> > > >>> Base: dvj@xxxxxxxxxxxx
> > > >>>
> > > >>> > -----Original Message-----
> > > >>> > From: owner-stds-802-17@xxxxxxxxxxxxxxxxxx
> > > >>> > [mailto:owner-stds-802-17@xxxxxxxxxxxxxxxxxx]On Behalf Of
> > > >>Mike Takefman
> > > >>> > Sent: Monday, June 24, 2002 1:31 PM
> > > >>> > To: Anoop Ghanwani
> > > >>> > Cc: 'djz@xxxxxxxxxxx '; 'stds-802-17@xxxxxxxx '
> > > >>> > Subject: Re: [RPRWG] control TTL (the 255-station and 2000-km issue)
> > > >>> >
> > > >>> >
> > > >>> >
> > > >>> > Anoop,
> > > >>> >
> > > >>> > For a protection hierarchy to work all nodes
> > > >>> > need to know about all failures.
> > > >>> >
> > > >>> > I agree with your comment that this needs to be
> > > >>> > clearly documented as part of the standard.
> > > >>> > Futhermore, if the WG decides to accept this
> > > >>> > TTL decrement algorithm it must be documented
> > > >>> > properly.
> > > >>> >
> > > >>> > With regard to your last question / comment.
> > > >>> > In wrapping, the adjacent nodes can react immediately if
> > > >>> > they have the highest priority failure. Thereby
> > > >>> > wrapping will have quicker reaction times to
> > > >>> > steering. The need to broadcast in the wrapping
> > > >>> > case is to support the hierarchy. If no hierarchy
> > > >>> > was supported, then the decision could be completely
> > > >>> > local. In this case I would still argue that a
> > > >>> > broadcast of the event was useful for 2 reasons.
> > > >>> > 1) The same algorithm supports steering which is
> > > >>> >    the default mode
> > > >>> > 2) The packets that are trapped on the wrong ring
> > > >>> >    will get killed.
> > > >>> >
> > > >>> > cheers,
> > > >>> >
> > > >>> > mike
> > > >>> >
> > > >>> >
> > > >>> > Anoop Ghanwani wrote:
> > > >>> > >
> > > >>> > >
> > > >>> > > Mike,
> > > >>> > >
> > > >>> > > I was trying to say that when a node "unwraps" due
> > > >>> > > to the ring healing, it can't throw away packets
> > > >>> > > forever because the ring might wrap at some other
> > > >>> > > place making it valid for this node to see packets
> > > >>> > > with the wrap bit set.  Therefore a node would have
> > > >>> > > to set some kind of timer (on the order of RTT) and
> > > >>> > > only throw away packets for that duration.
> > > >>> > >
> > > >>> > > The above discussion was trying to solve the problem
> > > >>> > > where all nodes do not know about protection events;
> > > >>> > > only those adjacent to the fault do.  If all nodes do
> > > >>> > > know about protection events, the solution you mention
> > > >>> > > should work, but it does need to be documented in the
> > > >>> > > spec.
> > > >>> > >
> > > >>> > > [Off topic discussion]
> > > >>> > > To me, it seemed like the main argument for doing wrapping
> > > >>> > > is that only nodes adjacent to the fault need to know about
> > > >>> > > it and react to it.  If all nodes do need to know about
> > > >>> > > a protection event, then it it probably more efficient
> > > >>> > > for them to use steering.
> > > >>> > >
> > > >>> > > -Anoop
> > > >>> > >
> > > >>> > > -----Original Message-----
> > > >>> > > From: Mike Takefman
> > > >>> > > To: Anoop Ghanwani
> > > >>> > > Cc: djz@xxxxxxxxxxx; stds-802-17@xxxxxxxx
> > > >>> > > Sent: 6/24/02 12:07 AM
> > > >>> > > Subject: Re: [RPRWG] control TTL (the 255-station and 2000-km issue)
> > > >>> > >
> > > >>> > > Anoop,
> > > >>> > >
> > > >>> > > wrapping nodes always communicate with every other
> > > >>> > > node anyway. This is necessary for protection
> > > >>> > > heirarchy to work. Also, given the broadcast
> > > >>> > > nature of messages to make steering work in under
> > > >>> > > 50 ms, I have no concern over all nodes knowing
> > > >>> > > that all protection events are done and the ringlets
> > > >>> > > are healed.
> > > >>> > >
> > > >>> > > If one waits for the ringlets to be healed
> > > >>> > > and then killing the packet life is fine. Or
> > > >>> > > maybe I did not understand your comment.
> > > >>> > >
> > > >>> > > mike
> > > >>> > >
> > > >>> > > Anoop Ghanwani wrote:
> > > >>> > > >
> > > >>> > > > > > The problem with (3), which you seem to advocate,
> > > >>> > > > > > is the time gap between the wrap action and the
> > > >>> > > > > > the distribution/settling of the wrap state information
> > > >>> > > > > > in other stations. During this time difference, any
> > > >>> > > > > > and all TTL-strip based frames will be discarded.
> > > >>> > > > >
> > > >>> > > > > A good point david, in response please consider the following
> > > >>> > > > >
> > > >>> > > > > Never decrement when on the wrong ring. Once the wrap
> > > >>> > > > > state is left, kill the packet if the ring id
> > > >>> > > > > is wrong. THus going into wrap does not cause the
> > > >>> > > > > packets to be prematurely lost. When leaving wrap
> > > >>> > > > > the packets will be killed once everyone knows
> > > >>> > > > > the wrap is over.
> > > >>> > > >
> > > >>> > > > Mike,
> > > >>> > > >
> > > >>> > > > Does everyone on the ring know when a wrap has occured
> > > >>> > > > and when it heals?  I thought wrapping was a local issue
> > > >>> > > > and only nodes adjacent to the fault know about it.
> > > >>> > > > In that case, if the node at which wrapping occurs
> > > >>> > > > detects a heal, and for some reason doesn't pull a wrap
> > > >>> > > > packet off, it will continue to circulate forever.
> > > >>> > > > The node can't be dropping wrapped packets forever
> > > >>> > > > because the wrap could occur somewhere else at
> > > >>> > > > which time it would be a legal packet for pass-through.
> > > >>> > > >
> > > >>> > > > -Anoop
> > > >>> > >
> > > >>> > > --
> > > >>> > > Michael Takefman              tak@xxxxxxxxx
> > > >>> > > Manager of Engineering,       Cisco Systems
> > > >>> > > Chair IEEE 802.17 Stds WG
> > > >>> > > 2000 Innovation Dr, Ottawa, Canada, K2K 3E8
> > > >>> > > voice: 613-254-3399       fax: 613-254-4867
> > > >>> >
> > > >>> > --
> > > >>> > Michael Takefman              tak@xxxxxxxxx
> > > >>> > Manager of Engineering,       Cisco Systems
> > > >>> > Chair IEEE 802.17 Stds WG
> > > >>> > 2000 Innovation Dr, Ottawa, Canada, K2K 3E8
> > > >>> > voice: 613-254-3399       fax: 613-254-4867
> > > >>> >
> > > >>
> > > >>--
> > > >>Michael Takefman              tak@xxxxxxxxx
> > > >>Manager of Engineering,       Cisco Systems
> > > >>Chair IEEE 802.17 Stds WG
> > > >>2000 Innovation Dr, Ottawa, Canada, K2K 3E8
> > > >>voice: 613-254-3399       fax: 613-254-4867
> >
> > --
> > Michael Takefman              tak@xxxxxxxxx
> > Manager of Engineering,       Cisco Systems
> > Chair IEEE 802.17 Stds WG
> > 2000 Innovation Dr, Ottawa, Canada, K2K 3E8
> > voice: 613-254-3399       fax: 613-254-4867
> >

-- 
Michael Takefman              tak@xxxxxxxxx
Manager of Engineering,       Cisco Systems
Chair IEEE 802.17 Stds WG
2000 Innovation Dr, Ottawa, Canada, K2K 3E8
voice: 613-254-3399       fax: 613-254-4867