Re: [RPRWG] RE: Divergent Simulator Results Explained
- To: Mike Takefman <tak@xxxxxxxxx>
- Subject: Re: [RPRWG] RE: Divergent Simulator Results Explained
- From: Prasenjit Biswas <pbiswas@xxxxxxxxxxxxxxxxxxx>
- Date: Wed, 16 Jul 2003 11:10:08 -0700
- CC: David V James <dvj@xxxxxxxxxxxx>, Rpr GroupOf Ieee<stds-802-17@xxxxxxxx>, Jon Schuringa <jon.schuringa@xxxxxxxxxxxx>, bjornfd@xxxxxxxxx, bjoernal@xxxxxxxxxx, petterte@xxxxxxxxxx, nuzun@xxxxxxxxx, leonb@xxxxxxxxxxxxx, kkrama@xxxxxxxxxxxxxxxx, JLemon@xxxxxxxxxxxx, hpeng@xxxxxxxxxxxxxxxxxx, yan@xxxxxxxxxxxxxxx, huang@xxxxxxxxxxxxxxx, mei@xxxxxxxxxxxxxxxx, Stein Gjessing<steing@xxxxxxxxx>
- References: <FMEBLOEMFEFGGFLELMLNKEEDCLAA.dvj@xxxxxxxxxxxx> <3F155273.8020100@xxxxxxxxx>
- Sender: owner-stds-802-17@xxxxxxxxxxxxxxxxxx
- User-Agent: Mozilla/5.0 (X11; U; SunOS sun4u; en-US; rv:1.0.1) Gecko/20020920 Netscape/7.0
For Jon:
The condition :
(lpNrXmitRate > unreservedRate) has been eliminated couple of meetings
back as a condition for congestion. This does not appear in draft 2.3
either (see page 225).
Prasenjit Biswas
Mike Takefman wrote:
>
> David,
>
> I agree SRP != RPR, but you missed my point.
>
> Real network deployments don't require solutions that have
> 0 corner cases. The evaluation is whether something is good
> enough to solve real world problems.
>
> mike
>
> David V James wrote:
>
>> Mike,
>>
>> With regards to:
>>
>>>> Frankly, certain
>>>> people can complain about SRP fairness, but it has been
>>>> deployed for years in REAL networks and we have not
>>>> had any complaints of performance issues.
>>>
>>>
>>
>> Yes, but:
>> assert(RPR!=SRP);
>> if (RPR!=SRP)
>> RPR_Behavior != SRP_Behavior;
>>
>> In particular, I believe:
>> 1) classA/classB/classC behaviors of RPR are different from SRP.
>> 2) single-queue options is not supported in SRP.
>>
>> In any case, RPR will have to stand-up to sponsor-ballot
>> analysis. I know several reviewers, including myself, that
>> will not buy the "its like SRP and (trust us) that works"
>> type of arguments.
>>
>> Comprehensible text, specific equations, and proofs of worst-case
>> latencies are needed. I had to provide such things when writing
>> a Masters thesis; I hope we apply similar quality controls to
>> IEEE Standards.
>>
>> DVJ
>>
>> David V. James
>> 3180 South Ct
>> Palo Alto, CA 94306
>> Home: +1.650.494.0926
>> +1.650.856.9801
>> Cell: +1.650.954.6906
>> Fax: +1.360.242.5508
>> Base: dvj@xxxxxxxxxxxx
>>
>>
>>>> -----Original Message-----
>>>> From: Mike Takefman [mailto:tak@xxxxxxxxx]
>>>> Sent: Monday, July 14, 2003 8:53 AM
>>>> To: Jon Schuringa
>>>> Cc: Stein Gjessing; dvj@xxxxxxxxxxxx; mei@xxxxxxxxxxxxxxxx;
>>>> huang@xxxxxxxxxxxxxxx; yan@xxxxxxxxxxxxxxx; hpeng@xxxxxxxxxxxxxxxxxx;
>>>> JLemon@xxxxxxxxxxxx; kkrama@xxxxxxxxxxxxxxxx; leonb@xxxxxxxxxxxxx;
>>>> nuzun@xxxxxxxxx; petterte@xxxxxxxxxx; bjoernal@xxxxxxxxxx;
>>>> bjornfd@xxxxxxxxx
>>>> Subject: Re: Divergent Simulator Results Explained
>>>>
>>>>
>>>> All,
>>>>
>>>> I know that our implementation of SRP had fair
>>>> sharing of the BW between STQ and stage buffer
>>>> when the STQ threshold was below the low limit.
>>>> Necdet had assured me that this behavior made
>>>> it into the standard. If it did not, or got cut
>>>> somehow, we may want to put it back in.
>>>>
>>>> I am not sure why some of the changes from D2.2 to D2.3
>>>> were made, but they appear to have created this
>>>> problem where it did not exist before.
>>>>
>>>> We also have to take a careful look at these scenarios
>>>> and do a sanity check on them. Frankly, certain
>>>> people can complain about SRP fairness, but it has been
>>>> deployed for years in REAL networks and we have not
>>>> had any complaints of performance issues. We have
>>>> to be careful not to be optimizing for corner cases
>>>> that cause worse problems in the general case.
>>>>
>>>> mike
>>>>
>>>> cheers,
>>>>
>>>> mike
>>>>
>>>> Jon Schuringa wrote:
>>>>
>>>>> All,
>>>>>
>>>>> I agree with Mike that the implementation of the shapers could
>>>>
>>>>
>>>> be the reason
>>>>
>>>>> for the different simulation results. I also found some
>>>>
>>>>
>>>> problems with my
>>>>
>>>>> implementation some time ago.
>>>>>
>>>>> BUT, it is very clear to me now why shaperD has a serious
>>>>
>>>>
>>>> design flaw. I
>>>>
>>>>> will explain that here without "prove by simulations". Please
>>>>
>>>>
>>>> take some time
>>>>
>>>>> to read it.
>>>>>
>>>>>
>>>>>
>>>>> Following very simple situation (like Wang Chao proposed some
>>>>
>>>>
>>>> months ago):
>>>>
>>>>> ------(A)-------------(B)-------------(C)----
>>>>>
>>>>> o-------------flow1------------->
>>>>> o------flow2---->
>>>>>
>>>>>
>>>>> Both flows are class C flows and have enough to send, say 100%
>>>>
>>>>
>>>> lineRate.
>>>>
>>>>> Furthermore we have 50% classA0, so unreservedRate is 50% of
>>>>
>>>>
>>>> the lineRate.
>>>>
>>>>> It does not matter if classA0 is actually send.
>>>>>
>>>>> What do we expect to happen?
>>>>> 1) STQ in station B grows
>>>>> 2) STQ in B reaches lowThreshold
>>>>> 3) Station B advertises a fair rate
>>>>> 4) Flow1 and flow2 both get unreservedRate/2 = 25% of the lineRate
>>>>>
>>>>> What will happen?
>>>>> 1) STQ in B will not fill until lowThreshold:
>>>>> Each time that a small number of packets is in the STQ, station B will
>>>>> forward these OR add local packets. The output will not be
>>>>
>>>>
>>>> idle until the
>>>>
>>>>> STQ is empty. This is *very* important to understand.
>>>>> Now, all these packets decrement the shaperD credits, so:
>>>>> a) We are decrementing credits at lineRate, and
>>>>> b) Increasing at unreservedRate, which is less than lineRate.
>>>>>
>>>>> ShaperD will get below the loLimit, thus stopping add traffic.
>>>>
>>>>
>>>> Station B now
>>>>
>>>>> drains the STQ at lineRate, keeping the shaperD below loLimit.
>>>>
>>>>
>>>> Thus the STQ
>>>>
>>>>> cannot fill at all! Initial credits, or timimg in simulation
>>>>
>>>>
>>>> programs are
>>>>
>>>>> actually not important in this case.
>>>>>
>>>>> 2) Interestingly, both flows get 25% in the current RPR
>>>>
>>>>
>>>> version. Station B
>>>>
>>>>> is reporting congestion, but not because of STQ thresholds! A
>>>>
>>>>
>>>> station is
>>>>
>>>>> also reporting congestion if (lpNrXmitRate > unreservedRate).
>>>>
>>>>
>>>> And that is
>>>>
>>>>> exactly what happens because station B is Xmitting at lineRate
>>>>
>>>>
>>>> as long as
>>>>
>>>>> there is something in its STQ. The fact that the outcome of
>>>>
>>>>
>>>> this experiment
>>>>
>>>>> is right, might be the reason why some people did not
>>>>
>>>>
>>>> recognize this to be a
>>>>
>>>>> problem.
>>>>>
>>>>>
>>>>> So, the external observable behavior of RPR is actually ok in
>>>>
>>>>
>>>> this case, but
>>>>
>>>>> it is easy to come with other scenarios where it is not. Why
>>>>
>>>>
>>>> do we need a
>>>>
>>>>> STQ of multiple MB if it cannot fill.
>>>>>
>>>>>
>>>>> I hope this was understandable,
>>>>> Jon
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> ----- Original Message -----
>>>>> From: "Stein Gjessing" <steing@xxxxxxxxx>
>>>>> To: <tak@xxxxxxxxx>
>>>>> Cc: <jon.schuringa@xxxxxxxxxxxx>; <dvj@xxxxxxxxxxxx>;
>>>>> <mei@xxxxxxxxxxxxxxxx>; <huang@xxxxxxxxxxxxxxx>;
>>>>> <yan@xxxxxxxxxxxxxxx>;
>>>>> <hpeng@xxxxxxxxxxxxxxxxxx>; <JLemon@xxxxxxxxxxxx>;
>>>>> <kkrama@xxxxxxxxxxxxxxxx>; <leonb@xxxxxxxxxxxxx>; <nuzun@xxxxxxxxx>;
>>>>> <petterte@xxxxxxxxxx>; <bjoernal@xxxxxxxxxx>; <bjornfd@xxxxxxxxx>;
>>>>> <steing@xxxxxxxxx>
>>>>> Sent: Thursday, July 10, 2003 10:33 PM
>>>>> Subject: Re: Divergent SImulator Results
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> (This is written in a login window, using a cell-phone as modem,
>>>>>> please excuse spelling errors etc.)
>>>>>>
>>>>>> All,
>>>>>> All our simulators contain errors, and will continue to do so
>>>>>> forewer. What we can hope to achieve is to reduce the number of
>>>>>> errors that make our results very incorrect.
>>>>>> During the implementation of our simulator we also found errors in
>>>>>> the
>>>>>> RPR draft. I believe one goal of the simulation activity is to remove
>>>>>> such errors.
>>>>>>
>>>>>> When we run simulations another source of error is configuration. I
>>>>>> recently found an error in our configuration of the 62 staion
>>>>>> scenario
>>>>>
>>>>>
>>>>>> from David and Harry, and I want to correct that before going
>>>>>
>>>>
>>>> public with
>>>>
>>>>>> more results.
>>>>>>
>>>>>> Answering Mikes questions I also found en error in the Shaper class.
>>>>>> I am sure we will find many more before the Simula-RPR simulator is
>>>>>> relatively stable and accurate. Changes to the RPR draft and hence
>>>>>> the
>>>>>> Java code will introduce more errors, but also reveal some, I hope.
>>>>>>
>>>>>> Now to Mikes questions:
>>>>>> 1. The Simula-RPR simulator is exact to the level of transmitting one
>>>>>> byte.
>>>>>> 2. The shapers are accurate to the level of transmitting one byte.
>>>>>> What we do (what the code is intended to do) is that whenever a
>>>>>> shaper is used, its value is calculated. This is done by saving the
>>>>>> value, the time this value was computed and the increment. Hence
>>>>>> it is
>>>>>> easy to calculate a new value whan the shaper is used the next time.
>>>>>> As I said above, I just found a bug in the Shaper class, and I
>>>>>> am sure there are more, but we have tested the simulator I while now,
>>>>>> and the shapers seems to work pretty well.
>>>>>> I need to go more detailed in to the code to answer all of a, b, c
>>>>>> and
>>>>>> d, but in a sinulator I do not think simultaneous or near
>>>>>> simultaneous
>>>>>> actions should be a problem (or more correctly: we must take that
>>>>>> into
>>>>>> consideration all the time)
>>>>>>
>>>>>> 3. We decrement to max_credit when we go above,
>>>>>> and we handle
>>>>>> negative credits (as we showed in the 62 station example)
>>>>>>
>>>>>> 4. We decrease the full amount of creditrs the moment we move a
>>>>>> packet
>>>>>> to the stage buffer (ie. 4b)
>>>>>>
>>>>>> Finally I fully agree that we must have several simulators agree
>>>>>> before we make changes to the draft based on simulations. And we can
>>>>>> not really make changes based on simulations, but on the new
>>>>>
>>>>>
>>>>> understanding we
>>>>>
>>>>>
>>>>>> get from the results of the simulations.
>>>>>>
>>>>>> Stein
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>> --
>>>> Michael Takefman tak@xxxxxxxxx
>>>> Manager of Engineering, Cisco Systems
>>>> Chair IEEE 802.17 Stds WG
>>>> 2000 Innovation Dr, Ottawa, Canada, K2K 3E8
>>>> voice: 613-254-3399 cell:613-220-6991
>>>>
>>>
>>
>>
>
>