[RPRWG] RE: Divergent Simulator Results Explained
- To: "Mike Takefman" <tak@xxxxxxxxx>
- Subject: [RPRWG] RE: Divergent Simulator Results Explained
- From: "David V James" <dvj@xxxxxxxxxxxx>
- Date: Tue, 15 Jul 2003 22:13:40 -0700
- Cc: "Rpr GroupOf Ieee" <stds-802-17@xxxxxxxx>, "Jon Schuringa" <jon.schuringa@xxxxxxxxxxxx>, <bjornfd@xxxxxxxxx>, <bjoernal@xxxxxxxxxx>, <petterte@xxxxxxxxxx>, <nuzun@xxxxxxxxx>, <leonb@xxxxxxxxxxxxx>, <kkrama@xxxxxxxxxxxxxxxx>, <JLemon@xxxxxxxxxxxx>, <hpeng@xxxxxxxxxxxxxxxxxx>, <yan@xxxxxxxxxxxxxxx>, <huang@xxxxxxxxxxxxxxx>, <mei@xxxxxxxxxxxxxxxx>, "Stein Gjessing" <steing@xxxxxxxxx>
- Importance: Normal
- In-Reply-To: <3F12D1DF.1000805@xxxxxxxxx>
- Sender: owner-stds-802-17@xxxxxxxxxxxxxxxxxx
Mike,
With regards to:
>> Frankly, certain
>> people can complain about SRP fairness, but it has been
>> deployed for years in REAL networks and we have not
>> had any complaints of performance issues.
Yes, but:
assert(RPR!=SRP);
if (RPR!=SRP)
RPR_Behavior != SRP_Behavior;
In particular, I believe:
1) classA/classB/classC behaviors of RPR are different from SRP.
2) single-queue options is not supported in SRP.
In any case, RPR will have to stand-up to sponsor-ballot
analysis. I know several reviewers, including myself, that
will not buy the "its like SRP and (trust us) that works"
type of arguments.
Comprehensible text, specific equations, and proofs of worst-case
latencies are needed. I had to provide such things when writing
a Masters thesis; I hope we apply similar quality controls to
IEEE Standards.
DVJ
David V. James
3180 South Ct
Palo Alto, CA 94306
Home: +1.650.494.0926
+1.650.856.9801
Cell: +1.650.954.6906
Fax: +1.360.242.5508
Base: dvj@xxxxxxxxxxxx
>> -----Original Message-----
>> From: Mike Takefman [mailto:tak@xxxxxxxxx]
>> Sent: Monday, July 14, 2003 8:53 AM
>> To: Jon Schuringa
>> Cc: Stein Gjessing; dvj@xxxxxxxxxxxx; mei@xxxxxxxxxxxxxxxx;
>> huang@xxxxxxxxxxxxxxx; yan@xxxxxxxxxxxxxxx; hpeng@xxxxxxxxxxxxxxxxxx;
>> JLemon@xxxxxxxxxxxx; kkrama@xxxxxxxxxxxxxxxx; leonb@xxxxxxxxxxxxx;
>> nuzun@xxxxxxxxx; petterte@xxxxxxxxxx; bjoernal@xxxxxxxxxx;
>> bjornfd@xxxxxxxxx
>> Subject: Re: Divergent Simulator Results Explained
>>
>>
>> All,
>>
>> I know that our implementation of SRP had fair
>> sharing of the BW between STQ and stage buffer
>> when the STQ threshold was below the low limit.
>> Necdet had assured me that this behavior made
>> it into the standard. If it did not, or got cut
>> somehow, we may want to put it back in.
>>
>> I am not sure why some of the changes from D2.2 to D2.3
>> were made, but they appear to have created this
>> problem where it did not exist before.
>>
>> We also have to take a careful look at these scenarios
>> and do a sanity check on them. Frankly, certain
>> people can complain about SRP fairness, but it has been
>> deployed for years in REAL networks and we have not
>> had any complaints of performance issues. We have
>> to be careful not to be optimizing for corner cases
>> that cause worse problems in the general case.
>>
>> mike
>>
>> cheers,
>>
>> mike
>>
>> Jon Schuringa wrote:
>> >
>> > All,
>> >
>> > I agree with Mike that the implementation of the shapers could
>> be the reason
>> > for the different simulation results. I also found some
>> problems with my
>> > implementation some time ago.
>> >
>> > BUT, it is very clear to me now why shaperD has a serious
>> design flaw. I
>> > will explain that here without "prove by simulations". Please
>> take some time
>> > to read it.
>> >
>> >
>> >
>> > Following very simple situation (like Wang Chao proposed some
>> months ago):
>> >
>> > ------(A)-------------(B)-------------(C)----
>> >
>> > o-------------flow1------------->
>> > o------flow2---->
>> >
>> >
>> > Both flows are class C flows and have enough to send, say 100%
>> lineRate.
>> > Furthermore we have 50% classA0, so unreservedRate is 50% of
>> the lineRate.
>> > It does not matter if classA0 is actually send.
>> >
>> > What do we expect to happen?
>> > 1) STQ in station B grows
>> > 2) STQ in B reaches lowThreshold
>> > 3) Station B advertises a fair rate
>> > 4) Flow1 and flow2 both get unreservedRate/2 = 25% of the lineRate
>> >
>> > What will happen?
>> > 1) STQ in B will not fill until lowThreshold:
>> > Each time that a small number of packets is in the STQ, station B will
>> > forward these OR add local packets. The output will not be
>> idle until the
>> > STQ is empty. This is *very* important to understand.
>> > Now, all these packets decrement the shaperD credits, so:
>> > a) We are decrementing credits at lineRate, and
>> > b) Increasing at unreservedRate, which is less than lineRate.
>> >
>> > ShaperD will get below the loLimit, thus stopping add traffic.
>> Station B now
>> > drains the STQ at lineRate, keeping the shaperD below loLimit.
>> Thus the STQ
>> > cannot fill at all! Initial credits, or timimg in simulation
>> programs are
>> > actually not important in this case.
>> >
>> > 2) Interestingly, both flows get 25% in the current RPR
>> version. Station B
>> > is reporting congestion, but not because of STQ thresholds! A
>> station is
>> > also reporting congestion if (lpNrXmitRate > unreservedRate).
>> And that is
>> > exactly what happens because station B is Xmitting at lineRate
>> as long as
>> > there is something in its STQ. The fact that the outcome of
>> this experiment
>> > is right, might be the reason why some people did not
>> recognize this to be a
>> > problem.
>> >
>> >
>> > So, the external observable behavior of RPR is actually ok in
>> this case, but
>> > it is easy to come with other scenarios where it is not. Why
>> do we need a
>> > STQ of multiple MB if it cannot fill.
>> >
>> >
>> > I hope this was understandable,
>> > Jon
>> >
>> >
>> >
>> >
>> >
>> >
>> > ----- Original Message -----
>> > From: "Stein Gjessing" <steing@xxxxxxxxx>
>> > To: <tak@xxxxxxxxx>
>> > Cc: <jon.schuringa@xxxxxxxxxxxx>; <dvj@xxxxxxxxxxxx>;
>> > <mei@xxxxxxxxxxxxxxxx>; <huang@xxxxxxxxxxxxxxx>; <yan@xxxxxxxxxxxxxxx>;
>> > <hpeng@xxxxxxxxxxxxxxxxxx>; <JLemon@xxxxxxxxxxxx>;
>> > <kkrama@xxxxxxxxxxxxxxxx>; <leonb@xxxxxxxxxxxxx>; <nuzun@xxxxxxxxx>;
>> > <petterte@xxxxxxxxxx>; <bjoernal@xxxxxxxxxx>; <bjornfd@xxxxxxxxx>;
>> > <steing@xxxxxxxxx>
>> > Sent: Thursday, July 10, 2003 10:33 PM
>> > Subject: Re: Divergent SImulator Results
>> >
>> >
>> >
>> >>(This is written in a login window, using a cell-phone as modem,
>> >>please excuse spelling errors etc.)
>> >>
>> >>All,
>> >>All our simulators contain errors, and will continue to do so
>> >>forewer. What we can hope to achieve is to reduce the number of
>> >>errors that make our results very incorrect.
>> >>During the implementation of our simulator we also found errors in the
>> >>RPR draft. I believe one goal of the simulation activity is to remove
>> >>such errors.
>> >>
>> >>When we run simulations another source of error is configuration. I
>> >>recently found an error in our configuration of the 62 staion scenario
>> >>from David and Harry, and I want to correct that before going
>> public with
>> >>more results.
>> >>
>> >>Answering Mikes questions I also found en error in the Shaper class.
>> >>I am sure we will find many more before the Simula-RPR simulator is
>> >>relatively stable and accurate. Changes to the RPR draft and hence the
>> >>Java code will introduce more errors, but also reveal some, I hope.
>> >>
>> >>Now to Mikes questions:
>> >>1. The Simula-RPR simulator is exact to the level of transmitting one
>> >>byte.
>> >>2. The shapers are accurate to the level of transmitting one byte.
>> >> What we do (what the code is intended to do) is that whenever a
>> >>shaper is used, its value is calculated. This is done by saving the
>> >>value, the time this value was computed and the increment. Hence it is
>> >>easy to calculate a new value whan the shaper is used the next time.
>> >>As I said above, I just found a bug in the Shaper class, and I
>> >>am sure there are more, but we have tested the simulator I while now,
>> >>and the shapers seems to work pretty well.
>> >>I need to go more detailed in to the code to answer all of a, b, c and
>> >>d, but in a sinulator I do not think simultaneous or near simultaneous
>> >>actions should be a problem (or more correctly: we must take that into
>> >>consideration all the time)
>> >>
>> >>3. We decrement to max_credit when we go above,
>> >> and we handle
>> >> negative credits (as we showed in the 62 station example)
>> >>
>> >>4. We decrease the full amount of creditrs the moment we move a packet
>> >> to the stage buffer (ie. 4b)
>> >>
>> >>Finally I fully agree that we must have several simulators agree
>> >>before we make changes to the draft based on simulations. And we can
>> >>not really make changes based on simulations, but on the new
>> >
>> > understanding we
>> >
>> >>get from the results of the simulations.
>> >>
>> >>Stein
>> >>
>> >>
>> >
>> >
>> >
>>
>>
>> --
>> Michael Takefman tak@xxxxxxxxx
>> Manager of Engineering, Cisco Systems
>> Chair IEEE 802.17 Stds WG
>> 2000 Innovation Dr, Ottawa, Canada, K2K 3E8
>> voice: 613-254-3399 cell:613-220-6991
>>