RE: [RPRWG] Why Data CRC Stomping is a BAD Idea?
Nirmal,
The STOMP is not a single value as you suggest below.
The proposed STOMP value in the frame is the inverse of the
expected CRC. If the STOMP value is the inverse of the expected CRC,
it should be as robust as the CRC. You have to calculate CRC
check on the frame anyway. I don't see any more complexity with
appending a bad frame with the inverted value.
I'm not sure if you are aware of the motivation for this
feature. Typically frames with bad CRCs would be thrown away
by the MAC. We would not have a statistics issue since
frames having bad CRCs would not be seen
by other stations on the ring. We introduced the requirement
of the frame HEC so that we could deliver the payload
when the RPR header is valid despite the frame errors
in the payload. The problem we had was that MAC statistics
would be counting frames with bad CRC in multiple places
which is a problem for problem isolation.
The RPR MAC also supports Ethernet PHY in addition to the
SONET PHY. The motivation for putting OAM features in the
MAC is to be able to have OAM when operating over PHYs such
as Ethernet which do not support OAM.
As per your other comments see below -
thanks,
robert
> -----Original Message-----
> From: Nirmal Saxena [mailto:nirmal@xxxxxxxxxxxxxxx]
> Sent: Wednesday, July 31, 2002 1:45 AM
> To: djz@xxxxxxxxxxx
> Cc: nirmal@xxxxxxxxxxxxxxx; stds-802-17@xxxxxxxx
> Subject: Re: [RPRWG] Why Data CRC Stomping is a BAD Idea?
>
>
>
> >
> > Nirmal,
> >
> > Would be nice to receive feedback on items earlier,
> > such as the comment phase. From what I understand,
> > some companies are interested in finishing this
> > standard(:>), so prompt review would be helpful!
>
> David I do recognize the importance of getting
> comments in a timely fashion. However, I hope
> this is not used as a deterrent to discourage
> valid concerns while the Standard is in still
> in progress.
>
> I will try to be prompt next time. :-)
>
>
> Here are my comments:
>
> a) CRC stomping has very little to do with protocol
> compliance. For example, it is not on par with
> requirements such as frame format etc.
> I am at a loss to understand why this is being
> made a requirement. At a minimum, it must be
> a suggested hint with full disclosure of caveats
> (such as those I raised in my comments).
It does affect protocol compliance as it
affects the error counts reported by the MAC. MACs also
need to be consistent in terms of how they report errors.
>
> b) Location of failed links is best left to layer 1
> interfaces. For example, loss-of-link or BER in SONET.
>
> A distinguished architect such as yourself would agree
> with me that it is never a good idea to overload
> gratituous functions on established methods when
> we are fully aware that we are actually reducing
> the probability of error logging and causing implementation
> problems.
>
> Even if we agree that layer 1 methods may not be sufficient
> to detect failed links; we have other established layer 2
> mechanisms like keep-alive messages to detect failed links/
> nodes.
We need to be able to isolate problems when the physical layer does
not support OAM such as Ethernet.
>
> c) The probability of correct CRC (undetected error) in the
> presence of error is irrelevant to the CRC stomping discussion
> because with or without stomping the probability of undetectable
> error is the same for both methods.
The relevance to the discussion is the probability of an errored frame
resulting in
a STOMPED CRC should not be any worse than the probability of an undetected
error.
>
> d) The real issue is the probability of not-logging given an error
> in data frame (i.e., probability of CRC being checkStomp).
> Your claim that the probability is 1/2**32 is not correct on two
> counts:
>
> 1) It assumes equally likey error model with probability
> of bit error = 1/2. The actual error rate on the links
> is much lower and estimating this probability with a known
> bit error rate is NP-Hard.
>
> 2) Your expression assumes unconditional probability. What we
> are interested in is conditional probability. That is,
> given a data frame error what is the probability that
> the computed CRC is equal to checkStomp value. My claim
> is that it could be very HIGH. For example, consider a
> a single bit error in a data frame of length L bits.
> Depending on the position of this single bit error and the
> chosen STOMP value the conditional probability could be
> ONE. I can generate a long list of STOMP values that are
> catestrophically BAD for the standard CRC-32 polynomial
> for single bit errors. The list for double-bit errors is
> quadratically longer. By the way, these catestrophical
> STOMP values are also functions of packet length. This
> compounds the problem of finding good STOMP values.
What is the probability that a single bit error results in an undetected
CRC error? I don't believe the undetected error rate of the CRC algorithm
is
ever 1 for single bit errors in the frame. Even if it is the average
undetected error rate should be substantially lower, and hence would
be the same magnitude of error for stomped frames.
Your argument in 2. is based on the assumption the STOMP value is
a constant value. That is not the case. The stomp algorithm
appends the frame with an inverted version of the expected CRC.
>
>
>