Thread Links | Date Links | ||||
---|---|---|---|---|---|
Thread Prev | Thread Next | Thread Index | Date Prev | Date Next | Date Index |
Hugh – I agree with you on the points below. Expanding
the scope of EEE to include new PHYs (whether you call them “subset”
or not, you have to define framing, line codes, startup, link monitoring,
electrical specs all like a new PHY) would result in a large amount of standards
work, market confusion and would likely result in interoperability problems due
to a lack of broad PHY expertise in the group. I also believe that the complexity of
expanding the scope of EEE to include “subset” phys is understated,
and the benefits have been overstated. Detail to support this is below. First of all, what is being called a
subset is little more than using common PAM signaling and the same baud. Framing, coding, slicing, and decision
protocols would all be changed. The power savings comes from modifying or
not using parts of the signal processing. Those that are not used will
need some kind of modification to keep track or freeze their states while the
PHY is active. It’s a bit of an overgeneralization
to call this a “subset”. It’s certainly not a proper
subset (since at least framing and slicing are different). Furthermore, the benefits are overstated. In the revised presentation, “powell_2_0507”,
Slide 5 (Comparison of Possible Solutions, 1000BT-10GBT) the graphs
misrepresent the data and likely scenarios. At this point, I’ll
give the authors the benefit of the doubt and assume it’s a graphical
error. Note the
graph on the left – the “Switching Time” – The
note says that the blue “fast start” would be 20msec, but the graph
(which most people look at to see the relative performance) has the bar all the
way up OVER 100msec. Likewise, the “Subset PHY”, still TBD, has an
estimated time of less than 2 usec, even though the PHY has a latency up to
2.5usec, and a simple check of the link (PCS_TEST) requires looking at errors
over a 125usec interval. As discussed before, determining that you
have a valid connection before reestablishing the link is vital or corrupted
packets will go through. A PCS test state simply looks to see if the LDPC
decoder (which would have to be reengaged) is correctly synced and has a good
enough SNR to perform. In my technical opinion, a 2usec switching time would
send garbage through the network in favor of buffering for even 1msec. I am running some lab experiments this
afternoon to check on adaptation time unrestricted by infofield signaling.
I expect results soon. While these are fine technical points, I hope
you can all keep your eyes on the big picture that this expansion of scope is
vague and unwarranted. The points are as clear as the 5 criteria: 1) Broad Market Potential – potentially proliferating
new copper PHYs for these speeds would fracture the existing market and minimize
the applicability. Since 802.3 doesn’t specify power consumption of
implementations, they would be difficult to distinguish. 2) Compatibility – the new types will present
interoperability problems with existing copper PHYs, especially if they are
closely related. Other than the fact that they’ll need new
management, inject a power management layer into the 802 hierarchy. Furthermore,
management of upper-layer protocols would break the current layering of the
spec. The concept of Subset PHYs would be difficult to envision within the
current architecture of Auto-neg without making it a new PHY type. By
contrast a EEE-enabled PHY (which would use an existing PHY type with
enhancements to PHY control) would easily be able to identify itself in
autoneg. 3) Distinct identity – new phy types would
only be different in the amount of time specified to transition. Power
consumption is not a specified parameter, neither has the signaling line code
been a deterministic of distinct identity. Media type or speed are
usually used. Ability to transition without Auto-Neg would be the
only identifying feature, and isn’t that a characteristic of whatever EEE
switching protocol is used, not of the PHY type? 4) Technical Feasibility – I think I’ve
stated the case above and in my prior email that this hasn’t been shown,
or benefits have been overstated. 5) Economic Feasibility – replacing all phy
types with a noninteroperable new type, requiring new designs, and a HUGE
standards effort is a tough pill to swallow economically. This has
definitely NOT been shown for a EEE standard of expanded scope. All of that said, I want to see EEE go
forward in July. I think it’s important. But, I think it’s
important as a build-on or make-from the existing ethernet architecture, not as
an opportunity to redo every PHY decision made in the past few decades. -george From: Hugh Barrass [mailto:hbarrass@CISCO.COM] George, First, let me express my regret at not being able to
get to Secondly, let me express my regret at not being able
to hear and actively speak for or against the proposals for changing the
objectives or considering “subset PHYs” at the meeting. Based on my knowledge of PHYs, experience in
standards projects in IEEE and elsewhere, and under the limitations in my
knowledge from not hearing directly these discussions, it is my belief that the
proposal for expanding the objectives, coupled with a discussion of
“subset PHYs” would lead to a long-delayed and problematic EEE
project, without any great benefits. The discussion on changing the objectives seems to
make much hay of “vagueness”. However, in my opinion, the
proposed remedy of replacing the speed changes with “a lower power
mode” makes the result ambiguous, essentially opening the door to
anything that someone could argue is lower power (and 802.3 standards
don’t specify the power of devices either – and traditionally this
is an area of much disagreement between vendors and their
implementations.). Furthermore, this change has the potential to make EEE
as complicated as starting 4 new PHY projects at once – something it
would be difficult to see the bandwidth for in the industry (Even though it
would be a great job-protection program for guys like me ;) While Pat has raised some important arguments about
required transition times that I think need to be incorporated, I believe that
we could incorporate a link to QoS or shorter switching time with less pain.
The pain is considerable, and I don’t believe the authors have
fully considered what training and transitions would need to occur with a
subset PHY. Some more technical detail is below, but let me cut to the
chase – a new PHY is unnecessary, if you solve the startup fine-training
and link-test problem. Any subset PHY would have to solve this anyways on
the upshift transition. As someone in this discussion who has brought up
robust 10GBASE-T PHYs, I tell you that turning on and off FEXT, NEXT and ECHO
filters, changing PAM levels, as well as disabling FFEs to save power, even if
you cycle the pairs, which has it’s one problem, you will still have to
occasionally fine-train and most always re-check the link integrity. The
easiest way to do that was to reuse the PHY control diagram in 10GBASE-T, and
that was the source of my analysis. If 10-20msec is too long, as Pat
suggests, then we’d have to do something else. The main sources of the longer switching time come
from the use of the existing PHY control transition mechanism in 802.3an for
transition states in fine-retraining and testing the high-rate link on the
upshift transition. The lion’s share of this process is driven by the
infofield signaling in the 802.3an standard. Any faster transition would
require a reworked signaling scheme anyways, so why not use it for upshifting
from an existing, well-defined PHY, and have to deal with only about 10% of the
standards work? (which is still a lot, since it’s on several PHYs). That’s the gist of it. What follows below
are some specific examples of where the “subset PHY” arguments go
awry. First, the discussion on complexity of shifting using
existing PHYs is overblown. Nobody in their right mind wanting to
downshift from 10GBASE-T to 1000BASE-T would add a math coprocessor just for
the purpose. It’s easier to straight decimate and train a
response. After all, we’ve all been told (and some have seen) that
1000BASE-T trains very rapidly, and beginning from a close target response should
get it there even faster. Keeping a prior stored state should be
sufficient. First-time training would take longer, of course, but that
will be true regardless. Speaking as someone who’s spent a lot of time
in the lab getting many different PHY types working, I’m afraid that the
thinking on the subset PHY proposals themselves seems a bit oversimplified, and
lacks practical experience. A simple example of this is the “pair
cycling”. The wire has memory, which is LONG in terms of symbol
times. Thus, you’d either have to wait for things to clear for up
to 1 usec or re-enable echo and NEXT cancellers…. Another example
is that the discussion seems to believe that you can switch on a frame
boundary. Frame switching would need to be synchronized at the PHY level,
which would require out-of-band signaling – something there isn’t
significant room for in the link. And then there’s the fact that
the link has latency of 2.5usec (hence 5usec round trip….) So
single-digit microseconds are out. And then there’s the issue of
how the upshift is accommodated. Re-entry requires re-synchronization and
testing of the LDPC link. That was the MAIN reason for entering into the
10GBASE-T PHY control state machine and taking a 10 msec hit. As mentioned before, the 10-20msec doesn’t come
from the fact that you’re using 1000BASE-T as the low-speed state, but
rather the re-use of the 10GBASE-T PHY control to do 2 things: 1) fine-tuning the
10GBASE-T receive path from a state that isn’t sufficient for PAM-16
transmission (this would have to be done as well upshifting from a subset PHY
– PAM4 training is different from PAM16 training, and simplex
transmission, without FEXT will lead to changes in the coefficients of the NEXT
cancellers, because the receive filter response will drift). And, 2)
synchronizing and testing the LDPC decoder to ensure a good link before data is
re-engaged. Testing the link itself takes many frames of LDPC data, and,
if you use the link quality metric in the 802.3an standard, that in itself costs
125usec. However, the main
cost is really using Gottfried’s infofields and the protocol already
developed for managing the transitions, and the PHY control machine in the
standard. These require state transitions to take hundreds of usec.
If you simply go away and develop a new transition scheme (which is much easier
than defining an entirely new “subset PHY”) you can reduce the time
substantially. This is the right place to focus if you really want to
change the upshift transition time. In summary, changing the objective language to be a
lower “power state” makes it VERY ambiguous and nebulous, and
doesn’t define speeds. It broadens the scope of EEE to make it
likely unmanageable, and more likely to produce market confusion. I would
not support that. I further do not think that including “subset
PHYs” solves any problems, and creates a whole set of new ones, including
the workload of 802.3an defining multiple PHY types at one time. What I DO support
is the effort to put some numbers on the switching time. You have made an
interesting case that the time has to be under 10msec, and probably near 1 in
the case of certain applications. I’m not sure that any rationale
switching algorithm would be dropping the speed if there were a constant video
feed running or a constant set of VOIP calls running. (And, if it’s
just one call, does anyone really care if it warbles a bit…)
However, we do need to be talking with the appropriate application developers
to allow the EEE switching to be sensitive to QoS. This begs the entire
question of WHERE and in what body the rate-control algorithms will be
developed. That is where the application problems Pat raises
should be solved. Not by redefining entirely new PHYs, which would take
years and end up with things that looked nice from 40,000 feet, but don’t
look so great when you get close. -george George A. Zimmerman Founder & CTO, PHY Technologies Solarflare Communications Tel: 949-581-6830 x2500 Cell: 310-920-3860 |