Re: [802.3EEESG] Proposals for modified objectives and subset phys

Thread Links	Date Links
Thread Prev	Thread Next	Thread Index	Date Prev	Date Next	Date Index

Hugh –

I agree with you on the points below. Expanding the scope of EEE to include new PHYs (whether you call them “subset” or not, you have to define framing, line codes, startup, link monitoring, electrical specs all like a new PHY) would result in a large amount of standards work, market confusion and would likely result in interoperability problems due to a lack of broad PHY expertise in the group.

I also believe that the complexity of expanding the scope of EEE to include “subset” phys is understated, and the benefits have been overstated. Detail to support this is below.

First of all, what is being called a subset is little more than using common PAM signaling and the same baud.

Framing, coding, slicing, and decision protocols would all be changed. The power savings comes from modifying or not using parts of the signal processing. Those that are not used will need some kind of modification to keep track or freeze their states while the PHY is active.

It’s a bit of an overgeneralization to call this a “subset”. It’s certainly not a proper subset (since at least framing and slicing are different).

Furthermore, the benefits are overstated. In the revised presentation, “powell_2_0507”, Slide 5 (Comparison of Possible Solutions, 1000BT-10GBT) the graphs misrepresent the data and likely scenarios. At this point, I’ll give the authors the benefit of the doubt and assume it’s a graphical error.

Note the graph on the left – the “Switching Time” – The note says that the blue “fast start” would be 20msec, but the graph (which most people look at to see the relative performance) has the bar all the way up OVER 100msec. Likewise, the “Subset PHY”, still TBD, has an estimated time of less than 2 usec, even though the PHY has a latency up to 2.5usec, and a simple check of the link (PCS_TEST) requires looking at errors over a 125usec interval.

As discussed before, determining that you have a valid connection before reestablishing the link is vital or corrupted packets will go through. A PCS test state simply looks to see if the LDPC decoder (which would have to be reengaged) is correctly synced and has a good enough SNR to perform. In my technical opinion, a 2usec switching time would send garbage through the network in favor of buffering for even 1msec.

I am running some lab experiments this afternoon to check on adaptation time unrestricted by infofield signaling. I expect results soon.

While these are fine technical points, I hope you can all keep your eyes on the big picture that this expansion of scope is vague and unwarranted.

The points are as clear as the 5 criteria:

1) Broad Market Potential – potentially proliferating new copper PHYs for these speeds would fracture the existing market and minimize the applicability. Since 802.3 doesn’t specify power consumption of implementations, they would be difficult to distinguish.

2) Compatibility – the new types will present interoperability problems with existing copper PHYs, especially if they are closely related. Other than the fact that they’ll need new management, inject a power management layer into the 802 hierarchy. Furthermore, management of upper-layer protocols would break the current layering of the spec. The concept of Subset PHYs would be difficult to envision within the current architecture of Auto-neg without making it a new PHY type. By contrast a EEE-enabled PHY (which would use an existing PHY type with enhancements to PHY control) would easily be able to identify itself in autoneg.

3) Distinct identity – new phy types would only be different in the amount of time specified to transition. Power consumption is not a specified parameter, neither has the signaling line code been a deterministic of distinct identity. Media type or speed are usually used. Ability to transition without Auto-Neg would be the only identifying feature, and isn’t that a characteristic of whatever EEE switching protocol is used, not of the PHY type?

4) Technical Feasibility – I think I’ve stated the case above and in my prior email that this hasn’t been shown, or benefits have been overstated.

5) Economic Feasibility – replacing all phy types with a noninteroperable new type, requiring new designs, and a HUGE standards effort is a tough pill to swallow economically. This has definitely NOT been shown for a EEE standard of expanded scope.

All of that said, I want to see EEE go forward in July. I think it’s important. But, I think it’s important as a build-on or make-from the existing ethernet architecture, not as an opportunity to redo every PHY decision made in the past few decades.

-george

From: Hugh Barrass [mailto:hbarrass@CISCO.COM]
Sent: Tuesday, May 29, 2007 12:27 AM
To: STDS-802-3-EEE@listserv.ieee.org
Subject: Re: [802.3EEESG] Proposals for modified objectives and subset phys

George,

I'm very sorry that you won't be able to make it to the interim, I think your input (like that below) would be invaluable. I agree with much of what you said, although I have a slightly different perspective:

1. I STRONGLY prefer the use of an existing or minimally altered PHY to a "subset" PHY, for the reasons that I have given previously.

2. The transition time from a lower to higher speed must meet the higher layer requirements. To date this has not been quantified but it seems to be gravitating towards 10ms. There was some work to demonstrate that 20ms may work for some applications, but it is questionable how reliable this will be across many applications. I haven't seen anything that requires a transition significantly faster than 10ms and I would caution against over-constraining the problem.

3. I, personally don't think there will be any significant problem beating the 10ms requirement for all backplane PHYs and RJ45 PHYs except for 10GBASE-T. It may yet prove too challenging to make a transition from 1000BASE-T to 10GBASE-T in less than 10ms for all circumstances. We may be able to get somewhere with your approach of "fast transitions" & "slow transitions" - where slow transitions are statistically unlikely and may exceed the 10ms limit. If this doesn't pan out then we could also consider a "subset PHY" approach.

4. I'm not sure exactly how we can express the objectives to allow some minor PHY changes for most of the PHYs and also allow deep exploration of the 10GBASE-T problem without restricting the solutions. I think it is vital that we move forward with a PAR in July so that we can start exploring baseline proposals after that.

I hope you can follow the discussions this week remotely, I look forward to seeing you in July.

Hugh.

George Zimmerman wrote:

Mike & EEE’ers:

First, let me express my regret at not being able to get to Geneva next week. It looks to be an interesting meeting.

Secondly, let me express my regret at not being able to hear and actively speak for or against the proposals for changing the objectives or considering “subset PHYs” at the meeting.

Based on my knowledge of PHYs, experience in standards projects in IEEE and elsewhere, and under the limitations in my knowledge from not hearing directly these discussions, it is my belief that the proposal for expanding the objectives, coupled with a discussion of “subset PHYs” would lead to a long-delayed and problematic EEE project, without any great benefits.

The discussion on changing the objectives seems to make much hay of “vagueness”. However, in my opinion, the proposed remedy of replacing the speed changes with “a lower power mode” makes the result ambiguous, essentially opening the door to anything that someone could argue is lower power (and 802.3 standards don’t specify the power of devices either – and traditionally this is an area of much disagreement between vendors and their implementations.). Furthermore, this change has the potential to make EEE as complicated as starting 4 new PHY projects at once – something it would be difficult to see the bandwidth for in the industry (Even though it would be a great job-protection program for guys like me ;)

While Pat has raised some important arguments about required transition times that I think need to be incorporated, I believe that we could incorporate a link to QoS or shorter switching time with less pain. The pain is considerable, and I don’t believe the authors have fully considered what training and transitions would need to occur with a subset PHY. Some more technical detail is below, but let me cut to the chase – a new PHY is unnecessary, if you solve the startup fine-training and link-test problem. Any subset PHY would have to solve this anyways on the upshift transition. As someone in this discussion who has brought up robust 10GBASE-T PHYs, I tell you that turning on and off FEXT, NEXT and ECHO filters, changing PAM levels, as well as disabling FFEs to save power, even if you cycle the pairs, which has it’s one problem, you will still have to occasionally fine-train and most always re-check the link integrity. The easiest way to do that was to reuse the PHY control diagram in 10GBASE-T, and that was the source of my analysis. If 10-20msec is too long, as Pat suggests, then we’d have to do something else.

The main sources of the longer switching time come from the use of the existing PHY control transition mechanism in 802.3an for transition states in fine-retraining and testing the high-rate link on the upshift transition. The lion’s share of this process is driven by the infofield signaling in the 802.3an standard. Any faster transition would require a reworked signaling scheme anyways, so why not use it for upshifting from an existing, well-defined PHY, and have to deal with only about 10% of the standards work? (which is still a lot, since it’s on several PHYs).

That’s the gist of it. What follows below are some specific examples of where the “subset PHY” arguments go awry.

First, the discussion on complexity of shifting using existing PHYs is overblown. Nobody in their right mind wanting to downshift from 10GBASE-T to 1000BASE-T would add a math coprocessor just for the purpose. It’s easier to straight decimate and train a response. After all, we’ve all been told (and some have seen) that 1000BASE-T trains very rapidly, and beginning from a close target response should get it there even faster. Keeping a prior stored state should be sufficient. First-time training would take longer, of course, but that will be true regardless.

Speaking as someone who’s spent a lot of time in the lab getting many different PHY types working, I’m afraid that the thinking on the subset PHY proposals themselves seems a bit oversimplified, and lacks practical experience. A simple example of this is the “pair cycling”. The wire has memory, which is LONG in terms of symbol times. Thus, you’d either have to wait for things to clear for up to 1 usec or re-enable echo and NEXT cancellers…. Another example is that the discussion seems to believe that you can switch on a frame boundary. Frame switching would need to be synchronized at the PHY level, which would require out-of-band signaling – something there isn’t significant room for in the link. And then there’s the fact that the link has latency of 2.5usec (hence 5usec round trip….) So single-digit microseconds are out. And then there’s the issue of how the upshift is accommodated. Re-entry requires re-synchronization and testing of the LDPC link. That was the MAIN reason for entering into the 10GBASE-T PHY control state machine and taking a 10 msec hit.

As mentioned before, the 10-20msec doesn’t come from the fact that you’re using 1000BASE-T as the low-speed state, but rather the re-use of the 10GBASE-T PHY control to do 2 things:

1) fine-tuning the 10GBASE-T receive path from a state that isn’t sufficient for PAM-16 transmission (this would have to be done as well upshifting from a subset PHY – PAM4 training is different from PAM16 training, and simplex transmission, without FEXT will lead to changes in the coefficients of the NEXT cancellers, because the receive filter response will drift).

And, 2) synchronizing and testing the LDPC decoder to ensure a good link before data is re-engaged. Testing the link itself takes many frames of LDPC data, and, if you use the link quality metric in the 802.3an standard, that in itself costs 125usec.

However, the main cost is really using Gottfried’s infofields and the protocol already developed for managing the transitions, and the PHY control machine in the standard. These require state transitions to take hundreds of usec. If you simply go away and develop a new transition scheme (which is much easier than defining an entirely new “subset PHY”) you can reduce the time substantially. This is the right place to focus if you really want to change the upshift transition time.

In summary, changing the objective language to be a lower “power state” makes it VERY ambiguous and nebulous, and doesn’t define speeds. It broadens the scope of EEE to make it likely unmanageable, and more likely to produce market confusion. I would not support that. I further do not think that including “subset PHYs” solves any problems, and creates a whole set of new ones, including the workload of 802.3an defining multiple PHY types at one time.

What I DO support is the effort to put some numbers on the switching time. You have made an interesting case that the time has to be under 10msec, and probably near 1 in the case of certain applications. I’m not sure that any rationale switching algorithm would be dropping the speed if there were a constant video feed running or a constant set of VOIP calls running. (And, if it’s just one call, does anyone really care if it warbles a bit…) However, we do need to be talking with the appropriate application developers to allow the EEE switching to be sensitive to QoS. This begs the entire question of WHERE and in what body the rate-control algorithms will be developed.

That is where the application problems Pat raises should be solved. Not by redefining entirely new PHYs, which would take years and end up with things that looked nice from 40,000 feet, but don’t look so great when you get close.

-george

George A. Zimmerman

Founder & CTO, PHY Technologies

Solarflare Communications

gzimmerman@solarflare.com

Tel: 949-581-6830 x2500

Cell: 310-920-3860