Thread Links Date Links
Thread Prev Thread Next Thread Index Date Prev Date Next Date Index

Re: [STDS-802-3-400G] one comment about 4x100G breakout RE: IEEE P802.3bs 400 Gb/s Ethernet Task Force Logic Ad Hoc



Duwenhua,

 

100G uses KR4 FEC. When we write standards for 100GBASE-ER, 100GBASE-LR, 100GBASE-FR, and perhaps 100GBASE-SR, these will likely be the final standards for 100G Ethernet as we have with 10GBASE-ER/LR/SR for 10G Ethernet. Are you sure 100G KP4 is sufficient and that when we come to writing these standards we won’t want something different?

 

I agree we see premise that 100G KP4 FEC should be sufficient for up to 2 km applications at least with 100G PAM-4. Perhaps it is best to prove that 100G KR4 FEC is insufficient, since we will have that available for potential use.

 

A successful architecture including the PCS and FEC should be able to go over different electrical or optical interfaces such as 16 wide, to 8 wide, to 4 wide, and maybe even single lane and at the other end back out to 4 wide, 8 wide, or 16 wide; perhaps even more choices. When we added FEC to 100G Ethernet, the 100G KR4 FEC, we put a constraint that everything be only 4 lanes wide and any use of 802.3ba gear box is no longer possible. This is the chief thing we wish the 400G architecture to be resilient.

 

Perhaps focus on a new 100G FEC that also does not preclude such resilience should be highlighted.

 

Jeff

 

 

From: Duwenhua [mailto:duwenhua@xxxxxxxxxxxxx]
Sent: Tuesday, June 30, 2015 9:00 PM
To: Jeffery Maki; STDS-802-3-400G@xxxxxxxxxxxxxxxxx
Subject: RE: [STDS-802-3-400G] one comment about 4x100G breakout RE: IEEE P802.3bs 400 Gb/s Ethernet Task Force Logic Ad Hoc

 

Hi Jeff,

 

For FEC architecture to Breakout  into 100G KP4 FEC, it is already discussed in 802.3bs group. I think it is a potential trend to use KP4 FEC to cover future 100G PMD, for example 100G PAM4 per lane.

From a 3.2T or 6.4T  logic chip which has many 400GbE ports implementation perspective, supporting 100G KP4 FEC will benefit for protect investment.

Breakout  4x100G is very important at the  position of Spine switch and Core switch (exclude ToR):

       Port-density: increased the number of 100GbE from 32-port to 64-port,  i.e., 16xCDFP2 breakout to 64-port 100G

       Flexibility:  Ports are dynamic changed to  400GbE or 4x100GbE, plus and play

 

The following document is from David and John( you are the supporter). I copy some statement here:

http://www.ieee802.org/3/bs/public/15_01/ofelt_3bs_01a_0115.pdf

cid:image003.jpg@01D0B3F1.01393140

 

Thank you!

Du Wenhua

 

From: Jeffery Maki [mailto:jmaki@xxxxxxxxxxx]
Sent: Wednesday, July 01, 2015 2:38 AM
To: Duwenhua; STDS-802-3-400G@xxxxxxxxxxxxxxxxx
Subject: RE: [STDS-802-3-400G] one comment about 4x100G breakout RE: IEEE P802.3bs 400 Gb/s Ethernet Task Force Logic Ad Hoc

 

Duwenhua,

 

The problem I see in moving forward with your proposal to define “4x100G FEC” for 400G Ethernet is that we do not have all of the 100G Serial PMDs defined at this time to assess whether “100G KP4 FEC” meets the need. A single lane of 400GBASE-PSM4 would be supported as a form of 100G Serial. Is that really all that shall be needed, though? Does 100G KP4 FEC coding gain meet all of the potential other needs? We may find ourselves with a FEC solution that when the time comes does not fully meet the need.

 

Jeff

 

From: Duwenhua [mailto:duwenhua@xxxxxxxxxxxxx]
Sent: Friday, June 26, 2015 8:25 PM
To: STDS-802-3-400G@xxxxxxxxxxxxxxxxx
Subject: Re: [STDS-802-3-400G] one comment about 4x100G breakout RE: IEEE P802.3bs 400 Gb/s Ethernet Task Force Logic Ad Hoc

 

Hi, Paul

       Thanks for your explaining .

1)       First, it is a good news that you agreed with us:  ToR do not need 16x25GE 400G-SR16 breakout;

--One of your mentioned reasons: ToR switches being more related to the use of copper cables (DACs)

--Another reason: TOR do not have 400GE port;

2)       Then, you talked about EoR and MoR,  you made a conclusion:   CDFP break out 16x25G is a good choice for EoR and MOR.

-          Your reason:  The CDFP form factor offers the highest density for 25G ports (than QSFP28)

 

       But I do not agree with you on your conclusion about EoR and MoR.  Because:

1)       QSFP28 breakout 4x25G is the winner for EoR and MoR:   a) same density; b)lower cost (High volume);  c)more matured

--a) Same Density:  both QSFP28 and CDFP can breakout 128-port of 25G.  A 19-inch line card face has 8 CDFP, so you get 8*16=128-port of 25G; A 19-inch line card face has 32 QSFP, so you get 32*4=128-port of 25G.   

--b) QSFP28 has Lower cost:  High Volume leads to low cost.   QSFP28’s volume is 10 times bigger than CDFP’s Volume. Many other protocols choose QSFP28 , FC, infiband…

--c) More matured: there are some released QSFP28 4x25G breakout products: http://investor.finisar.com/releasedetail.cfm?releaseid=903065

2)       EoR and MoR is not the main stream in datacenter

--ToR is the main stream in data center. ToR use copper cable, copper cable is cheaper than optical cable and transceiver.

--Both facebook and Google has some papers talk about switch in data center. They think ToR  is less expensive per port than MoR and EoR, because EoR and MoR is very big and very  complex  while ToR is a 1-RU Pizza box.   

--facebook OCP:  http://www.opencompute.org/wiki/Networking    Goole’s networking:  http://cseweb.ucsd.edu/~vahdat/papers/hoti09.pdf

--Even if some other providers or OTT companies need EoR and MoRthey can choose QSFP28 breakout 4x25G J

 

So, after receiving so many valuable feedbacks  ,  I think my initial 3 conclusions  are still  correct:
 1) 4x100G breakout, the FEC should be able to dynamic changed: 4x100G KR4, or 4x100G KP4.
 2) 4x100G breakout is very important in data center switch at the positions of Spine switch and Core switch, because face port density and flexibility.
 3) 8x50G and 16x25G breakout is not important, because only TOR switch need 50G/25GE, but TOR switch do not have 400GE port.

 

                                                                  Du Wenhua

 

From: Kolesar, Paul [mailto:PKOLESAR@xxxxxxxxxxxxx]
Sent: Saturday, June 27, 2015 12:48 AM
To: STDS-802-3-400G@xxxxxxxxxxxxxxxxx
Subject: Re: [STDS-802-3-400G] one comment about 4x100G breakout RE: IEEE P802.3bs 400 Gb/s Ethernet Task Force Logic Ad Hoc

 

Dan,

Thanks for explaining. 

 

Form factor is important for two main reasons:

1)      board real estate consumed per port, and

2)      commonality of form to support PMD change flexibility.

 

Reason 1) applies almost universally across functional platforms (switches, NICs, routers).  Reason 2) is important for switches offering port flexibility, for example the ability to change from DAC to AOC to pluggable fiber solutions.

 

The CDFP form factor offers the highest density for 25G ports, so the 400G-SR16 PMD which fits within the CDFP is the winner on reason 1). 

 

To your point on reason 2), the CDFP not likely to be preferred over the more common forms like SPF or QSFP which are established to accommodate many different PMDs.  But there are switch designs not requiring server-side port flexibility.  Switches that are purpose built for fiber include those with embedded optics and those designed for MoR and EoR placements.  

 

I now realize that I need to correct myself for using ToR incorrectly to refer to server (access) switches in general.  I should have used the term “access” which covers ToR, MoR (middle-of row) and EoR (end-of-row) switch placements.  This gets to Du’s point about ToR switches being more related to the use of copper cables (DACs), and also embraces the use of fiber for MoR and EoR.  However for MoR or EoR access switch server ports, the need to change the PMD is small because DACs cannot accommodate the reach, and SM PMDs are overkill.  Accommodation of AOCs would be lacking, but AOCs are generally not deployed for reaches longer than 20m, which is restrictive for EoR placements.  Thus the CDFP housing the 400G-SR16 PMD seems like a good choice.

 

That same form & PMD combination can also provide 4x100G-SR4 uplinks that allow an increase in radix, uplink redundancy, and/or higher uplink aggregate rates (by delivering 2x200G). 

 

For these reasons I continue to see good cause to employ 400G-SR16, even if it is not for true 400G interfaces.  That means we should also be considering accommodation of the FEC issues that allow it to be a fully functional solution in all of these scenarios.  Whether that impacts the 802.3bs project or the 802.3by project may be a matter of scope between them.  But we should not be casting those considerations aside.

 

Paul

 

From: Dan Dove [mailto:dan.dove@xxxxxxxxxxxxxxxxxx]
Sent: Thursday, June 25, 2015 8:24 AM
To: Kolesar, Paul; STDS-802-3-400G@xxxxxxxxxxxxxxxxx
Subject: Re: [STDS-802-3-400G] one comment about 4x100G breakout RE: IEEE P802.3bs 400 Gb/s Ethernet Task Force Logic Ad Hoc

 

Hi Paul,

I guess the question is form factor. To support 400G in a 16x25G breakout format will impact density.

For TOR where 25G serial is a likely speed for the highest density, with 100G being the likely uplink speed, we can expect QSFP-28 to dominate the volume. This enables a 25G down, 100G up implementation. It also allows Brad's 2x50G (50G being 2x25) per port and potentially 200G up or 2x100G up per QSFP when 50G line rates occur.

For enterprise, this is all over-kill for many years to come but will allow a smooth migration from 10G to 25G to 50G without ever getting out of SFP/QSFP form factors.

Dan Dove
Chief Consultant
Dove Networking Solutions
530-906-3683 - Mobile

On 6/25/15 5:57 AM, Kolesar, Paul wrote:

Dan, Brad, Ali, Du,

I think there may be a fundamental disconnect in our discussion.  Here you all seem to be viewing the introduction of 400G-SR16 PMD in ToR switches as simultaneously delivering 400G uplinks.  While it certainly is a 400G interface, it also has at least two additional utilities:  1) as a 4x100G-SR4 interface and 2) as a 16x25G interface.  It is not necessary for ToR switches to offer 400G uplinks in order for the 400G-SR16 transceiver to be useful for cost reduction of the 25G server-side ports.  This is a wider and faster extension of the already-proven-successful deployment of 40G-SR4 as 4x10G, or the use of 100G-SR4 for 4x25G.  I am sure this is already obvious to you, but I need to make this distinction for clarity. 

 

If 25G servers will exist side by side in the market with 10G servers and 50G servers, then switches optimized for each make sense.  Each of these rates will have its own significant lifespan, serving the various needs of the diverse market. It may be that 100G-SR4 is the go-to solution for increasing density and decreasing cost for 25G server-side switch ports because it is a mirror of 40G-SR4 deployments.  If that is the case, please explain why integration of 4 ports takes preference over integration of 16.  Does it come back to Joel’s implementation issues, or is it something else?

 

Regards,

Paul

 

From: Dan Dove [mailto:dan.dove@xxxxxxxxxxxxxxxxxx]
Sent: Wednesday, June 24, 2015 7:35 PM
To: Kolesar, Paul; STDS-802-3-400G@xxxxxxxxxxxxxxxxx
Subject: Re: [STDS-802-3-400G] one comment about 4x100G breakout RE: IEEE P802.3bs 400 Gb/s Ethernet Task Force Logic Ad Hoc

 

Hi Paul,

I agree with Brad, Ali on this. While "25G is an important and lasting lane rate", it will rapidly shift to 50G serial rates when they become available at an approximate cost (1.5x or less) for higher volume applications.

Its all about timeframes.

I agree about your point of MicroSoft vs the broader market. It sounds like they are driving server I/O bandwidth faster than the enterprise and will be for a long time to come. So you aren't missing the target on volume demand, but as for time-frame I think that by the time 50G I/O hits the enterprise, 50G serial will be mainstream.

That said, the enterprise market is not going to be putting 400G on TOR switches for a "long-time". 4x50G QSFP-28 is a more likely solution in that market space and in that time frame and thus 16x25G for TOR applications doesn't make sense. And this 400G project is the 1st of more to come, so pushing a 400G breakout to servers makes no sense.

The original question of supporting KP4 FEC on a 400G port in case 100G PAM-4 needs the FEC is a good one IMO. Its a tough one as well. Bet on the come? Bet on the pass? I think we can assume that future projects for 400G can address it in higher-density, lower cost timeframes. I would bet on the pass for this project. Leave it to the next generation 400G project.

Dan Dove
Chief Consultant
Dove Networking Solutions
530-906-3683 - Mobile

On 6/24/15 4:44 PM, Kolesar, Paul wrote:

Brad,

I am not trying to dispute what you already know as an employee of Microsoft.  But the 50G you speak of is multi-lane, meaning 2x25G.  This only emphasizes my point that 25G is an important and lasting lane rate because it has a lot of utility.  Just because 50G servers may exist does not mean 25G servers disappear.  That certainly has not been the case with 1G servers in the wake of 10G servers.  The redesign of switches is not only driven by lane rate increases, but also by cost reductions.  There will remain a lot of customers interested in non-bleeding edge solutions even as hyperscale DCs move upward.  So while your view may be spot on for Microsoft, that does not mean it reflects the total market.  As a recurring theme it has become obvious that one size increasingly fails to fit all.

 

Regards,

Paul

 

From: Brad Booth [mailto:bbooth@xxxxxxxx]
Sent: Wednesday, June 24, 2015 5:52 PM
To: STDS-802-3-400G@xxxxxxxxxxxxxxxxx
Subject: Re: [STDS-802-3-400G] one comment about 4x100G breakout RE: IEEE P802.3bs 400 Gb/s Ethernet Task Force Logic Ad Hoc

 

There will be 50 GbE on the server before there is 400G on the TOR and even before 400G task force gets to working group ballot. :-) It won't be 50G on a single lane though, but it will be 50G Ethernet.

 

Also, by the time 400G is on the TOR, there will be end users considering 100G (again, not on a single lane) to the server.

 

Cheers,
Brad

 

On Wed, Jun 24, 2015 at 1:27 PM, Ali Ghiasi <aghiasi@xxxxxxxxx> wrote:

Paul 

 

I said by "the time server and silicon technology catches where 400 GbE is integrated on the TOR”!  In 1-2 silicon generation when 400 GbE is integrated on the TOR then you will also see 50 GbE on the server.

 

We are developing 25GbE because 10G is not sufficient for a segment of market and 25 GbE cost less than the alternative the 40 GbE.  The same economic will apply to 50 GbE later on!

 

Thanks,

Ali Ghiasi

Ghiasi Quantum LLC

 

 

 

On Jun 24, 2015, at 12:31 PM, Kolesar, Paul <PKOLESAR@xxxxxxxxxxxxx> wrote:

 

Ali,

In order for 50G lanes to make sense for ToR switches, servers must be able to use that rate.  I think the market will put 25G servers to several years of use before 50G servers overtake them.  Witness the time lag for 10G servers to overtake sub-10G servers.  10G is now in its prime with 25G up and coming.  If this is not true, why are we bothering to develop 25GE?

 

Paul

 

From: Ali Ghiasi [mailto:aghiasi@xxxxxxxxx
Sent: Wednesday, June 24, 2015 1:59 PM
To: Kolesar, Paul
Cc: STDS-802-3-400G@xxxxxxxxxxxxxxxxx
Subject: Re: [STDS-802-3-400G] one comment about 4x100G breakout RE: IEEE P802.3bs 400 Gb/s Ethernet Task Force Logic Ad Hoc

 

Paul 

 

Initial application of 400 GbE will follow 100 GbE early deployment SMF ports on Routers and Spin switches.  Du Wenhua statement is correct to say 400 GbE will not be integrated initially on TOR.  By the time server and silicon technology catches where 400 GbE will be integrated on TOR then we will have 50 GbE on single lane of SMF/MMF. 

 

So 16x25G MMF has limited life/application and soon to be obsolete with emergence of 50G eco-system.

 

Thanks,

Ali Ghiasi

Ghiasi Quantum LLC

 


<image001.png>

 

On Jun 24, 2015, at 7:25 AM, Kolesar, Paul <PKOLESAR@xxxxxxxxxxxxx> wrote:

 

Du Wenhua,
I thanks for your thoughts.  This is a valuable discussion.  

I do not disagree with your points on KP4 and KR4 FEC at 100G, but I have a different view on your conclusion point 3). I think 16x25G and 8x50G break-outs can play a role in ToR switch ports facing the servers.  A port that delivers 16 lanes of 25G can provide transceiver cost advantages compared to discrete transceivers. It also provides electrical signal routing advantages because all electrical lanes can be of nearly the same minimal length rather than spread out to reach ports across the entire switch faceplate which alleviates the need for repeater chips saving cost and power.

Regards,
Paul Kolesar

-----Original Message-----
From: Duwenhua [mailto:duwenhua@xxxxxxxxxxxxx
Sent: Wednesday, June 24, 2015 3:36 AM
To: STDS-802-3-400G@xxxxxxxxxxxxxxxxx
Subject: [STDS-802-3-400G] one comment about 4x100G breakout RE: IEEE P802.3bs 400 Gb/s Ethernet Task Force Logic Ad Hoc

Hi, Martin

400GE FEC Breakout: (langhammer_02_0615_logic.pdf)  pag5 said:
1x400G KP4 ASIC: 55645
1x400G KP4 & 4x100G KR4 ASIC: 61518 
  11% area increase

My comments: 
 1) What is the size of 1x400G KP4 & 4x100G KP4 FEC? 
 2) today 100GbE choose KR4
but I think that the breakout 100GbE will need KP4 FEC if the 100GbE SMF PMD choose 100G PAM4 in future.
     --In long term, serial is better than parallel. For 100GbE,  100G PAM4 SMF PMD is better than 4x25G SMF PMD.
 3) same idea can be seen at page29 of http://www.ieee802.org/3/bs/public/14_05/maki_3bs_01a_0514.pdf   QSFP28 has a DSP&FEC which convert 4x25G to 1x100G.

My conclusions:
 1) 4x100G breakout, the FEC should be able to dynamic changed: 4x100G KR4, or 4x100G KP4.
 2) 4x100G breakout is very important in data center switch at the positions of Spine switch and Core switch, because face port density and flexibility.
 3) 8x50G and 16x25G breakout is not important, because only TOR switch need 50G/25GE, but TOR switch do not have 400GE port.

                         --Du Wenhua

-----Original Message-----
From: Mark Gustlin [mailto:mark.gustlin@xxxxxxxxxx]
Sent: Saturday, June 20, 2015 8:09 AM
To: STDS-802-3-400G@xxxxxxxxxxxxxxxxx
Subject: Re: [STDS-802-3-400G] IEEE P802.3bs 400 Gb/s Ethernet Task 
Force Logic Ad Hoc

All,

Presentations are now up on the logic ad hoc web site from today's ad hoc call:
http://www.ieee802.org/3/bs/public/adhoc/logic/index.shtml

Thanks, Mark

-----Original Message-----
From: Mark Gustlin
Sent: Monday, June 15, 2015 10:59 AM
To: STDS-802-3-400G@xxxxxxxxxxxxxxxxx
Subject: RE: IEEE P802.3bs 400 Gb/s Ethernet Task Force Logic Ad Hoc

All,

Just a reminder of the upcoming logic ad hoc call.
The current agenda is:
400GE FEC Implementation - Martin Langhammer Considerations for 
breakout - Martin Langhammer 1x400G vs 4x100G FEC Implications - 
Bill Wilkie

Requests are due by end of day Wednesday. There is room for only one 
more presentation for this call, so the next request will get the last slot.
There is another opportunity on June 29th.

Thanks, Mark Gustlin


From: Mark Gustlin [mailto:mark.gustlin@xxxxxxxxxx]
Sent: Thursday, June 04, 2015 2:49 PM
To: STDS-802-3-400G@xxxxxxxxxxxxxxxxx
Subject: [STDS-802-3-400G] IEEE P802.3bs 400 Gb/s Ethernet Task 
Force Logic Ad Hoc

All,

This is an announcement of the next IEEE P802.3bs 400 Gb/s Ethernet 
Task Force Logic Ad Hoc conference call opportunity.

The next call will take place on Friday June 19th from 8am-10am PDT.

If you are interested in presenting, please request a timeslot by 
the end of Wednesday June 17th. A separate calendar invite will also follow.

In addition there will be another opportunity on Monday June 29th at
8am- 10am PDT.

Thanks, Mark Gustlin


-- Do not delete or change any of the following text. --

Join WebEx meeting
Meeting number: 492 414 237
Meeting password: four123!
Join by phone
Call-in toll-free number: 1-(877) 582-3182 (US) Call-in number:
1-(770) 657-
9791 (US) Show global numbers Conference Code: 568 245 6486