Thread Links Date Links
Thread Prev Thread Next Thread Index Date Prev Date Next Date Index

Re: [802.3BA] Longer OM3 Reach Objective




Brad,
I agree with you regarding the apples/oranges mix of the preceding argument.  To this I would add that if we had down-selected the 100BASE-T options to something other than TX, 100BASE-T may not have had the success it has enjoyed.  While we are good at defining the use of technology and writing rigorous specifications, our track record is inconsistent with choosing the optimal solution set.  Despite our best efforts, we can see instances of sub-optimal choices at every data rate.

10BASE- we have T, FP, FB, FL with great success for T, and FL taking the fiber market.  FP and FB died.

100BASE- we have T2, T4, TX.   TX took the market and T2 and T4 died.  We also have FX for multimode, but did not standardize a SM interface (until EFM), forcing proprietary SM solutions to surface (some based on SM FDDI).

1000BASE- we have T, CX, SX, LX.  T forms the biggest part of this market, scooping CX.  SX is the most successful fiber interface ever.  But LX only went to 5km, forcing a bunch of longer reach proprietary solutions, one of which is known as ZX.  

10GBASE- we have the original four, some multiplied by two for LAN (R) and WAN (W): S, L, E, LX4.  The WAN interfaces have not found much of a market and are probably dying or dead.  To these PMDs we later added CX4, T, and LRM.   LRM was touted to hold the key to opening up the installed base cable plant, so was permitted to be added despite the fact that it represented a second solution to the same problem that LX4 was created to solve and therefore was in defiance of one of the five criteria that we hold so high.  As far as I can tell, LRM's promise has not materialized, and the market continues its shift to S over OM3 fiber.  

This march toward S over OM3 is occurring despite the fact that customers are telling box vendors that they do not want to put in OM3 fiber.  They want a solution that allows them to reuse their old cable plant.  If I were a customer, that's what I'd be saying too.  But then reality hits when presented with the upgrade cost of the two scenarios.  If the expectation about cabling that was built in customer's minds, as they upgraded data rates from 10M to 100M to 1G using the same multimode cabling over and over again, held true for switches, routers, gateways, servers, and PCs, they would be asking the respective box vendors to increase their speed without any cost too.  But that expectation is born only by the fiber cabling because of its legacy of providing multiple generations of support in the past.  Some customers seeking that same upgrade-ability now look to singlemode fiber.  While this will provide an upgrade path, the reality is that it will not provide the lowest cost system - by far.   Indeed, some customers seem to be coming to grips with this by building data centers in stages.  

The point to this review of history and trends is that it reinforces my perspective that we can't pick the optimal set with precision, because it is only thru the rear-view mirror that we can resolve what is optimal.  The best we can do is choose what has a reasonable chance of providing value to the market.  Some will thrive and others whither.  But despite our failures, Ethernet dominates, ever gaining ground in multiple markets.  I am not saying that we should abandon our scrutiny, but rather have the confidence that even if we pick sub-optimal PMD sets by selecting too many, the Ethernet ecosystem will flourish anyway, and the market will determine what it finds to be the best value.  This has been the case in Fibre Channel as well.

With this perspective it would be foolish to select a set of PMDs that address only a portion of the data center environment cost effectively.  Focusing on a distance short enough that can be addressed with run-of-the-mill PMDs has its merits.  But that choice should not be held up as the only low-cost option if it fails to address the remainder of the space.  I and others have shown that a 100m PMD is inadequate to cover the data center backbone lengths.  We already recognize this axiom with singlemode by having separate objectives for 10km and 40km PMDs at 100G.  Vendors seem very willing to optimize the choices for the metro space.  Indeed, the analysis that we heard over the last few meetings where 3km costs and coverage were compared to 10km costs and coverage went decidedly in favor of the 10km distance.  Here we chose performance over cost savings, because it would actually drive up the cost if those in need of reaches between 3km and 10km were forced to deploy the 40km PMD.  Yet for the data center many seem to put on blinders to this same issue.  I don't disagree that we need a lowest-cost option.  But we also need one that for slightly more cost can cover nearly the entire data center extent, otherwise many customers will not deploy the speed upgrade we are standardizing when faced with the cost of the 10km singlemode PMDs.  They will instead deploy link aggregated 10GBASE-S.

Some say that 10GBASE-S specifications are overly stringent.  I must remind those with this mind set that it was envisioned during the height of the tech bubble to be deployed in LAN networks over building backbones, to which its 300m reach is appropriate.  And despite the fact that its deployment has been primarily in data centers, where the reach requirements are generally more modest, it is still the lowest cost optical PMD by far, with an accelerating market share.  This is due in no small part to the multiple cost reductions brought by simplifying the device from its unnatural form as a XENPAK PHY to its purest state as an SFP+ PMD.  When we speak of low-cost optical PMDs for 40G and 100G, it is the parallel analog of the SPF+ that springs to mind.  The cost impact of remaining with this platform, but offering two performance grades differentiated by spectral width requirements (i.e. requiring the higher grade to comply with the spectral width of 10GBASE-S to achieve more than double the distance), is minimal because it delivers the aggregate volume of both performance grades to their common components.  These happen to represent 100% of the components, as the same design can yield both performance grades.  And they can be made interoperable to the capability of the lower grade.  In the singlemode PMD debates we have been hearing how sharing some of the same optics between the 10km and 40km PMDs will lower the cost of both.  This multimode PMD approach not only shares some of the components, but all of them.  

I am not adverse to other solutions, such as dispersion compensation or clock recovery or forward error correction, to address nearly the full extent of the data center channel lengths, especially if a single low-cost PMD can the result.  The dispersion compensation approach can open data eyes to be more resistant to jitter and dispersion.  Clock recovery resets jitter budgets, which are otherwise passed along from stage to stage.  And forward error correction can mitigate the effects of noise, such as that produced by mode partitioning, thereby reducing the effects of widened spectral width.  So it seems we have plenty of techniques at our disposal.  Certainly with all of these we can do better than ignore providing our customers with low cost solutions that adequately cover their distance needs in data centers.  Ignoring the customer's needs is the high-risk approach that hold the success of our efforts in the balance.  There can be no greater cost.

Regards,
Paul Kolesar
CommScope Inc.
Enterprise Solutions
1300 East Lookout Drive
Richardson, TX 75082
Phone:  972.792.3155
Fax:      972.792.3111
eMail:   pkolesar@commscope.com



Ali Ghiasi <aghiasi@BROADCOM.COM>

03/25/2008 07:08 PM
Please respond to
Ali Ghiasi <aghiasi@BROADCOM.COM>

To
STDS-802-3-HSSG@LISTSERV.IEEE.ORG
cc
Subject
Re: [802.3BA] Longer OM3 Reach Objective





Jonathan

I suggest you compare the PHY price premium for linear and limiting SFP+.
No one is proposing LRM like linear interface, we have proposed SLI "simple
linear interface".  Every 8 Gig FC port which supports copper cables will have
a 3 tap DFE, since most SerDes must have at least 1 tap DFE to compensate for
8" of FR4 after the limited receiver going to a 3 tap DFE has not been a burden.

I agree low cost is archived by using a known low cost technology, but lowest cost is
achieved only if your are open minded and you explore the possibilities not limit yourself
to the 20 years interface concept.  1000Base-T is a clear vivid example how complex
DSP was successfully applied to solve a very complex problem, currently at equal power
to the optics but with significantly lower cost than optics.

Thanks,
Ali

Jonathan King wrote:

Sure, adding dollars and watts to the link can buy margin, but highest margin does not equate to lowest cost or best suitability for high volume adoption, (otherwise wouldn't we be using trans-oceanic terminal equipment for SR apps. ?)
lowest cost is achieved by choosing known low cost technologies to meet 90+% of the application market and ensuring any application spec doesn't try to squeeze the technology to its limits.
 
 
Jonathan King
Finisar Corp
1389 Moffet Park Drive
Sunnyvale, CA 94089
 
ph: 1 408 400 1057
cell: 1 408 368 3071
e-mail: jonathan.king@finisar.com
cube C127
 
-----Original Message-----
From:
Brad Booth [
mailto:bbooth@AMCC.COM]
Sent:
Tuesday, March 25, 2008 7:28 AM
To:
STDS-802-3-HSSG@LISTSERV.IEEE.ORG
Subject:
Re: [802.3BA] Longer OM3 Reach Objective

 
That's definitely one way it can be interpreted.  There is some apples and oranges comparison going on there, and I'm not sure how the VG got added to the 802.3 mix. ;-)
 
But with optics, I would agree that the tougher choices haven't always been made, even when the writing was on the wall.  The biggest complaint with 10GbE was all the possible port types (thank you WAN interface sublayer).  The 10GBASE-SR PHY is doing well in the market though, and that's partially due to the fact that it is the only 850nm wavelength PHY for that space.  Interestingly enough though, implementations can be achieved with either linear or limiting components.  And, if you put a linear at one end and a limiting at the other, they will communicate.  
 
That's what I do like about Ali's proposal.  He has shown that it is possible to do 300m of MMF with an linear approach.  That indicates to the task force there is more margin in a linear approach than in a limiting approach; therefore, having more margin to play with, the linear approach with a 100m MMF reach should be able to become the lowest cost solution for the largest volume of the MMF market.  That's a huge benefit.  Rather than trying to pushing the limits and slowing the adoption curve, there is an implementation option which should make 100G MMF up to 100m a fiscally viable option.
 
Thanks,
Brad
 



From: Kevin Brown [mailto:kbrown@BROADCOM.COM]
Sent:
Tuesday, March 25, 2008 12:25 AM
To:
STDS-802-3-HSSG@LISTSERV.IEEE.ORG
Subject:
Re: [802.3BA] Longer OM3 Reach Objective

"Let the market decide" was how we ended up with 100BASE-TX, instead of 100BASE-T4, 100BASE-T2, or 100BASE-VG.  The 802.3 working group did a poor job of making tough decisions and minimizing the number of options to be presented to the industry.
 
What a mess.
 
But I think 100BASE-TX is the most widely deployed of the various 802.3 interfaces.  There have been a few billion shipped so far.
 
KB
 
 



From: Brad Booth [mailto:bbooth@AMCC.COM]
Sent:
Monday, March 24, 2008 3:19 PM
To:
STDS-802-3-HSSG@LISTSERV.IEEE.ORG
Subject:
Re: [802.3BA] Longer OM3 Reach Objective

"Let the market decide" is a really, really bad way to write a standard.  The IEEE 802.3 working group has done a very good job of making tough decisions and minimizing the number of options to be presented to the industry.  To create a reach objective that can only be satisfied by one implementation is a poor choice as it reduces the ability of component vendors to compete based upon their respective implementation strategies.  As the current objective is written, the reach is achievable with limiting and linear TIA's and may be achievable with lower cost components.
 
Just my 2 cents,
Brad
 
 



From: Ali Ghiasi [mailto:aghiasi@BROADCOM.COM]
Sent:
Monday, March 24, 2008 4:58 PM
To:
STDS-802-3-HSSG@LISTSERV.IEEE.ORG
Subject:
Re: [802.3BA] Longer OM3 Reach Objective

Petar

Thanks for sending the pointer to the top 500 list and I do see the server at TJW.

In November 2007, 2 systems appeared in the TOP500 list.

Rank
System
Procs
Memory(GB)
Rmax (GFlops)
Rpeak (GFlops)
Vendor
8 BGW
eServer Blue Gene Solution
40960 N/A 91290 114688 IBM


They did not show a picture or how big is the server, but based on your remarks it is small enough to fit in modest room.

I assume the Intra-links with the Blue Gene might be proprietary or IB.  What does clustering system Intra-links has do to
with the Ethernet network connection.  

I assume still some of the users in TJW lab may want to connect with higher speed Ethernet to this server, very likely you will need
links longer than 100 m.  In addition higher speed Ethernet may be used to cluster several Blue Gene system for fail over,
redundancy, disaster tolerance, or higher performance which will require links longer than 100 m.

We are both in agreements that parallel ribbon fiber will provide the highest density in near future.  The module form factors with a gearbox
will be 3-4x larger.   Here is a rough estimate of BW/mm (Linear face plate) for several form factors:
Speed      Media Sig.         Form Factor                                             Bandwidth (Gb/mm)  
 10GbE     1x10G      SFP+ (SR/LR/LRM/Cu )                    1.52 (Assumes stacked cages)
 40 GbE     4x10G      QSFP (SR or direct attach)                  4.37 (Assumes stacked cages)
 40 GbE     TBD         If assumed Xenpak (LR)                     0.98
 100 GbE    10x10G   CSFP (SR or direct attach)                  3.85 (The proposed connector already is stacked )
  100 GbE   4x25G     CFP (LR)                                             1.23

As you could see here the form factors which allow you to go >100 m will be several time larger and not compatible
with the higher density solution based on nx10G.  Linear nx10G as given in  

http://www.ieee802.org/3/ba/public/jan08/ghiasi_02_0108.pdf
can extend the reach to 300 m on OM3 fiber and relax the transmitter and jitter budget.

You have stated strongly you see no need for more than 100 m,  but we have also heard from other who stated
there is a need for  MMF for more than 100 m especially if you have to change the form factor for more than
100m!  Like FC and SFP+  we can define limiting option for 100 m and  linear option for 300 m, and
let the market decide.

Thanks,
Ali

Petar Pepeljugoski wrote:


Frank,


You are missing my point. Even the best case stat, no matter how you twist it in your favor, is based on distances from yesterday. New servers are much smaller, require shorter interconnect distances. I wish you could come to see the room where current #8  on the top500 list of supercomputers is (Rpeak 114 GFlops), maybe you'll understand then.


Instead of trying to design something that uses more power and goes unnecessarilly longer distances, we should focus our effort towards designing energy efficient, small footprint,  cost effective modules.


Regards,


Petar Pepeljugoski
IBM Research
P.O.Box 218 (mail)
1101 Kitchawan Road, Rte. 134 (shipping)
Yorktown Heights, NY 10598

e-mail:
petarp@us.ibm.com
phone: (914)-945-3761
fax:        (914)-945-4134


Frank Chang <ychang@VITESSE.COM>

03/14/2008 09:23 PM


Please respond to
Frank Chang
<ychang@VITESSE.COM>


To
STDS-802-3-HSSG@LISTSERV.IEEE.ORG
cc
 
Subject
Re: [802.3BA] Longer OM3 Reach Objective

 


   





Petar;

 

Depending on the sources of link statistics, 100m OM3 reach objective actually covers from 70% to 90% of the links, so we are talking about that 100m isnot even close to 95% coverage.
   
 

Regards

Frank




From: Petar Pepeljugoski [mailto:petarp@US.IBM.COM]
Sent:
Friday, March 14, 2008 5:09 PM
To:
STDS-802-3-HSSG@listserv.ieee.org
Subject:
Re: [802.3BA] Longer OM3 Reach Objective


Hello Jonathan,


While I am sympathetic with your view of the objectives, I disagree and oppose changing the current reach objective of 100m over OM3 fiber.


From my previous standards experience, I believe that all the difficulties arise in the last 0.5 dB or 1dB of the power budget (as well as jitter budget). It is worthwhile to ask module vendors how much would their yield improve if they are given 0.5 or 1 dB. It is responsible for most yield hits, making products much more expensive.
I believe that selecting specifications that penalize 95% of the customers to benefit 5% is a wrong design point.

You make another point - that larger data centers have higher bandwidth needs. While it is true that the bandwidth needs increase, you fail to mention is that the distance needs today are less than on previous server generations, since the processing power today is much more densely packed than before.

I believe that 100m is more than sufficient to address our customers' needs.  

Sincerely.


Petar Pepeljugoski
IBM Research
P.O.Box 218 (mail)
1101 Kitchawan Road, Rte. 134 (shipping)
Yorktown Heights, NY 10598

e-mail:
petarp@us.ibm.com
phone: (914)-945-3761
fax:        (914)-945-4134

Jonathan Jew <jew@j-and-m.com>

03/14/2008 01:32 PM


Please respond to
jew@j-and-m.com

 


To
STDS-802-3-HSSG@LISTSERV.IEEE.ORG
cc
 
Subject
[802.3BA] Longer OM3 Reach Objective



I am a consultant with over 25 years experience in data  center
infrastructure design and data center relocations including in excess of 50
data centers totaling 2 million+ sq ft.  I am currently engaged in data
center projects for one of the two top credit card processing firms and one
of the two top computer manufacturers.

I'm concerned about the 100m OM3 reach objective, as it does not cover an
adequate number (>95%) of backbone (access-to-distribution and
distribution-to-core switch) channels for most of my clients' data centers.


Based on a review of my current and past projects, I expect that a 150m or
larger reach objective would be more suitable.  It appears that some of the
data presented by others to the task force, such as Alan Flatman's Data
Centre Link Survey supports my impression.

There is a pretty strong correlation between the size of my clients' data
centers and the early adoption of new technologies such as higher speed LAN
connectivity.   It also stands to reason that larger data centers have
higher bandwidth needs, particularly at the network core.

I strongly encourage you to consider a longer OM3 reach objective than 100m.

Jonathan Jew
President
J&M Consultants, Inc

jew@j-and-m.com

co-chair BICSI data center standards committee
vice-chair TIA TR-42.6 telecom administration subcommittee
vice-chair TIA TR-42.1.1 data center working group (during development of
TIA-942)
USTAG representative to ISO/IEC JTC 1 SC25 WG3 data center standard adhoc

 

aghiasi.vcf