I recently ran into some internal buzz about Oracle’s 72 port ‘top-of-rack’ switch announcement and it peeked my interest, so I started taking a look. Oracle selling a switch is definitely interesting on the surface but then again they did just purchase Sun for a bargain basement price and Sun does make hardware, pretty good hardware at that. Here is a quick breakdown of the switch:
Size | 1RU |
Port Count | 72x 10GE or 16x 40GE |
Oversubscription | None fully non-blocking |
L3 routing | Yes |
DCB | No |
FCoE | No |
Price | $79,200 list |
Two three letter words came to mind when I saw this: wow, and why. Wow is definitely in order, I mean wow! Packing 72 non-blocking 10GE ports into a 1RU switch chassis is impressive, very impressive. I’m dying to get a look at the hardware. Now for the why:
Why does Oracle think they can call a 72 port switch a top-of-rack switch? 1RU form factor doth not a ToR make. Do you have 72 10GE ports in a rack in your data center? This switch is really a middle-of-row or end-of-row switch. Once you move it into that position now you’ve got some cabling to think about, $1000.00 or so times 2 per link for optics another couple hundred for that nice long cable x 72, the cost of running and maintaining those cables… think ‘Holy shit Batman my $79,200 ToR switch just became a $200,000+ EoR switch and a different management model from the rest of my shop.’
Why does Oracle think there is a need for full non-blocking bandwidth for every access layer port? Is anyone seriously driving sustained 10GE on multiple devices at once, anyone? You’ve got two options in switching and only one actually makes sense. You either reduce cost and implement oversubscription in hardware, or you pay for full rate hardware that is still oversubscribed in you network designs because you aren’t using 1:1 server to inter-switch links. Before deciding how much you really need line-rate bandwidth do yourself a favor and take a look at your I/O profile across a few servers for a week or two. If you’re like the majority of data centers you’ll find that you’ll be quite fine with as much as 8:1 or higher oversubscription with 10GE at the access layer.
Why would I want to buy a 10GE switch today that has no support for DCB or FCoE? Whether you like it or not FCoE is here, both Cisco and HP are backing it strongly with products shipping and more on the way. Emulex and Qlogic are both in their second generation of Converged Network Adapters (CNA) see my take on Emulex’s known as the OneConnect adapter (http://www.definethecloud.net/?p=382.) The standards are all ratified and even TRILL is soon to be ratified to provide that beautiful Spanning-Tree free utopian network you’ve dreamed of since childhood. If I’m an all NFS or iSCSI shop maybe this doesn’t bother me but if I’m running Fibre Channel there is no way I’m locking myself into 10GE at the access layer without IEEE standard DCB and FCoE capabilities in the hardware.
What it really comes down to is that this switch is meaningless in the average enterprise data center. Where this switch fits and has purpose is in specialized multi-rack appliances and clusters. If you buy an Oracle multi-rack system or cluster from Oracle this will be one option for connectivity. With any luck they won’t force you into this switch because there are better options.
Thanks to my colleague for helping me out with some of this info.
Kudos: I do want to give Oracle kudos on the QSPF which is the heart of how they were able to put 72 10GE ports in a 1RU design. The QSPF is a 40GE port that can optionally be split into 4 individual 10GE links. It’s definitely a very cool concept and will hopefully see greater industry adoption.
How to build the 10GE network of your dreams:
One of the things I love about the Oracle 10GE switch is that it highlights exactly what Cisco is working to fix in data center networking with the Nexus 5000 and 2000.
Note: Full disclosure and all that jazz, I work for a Cisco reseller and as part of my role I work closely with Cisco Nexus products. That being said I chose the role I’m in (and the role chose me) because I’m a big fan and endorser of those products not the other way around. To put it simply, I love the Nexus product line because I love the Nexus product line, I just so happen to be lucky enough to have a job doing what I love.
So now stepping off my soapbox and out of disclosure mode let’s get to the what the hell is Joe talking about portion of this post.
In the diagram above I’m showing two Nexus 5020’s in green at the top and 10 pairs of Nexus 2232’s connected to them. What this creates is a redundant 320 port 10GE fabric with 2 points of management because the Nexus 2000 is just a remote line card of the Nexus 5000. All of this comes with two other great features: latency under 5us and FCoE support. Additionally this puts a 2K at the top of each rack allowing ToR cabling while keeping all management and administration at the 5K in the middle-of-row. Because the system also supports Twinax cabling there is a cost savings of thousands of dollars per rack over Fibre cabling to TOR or EoR. There is not another solution on the market that comes close to this today. All of this at a 4:1 oversubscription rate at the access layer. If you’re willing to oversubscribe a little more you could actually add 2 more redundant Nexus 2000s for another 64 ports capping at 384 ports.
This entire solution comes in at or below the price of 2 of Oracle’s switches before considering the cost savings on cabling.
Summary:
I don’t believe Oracle’s 72 port switch has a market in the average data center. It will have specialized use cases, and it is quite an interesting play. The best thing it has to offer is the QSPF which hopefully will gain some buzz and vendor support thanks to Oracle.
I guess I am shocked that it doesn’t support DCB/FcoE – perhaps this is how they are able to achieve non-block at such great density. Look at Arista, they have their ultra-dense DC switch but at least support DCB. These are clearly two companies who do not believe in FCoE. Their customers are the same folks who think they have a “SAN” and a “unified fabric” just because they are running multi-path NFS and iSCSI.
Does utradense ToR/EoR switching matter? Maybe. To your point, Joe, show us the use cases. I can only think of one and I think it is filled (nicely) by Infiniband….
I definitely agree with you Mike, this switch seems like a what were thinking type of device. What I can’t do with other 10GE network switches from proven vendors I can do with Infiniband.
Perhaps I am misunderstanding your figures but +300 10Gb ports for around 150k is significantly cheaper than the Cisco quote I had for the same number of 1Gb ports last month. Ultimately they lost the bid for over 1k edge ports and a 10Gb core to Extreme Networks and damn I’m glad they did with the management gains the company has acheived (API’s really matter). Off topic though, mainly I’m just unbelievably surprised at your pricing compared to the pricing I received from Cisco which claimed to have a +%40 discount.
Scott,
I can definitely believe that but it depends on the architecture you were looking for. I’m talking about access layer only in the above post. If you were quoted Nexus 7000 or Catalyst you’d have been paying more but also getting a great deal more for aggregation/distribution work.
As far as your pricing on Extreme goes I don’t doubt it came out preferential. Cisco is not in the market of discount switching, they invest quite a bit into R&D and driving new standards and network advancements, they build there own ASICS (hardware chips) and equipment. When you purchase Cisco networking gear there will typically be some additional up front cost for the features, interoperability and reliablity you get.
To paraphrase a customer quote: ‘I’ve purchased non-Cisco devices in the past based on the price difference and learned my lesson the second I needed to utilize a feature that wasnt available because of my decision.’
I’d love to understand the architecture you worked with if your interested, shoot me an email: joe (at) definethecloud.net.
In either event, I truly appreciate the feedback and the fact that someone is reading 😉
From the perspective of a Hosting ISP, FCoE and DCB aren’t interesting features, but another thing is: Due to the use of the QSFP connectors, this 1RU switch actually consumes additional 2RUs for “breakout panels”. In the end, you just need some space for a number of SFP+-Ports. Regarding the device in general, it becomes interesting if we get server racks 72RUs high 😉
P.S.: Truely, TRILL is a wet dream becoming real 😉
Malte,
Excellent points in regards to ISP’s in most cases they won’t have any need for FCoE and DCB. I was also unaware that this 1RU switch becomes 3RU due to QSFP breakout panels, at that point you could easily do 60 SFP+ ports in a 3RU form factor without the complexity and risk of 4:1 optics.
I’ll definitely have to retract this post when we start using 72RU racks, that will make this ToR switch mighty appealing 😉 Biggest racks I worked with were custom 12 footers with about 60U so we still have a ways to go.
Thanks for reading and the input!
Joe
i az bqh na bgmreja vse kelbai visqt po darvetata mnogo puti sam nosel ruter i sled nqkolko dni krai obajdam se i mi kazvat elate s rutera da vi go nastroim premestih se na megalan mnogo sa dobri