Infrastructure for Big Data

Notes from the Facebook Networking @Scale event at building 15, on the former Sun campus. 11 Feb 2015

Updated to add - Video for most of the talks is linked here. Slides are visible, but not published separately.


Omar Baldonado, with whom I worked at Netsys Technologies prior to their acquisition by Cisco Systems, was organizing- the event had excellent content, and was well worth the time.

Anees Shaikh Google
Talked about what Google is doing with YANG - need schemas for telemetry - want continuous real time stream for network operations state change. Working with IETF - aim is to bring something that works for operators to IETF, not specify there. Want vendor neutral models, with optional vendor additions. Building a model composition framework, from the point of view of 'what are all the components required to stand up a service' - in a time of a few hours, with very few people. This theme came up repeatedly in the FB talks later.

refer to

Michael Payne, R&D, Cloud Development, JPMorgan Chase

He showed an outline design for a network for low latency trading.
They run different networks for different types of trading transactions - are seriously considering using microwave networks - 22 hops between Chicago and New York - to save 3 milliseconds round trip time (9 ms compared to 12 ms using fibre).
They care about and implement time stamps - NTP, IEEE 1588 Precision Time Protocol, and Pulse Per second for tick of time distributed on coax, 10 MHz sine wave output.

Morgan is working on the next generation infrastructure design for IT services. They are particularly concerned about the ability to measure performance, and a design to do it independent of the host systems using packet brokers.

Najam Ahmad, Facebook
They redesigned their data centres for simplicity of implementation - 3 days, 2 people, added 120 boxes to an existing data centre with a design using the Wedge.

This is very close to the slide set from Yuval Bachar Mentioned 3 months development time from design to power on for the Wedge, 4.5 months for the 6-pack (which is 12 Wedges, in 7 RU). 1.28 T Broadcom switch chip (as confirmed by the Broadcom VP I talked to), 128 x 40 Gbps interfaces. This rapid turnaround, and ability to implemented new features incrementally on their timescale, is one of the big drivers behind doing their own hardware and software development.

Artur Bergman, Fastly

Building a 'small object' CDN - so they use Akamai for big video streams. For high frequency requests, they've built out an infrastructure using Arista switches, doing BGP routing using Bird. Uses Varnish.
Hardware (this was Aug 14) 2 Intel 2690 v2 (Sandy Bridge) • 10 Cores @ 3 Ghz • 768 GB of Ram • 4x10Gb Ethernet EB82599 • 24*500GB SSD • Intel 3500 • Samsung 840 Pro Hardware
16 Servers • 12 TB Ram • 192 TB of SSD • 640 Gbit/sec Rack
He "hates routers, hates F5"
Network designed with 4 times redundancy of server hardware, so that they don't have to debug on a server running live traffic - relocate the work, get to the failed thing "next Tuesday". Each pod has 4 x 10GE networks + IPMI + a management network (and a switch just for management traffic).
Relocateable MAC addresses on servers. see slide 29 onwards. Rack picture.

Albert Greenberg, Microsoft Research
Talked about and did a live demo (the only one at the event) of flow tables (Azure Tables) being movable - this talk had the least common context with the other talks. Azure is bringing on 10,000 customers a week. Has 19 regions, 600,000 servers or so per region.

Dave Temkin, Netflix

Talked about the physical infrastructure Netflix is using to build out its own CDN (not on AWS so much anymore). 3 Tbs in 6 racks. They were able to get 1 Tbs, 40 kW in Paris Telehouse, to the astonishment of everyone who said it'd been full for years - went there for the best crossconnect (and they are in the attic). Design to two power envelopes. Custom build 18 cable types. MTP cables. Have a 6.5 kW per rack footprint - would like to get to 9kW per rack but most facilities can't support that. Using Arista 7500 switches, with one of 3 configurations depending on size at every PoP. They do no aggregation, have no use for top of rack switches.

Peter Griess, Facebook
Talked about Mobile Proxygen
Monitoring, troubleshooting tools which are downloaded along with the FB app to the mobile client. Includes SPDY, happy eyeballs, DNS behaviour modifications.


Omar Baldonado, Networking Software Manager, Facebook

Extending SDN to the Management Plane
Anees Shaikh, Staff Network Architect, Google

Building Low-latency High Performance Trading Networks
Michael Payne, R&D, Cloud Development, JPMorgan Chase

Facebook Data Center Network
Najam Ahmad, VP Network Engineering, Facebook
Yuval Bachar, Hardware Engineer, Facebook
Alexey Andreyev, Network Engineer, Facebook

CDN Scaling Challenges
Artur Bergman, CEO, Fastly

Synchronous Geo-Replication over Azure Tables
Albert Greenberg, Distinguished Engineer and Director of Microsoft Azure Networking, Microsoft

Scaling the Netflix Global CDN, lessons learned from Terabit Zero
Dave Temkin, Director of Global Networks, Netflix

Mobile Networking Challenges
Peter Griess, Software Engineer, Facebook

Twitter @netatscale
There's a FB group, described as formed from the attendees.