Isn't this just a CDN? ISPs quite happily host Akamai and Google nodes -- are they being paid for maintenance of those? Empirically, from an old job, the answer is no (in fact, we had to petition for quite a while to get some Google boxen), but that's in a country at the tail end of a long, long pipe. Is it different in the US?
People don't know how CDNs work so it looks amazing.
Comcast and Verizon now claim that Akamai is indeed paying them. Perhaps the outcry over Netflix being extorted explains why these deals have remained secret for years.
What this article is describing sounds like a colocation agreement which is a quite common service offered by ISPs and is typically paid for by a monthly rental fee calculated by storage size, dedicated bandwidth use and power consumption.
It's not the same. If I want Verizon to host my server, they have nothing to gain, but I do. So, yes, they should charge me. If instead they host a CDN server for a big content provider, they save money on routing content within their network.
That's not always the case though. Sometimes rack space is more valuable to ISPs than improving bandwidth efficiency via CDN co-location deals. It really depends how much space you've got to spare and who else wants it.
Could someone explain to me (it's been a long day, and I'm no storage expert!) how these are useful, wouldn't you need a whole bunch of these at each location they are to be installed?
Just basing this on my own experience I have 12 drives in RAID which have a fairly substantial sequential throughput. Start multiple streams of high bandwidth videos and the maximum throughput from the drives drops sharply due to the random reads.
I would have assumed that having even 100 people streaming from a single box would be an interesting challenge.
These things can pump out 12Gbit during peak viewing, which is enough for about 4,000 avg. 3 Mbit streams.
If they can cram in 256GB of DRAM, that's enough room for a 64MB buffer per stream, or about 170 seconds worth of streaming. Now you only need to be able to fill those buffers at a rate of about 24/second.
I'm assuming whatever file system is on the disks uses a massive block size, so the number of seeks you'd have to perform to pull 64MB off is probably pretty low. Eight? Sixteen? Even if it's the latter, that's only 384 seeks/second, which you could very plausibly do striped over only a half dozen disks, and the device presumably has many more.
I'm no expert either, but if my napkin math is right, fully utilized in a best-case scenario (no network overhead, etc.), those disks would peak at around 30-40 MByte/s (36 disks, 10 Gbit NIC).
I'm not sure if the I/O is really all that random. Theoretically, it's all sequential, but might effectively hit the disk like random I/O because of concurrent streaming sessions, varying network speed and so on. My guess is you can get pretty close to sequential I/O with some simple means like using large block sizes in your RAID, well-tuned readahead and Native Command Queuing.
There's a bit of magic to it, turning the OS to work well in conjunction with the hardware you have, but it's possible.
* Use 15k RPM SAS 12Gb/s drives instead of cheap consumer drives,
* Use 50 in a chassis of them instead of 12.
* Use RAID 0 instead of RAID 5 or 6 - I'll be the quoted storage space is raw, not post-raid.
* Have multiple copies of a show on disk, as the article states happens.
* Optimize for a read-heavy workflow during peak hours, (eg, mount noatime).
The devil is in the details, but what I listed above is probably a good starting point. The article states a data rate of 3GB/hour, which is only .83 megabytes/second, * 100 streams is only 83 megabytes/second, which is easy for the above configuration. Hell, I'll bet the above configuration could do 10,000 streams if the data rate is 3GB/hour with no issues given Netflix's peak read-only workload.
A 15000RPM SAS drive would be an _extremely_ poor choice for this type of workload. Power hungry, hot, extremely sensitive to vibration, very low data density, and, of course, ridiculously expensive.
What? While I can't argue with them being ridiculously expensive (because they very much are) why would them being power hungry, hot, and extremely sensitive to vibration even matter if they are going to be rack-mounted into an ISP's data center, with redundant power-supplies and a data-center grade cooling system to match?
The only possible issue of the ones you raised is low data density, but all engineering is a trade-off, 15k rpm drives get you better seek times, which would generally lead to the ability to support more viewers in a single box - not working for Netflix, I don't know how many users they want to support per-box.
It's totally possible that between the number of streams they want to support, and the total storage in the box that they've gone with SATA drives, but to pretend that 15k RPM SAS drives are "an _extremely_ poor choice" for an enterprise-grade storage system would be to ignore the fact that Netflix is making an enterprise-grade storage appliance - and that on the top-end, those appliances commonly use SAS drives.
I will double-down to claim that 15krpm SAS drives are a bad choice for any application, and they are only used as a bandaid for marginal improvements on irredeemable system designs.
To address your points individually:
1) Power hungry. When you add in conversion and distribution and cooling, every watt consumed by the computer is consumed again by the datacenter infrastructure. Power costs money.
2) Hot is just the corollary of 1). Hot is also the enemy of density, and this box is very dense.
3) Sensitive to vibration. If you aren't intimately familiar with this fact then you aren't getting the performance you paid for from your 15k disks. To achieve their spectacular claimed seek times then need very careful mechanical design of their enclosure. Much more careful than racking up one of Netflix's boxes in a rack with other random vibrators.
4) Density. To get the space Netflix is using here, you need 5x more expensive 15k disks because they top out at 600GB and the ones people actually use for these workloads have 3TB.
A smart read-ahead strategy obviates the need for shiny seek time specs. For any given stream you could read ahead by 32MB or whatever. Now you've made seek time irrelevant. Put lots of RAM in the machine and you're done at a tiny fraction of the capital and operating costs of 15k disks.
1) Power Hungry, the total power envelope of a rack position is the limiting issue. You can fit 16-20 disks per RU no problem. The constraint on total density is your 5kVa or 10kVa power budget. Watts matter.
4) Density, actually 4TB is hitting the sweet spot for $ per byte last time I looked. If you need absolute density look for 5 & 6TB Real Soon Now. If you can tolerate some loss variable density disks around 5TB look quite a bit more cost effective.
5) Read ahead, you only need to read ahead by a couple of chunks. For video ball park it at 2MB. You don't really need lots of RAM, think 32-64GB per chassis. Additionally each 8GB dimm costs about the same power as a disk. By going with 32GB instead of 64 I can fit another 2-3 disks in the power budget per chassis.
Start multiple streams of high bandwidth videos and the maximum throughput from the drives drops sharply due to the random reads.
I think this can be fixed by prefetching harder (like 1-2 MB).
100 Netflix streams would be less than 100 MB/s which can be satisfied with a single SATA drive. And some of the Netflix boxes have dozens of SSDs that each do 500 MB/s.
Can anyone explain why an ISP would push back Netflix installing one of these boxes? Does the ISP want Netflix service congestion to better sell its own media assets/distribution (i.e. cable TV)?
Let's say I have something that costs me $100, and I know for a fact that (a) if I give it to you, it will make you $1000 richer and (b) you don't have access to any substitute, so you either pay the price I ask or go without.
This is what game theorists call a 'pure bargaining' [1] problem:- reaching a deal makes $900, but we have to decide how to split that between us. Both of us want as big a share as possible, but our main leverage is refusing to reach a deal.
In this situation, I am the ISP; installing a Netflix box costs me $100 for power, bandwidth, whatever. Netflix are the ones that make $1000, as installing the box lets them get more customers. ISPs refuse to install boxes because they want more money than Netflix is offering, and refusing to install is the only way to get it.
Based on the blog posts by Level 3 -- how much internal capacity Comcast / Verizon say they have versus what Level 3 sees for utilization at their peering points with both ISPs -- I would have to assume "yes".
That's Brett Glass, though, who is … opinionated. It's a good idea to read the entire thread for responses from a number of other network operators as well as some Netflix technical staff:
It may very well be in their best interest to give NetFlix free co-location space but because of the PR battle this has become kind of a proxy war for bigger issues. ISPs have a legitimate concern now that if NetFlix 'wins' the PR battle they will have a long line of companies knocking on their door demanding peering arrangements and free co-location. It's also worth noting ISPs could rent out 4RU of space in their facilities to other companies for way more than $0 and often do.
I'd be interested to find out more of hardware details of these devices. 100+ TB of storage in a 4U is pretty respectable. From the images in the article, it looks like the drives are not hot-swappable, so I'd guess Netflix is able to remotely track loss of redundancy and will just send out an entire replacement unit when needed.
For comparison, Supermicro makes one of the highest-density storage servers that I'm aware of[1]. 72 3.5" drives in 36 drive bays, so up to 288tB of raw storage, if you're brave enough to use 4tB drives.
From my understanding, these boxes are basically local copies of the entire Netflix online catalog - a catalog that frequently changes. How do these ~15,000 machines pick up new content? I assume they synchronize with the Netflix mothership every once in a while.
My question (without any intent to trivialize what Netflix is doing) is, isn't that just a good-old caching system?
But it makes me wonder if it would be possible to create a distributed, localized caching system? For example, if I watch The Avengers, my machine could be configured to cache a part (or all) of the movie. The more people watch it, the more it is distributed. Since Netflix can correlate location data with what movies are watched, they can target certain areas to cache certain movies more aggressively. When the next person in my neighborhood watches The Avengers, they would pull part of the movie from me (maybe the beginning until the rest buffers) and cache another part of it themselves for the next person in line. That way all the people in my neighborhood or adjacent neighborhoods don't have to pull the entire movie across the internet every time - instead, we help each other watch movies with higher quality.
On second thought, this may not be cheap considering the amount of development that may be required (and, at the end of the day, someone has to pay for this storage - although Netflix could, for example, offer personal mini-caching boxes in exchange for a monthly discount), but it's fun to dream about a sanctioned, decentralized, peer-to-peer movie service.
It _would_ be a perfect setup (much much better than sending the same stream through the same lines just because 2 persons are watching the same show)
The problem here is not technical (Bittorrent solves that) the problem is societal: for most of the content that is of interest to population, you are not allowed to redistribute it. So Netflix has to DRM-protect the content for everyone, meaning that the actual content that transits on the line has to be different, meaning that you can't redistribute it (since it's targeted to you, no one else could decode it).
If you want to see how much the technical side works, take a look at PopcornTime.
As I wrote that comment, BitTorrent came to mind with all the societal implications that carries (that's what I meant by "sanctioned" in the last sentence)
I haven't really explored PopCorn Time but I'll take a look.
With regards to DRM, the content must be de-protected before you watch it - so if Netflix installed a mini-box on my network, they could decrypt it to let me watch the show, and encrypt it to store on the device before it is sent to someone else. The key itself could be retrieved from the a centralized server.
If I think about it some more, there really is no need to keep an entire Netflix catalog in every location. As I mentioned, they have usage data and a strong prediction system too, so predicting the next show that will be watched in an area and even when would be "straight-forward" (In quotes because it wouldn't be easy for me to do it, but I think it would be easy for Netflix).
Another thing Netflix could try is pre-fetching shows during off-hours. That would work well for binge show watchers, and reduce peak time traffic.
This seems to be pretty close to what Popcorn Time does, and apparently does it very well. Their site says they already select individual files, so I imagine a 'sanctioned' Netflix style offering would be technically trivial to implement. This whole thing is just a reflection of the open internet/what went wrong debacle where all of this money that could be going to the content producers is instead spent on several layers of network, hardware and technical service providers that a small group from Buenos Aries has demonstrated is unnecessary if we could only, actually use, the network. But we can't.
A few weeks back, there was a job posting looking for someone with peer-to-peer experience to build such a prototype for Netflix, so it's definitely something they're considering. On the other hand, I don't know how people with data caps would feel about reseeding content for a corporation like Netflix.
If the ISPs are reclassified as common carriers, will they still be allowed to let Netflix install those boxes?
Presumably there is limited space in the ISP data center, so they cannot let everyone who wants to stick a box in there do so, so I'm wondering what methods a common carrier can use to decide which content providers can have access, and how this fundamentally differs from the "fast lanes" that common carrier status is supposed to prevent.
Maybe they can come up with a rule like the box must save 2 Gbps of peak hour bandwidth per U.
From the ISP's perspective they are hosting these boxes to save money, not improve performance. If performance happens to improve (ahem) that's a convenient side effect.
It's all one discussion in my mind. There are three ways to get Netflix: boxes, peering, or transit (from cheapest to most expensive). If peering saves the ISP money and the boxes save even more money, why are some ISPs refusing to use them?
Because some ISPs are trying to maximize their revenue by charging Netflix directly.
There are a few other legitimate reasons as well. Some ISPs dislike hosting hardware that they don't own in their data centre. Or it may be that an ISP has contracts in place that commit the ISP to certain bandwidth costs, so removing Netflix traffic may not actually save them money.
Why shouldn't an ISP charge Netflix to host a box in their datacenter? I'd charge you a fee if you wanted to store your DVDs in my house, even if you let me watch them.
On the other hand, if the ISP also runs a video service, they've got some very compelling reasons to do everything they can to not increase the quality of service for Netflix. They're green, and rectangular, and...
And the two are also not mutually exclusive. You can have an OpenConnect appliance that is populated via peering. Someone like Verizon/Comcast/etc could add OpenConnect appliances and then peer with a single 10Gbps link vs the multiple 10Gpbs transit links they are using now.
" Each appliance stores a portion of the catalog, which in general is less than the complete content
library for a given region. Popularity changes, new titles being added to the service, and re-
encoded movies are all part of the up to 7.5TB of nightly updates each cache must download to
remain current. We recommend setting up a download window during off-peak hours when there is
sufficient free capacity."
"But wait, you say. That'd be all fine and dandy if Netflix was etched in stone, but it's not! They add and remove stuff all the time! That's right, which is why most OCAs take a 7.5TB update. Every. Freaking. Day.
"If you had to take a 7.5TB update on your home internet, you would be screwed. According to Ookla's Net Index, the average download speed in the United States is 18.6Mbps. So with an average connection, that 7.5TB would take you about 40 days to chew through.
"Fortunately, data centers tend to have a little more speed to work with. A firehose, compared to your bendy straw. So Netflix recommends a much shorter 10-hour span (from 2AM to noon, local time) where these boxes can soak up their updates with 1.2Gbps connections. That is to say, 'Google Fiber speeds.'"
I'm sure thought has been put into it. I just wonder why the OCAs aren't caches? I'd imagine a cache you be a lot more (mostly bandwidth) efficient for the ISPs and scale much better.
I request a stream. It's proxied through the device. It's not present on the device. So the device requests the 1080P from Netflix. It transcodes the stream it receives on-the-fly and sends me whatever quality I require while writing both the original and my lower-quality version to disk.
I mean maybe the limitation in that plan is the number of streams that could be transcoded simultaneously. So maybe instead the on-the-fly version isn't cached. Because I may need to switch quality on-the-fly. So maybe instead jobs are just queued up for the 5 or so formats required based on the 1080P download and they can process on spare transcoders opportunistically.
Just interesting to think through it as a programming problem. I'm sure the people working on this are very smart. It just seems really heavy on the bandwidth utilization as is.
I can think of reasons why a pure cache would impose costs that are too high -- namely, the device would then be writing during times of very read load.
But: I agree with your bigger point, that it's an interesting programming problem.
We're all familiar with CPU caches, so it's interesting mental exercise to think of this as the same system, but optimized for a different part of design space.
"It's rarely—if ever—a full copy, but rather a massive chunk specifically designed for a specific country or region, depending on what's available and what's popular."
In addition, Netflix has multiple copies of the same content at different quality levels. So, there might be the 1080p, 720p, and SD quality levels of the popular content, each of them taking up space from other movies/shows.