IMHO it's pretty simple: low-level performance. Even 25% slower than the output of a good C++ compiler means a browser goes from first to last place in many benchmarks. People expect the page to scroll at 60 FPS and that rules out GC pauses longer than 16 ms. Browsers are a mature market and unless the proposed replacement has the performance of C++ it's not worth the risk. Horizontally scaling server software has greater flexibility for performance; hence the explosion of languages on the server side while the client side is largely confined to languages developed in the 80s.
Parallelism may be the key to disruption here however; if a new browser can scale better to the multi-core present and future it may be worth taking a small hit over C++ in the sequential case. That's Servo's bet (although Rust strives for C++-level performance even in the single-threaded case).
> How do people writing games in Java or C# deal with this? Gaming is an area where people expect smooth, high speed performance.
Short, mean answer: poorly.
Longer answer: they sacrifice the benefits of managed languages. They forego immutability and controllable object lifetimes for the use of object pooling to prevent the garbage collector from going nuts. And many still target 30fps in order to minimize the effects of variable latency.
I used to do this; I'm porting my stuff from XNA/MonoGame to my own OpenGL/GLFW/OpenAL engine in C++ because I grew unsatisfied with the performance and limitations inherent in the process. Many of the biggest benefits of managed languages are blunted with games in other ways, too. The CLR, for example, should be a huge boon for scripting--the DLR is a fantastic concept, for example. But dynamic code requires codegen, and System.Dynamic doesn't work in Xamarin's iOS product. So if you want to bring a game with scripting from Windows/OS X to iOS, you're boned. (Java has the same problems; Groovy, while a great scripting language, doesn't run on Dalvik or on IKVM-over-MonoTouch. Rhino's LiveConnect doesn't work there either.)
On the other hand, I use AngelScript and expose C++ objects and methods to it, computed statically at runtime but executable via interpreted scripts, for the best of both worlds without breaking my back to do it. Memory management just isn't that hard and the performance and compatibility of native code just can't be touched right now.
Every victim of a use-after-free just cried out in anguish. Your world weeps at 60 fps.
Keep in mind the original question: What's a good language for implementing a browser? The number one requirement for a browser today is not performance. It is security. A decade of effort has been put into the major browsing engines to snatch pseudo-victory from the clutches of determined hackers. A great deal of that effort has been necessary because of the desire for "speed", and the fallibility of those who believe memory management "just isn't that hard".
> Object pools were good JVM thinking 10 years ago. It's been a very long time since they made any sense for memory-based objects.
I guess that's why libgdx very recently added pooling, right? And (OK, it's Dalvik, but still) why Android uses view recycling all over the place? Why recommended practice for ListView is to re-use existing objects whenever possible?
I'm not pulling the use of pooling out of my ass here. This is what you do for high-performance managed stuff, in games and otherwise. Everybody I've ever worked with, and I've worked with some pretty heavy hitters (some of my former coworkers at TripAdvisor are insanely plugged into the Way Of The JVM), has kept pooling in the toolbox for latency-sensitive problems.
> Unity manages to do quite a bit with the CLR on a cross-platform basis, and that looks like Javascript to me.
Well, for one, it's not. It's statically typed sort-of-JScript. It bears little relation to JavaScript; it was originally called UnityScript and the name was changed largely for buzzword bingo purposes. (It's really very frustrating when working in Unity, both because it's not real JavaScript but doesn't play nicely with Boo and C# scripts a lot of the time.)
Unity also has notable performance problems for nontrivial scenes due about 50/50 to expensive creation on the front end and garbage collection on the back end. One of the most popular Asset Store items is--wait for it--an object pooling system.
> Horrors! AngelScript has Garbage Collection! But since it didn't come from the JVM, I guess it's OK.
It does, but only if you're allocating within it. I literally never allocate non-stack data in it. I use it as a logical glue layer within my application, which is not garbage collected.
> Every victim of a use-after-free just cried out in anguish. Your world weeps at 60 fps.
I literally can't remember the last time I hit a use-after-free in my own code. If you understand your object lifetimes, RAII will usually (almost always?) put you in the right place. In the worst of cases, where you've found yourself in a thoroughly opaque situation and can bear the perf hit, you have tools like boost::shared_ptr (though in my current project I don't use it anywhere).
It's certainly possible to make mistakes, but it is easier to do the right thing than most managed-languages people are willing to admit. I used to be one. It's why I was pretty terrified of working in C++ for so long. As it happens, I think former-me was pretty wrong about that.
> What's a good language for implementing a browser? The number one requirement for a browser today is not performance. It is security.
I'm of two minds about this. Sure, security is a critical concern. But user experience isn't to be ignored, and I really do feel that user-detectible pauses are to be avoided at all costs. Writing better native code is, to me, a more feasible option than an imperceptibly fast general-purpose garbage collector.
(As an example, I grow frustrated when scrolling a big list on my Nexus 4 when I can watch the scrolling get stuttery. In my experience, it's almost always due to someone not following best practices and re-allocating instead of re-using existing objects.)
Use-after-free bugs that are so blatant that they occur in normal use are rare, but real UAF bugs are discovered adversarially, and it is absolutely not simple to test them or, for that matter, avoid them.
I'd certainly agree that they certainly exist in the wild but I'm less convinced that they're particularly hard to avoid if you understand the lifecycles of what you're working on
That browsers are a tough problem with a difficult-to-internalize object lifecycle and a bunch of edge cases makes this tougher, for sure; my stuff is games-focused and so, yeah, I have a relatively simple problem set.
But, all things considered I'd rather Somebody Else (because Somebody Else makes browsers, I don't) spend developer time making a fast system secure rather than running a slightly more secure system up against the hard walls imposed by any managed language one would care to name.
Integer overflows, which require no mitigation other than getting a computer to count correctly, are extremely pernicious; they pop up even in systems that were deliberately coded defensively against them. There's an LP64 integer bug in qmail! And that's just for counting. Avoiding UAFs requires you to track and properly sequence every event in the lifecycle of an object --- and, in many C++ systems, also require you to count properly.
No, I don't think they're straightforward to avoid.
I think it seems straightforward to avoid memory lifecycle bugs because it's straightforward to avoid them under normal operating conditions. But attackers don't honor normal conditions; they deliberately force programs into states where events are happening in unexpected ordering and frequency.
I don't even know where to begin with the idea that it's acceptable to sacrifice a couple memory corruption vulnerabilities in the service of a faster browser.
> No, I don't think they're straightforward to avoid.
Fair enough, and you certainly have more experience on the topic than I do. Thanks for the explanation.
One quibble though, and maybe it was poor phrasing on my part:
> I don't even know where to begin with the idea that it's acceptable to sacrifice a couple memory corruption vulnerabilities in the service of a faster browser.
It's not really an either/or though, right? I mean, you can protect against some forms of attacks through a managed environment (though the environment itself adds its own attack surface). But you trade performance to get it. A managed language isn't going to solve all your problems, and it is going to mess with your performance in inconsistent and essentially intractable ways. I'm not saying "throw in a couple memory corruption vulnerabilities if it'll make it faster", I'm saying that you can fix memory corruption vulnerabilities. You can't fix inherently slow.
You'll notice that most games are not written in Java or C#. It's much more common in small games that have a lot more cycles to spare.
And even still, it often becomes a problem. Generally it's solved by being very careful with object allocation as to not require GC during gameplay. Or some other trick to force GC to take less time than budgeted.
Also keep in mind, a lot of games are designed to run much closer to 30 FPS these days. Which gives you a lot more time to deal with GC.
Some advantages of memory safety remain. When using pools, you don't get protection against use-after-free or leaks. It does somewhat mitigate the consequences of those, however: it's much harder for a use-after-free to become a security exploit in a pool in a GC'd language, for example.
Sounds like you have a specific kind of object pool implementation in mind? Invalid object refs don't have to break the memory safety, cf. weakrefs.
An object pool can also be implemented without unsafe bulding blocks or other language support, as a pool of objects that are manually recycled after the pool lifetime is up. Then a bug could let you confuse a previous generation object witha current-gen one, but would not break memory-safety either.
In 2013 I don't think it has anything to do with the merits of C++ vs other languages. You can surely write a better browser from scratch in a language like Haskell; a project this important could even implement their own JVM if they want to use Scala and need to guarantee some performance characteristics. Like how Facebook wrote their own PHP compiler/optimizer.
A browser that breaks on non-standard markup is worse than useless. Legacy compat is so critical and so complex that a rewrite is just not an option. Lots of money and time is invested in battle-tested security etc, you can't just throw that investment away. Again, like how Facebook is still written in PHP.
Having written web browsers / browser engines and compilers for functional languages, I don't think you could do this with anything that exists today. And if you did it, the result would not be idiomatic Haskell; it would just have one huge monad threaded through the entire thing with no pure portion. You would also probably have to drop down to unsafe manual memory management for performance reasons.
> GC pauses wouldn't really be a problem if development in the past twenty five years had been focused on hardware assisted incremental garbage collection
Hardware assisted GC isn't free. It comes at a large cost of silicon* and power. And makes a lot of assumptions and puts a lot of restrictions on your GC. And it still doesn't solve the caching problems related to automatic memory management.
* The silicon has to come out of the die budget somewhere. likely decreasing performance on other operations. Or removing other special purpose instructions altogether (SIMD?)
The hardware features required for hardware-assisted GC are actually pretty simple. See https://www.usenix.org/legacy/events/vee05/full_papers/p46-c... for a description. Other features like hardware-assisted write barriers (simple) and transactional memory (much more complex) can be used to increase performance.
Would you rip out the guts of a modern browser and replace them with brand-new code that you claim is 25% faster?
No way. You'd be laughed out of the room. How exactly would you get a radical expansion of the attack surface past a security audit?
You might do it in a game, where security is (usually) not a primary concern.
Anyone who would sacrifice browser security for a 25% increase in performance is a fool. Browsers need to be secure, and then performance needs to be "good enough".
> Even 25% slower than the output of a good C++ compiler means a browser goes from first to last place in many benchmarks.
Do most people care about that, though? Like, go to a page that demonstrates browser-benchmarks, then uninstalling their current browser, downloading the best one and installing it? Or are the browser they got "bundled with the OS" fast enough for them? (not to mention exporting and importing all their bookmarks.) Maybe I've been spoiled by having fast browsers, but I can imagine that I could live with that, if I gained other benefits. Most of the websites I read are either very "static" pages for mostly reading stuff, or videos. I am pretty content as long as webpages load in around one second (I can probably live with more than that too) and my videos buffer at a respectable rate.
If I got a browser that was 25% slower, but on the other hand was very much less prone to buggy behaviour, then that sounds like a great deal to me as a user. Chrome is annoying me right now because of the relatively recent behaviour of freezing the UI-frame when I make a new tab: I can type in text but it's effectively hidden to me, and I only get to press enter and see what I wrote after up Five seconds. Another problem (very common in Chromium) is the whole webpage being unresponsive until the whole page is loaded, instead of loading stuff like text and letting me navigate the loaded stuff. If these things could be actually fixed for me, the lazy user (ie without me changing browser or diddling around with stuff that I shouldn't need to diddle around with in order to have a performant browser), then that sounds like a good deal to me.
Parallelism may be the key to disruption here however; if a new browser can scale better to the multi-core present and future it may be worth taking a small hit over C++ in the sequential case. That's Servo's bet (although Rust strives for C++-level performance even in the single-threaded case).