Phoenix LiveDashboard

mikl · on April 16, 2020

LiveView is such a game changer for Phoenix (and thus Elixir). From being “just another web framework” (albeit a very nice one), LiveView enables a whole different paradigm for web applications. As a developer, it’s like having superpowers. What would have taken days to build with an SPA or similar “fat client” JavaScript can be done in hours with LiveView.

erichocean · on April 17, 2020

I built a pop-up store for Nike's NBA All Stars event that used this approach about 5 years ago (in Node.js).

Productivity was extremely high, and even better, you could restart clients and since all app state was on the server, they would start up wherever they had left off. I stored the state in LMDB transactionally, so I could also bounce the server any time. Clients would reconnected with their WebSocket while it was being used and no one was the wiser. Did that dozens and dozens of times during the event.

I used Blossom (SproutCore derivative) for the UI and statecharts to make that happen. All actions were sent to the server (written in Node.js, with LMDB as the persistent storage). We also had iOS apps for the back office and customer support reps that worked identically (but in Objective-C).

Entire project took 5 weeks with myself as the only developer, replicated a (beautiful) iPhone app Nike made called SNKRS entirely in Blossom and ran it on these huge 30inch touch screens inside a full screen Chrome instance.

Very fun project, and we sold ~1M in shoes (through Stripe) in four days. Freaking cold though, February in NY!

Link to the event: https://dunksandjordans.wordpress.com/2015/02/13/nike-pop-up...

elamje · on April 17, 2020

I think very highly of elixir and Jose. After 3 months of using .NET Blazor in production, I can say that LiveView and Blazor have some trade offs that people don’t talk about.

For those that don’t know what they are, they basically keep a websocket connection open between the client and server to allow the client to call Elixir and C# functions instead of javascript frameworks that call some rest api endpoint. You can write HTML click listeners that are in the server language rather than js.

It’s powerful, but there is a latency trade off since most UI events end up being handled by the server, the client might notice this latency in the UI. Blazor specifically has issues with connectivity, which means your client is completely locked out of the UI until a manual refresh.

It’s true that this is certainly a paradigm shift, and maybe just a stepping stone to web assembly web apps written in a server language.

I haven’t used LiveView, but I can imagine it’s better than .NET at scaling since it runs on BEAM rather than the CLR, which means it’s likely the better choice if you will need to maintain a lot of client connections in your app.

abrookewood · on April 17, 2020

A lot of these trade-offs are clearly explained in a Stack Overflow Blog post [0]. The issue of latency is highlighted as a key issue, as is increased server load and no offline support. I would assume these also apply to LiveView. Still, the both sound appealing in that you can do everything in a single language.

https://stackoverflow.blog/2020/02/26/whats-behind-the-hype-...

conradfr · on April 17, 2020

I made a stupid game as LiveView POC [1] which I posted the other day and tested it on different conditions and can comment a bit on the latency.

On a LAN you really don't feel it 99% of the time. However in the game I have a countdown that I initially updated each seconds with LiveView (probably with send_after/4) and I couldn't make it totally reliable, sometimes the updates would queued up, so in the end I "faked it" in javascript with hooks.

Over the internet there is an added latency obviously. I feel it most in the game part because the button has a instant feedback. On the other parts, forms etc, this does not perform worse than a typical SPA.

So I guess it would be very dependent on the kind of application you use it for. Nobody said it would replace all SPAs and javascript. Also the official demo runs an animation at 60 or 100fps IIRC, so there's that.

Fun anecdote, my devops colleagues tried to ddos my app, they failed and all along people were playing the game without any hiccups.

I wonder though how it performs on mobile networks. Aren't some of them terrible for websockets, when changing towers etc? Would the reconnect and session restoring be too much noticeable?

[1] Demo On a free Gigalixir instance: https://every-weak-tapaculo.gigalixirapp.com

On a Raspberry4 behind my FTTH: https://nameguess.funkybits.fr

elcritch · on April 17, 2020

Haven’t tried Blazor, but LiveView’s haven’t had too noticeable of a lag for me even doing basic UI stuff on a server. I wonder if .NET has a higher latency response times on average? Elixir/BEAM seem to consistently provide < 10 ms response times. Given a 30-60ms ping, that’s still provide a round trip of < 120 ms. Barely noticeable for many cases. Still for people with bad internet latencies either tech stack would be frustrating...

I’d guess that most any web service for less than, say, 10k total users LV would be fine. Given a user base that size or smaller, the productivity tradeoffs outweigh most user inconvenience. As in, I can make a SaaS/SPA to serve a small market significantly quicker than using a bifurcated server/js-frontend. Especially if the SPA requires or benefits from live streaming updates (graphs, dashboards, stats apps, etc). No contest in my opinion, except for maybe a Clojure stack (?).

fahrradflucht · on April 17, 2020

What I don't understand with these technologies, is how you do continuous deployment without breaking all running client sessions evertime.

josevalim · on April 17, 2020

LiveView does two things in particular:

1. It includes live navigation, which updates the URL as you interact with the app. For example, if you try the live scaffold generator in Phoenix v1.5.0-rc.0, you will see that once you click the "New Post" modal, the URL changes to "/posts/new". So if users reconnect by any reason (poor connection, new deployment, etc), they will go back to the same page with the same modal.

2. For forms, we automatically submit the current form state on reconnections, allowing the server and client to sync up.

For complex applications, think a chat app or a game, then you have to persist messages, the game state, and what not on the server. But that would be necessary regardless if you are using LiveView or not.

elcritch · on April 17, 2020

I didn't realize that killing the LV server process would act just like a reconnect.. But that makes sense! It's been really impressive how well LV reconnects even when a server has been down for a while (I use RPi3's and they loose WiFi or reboot sometimes).

It's good to note that using the live navigation you can pass parameters via the URL. That's my main method for graphing / data view pages.

dnsbty · on April 17, 2020

With Erlang releases (compiled code with the Erlang runtime included) you can do hot code swapping to change the code while it’s still running.

And iirc I believe LiveView will have all the state necessary client-side to be able to reconstruct the session when reconnecting the socket. I’m not near my development machine to make sure on that one though.

te_chris · on April 17, 2020

We store our sessions in Redis and don't do hot code swapping but deploys do still mostly just work and live channels normally reconnect without any noticeable downtime. We run everything on GKE so a deploy is just rolling container swaps.

Liveview is built on top of channels, so worth reading up about them: https://hexdocs.pm/phoenix/channels.html

out_of_protocol · on April 17, 2020

As far as i know it doesn't work this way (session restoration) as of now. Something could be done but requires a lot of effort

machiaweliczny · on April 17, 2020

I was just storing session state in Redis

theonething · on April 17, 2020

Yes! If I could use things like LiveView or TurboLinks exclusively, I wouldn't miss React and friends at all.

pzmarzly · on April 17, 2020

In Laravel (PHP) world, some people have tried building sites LiveWire + Blade X + alpinejs. LiveWire is LiveView but for PHP and using POST and GET requests only, Blade X is for server side React-like DOM building, alpinejs for JS stuff that doesn’t require data round trip (for example displaying a popup with known contents).

I haven’t seen any large projects built with it yet, though for small ones it sure feels like magic.

GiorgioG · on April 17, 2020

ASP.NET Core Blazor has the same ability, for those stuck in a corporate environment.

sl1ck731 · on April 17, 2020

Is Blazor still "experimental"? I looked at it a year ago I believe and most people didn't seem to recommend it at that time due to not being optimized.

GiorgioG · on April 17, 2020

Sadly Microsoft is still terrible with naming developer products. There are 2 Blazors, the server-side blazor (which works like Phoenix LiveView, and then there's the version of Blazor that will target WebAssembly and run completely client-side.) The WebAssembly 'version' of Blazor is still experimental. Server-side Blazor is production-ready as far as Microsoft is concerned (I believe.)

elamje · on April 17, 2020

Using Blazor Server in production right now. Like another commenter said, there is also a client side web assembly target as well. The Server framework is great, but has its issues which I mention elsewhere in the thread.

math · on April 17, 2020

even those who aren't..

hamandcheese · on April 17, 2020

LiveDashboard seems to be something different than LiveView

abrookewood · on April 17, 2020

It's a metrics dashboard for Phoenix applications that uses LiveView to display the data.

aabbcc1241 · on April 17, 2020

node.js folk speaking: For people who like the concept of SSR SPA or "thin as just a screen client" but not comfortable to develop in dynamic type dynamic type language (Elixir and Erlang), checkout TS LiveView. It implemented the idea of Phoenix LiveView for Typescript developer.

tomconroy · on April 16, 2020

Better link is Jose's tweet about it: https://twitter.com/josevalim/status/1250846714665357315

brightball · on April 16, 2020

Having this built in is absolutely awesome...

out_of_protocol · on April 16, 2020

^^^ screenshots!

rubiquity · on April 16, 2020

It’s awesome for Phoenix users to have this built right into their apps. I know it isn’t a full replacement for Splunk or what have you, but it’s a step in the right direction to have this at your finger tips. The SaaSification of operational tools isn’t great from a cost or waste perspective given that most vendors are likely overprovisioned and under utilized. You also can’t beat the user experience of staying in your language and framework’s ecosystem. Well done.

cercatrova · on April 17, 2020

I like Elixir and all of its infrastructure and libraries, like BEAM and Phoenix, but I could never get used to not having static types. What's the best way to have static types in Elixir?

I have seen in Elixir an implementation of functional programming concepts that might prove useful: https://github.com/witchcrafters/witchcraft. It hasn't been updated since last year so I'm not sure if it still works.

These days I've just started using Rust and actix-web instead, seems a lot faster due to no VM, with static typing, and moreover, algebraic data types. I wonder if I could use another language without ADTs anymore.

skrebbel · on April 17, 2020

Same. I suffer through it, because the rest is pretty neat.

There is no best way. Fortunately the Elixir core libraries are annotated with typespecs, which gets you like 30% there. But dialyzer sucks, and the typespec syntax itself, as well as its expressiveness, leave a lot to be desired. It's clearly an afterthought.

One day I'm gonna make a TypeScript for Elixir.

Note: There's eg Gleam a statically typed language for BEAM, but it's quite unlike Elixir and it has a much smaller community. https://github.com/gleam-lang/gleam. I doubt using eg Phoenix on Gleam is possible, or if it is, whether it's fun. It looks promising though, especially if you're into Ocaml/Haskell-style "hard" FP.

lpil · on April 17, 2020

Hi! I'm the author of Gleam, thanks for your interest.

For how young we are I think we're doing a good job of building a community, there are more familiar faces in IRC and on GitHub every day.

My goal is for Gleam to be easier to use than Elixir, and in many ways we are taking inspiration from Elm which excels here. Hopefully in the future Gleam will meet your needs! :)

hopia · on April 17, 2020

Gleam seems like effort in the right place, I'm looking forward to seeing it in the wild!

Since we got you here, can you tell us what's your plan with static type checking for message passing semantics? From what I understand, that's the hard part about producing a static type checker for BEAM systems.

As for things like gen_server, is the idea currently that you simply give type definitions to callbacks, allowing the gen_server to model its state with types?

At first glance, I also don't understand yet how parametric polymorphism works. Is it implemented at language level?

midgetjones · on April 17, 2020

Gleam is pretty young, so I wouldn't expect the community to be huge.

Not sure what you mean by 'Phoenix on Gleam', but you could write core code in Gleam and use it in an Elixir/Phoenix app, no probs

https://dev.to/contact-stack/mixing-gleam-elixir-3fe3

skrebbel · on April 17, 2020

I just mean "how easy would it be to common Elixir libraries from Gleam, especially if they're rather involved like Phoenix".

Like, quite some of the more popular Elixir libraries (eg Phoenix and Ecto) depend quite heavily on macros, which I assume sortof rules out using them from other languages. Like Elixir itself, I think Gleam will only really shine once a community builds its own libraries for non-trivial messy stuff such as DB layers and web frameworks for it.

The interop is be great for simple functional stuff though, like unicode converters, JWT libraries and AWS wrappers.

lpil · on April 17, 2020

Witchcraft works well but it's not statically typed. It uses the patterns of Haskell but they are checked at runtime.

I'm working on a statically typed programming language in the Erlang VM called Gleam. It's very young but perhaps one day it will meet your needs! https://github.com/gleam-lang/gleam

cercatrova · on April 17, 2020

I've used Gleam back in the 0.1 or 0.2 days, back when you posted about it on the Elixir forum, glad to see it's still moving along nicely!

Zalastax · on April 17, 2020

We're working on a gradual type system for Erlang which also works quite okay for Elixir: https://github.com/josefs/Gradualizer/.

reledi · on April 17, 2020

Definitely. When I go back to Elixir from Go I feel the pain of not having static types.

AFAIK, the closest we have right now is

- typespecs

- guard clauses

- structs w/ required keys (cannot specify types)

- ecto embedded schemas (e.g. structs with "typed" fields)

conradfr · on April 17, 2020

I don't care that much between dynamic and strict types but I like strict typing.

So yeah typesecs and guard clauses cover a lot of that (and the ever fun dialyxir) but it's more cumbersome than just typing your function parameters and variables.

And funnily enough in the end what I miss the most is the type hinting when I write function calls. I use Intellij and the elixir plugin doesn't support that AFAIK, it's a real productivity killer.

dudul · on April 17, 2020

After using Elixir for almost a year professionally, I find structs/schemas to be fairly useless when it comes to type safety. These are essentially just maps, there is absolutely no guarantees, anywhere in your code, that your map was created properly, and went through the proper set of validations.

When I get a `Point` struct, I have no guarantee that the wrapped lat/long are actually valid. After 15 years of static typing this is just so stressful :)

brightball · on April 17, 2020

Static typing will create a lot of problems in a BEAM language IMO.

The distributed nature and ability to send calls to other nodes that can be running completely different code would throw it for a loop unless you had a way to get the types from the other nodes as well. And even if you could there are a number of synchronization problems that come from it.

You’ll end up getting the static benefits on your single node, but beyond that those guarantees will be gone just like interacting with a REST API.

cercatrova · on April 17, 2020

Couldn't it work like GraphQL which is typed?

dognotdog · on April 17, 2020

It's interesting, I started out seriously programming in C++, then took to Smalltalk, Lisp, Io, as exploratory languages, doing more work in ObjC, and eventually Erlang and now Elixir, and I do like the Elixir compromise of some typechecking, though indeed it is a little cumbersome.

Especially doing embedded work, and without doing proper mockups and test driven dev, the turnaround time with uploading code, rebooting the device, and waiting for a runtime error is quite long, and more compile-time error checking could help.

However, I also feel like types are more important to JITs and compilers optimizing high-performance numerical code, and dropping to C to do that with Elixir's FFI is thankfully simple enough.

I'm quite ok with Duck-Typing, as it does help me to iterate fast, even if it doesn't help the computer go fast :)

cercatrova · on April 17, 2020

I use TypeScript a lot and I only write in transformations of types in functions, rather than classes, so having ADTs is a big part of my workflow. I also write Python for machine learning but that's more glue code for Tensorflow, not full applications. I feel like once you go to FP land it's hard to come back :D

lawik · on April 17, 2020

I think there is work to get static typing in Erlang.

This tweet thread was where I stumbled into it: https://mobile.twitter.com/slashdotdash/status/1238513622189...

GordonS · on April 17, 2020

Same here.

I spent a day learning and playing with Elixir around a year ago, and immediately liked it a lot.

But more main dev language is C#, and I'm very used to static typing - as much as I enjoyed Elixir, the lack of static typing was a problem for me, so unfortunately I chose not to take it any further.

toastal · on April 17, 2020

There's purerl, a PureScript compiler to Erlang as an option of types to something on BEAM. I can't comment about it's maturity but it the Pinto + Stetson libraries seem like an interesting take.

namelosw · on April 17, 2020

Phoenix is amazing. Having shallow experience in Rails, I found I can be productive in day 1. It just works, and solid, and fast.

Yet there are more brilliant things ahead like OTP and LiveView.

There are many other outstanding functional languages, and I like some of them more than Elixir, but none of them have any framework as solid and straightforward as Phoenix.

krat0sprakhar · on April 16, 2020

Wow, this looks amazing - I'm so jealous of the elixir / pheonix ecosystem! The built in telemetry is the killer feature IMO as generally one would have to use an external system such as NewRelic etc to get this info.

Awesome work by the folks behind it - now I just need a project idea for which I can try this out :)

nickjj · on April 16, 2020

Not to knock the live dashboard but from what I read on IRC you will still likely need to use New Relic or other tools to look at logs and explore the state of your system from the past to help track down errors or analyze / filter those logs from the past.

The live dashboard only captures information at the moment. There is no persistence or advanced filtering capabilities. Meaning, if you received an error notification 5 minutes ago but didn't already have the dashboard open, you couldn't just load live dashboard and check it out there. You would have to troll through your own logs or use a 3rd party logging tool just like before.

Where live dashboard seems to shine is if you already have it open beforehand, you can see various metrics about your app in real time. Although other logging tools are pretty close to real-time too.

But it is cool that something like this is included by default. I hope future releases expand on what it allows you to do.

fataliss · on April 17, 2020

I really miss working with Phoenix/Elixir. While switching to the latest hottest tech rarely make sense, I really feel that the Phoenix core team has its head in the right place and when I see the features they put out it really makes me want to switch every project to it.

djm_ · on April 17, 2020

>latest hottest tech rarely make sense

I would agree! But I'd also say that I believe Elixir has moved past this part of the curve and is seeing serious adoption amongst companies.

I first got interested in 2014 and back then it was definitely still early days. 6 years later and I'm still yet to regret the decision to invest time Elixir and BEAM ecosystem.

peteforde · on April 17, 2020

Congrats to the Phoenix team on another milestone win. Your rising tide is lifting many boats.

For those of you working in Rails, you will be excited to know that we have StimulusReflex:

https://docs.stimulusreflex.com/

SR was inspired by LiveView. It's built on ActionCable, CableReady and morphdom. It is fast like a demon and works well with AnyCable. You can see demos here:

http://expo.stimulusreflex.com/demos/tabular

theonething · on April 17, 2020

Neat! How does this compare with Turbolinks?

peteforde · on April 17, 2020

It's not an OR, it's an AND.

StimulusReflex is built on top of Stimulus, which is designed to work arm-in-arm with Turbolinks. We think that the combination of Rails + Stimulus + Turbolinks + StimulusReflex + solid Russian doll caching is the greatest thing since sliced bread.

porker · on April 17, 2020

It would be great to have a higher-level explanation that included this. I read through the first 4-5 pages of the docs and saw it mentioned these, but didn't pick up (until the code examples) that it used them.

I'm not a Rails programmer, but I'm keen to learn about all the different approaches being used & tried, to see what I can make fit with my language.

peteforde · on April 20, 2020

Help! I'm the guy who writes those docs. I want them to be better. Nothing strikes terror like "I read through the first 4-5 pages of the docs and saw it mentioned these, but didn't pick up that it used them."

To me, that's like "When I ordered beef, I didn't know you meant a cow!" lol

Seriously, help me help you. Where did I fuck this up?

The very first paragraph of the Quick Start page:

"A great user experience can be created with Rails alone. Tools such as UJS remote elements, Stimulus, and Turbolinks are incredibly powerful when combined. Could you build your application using these tools without introducing StimulusReflex?"

https://docs.stimulusreflex.com/quickstart

What is the thing you could have read and understood right away? I really appreciate whatever you can suggest.

kuzee · on April 16, 2020

Elixir/Phoenix developers will get a lot of mileage out of the ability to quickly collect and inspect stats like this. This is really great and solves ~50% of the use cases for an outside error monitoring service while also making debugging specific instances easier.

systemd0wn · on April 16, 2020

It looks like Plangora made a youtube video recently. https://www.youtube.com/watch?v=Nqr5ly35tu8

eqmvii · on April 17, 2020

This is really cool! I'd love to get this into some apps, and the LiveView dependency might be a nice excuse to sneak that in at the same time...

PopeDotNinja · on April 16, 2020

The looks pretty neat. Is this a pretty wrapper for the Erlang Observer, or is it more than that?

sb8244 · on April 17, 2020

This is more than that. The actual metrics come from erlang functions and aren't tied to observer. https://github.com/phoenixframework/phoenix_live_dashboard/b...

Hook into telemetry is also really cool because any library can be instrumented.

The multi node cluster aspect allows easier monitoring of the full cluster

valuearb · on April 16, 2020

What if you live slightly outside of Phoenix? Can you still use the dashboard?

out_of_protocol · on April 16, 2020

Dashboard built on top of LiveView, which is elixir-specific.

* For any other language it means trying to replicate the project from scratch (also unlikely to be of any use on php, python, ruby etc)

* For non-phoenix elixir projects you could make it work but you'll rewrite half of Phoenix framework anyway so i don't see the point

sorentwo · on April 16, 2020

Searching for things related to Phoenix is a constant problem. Even Phoenix Elixir will sometimes give you results for random bars in Arizona.

bigbassroller · on April 17, 2020

Try “Elixir Phoenix” and the topic.

defnotarobot · on April 17, 2020

I think it's up and running as far away as Flagstaff.

dogweather · on April 16, 2020

taspeotis · on April 16, 2020

Is Erlang's BEAM VM really slow compared to other virtual machines, or is it Phoenix? It seems to be a bit of a laggard on benchmarks [1] and there's been this thread going on for years [2] which hasn't reached a conclusion.

[1] https://www.techempower.com/benchmarks/#section=data-r18&hw=...

[2] https://elixirforum.com/t/techempower-benchmarks/171

EDIT: Thanks everyone for the responses.

josevalim · on April 16, 2020

I have commented this in other places but the main explanation for the benchmark you linked is that Phoenix (and its underlying Cowboy webserver) are not optimized for this particular workflow.

For a plain-text benchmark, you are ultimately measuring the ability of sitting on top of a socket and reading/writing to that socket as fast as you can.

However, Phoenix/Cowboy, whenever there is an HTTP connection, spawns a VM light-weight process to own that connection, and then each individual request runs in its own light-weight process too. This comes with its own guarantees in terms of state isolation, reasoning about failures, and you get both I/O and CPU concurrency for free - if you were to do any meaningful work on those requests.

Even though the VM processes are cheap to spawn and lightweight, it is overhead compared to something that is just directly reading and writing to the socket. I actually wrote a proof of concept where we just sit on top of the socket without spawning processes and it performs quite well - although I don't think it has any practical purpose. There is also an interesting article from [StressGrid](https://stressgrid.com/blog/cowboy_performance_part_2/) that shows how removing the per-request process speeds it up by ~60%. But once you are going through long requests (i.e. 1ms because you need to talk to a database or API, encode JSON, etc), these differences tend to matter less and the concurrency model starts to give you an upper hand.

taspeotis · on April 16, 2020

Thanks. So a 60% improvement would be approximately 300K requests/sec vs. ASP.NET Core's seven million. Which still looks concerning.

> But once you are going through long requests (i.e. 1ms because you need to talk to a database or API, encode JSON, etc), these differences tend to matter less and the concurrency model starts to give you an upper hand.

If you look at the other tabs in the benchmark there are "heavier" loads that include database access and the most favourable one I can find is "Fortunes" which puts Phoenix at 17.8% of the relative performance of the best ASP.NET Core one.

josevalim · on April 16, 2020

For the fortunes case, you are still comparing a web framework with a platform. When comparing web frameworks, some ASP.NET frameworks performs worse, others perform better, but at most 2.4x faster.

On the plain text one, the difference between frameworks is at most 6x. You would have to yank Phoenix and sit directly on top of a socket, as per my previous comment, if you want to compare both VMs.

derefr · on April 16, 2020

BEAM is by default tuned for throughput over latency, but also for soft-realtime responsiveness over throughput. The right benchmark for BEAM performance isn’t how fast it can do a single-threaded or even multi-threaded CPU-bound task; but rather the 99th percentile packet-to-pcket latency + makespan + memory-requirement volatility of a C10M workload. In other words: if the VM were hosting a million VoIP calls, how many frames would it deliver “on time” such that the client wouldn’t discard them?

(This is not to say you can’t tune BEAM to the needs of other workloads; those are just the defaults. Though, because that is the default assumed workload, it’s also the one most optimization effort goes into.)

BEAM is also pure-functional in a way that a lot of VMs hosting pure-functional languages aren’t; things like mutable atomic memory handles were only introduced recently, and there’s no plan to rewrite core operations in terms of them, because these primitives are less predictable in their time costs than equivalent functional primitives.

You can go fast under BEAM, but the people who want to do so, often decide that it’s too hard to go fast while also retaining soft-real-time guarantees they began using a BEAM language to attain; so they instead use one of BEAM’s many IPC bridges to hook up a fast native “port program” written in another language to the BEAM node, ensuring that any hiccups or crashes in the native code are isolated to its own address space and execution threads.

This last consideration means that very few people are really driving BEAM’s performance forward, because so many instead take the “escape hatch” of low-overhead IPC.

Taken as whole systems, though, you’ll find BEAM in a lot of services that go very fast indeed—you just won’t usually find BEAM itself handling the performance-critical part!

hopia · on April 17, 2020

> BEAM is by default tuned for throughput over latency

I thought it's exactly the other way around.

I Always assumed BEAM is built for concurrent, consistent low latency IO. The benchmarks seem to vaguely reflect that.

derefr · on April 17, 2020

Right; we're saying the same thing in different ways, I think. BEAM's concerns, in descending order, are:

1. that an actor that wants to run is never de-scheduled for too long (= 99th percentile latency);

2. that every actor that wants to issue an IOP to the kernel gets to do so efficiently (= whole-system throughput)†;

3. that any individual actor makes actual appreciable progress (= 50th percentile TTFB latency.)

This is basically the definition of a soft-realtime system. If latency always came first, it'd be a hard-realtime system; if throughput always came first, it wouldn't be a realtime system at all.

† BEAM is actually quite bad about this for disk IO in the normal case, because all disk IO by default goes through the bottleneck of sending messages to a single file-server actor. This has the advantage of virtualizing the filesystem, allowing for e.g. diskless BEAM nodes that boot by using a peer's file server as their own. But it's not very good for filling up your disk's IO queue. Luckily, you can bypass the file server and use file-IO ports/BIFs directly. Or, if you're working against network storage anyway, just skip your own OS kernel and have BEAM speak iSCSI to the disk server directly (like Erlang-on-Xen does.) That approach will likely give you more disk IOPS than you could achieve even in C, presuming the C program is relying on OS syscalls to talk to the remote disk through the OS VFS layer.

hopia · on April 17, 2020

That's a great explanation for a soft real-time system!

However, I meant that I always assumed that BEAM prioritized low consistent latency at the expense of high throughout (but without providing hard real-time guarantees). And the mechanism being that preemptive scheduling and counting those reductions so no single process is able to hog the entire system's resources for expensive computations.

While these benchmarks may not be reflective of the true performance it seems that Phoenix has a smaller throughput but lower latencies than most of its peer group in them.

derefr · on April 17, 2020

I would say that BEAM prioritizes throughput over 50th-percentile latency, because BEAM prioritizes serving “port” pseudo-actors (distinct scheduled interactions with the kernel through syscalls for queuing/resolving IO ops) over serving actual userland execution-thread actors.

One actor queuing up a heavy amount of IOPS through many different ports, can actually cause the BEAM node to generate a visible latency spike, as it always tries to schedule all the ready ports assigned to the given OS scheduler-thread per scheduler-timeslice, before trying to schedule any user-defined actors; and so will find itself with very little time left in that timeslice to schedule the execution threads of actual actors. A single actor can accomplish this “uneven” loading because BEAM’s reduction scheduling means that just queuing IO ops through ports in a linear block of code can never cause an actor to be descheduled, since the port BIFs are not function calls per se, and therefore have no ability to trigger scheduler yield-checking.

Mind you, this is a highly-unusual usage pattern. Actors don’t usually batch up huge binary payloads to send and then blast them all through at once into many different ports in an unrolled loop. And, usually, one actor doesn’t even usually hold multiple ports; the idiomatic approach is for each port to have its own separate “owner” actor (because this means that these “owner” actors will be scheduled semi-synchronously in response to activity on their respective ports, sort of like Go with goroutines getting scheduled in response to activity on channels.) In an idiomatic usage pattern, you could say that the node doesn’t become IOPS-bound but rather actor-scheduling-bound, where a scheduler-saturating number of ready ports can’t be brought about, because as ready ports per timeslice increase, the amount of the scheduler-timeslice available to actors decreases, and so fewer actors that would generate IOPS are getting run, meaning that more of the next timeslice will be available. In other words, with idiomatic use of ports, this is a self-limiting problem.

But, well, we’re talking about how the BEAM is built, rather than about what Erlang/OTP systems do in practice. The BEAM itself has these priorities, even if, in practice, they’re papered over with other ones :)

taspeotis · on April 16, 2020

Thank you for the explanation of BEAM + its objectives and internals.

> It's incredibly good at multiprocessing, which is really useful in the websphere

So on that benchmark I linked, if you switched to the latency tab Phoenix clocks in at 14.6ms average latency with a standard deviation of 23ms and a max of 460.7ms. The best looking ASP.NET Core result from a standard deviation and max point of view is 1.5ms average latency, 1.4ms standard deviation and 49ms max.

So again it's hard for me to tell whether it's BEAM or Phoenix that's responsible for the relative performance hit.

derefr · on April 16, 2020

Very likely Phoenix (Phoenix is “more” software than ASP.Net is; a better comparison might be to Erlang’s cowboy library); but also very likely that the author of that benchmark didn’t really tune BEAM. By default, BEAM has runtime tracing hooks enabled; native code compilation (HiPE) disabled; no core pinning; low enough process-heap-size upon spawn that an immediate grow is usually necessary; and a slew of other things.

There’s also the fact that Elixir and even Erlang don’t take full advantage of the possibilities of BEAM-bytecode whole-program optimization. For example, any list you walk using Erlang’s lists module or Elixir’s Enum module is going to generate a back-and-forth of remote (symbolic) calls, rather than resulting in the body of that function of lists/Enum being inlined and fused into the caller. Which in turn means, even if you use HiPE, you’ll be thunking in and out of native code so much that the overhead will eat any potential benefit; and each side will be opaque to the static analysis of the other, so even a wholesale native recompilation of the runtime won’t save you. (Same with gen_servers: remote callbacks through library code, rather than fused module-local receive statements; ergo, 10x optimization opportunity lost.)

You’ll see real optimization in code generated by e.g. Erlang’s lexer-generator library leex; or in specific parts of each language’s stdlib, like Elixir’s Unicode-handling functions. But such optimized code emission is thin-on-the-ground compared to “naive” coding in Erlang/Elixir. (And I don’t blame the language devs: the languages are fast enough for almost everything people try to do; and for the rest, you can ignore the language-as-framework and write entirely self-contained modules that only rely on BEAM primitives.)

gregors · on April 17, 2020

looking at latencies for framework vs framework tells another story https://www.techempower.com/benchmarks/#section=data-r18&hw=... Where ASP.NET MVC clocks in with 39.3ms average latency, 154.1ms standard deviation

TylerE · on April 16, 2020

It's much more complicated than a simple slow <--> fast continuum.

Elixir is pretty slow for simple singlethreaded CPU bound tasks.

It's incredibly good at multiprocessing, which is really useful in the websphere since your workload is much more likely to resemble thousands of simultaneous communications that need to not block each other rather than calculating a 13042345235223452 digit prime number.

nathan_long · on April 17, 2020

I like to say that the BEAM's preemptive scheduling provides non-blocking IO AND non-blocking computation.

To expand on your workload description: things like waiting for a database query or an API result are smoothly handled in this paradigm; the process goes to sleep for a bit whle another request is processed.

If your workflow is like a delivery truck which makes a lot of stops, "faster trucks" doesn't help much with throughput. "More trucks" is better.

taspeotis · on April 16, 2020

> It's incredibly good at multiprocessing, which is really useful in the websphere

Right so the benchmark I linked to was TechEmpower Web Framework Benchmarks, and Phoenix managed 186,774 requests/sec on a 14C/28T Xeon to do basically nothing (echo Hello World) and ASP.NET Core managed ... seven million.

ghayes · on April 16, 2020

I'd throw this hat into the arena, which is the creator of Phoenix trying to push the framework to its limits (from a few years ago) and getting 2MM active websocket connections from one server: https://www.phoenixframework.org/blog/the-road-to-2-million-...

leeoniya · on April 16, 2020

or 1MM on an old laptop with uWebSockets.js:

https://medium.com/@alexhultman/millions-of-active-websocket...

taspeotis · on April 16, 2020

I think that's a good example of how Phoenix can scale but not necessarily how fast it is. Scale is one part of performance but I should have made it more clear: I'm interested in how fast it can be. Looking at those charts it takes them 7.5 minutes to reach 2 million subscribers. Without knowing how much work is involved for each subscription it's hard to take away information about its speed.

kilburn · on April 16, 2020

> I should have made it more clear: I'm interested in how fast it can be.

The answer you are looking for is: much slower than .NET core.

The "being good at multiprocessing" of the BEAM VM people are bringing up is not about raw speed at it: it is about giving you the tools to actually being able to do multiprocessing in a sensible, stable manner.

For instance, what happens if a bug in your program blocks threads for a long time in .NET core? What if one thread crashes? Do you have any guarantees that one thread cannot affect others in any way? Etc.

Some developers think that giving up raw performance for guarantees in these areas makes sense. Other developers think they can deal with those issues without nice guarantees from the platform and thus prefer the speed. There's no definitely right answer...

nathan_long · on April 17, 2020

Another consideration: Phoenix PubSub makes it simple and performant to add more servers. So if one (say) chat app server can provide acceptable performance for N users, you can serve N more users by adding another node.

See the architecture of Phoenix Channels (which powers LiveView and other websocket-based solutions in Phoenix): https://hexdocs.pm/phoenix/channels.html

dnautics · on April 16, 2020

Why do you care so much about pure speed out of context?

TylerE · on April 16, 2020

Have you considered that's revealing more about the webserver used than the software running behind it?

Plus, 186k/sec is still a hell of a lot.

taspeotis · on April 16, 2020

That seems to be what the forum thread is about. It couldn't possibly be that Phoenix is slow, it must be ...

gregors · on April 17, 2020

I think the repo with these benchmarks have some really interesting things in them in general. It's really worth taking a peek at some of the implementations.

The aspcore platform benchmark does some cool trickery based on whether the test even requires a database using compiler #if directives. Though it seems to be flagged as "realistic approach" Dunno if I agree with that. I think I'm looking at the right code. https://github.com/TechEmpower/FrameworkBenchmarks/blob/mast...

zinclozenge · on April 17, 2020

I'm typing this all from memory so I might be wrong, but the short answer is that gen_tcp:recv/2 is a blocking call. There is a private API called prim_inet:async_recv which is async, and ironically gen_tcp builds on it. There have been attempts writing experimental HTTP servers using prim_inet:async_recv which have much better performance,