More

tadlan · on March 15, 2016

Any specific plans for python support?

jfbastien · on March 15, 2016

Python is just another language VM. C Python should work just fine in WebAssembly, and bridging the APIs to the web (e.g. filesystem, etc) is already possible through Emscripten.

We don't plan to do things for WebAssembly, we plan to make WebAssembly powerful enough that anyone can run their language VM :-)

esailija · on March 15, 2016

I think what they meant is that they could use python without having to also ship the implementation like CPython. I mean, that thing is huge. How is it feasible to serve 10mb for a website?

tanlermin · on March 15, 2016

Yes, thats what I meant.

BuckRogers · on March 15, 2016

That's awesome and means alongside ObjectiveC, I effectively waited out deep diving on 2 (soon to be) formerly popular programming languages. I get to use my preferred language, Python, for even more.

tadlan · on March 15, 2016

Gotcha!

tadlan · on March 15, 2016

Or julia. Which is 1000th better than js

CyberDildonics · on March 15, 2016

Julia would still require the garbage collector but I don't think the downvotes here are fair at all. If there was a language that would be a natural transition to people used to javascript but wanting native speed, julia would be it. In a way it is a glimpse of what javascript could have been given a lot of experience and much more design.

tanlermin · on March 15, 2016

So how big are the Julia binaries? Would it still need a VM?

CyberDildonics · on March 16, 2016

At the moment actually that would be a pretty big problem, the dependencies are very large, mostly because that hasn't been a priority I think.

There has been work released recently which allows programs to be compiled to native binaries if they aren't dynamically evaluating anything however.

tanlermin · on March 16, 2016

Yea I meant the native binaries in the recent ahead of time compilation work.

tadlan · on March 15, 2016

It will eventually, iirc

tadlan · on March 15, 2016

Question please: Even though there is stated desire to support python etc vm, this will be hampered by the need for the client to download the large VM vs native code that can be distributed itself, correct?

jfbastien · on March 15, 2016

Correct, though we're working on dynamic linking to ease the pain, and improving compilers such as LLVM so that the generated code is smaller.

tadlan · on March 15, 2016

Thanks. I'm not familiar with dynamic linking. What will that do?

binji · on March 15, 2016

There's a response above this one from titzer describing it, or you can read more here: https://github.com/WebAssembly/design/blob/master/DynamicLin...

tadlan · on March 7, 2016

Have you tried dask?

tadlan · on Feb 2, 2016

What about dynd? Iirc doesn't use iteration st all.

tadlan · on Jan 31, 2016

Woah!!! Cool

tadlan · on Jan 3, 2016

Numba obviate these issues

srean · on Jan 3, 2016

I see your excitement about Numba, I guess you follow it closely, or perhaps have other valid reasons. So all the best. However, the last time I tried to use it, about 4 months ago, installation was a bitch and it would crap out compiling some functions. It holds promise of course, so does alternatives.

What I think the current post is about is that similar nice and performant abstractions for D. I love Python a lot but I have to pay extra attention so that it is performant, so that I don't make stupid typos. I have to rewrite parts in Cython or Weave (sadly the latter gets no love anymore). Larger the code base becomes I have to spent more time in the tail part of the 'tooth to tail' ratio. In D many of these are taken care by default, I have to worry a lot less about these things. The other fantastic thing is DMD has super fast compilation times. It used to be faster than Go in compilation time but the latter has caught up in that department (grossly lacking in others). D binaries execute a lot faster (when compiled with LDC or GDC). It may lack some libraries here and there, but that does not worry me as much, everyone has to start somewhere, and to be honest Python is lacking in that respect compared to R. What would be worrisome is the presence of structural aspects that may get in the way of such an eco-system emerging. I don't see anything of that kind in D.

In anycase I don't quite see how Numba obviates the issues I mentioned. It does not change the Ndarray representation, and the extra indirections lay right there. OTOH some copy elision I would grant.

I can give concrete examples. The isotonic regression implementation in scikits.learn was absolutely pathetic. It has been rewritten several times and its performance is still severely lacking in spite of being Cythonized. You can take a stab at it with Numba and see what you can do. In C++ (D would have been nicer) the very first idiomatic implementation in STL was faster than any of these rewrites. I wrote it once and moved on. But C++ is a horrifying mess, now if there was a language that gave me C++ performance (not asking for much) but none of the mess and numpy level expressiveness to boot, that would be sweet. Julia, Nim, D, PyPy are some of the contenders trying to reach this holy grail

p4wnc6 · on Jan 3, 2016

Numba certainly does not obviate all of these issues. I think user `tadlan` is referring to some of the compiler optimizations, like loop unrolling or fusion (e.g. noticing that two subsequent loops can be 'fused' into the subordinate execution block of one single loop). These things can offer speed-ups even beyond NumPy, and they can work even when the Python code you start out with already uses NumPy.

The thing is, which `tadlan` seems unaware of, a lot of this stuff just fails in production environments and hits corner cases that the Numba compiler does not handle (I'm talking about the first part of the Numba compiler pipeline, where it converts to Numba IR, and not yet to LLVM IR, and does things like examine the CPython bytecode to alter the representation from a CPython stack-based representation to a register-based representation that will be compatible with LLVM and ultimately with the actual machine itself). In that compilation step, the only things that are able to be handled are things that the Numba team (I used to be a member of it) explicitly support. They don't support a full-blown compiler for the entire Python language, nor even for every type of NumPy operation. That's not a knock against Numba at all -- it's a specializing compiler and obviously they need to prioritize what to support, and make longer term goals about supporting more general things. But the point still remains that you cannot just assume that if you call `jit` on any arbitrary Python code, it will always become faster. In some cases, it can even become slower.

I suspect `tadlan` is very interested in Numba and enthusiastic about knowing the taxonomy of Numba details, but it does not seem like that user has had real world experience trying to get Numba to work in production, and seeing all of the numerous buggy and missing features. I don't want to diminish anyone's enthusiasm for Numba, so it's probably best just not to engage with `tadlan` about it. That user's mind seems made up already.

Gtifn · on Jan 3, 2016

As a former member of the Numba team, what is your outlook on the technology and the broader continuum ecosystem?

I'm looking at Numba and friends (Dynd, blaze) for a new stack, but I'm not sure where the development arc will end up vs say Julia. I'm also curious about the sustainability of Continuum's business model and practices in the medium and long term.

Any thoughts on this? I understand if you are limited in what you can say, but I'm open to any nuggets.

p4wnc6 · on Jan 3, 2016

This is one of those questions that is extremely hard to predict. Even though I was part of that team, it doesn't mean I have any special insight into what will happen.

A lot will depend on sources of funding. Will folks like Nvidia start sponsoring Numba, and if so what will it mean for support of Nvidia alternatives like OpenCL?

It seems like NumbaPro as a stand-alone for-pay product is not viable on its own, but that claim could be wrong based on more recent data that I don't have access to. So external sponsorship may be necessary.

One form of this could be through Continuum's already established business model of consulting and support services. But then the question is whether the nature of those consulting and support projects will allow for developers to actually further the cause of Numba, or just merely hack in poorly conceived features that are demanded by the consulting and support customers? Since Numba is open source, it should be easy enough for anyone to follow along with commits and discussions on GitHub and make their own opinion about what direction that is going.

The other question that is always hard is staffing. Far and away the colleagues I had the chance to work with on the Numba team were amazingly good. But it's not clear if working solely on Numba can justify the sort of salary that would be required to attract very top engineers and grow the team. You might start to see more interns and/or post-doc type labor feeding into Numba, and again I don't know what that will mean for the project ... could be good or bad.

At the same time, you've also got a lot of active development for Julia, PyPy, and a lot of people still prefer to use Cython rather than jitting functions. Some people even call into question the entire goal of making something that is "easy" but also a "black box" -- like the way just dropping in `jit` works for people who merely use, but don't understand the inner workings of, CPython.

It's an exciting area, and the Numba team has as much talent and ability to claim a significant piece of the tool space surrounding high performance computing as anyone else. Whether that will pan out for them is still really hard to predict.

Gtifn · on Jan 3, 2016

Thanks very much for your thoughts.

What do you think about the foundational tech of numba itself?

Do you think it is any more of a black box than say Julia? Is there anything about it that would impede extension into a more stable, predictable and feature rich product?

BTW looks like Intel is doing some stuff with Julia: https://github.com/IntelLabs/ParallelAccelerator.jl

There has also been alot of recent funding to Continuum and dev of numba seems to be going strong. Also some recent work with AMD.

tadlan · on Jan 3, 2016

What would one need to use c or c++ to write a julia package? It's as fast as native code so no need to use multiple languages.

9il · on Jan 3, 2016

Julia is really fast in 95% cases, but 5% still "make the weather". The pairwise summation is an example. I will post benchmarks D vs Julia next week ;)

skariel1 · on Jan 3, 2016

I would say the biggest benefit of D is static typing. In Julia you can run a simulation and discover only after half an hour that you misspelled a function

tadlan · on Jan 3, 2016

Make sure you use the devect macro or parallel accelerator if you're going to use vector iced code :)

tadlan · on Jan 2, 2016

Use numba to compile python loops or array expressions to fast llvm, and problem solved. I'm sticking with python.

bionsuba · on Jan 3, 2016

Considering that in the benchmarked example, only one like of Numpy code was used which already uses compiled C, I have a hard time believing that that would catch up to using all compiled code.

tadlan · on Jan 3, 2016

Numbs compiles entire functions on and allows array expressions with allocation and loop fusion. I don't see the problem

bionsuba · on Jan 3, 2016

The benchmarked code is ALREADY a comiled C function called from Python and it still lost.

tadlan · on Jan 3, 2016

Numba would still be faster. It would fuse away any intermediates in the code and remove any Overhead to the compiled code.

also have the option of devecting to loops.

both of which are generally faster than vectorized jumpy code.

p4wnc6 · on Jan 3, 2016

While I agree broadly that the benchmark example in the post is not representative or useful as a comparison between D and NumPy, I disagree with your strong insistence on Numba in this case.

There is still a lot of Python code that the Numba-to-LLVM compiler cannot handle. Yes, it is true that Numba can do a decent job of removing CPython virtual machine overhead, even for functions in which you statically type the arguments merely as 'pyobject' -- but not universally.

And there are also a ton of super basic things, for example creating a new array inside the body of a function when using Numba in 'nopython' mode [1], that Numba doesn't handle. These things will improve over time, but they may not improve quickly enough for a given use case, and the D language and this ndarray implementation may be a fair competitor to Numba in the short-to-medium term (and could even be superior in the long run, who knows).

A fairer alternative would be comparing with the use of Cython, which for my money still hands down beats anything like D or Nim/Pymod for performance enhancement without sacrificing pleasantness of design and development.

Though, of course, none of this stuff holds a candle to Haskell :)

[1] < http://stackoverflow.com/questions/30427081/creating-numpy-a... >

tadlan · on Jan 3, 2016

Actually, Numba array allocation in nopython mode is allowed now.

What other numerical features are missing?

srean · on Jan 3, 2016

If we are discussing in 'would' and 'could' why wouldn't a D backend also be able to that. Lets talk once Numba is easy to install and use :)

tadlan · on Jan 3, 2016

It's not a would and gpod. Numba works like that right now, has better interoperability with python and less context switching.

Numba is I easily installed on all systems through conda package manager and is easily used with just a function decorator

Seems you are grasping at straws here.