Hacker News .hnnew | past | comments | ask | show | jobs | submitlogin

For implementing Python, you have at least five options:

- Naive interpreter (CPython). Everything is a dict. Slow. - Transliterate to some hard-compiled language, but all data is still one kind of dynamically typed object (Nuitka). A little faster, and compatible, but has limited optimization potential. - Infer types and try to create appropriate code in a faster language (Shed Skin). Hard to do, but promising. (Shed Skin has one implementor.) - Restrict the language (RPython) Potentially much faster, but incompatible. - Build a JIT compiler/interpreter combo and handle all the hard cases that require recompiling during execution (PyPy). Hard to do, and results in a huge system, but almost compatible. After 10 years of work, it's finally happening.

If you're willing to restrict the language, it's much easier. RPython was written only to help build PyPy, but the concept could be extended to allow most of Python. Both Shed Skin and RPython insist that type inference succeed at disambiguating types. If you're willing to accept using an "any" type when type inference fails, you can handle more of the language.

The big boat-anchor feature of Python is "setattr", combined with the ability to examine and change most of the program and its objects. This isn't just reflection; it's insinuation; you can get at things which should be private to other objects or threads and mess with them. By string name, no less. This invalidates almost all compiler optimizations. It's not a particularly useful feature. It just happens to be easy to implement given CPython's internal dictionary-based model. If "setattr" were limited to a class of objects derived from a "dynamic object" class, most of the real use cases (like HTML/XML parsers where the tags appear as Python attributes) would still work, while the rest of the code could be compiled with hard offsets for object fields.

The other big problem with Python is that its threading model is no better than C's. Everything is implicitly shared. To make this work, the infamous Global Interpreter Lock is needed. Some implementations split this into many locks, but because the language has no idea of which thread owns what, there's lots of unnecessary locking.

Python is a very pleasant language in which to program. If it ever gets rid of the less useful parts of the "extremely dynamic semantics", sometimes called the Guido von Rossum Memorial Boat Anchor, it could be much more widely useful.



> The big boat-anchor feature of Python is "setattr", combined with the ability to examine and change most of the program and its objects. [...] If "setattr" were limited to a class of objects derived from a "dynamic object" class, most of the real use cases [...] would still work, while the rest of the code could be compiled with hard offsets for object fields.

Isn't __slots__ made for that specific usecase? (when you want to optimize your code by specifiying specific attribute names)

I think the "dynamic" behavior is a sane default (since most people don't need that optimization).


setattr and similar dynamic features do make Python harder to optimize, but they're not that different from what you see in JavaScript. JavaScript has had an incredible amount of work spent optimizing it, but the result is a bunch of pretty damn fast language JITs that implement the full language, usually only slowing down in cases where actual use of those features requires it. Is it really that hard to come up with something like that for Python?

(Threading is a separate issue, though.)


> Is it really that hard to come up with something like that for Python?

Nope, we just need a large company willing to spend tons of money funding the effort.

PyPy has made incredible strides in this area, especially for long running processes where the JIT has time to warm up. But they need a lot more funding if people ever want Python to get fast.


I'm very impressed that PyPy got their JIT working. But it took 10 years from initial funding by the European Union. It's a hard problem. They had to come up with some new, elegant solutions to make it work. See "https://pypy.readthedocs.org/en/release-2.3.x/jit/pyjitpl5.h...

JavaScript is a bit easier because it doesn't have concurrency. In Python, you can change a method of an object while an instance of that object is being executed in another thread. So you have to worry about invalidating code currently being executed asynchronously.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: