Had to write a fairly substantial native extension to Python a couple years ago and one of the things I enjoyed was that the details were not easily "Googleable" because implementation results were swamped by language level results.
It took me back to the old days of source diving and accumulated knowledge that you carried around in your head.
I made some small contributions to cpython during the 3.14 cycle. The codebase is an interesting mix of modern and “90s style” C code.
I found that agentic coding tools were quite good at answering my architectural questions; even when their answers were only half correct, they usually pointed me in the right direction. (I didn’t use AI to write code and I wonder if agentic tools would struggle with certain aspects of the codebase like, for instance, the Cambrian explosion of utility macros used throughout.)
This was around 2021 so AI code tools had not yet eaten everyone. One of the most interesting challenges was finding the right value judgements when blending multiple type systems. I doubt any agentic coding tool could do it today.
I blended the python type system with a large low-level type system (STEP AIM low level types) and a smaller set of higher-level types (STEP ARM, similar to a database view). I already was familiar with STEP, so I needed to really grok what Python was doing under the covers because I needed to virtualize the STEP ARM and AIM access while making it look like "normal" Python.
Oh, that's very interesting work. And, yes, I'd also be surprised if (today's) agentic tools were at all helpful for that: it's way outside of distribution, and conceptual correctness truly matters.
I've been comparing various platforms and discussing them with ChatGPT—for instance, why Python's execution is slower than JavaScript's V8. It claimed this is due to mtechnical debt and the inability to change because libraries like NumPy bypass public interfaces and access data directly.
I'm wondering how much of that is true and what is just a hallucination."
Btw: JavaScript seems to have similar complexity issues.
If we are being very pedantic, languages don't have "speed", only implementations do.
Of course in the real life there are de facto implementations and language features give way to better/worse tradeoffs.
With that out of the way, Python is basically the de facto glue language. It is very often used to provide a scripting API over lower level C libraries. To be ergonomic in this function, CPython (the major implementation) exposed some internal details of its execution model, which C libraries can reach into. This makes it very hard to make more aggressive optimizations, as one example a C library can just increase/decrease the reference count of an object. Another design decision (that got some discussion recently) is the GIL (global interpreter lock) that makes python much less competitive than something like Java. (JS also does a single thread of execution, though there are ways around it).
JS has a different use case, so access to the C world doesn't impose such restrictions on it.
Both you and the grandparent comment are correct. The implementation is slow because the API that it exposes is so leaky that implementation changes (for example a tracing garbage collector) are impossible to implement without changing the API, and the API cannot easily change because of the dependence or the ecosystem on it (e.g. numpy)
Lots of people. Several people from Arm and Microsoft, various PhD students... I don't know if anyone working at Facebook worked on the JIT, maybe they did.
As someone who has many times dived into deep rabbit holes like this (e.g. how does JavaScript's prototype-based class work?), some effective ways to handle this is to ask follow up questions, use web search or ask for references. Deep search also helps. Often it corrects itself or takes back claims that have no basis. At the very least, it provides references that you can read yourself.
Of course, you can't really do all of that on a free plan.
That's far from ideal, but if you are motivated and care about these technical details (which you probably do), you can get pretty good results.
=====
Putting all of this aside, you can sometimes find YouTube videos on obscure channels that talk about these things. Chances are that someone who cares to make a YouTube video about these hardcore topics know what they are talking.
I wish they would just go back to calling it Python, since it’s the Python that everyone knows and uses. No one gets confused over Python the spec and Python the implementation. Every time I see “CPython” i have to double check we’re just talking about Python.
I guess they “CPython’ed” back when people thought Jython would take off , and it never did because Java sucks.
Yes, but no one is ever talking about pypy or jython implicitly. They are always mentioned by name because they represent <0.1% of all Python usage and are relegated essentially exclusively to niche or experimental use cases for power users.
It’s a bit like arguing people should start saying “homo sapiens” when referencing “people” for added precision. It may be useful to anthropologists but the rest of us really don’t need that. Similarly, CPython is really only a sensible level of precision in a discussion directly about alternative Python implementations.
(although in this case the original post is about implementation internals so I’d give it a pass)
This seems to be literally looking at the details of the C implementation of a Python interpreter. Exactly specifying the implementation makes sense here. You wouldn't say "how does the C++ compiler work" then look only at gcc.
IDK why /InternalDocs/ instead of /Doc/internals/ ? ( `ln -s` works with Mac/Lin/WSL. )
Ideally what's in InternalDocs/ would be built into the docs.python.org docs .
Is it just that markdown support in sphinx is not understood to exist?
Sphinx has native markdown support. Sphinx does not have native MyST Markdown support. To support MyST Markdown in a sphinx-doc project, you must e.g. `pip install myst_parser` and add "myst_parser" to the extensions list in conf.py.
Sphinx resolves reference labels at docs build time, so that references will be replaced with the full relative URL to the path#fragment of the document where they occur; in ReStructuredText and then MyST Markdown:
.. _label-name:
(label-name)=
:ref:`Link title <label-name>`
{ref}`Link title <label-name>`
It took me back to the old days of source diving and accumulated knowledge that you carried around in your head.
https://www.dave.org/posts/20220806_python/
reply