Hacker News .hnnew | past | comments | ask | show | jobs | submit | thomasmg's commentslogin

I don't have a strong opinion what is better in this case, but my view is:

> document and express intent clearly

Arguably, the void* does that as well?

> Any seasoned C++ developer seeing this knows what this reinterpret_cast means.

Same for void*?

> it's a bit more text to read

If you have to call it many times, this adds up.

> Some might also say it complexifies and uglifies the code

I think the point is that it adds security, which the other options don't. And, it doesn't add complexity on the caller, but only at one place: the implementation.

> makes it non-portable on top of that.

This can be solved.


> Arguably, the void* does that as well?

Sort of (I mean: seeing void* and a size probably means 'arbitrary sequence of bytes' or something like that, but well it's void* so it can be like anything whereas with std::span you get more of a hint what's going on just based on the type), but not at the callsite which is what the author is referring to when it's about reinterpret_cast.

> I think the point is that it adds security, which the other options don't

Imo span also does that to some extent, but already when writing the code and not afterwards in e.g. static analysis. E.g. if I get an std:span<const char> I'd have to do counterintuitive things to misuse it. Annotating a void* still leaves it a void* which I then need to cast to char* if I think that's what it is intended.

Don't get me wrong: I've written my fair share of void* but these days I really feel like there's almost always a better thing which can be used instead. Though I do admit that since I've written and consumed a lot of code with such alternatives I'm not hindered by readability/apparent complexity of it anymore but I understand that's not the same for everyone.


> whereas with std::span you get more of a hint what's going on just based on the type

You don’t if it’s a span of bytes (or equivalent).

Encoding the length in a span is a meaningful thing. But the fact that it holds a random memory pointer labeled “byte” instead of “void” doesn’t change anything.


> Arguably, the void* does that as well?

How do you figure? The type is a pointer to quite literally anything, including nothing (ie a pointer that cannot be dereferenced). If you're working with bytes, indicate this with the type.


> The type is a pointer to quite literally anything

No, it can only be a pointer to an object. It can't be a pointer to a function, for example.


All pointers in C/C++ can point to “nothing”. Swapping “void *” for “byte *” is basically an aesthetic choice.

Right. So why not use a typed reference?

1. void * is a type

2. The functions in question here operate on raw memory. void * conveys the same information as byte *. It’s not untyped vs typed. It’s two different types that convey the same information but require slightly different mechanics to obtain.

3. The author explained his reasoning for preferring void * over byte * (or uint8_t *)

4. A reference also isn’t the same thing as a pointer in C++


I'm not trying to criticize, but Python is known to be much slower than eg. Java or Go etc. So for performance-critcal code, why use Python? I find Python to be very good because it is concise and simple, but I have not used it for production so far.


You use Python when it makes sense for other reasons (library support, coworker familiarity, etc), same as for any other project. Additionally, sometimes performance matters, but perhaps not enough to overcome whatever else is drawing you to Python in the first place.

Right this second I'm writing something in Python with critical performance requirements. It needs to average processing 25k things per second. That won't be particularly hard, but it's close enough to the edge of what the language is capable of that I do need to be at least a tiny bit careful with the implementation. I'm highly unlikely to need a profiler for this project in particular, but earlier in my career I probably would have needed one.

Python is fairly commonly used as a glue engine around faster code too, and it's not always obvious when the wrapper code is inducing nontrivial overhead (hidden copies and that sort of thing). Profilers are great for teasing out those sorts of problems. They shine a spotlight on the section of code which should take 0us and is instead dominating your runtime.


The simple answer is, I choose to use Python because I am productive with it. I get a lot done compared to the other languages I have tried. Performance is almost never the limiting factor in my work, nor has it been for the vast majority of the work I've ever been witness to. When it is, it comes up in very particular circumstances, and can usually be fixed algorithmically. Indeed, that is the situation where I have used profilers.

The fact that the base language is an order of magnitude (or two!) slower has almost never mattered. If my work gets to the point where it does, and I have an excuse to go mess around with a rust extension or some cool optimized library, things are going very well.

I've been professional developer for over 20 years now, and I've read this forum obsessively for much of that time. I've seen people write things like, "Most engineers would kill for a 5% speedup" and I think, on what planet? Most engineers have much larger problems that cannot be so easily quantified. Come to think of it, I think that there is an allure to performance optimization due to the fact that it can be so easily quantified.


> I choose to use Python because I am productive with it.

That, I fully understand. I think many developers are productive in one language: in the one that they know best. Which is probably the one they use most. It might happen that this is is a "fast" language by accident (like Java or Go), or a language like Python. And then there is never enough reason to switch.

> "Most engineers would kill for a 5% speedup"

I think this is very rare - maybe a heavily used app in Facebook or Google, where 5% could mean a lot of money. But a factor of 10 speedup is much more common (and possible sometimes).

> there is an allure to performance optimization due to the fact that it can be so easily quantified.

That's true. I also think simplicity is quantifiable, and so my personal hobby is to write something impressive in few lines. Like a chess engine, QR code reader, editor, data compression tool, compiler, in 500 lines. But this is mostly for hobbies I guess. For work, it's mostly about features, and then performance, I guess.


> I think many developers are productive in one language: in the one that they know best.

While superficially true, this conflates cause and effect. I am, or was at one time, extremely proficient in multiple assembly languages, Fortran, C, Pascal, Modula-2, Verilog, and Python, marginally proficient in multiple other assembly languages, Perl, Tcl, etc., and have toyed with various lisps and functional programming languages.

I viscerally recoiled from a few languages, like Java, Javascript, and C++, and found other languages, like Go, Lua and Julia, not compelling enough to bother switching to for my use cases.

And while I can certainly believe that many people are most productive in the language they know best, that most emphatically doesn't mean that they learned a language and became proficient in it to the exclusion of others. It most certainly meant in my case that I spent more time with, and learned ever more about, a language that was already productive and useful for me.


I think it's the opposite, the fact that it is slow means profiling is more important. This is because a 10x difference between unoptimized 0.1ms and optimized 0.01ms in Go could translate to 10s vs 1s in an equivalent python script, which is considerably more noticeable difference


Even when performance is not critical it’s possible to write (or vibe?) disastrously slow code, so having profiler handy is always a plus. It might be a deciding factor between “we must rewrite it all in Rust” and “oh we added exponential complexity by accident” and save a ton of time


Sometimes you get there by accident. You make a thing, it grows or is used in unexpected ways, now suddenly performance matters.

Sometimes Python is just the language used in the domain. Lots of sciences live on Python because it is easy to teach to grad students and the package ecosystem is strong.


Profiling is about acknowledging hot paths. What to do with that info is up to programmers - usually trying to optimize code that takes more execution time than expected. Every language needs good tooling, no matter how fast its runtime.


I wrote a logistics optimization program (think travelling-salesman++) in Python and it gradually took over from the v1 version written in Java.

People generally compare "speed of execution" to "speed of coding".

I'll add "speed of troubleshooting". When something goes wrong and the clock is ticking, you need tools to identify quickly if the data is inconsistent, a config set wrongly or the algorithm stuck at a local optimum. And then decide if things can be mitigated with a patch or we should we grovel and buy time from the client... All of this is much easier in Python (bar the grovelling).

LLMs are changing things though. I recently ported a complex part of the kernel to C++. 90% of it was creating the scaffolding before the actual algorithms were ported. Very tedious work but Claude can just chug through. Wouldn't like to maintain it just yet.


In production what's making your application slow is extremely unlikely to be the python code. It's going to be I/O, the threading/concurrency architecture, other mistakes or inefficiency that can be cleared up without leaving the ecosystem. The question of fast vs. slow languages doesn't make a lot of sense to entertain before you have any context of the specific needs of the application or use case. On its own it's just unsophisticated vanity blog fodder.


With the obvious caveat that low-level game engine, image/video processing, numerical code etc. isn't really viable in Python. But outside of that, it's fast enough for gluing together other code that's doing the heavy lifting.


Right, if you're writing an Xbox game for example you wouldn't go with Python but if you're in the range of use cases where a language isn't fully disqualified then the pure speed of the language itself rarely needs to guide the decision.


Generally you choose Python for the conciseness you mentioned, and then move the performance-critical functions into another language like C or (I find to be easiest) Cython. Ideally most of your code stays Python, and you either optimize self-contained pieces, or find library bindings that have done it for you.

A profiler like this can be used to identify which parts to rewrite in a faster language. Sometimes it's easier to write everything in Python first, then measure, than guess at the start which parts need to be fast.

You can also get gains by switching algorithms, both in pure Python and when using a compiled library like `numpy`. And there are also some operations, like string manipulation or the `sqlite3` module, where the Python runtime's implementation has already been optimized in a compiled language.


It's super easy. If you collaborate with non-software people (say, in hardware engineering), there's not many good alternatives. I've seen attempts at DSLs, Lua, and Perl, C (disaster) but Python lets anyone dig down and understand/debug.


Because your colleagues insist on it and now you're stuck with slow Python despite telling them that this is exactly what would happen...


Well... actually, it isn't. I'm also writing my own programming language (named "Bau"). I asked Claude to convert a minesweeper game from C to that language. I only gave some example programs in my language and the grammar. This worked on the first try (Claude didn't even have access to the compiler).


Not to single you out but I really don’t like the whole “I asked an LLM” phrasing. All this language anthropomorphizing a tool is just weird to me


I understand what you mean; Claude is a tool and does not have feelings, thats clear to me. But how else can I describe what I did? "Wrote to Claude" has the same issue. Posted, typed, inputed?


“I used Claude to…” “I tried to X using Claude” etc

Anyway doesn’t matter. I’m just kind of whining, I probably should’ve never written that comment in the first place. I think it just sticks out to me unlike like a lot of common parlance in other industries, which can definitely steer into anthropomorphizing, because we’re seeing all kinds of issues with people attributing actual intelligence to these things or just experiencing general psychological distress because of them. Using language that ascribes human characteristics to describe using LLMs just feels weird in that context


Given these machines are the product of massive intentional and increasingly successful efforts to humanize computers, increased anthropomorphization is appropriate.

The behavior/attribute overlap isn't a coincidence or misunderstanding, it is by design.

In case of "ask", that describes our behavior not the machines.

But if a machine is able to recall and use some fact fluently then it makes sense to say it "knows" it. We routinely use words like "know", without any confusion, when talking about simpler lifeforms that are far less human-like than these models.

None of the above means the machine feels pain, is conscious, has a continuous identity, etc. Yet.


I say "I asked Python to".


It is not quite as rare. I calculated it to be less common than being hit by a meteorite, and added a section about that and the Birthday Paradox to Wikipedia, to the article about UUIDs. It got removed / replaced a few years ago however. (If my source was correct, there was actually a woman hit by a meteorite, but she survived, with a leg injury.)

If you do have a UUID collision, chances are extremely high that it's either a software bug, or glitch in the computer. It could be a cosmic ray. Cosmic rays messing with the computer memory or CPU are actually relatively common.


Well there is Google Sheets, Microsoft Office, Figma, and some other heavier web apps.


You could use "local.tee". It is kind of is "store" + "duplicate".


`local.tee` doesn't duplicate. it just doesn't remove the value from the stack. (so it is "just" `local.set` followed by `local.get`)


Sure. But it does save you one instruction: "tee", "get" instead of "set", "get", "get".


Well that is how it mostly worked until recently... unless if the developer copied and pasted from stackoverflow without understanding much. Which did happen.


It depends on the use case, but do you consider NaN to be equal to NaN? For an assert macro, I would expect so. Also, your code works differently for very large and very small numbers, eg. 1.0000001, 1.0000002 vs 1e-100, 1.0000002e-100.

For my own soft-floating point math library, I expect the value is off by a some percentage, not just off by epsilon. And so I have my own almostSame method [1] which accounts for that and is quite a bit more complex. Actually multiple such methods. But well, that's just my own use case.

[1] https://github.com/thomasmueller/bau-lang/blob/main/src/test...


For the few days without wind, natural gas is cheaper than nuclear. There is also biogas and hydro. Nuclear is not cheap to turn on off. Also, the insurance cost of nuclear power is not accounted for: basically, there is no insurance, and the state (the population) just have to live with the risk.


Natural gas emits CO2. The risks of climate change caused by CO2 utterly dwarf those of any nuclear reactor. Nuclear power in the US has the lowest deaths per joule of all of them.


I think the main issue with nuclear reactors is the cost, but yes I agree.


There is a programming language that is reversible: Janus [1]. You could write a (lossless) data compression algorithm in this language, and if run in reverse this would uncompress. In theory you could do all types of computation, but the "output" (when run forward) would need to contain the old state. With reversible computing, there is no erased information. Landauer's principle links information theory with thermodynamics: putting order to things necessarily produces heat (ordering something locally requires "disorder" somewhere else). That is why Toffoli gates are so efficient: if the process is inherently reversible, less heat need to be produced. Arguably, heat is not "just" disorder: it is a way to preserve the information in the system. The universe is just one gigantic reversible computation. An so, if we all live in a simulation, maybe the simulation is written in Janus?

[1] https://en.wikipedia.org/wiki/Janus_(time-reversible_computi...


That sounds interesting. I just checked out the examples in the Haskell Janus implementation: https://github.com/mbudde/jana


Could you really do general compression in this language? I was under the impression that the output is always the same size or larger than the input.


Yes you can do compression, if the text is compressible. The playground [1] has a "run length encoding" example.

Maybe you meant sorting. You can implement sorting algorithms, as long as you store the information which entries were swapped. (To "unsort" the entries when running in reverse). So, an array that is already sorted doesn't need much additional information; one that is unsorted will require a lot of "undo" space. I think this is the easiest example to see the relation between reversible computing and thermodynamics: in thermodynamics, to bring "order" to a system requires "unorder" (heat) somewhere else.

There are also examples for encryption / decryption, but I find compression and sorting more interesting.

[1] https://topps.diku.dk/pirc/?id=janusP


He meant lossy compression


You could package all your data into a zip using this language but you would also have a worthless stretch of memory seemingly filled with noise / things you’re not interested in.


Why do you think so? The code example shows that you can do RLE (run length encoding) without noise / additional space. I'm pretty sure you can do zip as well. It would just be very hard to implement, but it wouldn't necessarily require that the output contains noise.

[1] https://topps.diku.dk/pirc/?id=janusP


Hmm. As a physicist my intuition is that information-preserving transformations are unitary (unitary transformations are 1-to-1). If a compression algorithm is going to yield a bit string (the zip file, for example) shorter than the original it can't be 1-to-1. So it must yield the zip file and some other stuff to make up for the space saved by the compression.


> The universe is just one gigantic reversible computation.

Assuming that the Many-Worlds interpretation is true.


Actually this is true whichever interpretation you take, give or take some knowledge around black holes. I think hawking actually proved that this is true regardless of how black holes work due to hawking radiation.


> Actually this is true whichever interpretation you take

In the Copenhagen interpretation the collapse of the wave function explicitly violates unitarity (and thus reversibility).


(This is way beyond my area of expertise so excuse me that this might be a stupid idea.)

I assume the following happens: while a (small) subsystem is in "pure state" (in quantum coherence), no information flows out of this subsystem. Then, when measuring, information flows out and other information flows in, which disturbs the pure state. This collapses of the wave function (quantum decoherence). For all practical purposes, it looks like quantum decoherence is irreversible, but technically this could still be reversible; it's just that the subsystem (that is in coherence) got much, much larger. Sure, for all practical purposes it's then irreversible, but for us most of physics anyway looks irreversible (eg. black holes).


The problem is that the larger subsystem includes an observer in a superposition of states of observing different measured values. And we never observe this. Copenhagen interpretation doesn't deal with this at all. It just states this empirical fact.


So if I understand correctly, you are saying the observer doesn't feel like he is in a superposition (multiple states at once). Sure: I agree that observers never experience being in a superposition.

But don't think that necessarily means we are in a Many-Worlds. I rather think that we don't have enough knowledge in this area. Assuming we live in a simulation, an alternative explanation would be, that unlikely branches are not further simulated to save energy. And in this case, superposition is just branch prediction :-)


Yes, I think that's a stance many physicists take these days. Unfortunately, it's not verifiable. And we also don't have any clue how gravity (which does become relevant at our scales) would fit into this picture.


You just need unitarity.


Unitarity means that information (about quantum states) is not lost, despite it appearing otherwise after a measurement. The Many-Worlds interpretation seems to be the simplest way to explain where this information has gone.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: