Hacker News .hnnew | past | comments | ask | show | jobs | submit | ralphb's commentslogin

> It's also the constant stream of digital nomad influencers on Twitter who sell extremely distorted, rosy, and often times false dreams to indie entrepreneurs like myself.

I believe that all sales are about selling into peoples dreams. I am sure that the false dreams you refer are an extremely good business and a willing market, because - who doesn't want to buy the dream of being a successful entrepreneur?

Ironically, assuming these people are creating products primarily based on their own experiences, there might not be anything false about it - it's just that you've trying to sell into less desirable dreams!

Fundamentally I believe that for me it is impossible to passionately pursue stuff I am not really interested in. This limits the desirability of the dreams I can sell into, and I have come to terms with that. You might even think of this as bad luck.

I've also come to terms with the fact that I have to create for myself first, and any interest I get from other people is purely a bonus. Obviously that doesn't pay the bills :-)


> I believe that all sales are about selling into peoples dreams.

Nah, there's selling the dream of quitting the 9-to-5, financial independence, control over one's own time. That is what this is all about. It's on a totally different level than selling the dream of, I don't know, running a marathon (by selling you a running shoe) or stopping the flow of your period with a tampon. These are actual useful products.

IMHO, selling the dream of quitting the rat race is financially extremely lucrative – and a somewhat cheap shot.


I a have a feeling that most influencers live off the story, not off the actual product they built. They sell coaching for how to live the great life. this works well, but in the end is nothing more than a Ponzi scheme or MLM scam.


I'm confused and very far from an expert here. What is wrong with parsers, and what is the alternative?


A specific class of parsers

>parsers for untrusted input in C


Pretty much all input is untrusted unless it originated (exclusively!) from something with more permissions that is trustworthy.

The kernel is written in C.

So that pretty much means all parsers written in C and every other language should consider all input untrustworthy, no?


Linux is probably the most carefully constructed C codebase in existence and still falls in to C pitfalls semi regularly. Every other project has no hope of safely using C. It's looking more and more like Linux should be carefully rewritten in Rust. It's a monstrous task but I can see it happening over the next decade.


I agree with the spirit of your comment.

> Linux is probably the most carefully constructed C codebase in existence and still falls in to C pitfalls semi regularly.

My guess is that it would actually be OpenBSD, but I'm not sure either way.


That didn't answer anything. If you want to do anything with your input, you have to run it through a parser. Doesn't matter if it's untrusted or not. Your only options are ignoring the input, echoing it somewhere, or parsing it.


They're saying don't write such a parser in C. Use something else (memory safe language, parser generator, whatever).


And then do what with it? Throw it away?

If it hands it to a C program, that C program needs to parse (in some form!) those values!

How is a C program expected to ever do anything if it can’t safely handle input?


Untrusted input -> memory-safe parser -> trusted input -> C program.

Probably not that important for `ls`, probably worth it for OpenSSL.


The challenge of course is the links to the ‘memory safe parser’, or how it gets from the untrusted input to it mediated by C, correct?


Right, but you do have the option of writing that parser in a language other than C. And given how often severe security issues are caused by such parsers written in C, one probably ought to choose a different language, or at least use C functions and string types that store a length rather than relying on null termination.


>>>>> Why are people still using parsers for untrusted input in C?

No matter what the parser itself is written in, if you're writing in C you'll be using the parser in C.


If you have the input in a buffer of known length in C, hand it off to a (dynamic or static) library written in a safe language, and get back trusted parsed output, then there's much less attack surface in your C code.


The issue in many of these cases is there appears to be no canonical safe way to know the length of the input in C, and people apparently screw up keeping track of the lengths of the buffers all the time.


This is why you reduce the amount of C code that has to keep track of it to as little as possible.


1. Well don’t write in C then if your program is security critical or going to be exposed over a network. Sure, there are some targets that require C, but that’s not the case for the vast majority of platforms running OpenSSL.

2. That’s still less of a problem as the C will then be handling trusted data validated by the safe langauge.


If you make argument 2) could you explain how writing a parser is more security critical than any other code that has a (direct or indirect) interaction with the network? At least recursive descent parsers are close to trivial. I usually start by writing a "next_byte" function and then "next_token". You'll have to look very hard to find any pointer code there. It's close to impossible to get this wrong and I don't see how the fact that it's a parser would make it any more dangerous.


Well if you're dealing with a struct then the compiler will provide type safety if say you try to access a field that doesn't exist. You don't get the same safeguards when dealing with raw bytes. Admittedly in C you can also run into these hazards with arrays and strings, which I why I suggest using non-standard array and string types which actually store the length if you insist on using C.


When a C program is factored well there needn't be all that much access by pointer + index. I'm not saying it can't be frequent in certain kinds of code, but for many things it's easy to just put a simple abstraction (API consisting of a few functions) that you have to get right once, then can reuse dozens of times.

Plain pointer access in high-level code (say when parsing a particular syntactic element by hand in a recursive descent parser) is a violation of the principle of separation of concerns IMO.

In any case I still don't see what's special about parsers. Most vulnerabilities I suspect to be in the higher levels, like validating parsed numbers and references, for a trivial example. In general, those are checks that are likely to be implemented much closer at the core of the application.


> Most vulnerabilities I suspect to be in the higher levels, like validating parsed numbers and references, for a trivial example. In general, those are checks that are likely to be implemented much closer at the core of the application.

What I see (especially in libraries like OpenSSL) is the core logic often receives a lot of scrutiny and testing, and thus it is silly mistakes with offsets and bounds checks that make up the majority of bugs.

It’s also worth considering the severity of different kinds of bug. A bug in high level logic might allow an attacker to do something they shouldn’t be able to do, but it doesn’t give them code execution.

The worst bit is, an attacker can often gain code execution through a part of the code that otherwise wouldn’t be security critical (where a logic mistake would be low impact). So writing code in a language that allows for these vulnerabilities greatly increases your attack surface.


> It's close to impossible to get this wrong and I don't see how the fact that it's a parser would make it any more dangerous.

I can answer that one. The parser is more dangerous because a parser, essentially by definition, takes untrusted input.

Nothing the parser does is any more dangerous than the rest of the code; it's all about the parser's position in the data flow.


I agree that the original statement encourages that interpretation, but I think it admits the interpretation that the parser itself is in C and I think that is what was intended.


Even if application constraints mean you can't write a parser in another language that's linkable to C, why couldn't you use a parser generator that outputs C?


I am almost unhappy to learn that this was a joke. Would have been nice to put a final (personal) nail in the C++ coffin with this insanity. However, I guess it says enough that I did have to dig quite far into the paper to realize whether Bjarne was joking or not.


I've found that a super simple way to parse basic expressions is a recursive descent parser. It is very simple to implement. No need to to break into tokenizer/parser, no need to generate an AST, just evaluate the expression while parsing.


Do you have an example of a simple parser like that?


https://gist.github.com/revivalizer/935dfcc345b009a0207a033c...

Even includes error handling :)

At the bottom you can see some test examples of what it can do. Obviously it is a basic calculator.


This is well-known, and almost trivial. But because I am curious I looked this up in NR 3rd edition. Section 7.1.5 says

> The steps above that convert a 64-bit integer to a double precision floating-point value involves both a non-trivial type conversion and a 64-bit floating multiply. They are performance bottlenecks. One can instead directly move the random bits into the right place in the double word with a union structure, a mask, and some 64-bit logical operations; but in our experience this is not significantly faster

It goes on to describe a lagged Fibonacci generator which generates values directly as floating point.


That's interesting. My kneejerk reaction to your comment was "no, you don't have a strategy for landing a man on the moon, you have a mission architecture!". But on further consideration there really seems to be a lot of overlap between system architectures and strategies (in the Rumelt sense). Diagnosis, kernel, guiding policies. Seems to fit both domains.


We can talk about the advantages of monorepos, but your questions is phrased in a way that makes me think that you don't see any "trouble" in multiple repositories.

I would encourage you to do some research and keep an open mind.


I see "trouble" in all approaches: their solutions to manage their git monorepo approach it by switching to multi-repo emulation.


Coffee dampens appetite. (Drink it black)



The language called "Rust" prior to 2013 is a completely different language from what people today know as "Rust". That language had a garbage collector, mutable aliasing, and no borrow checker (the three most unique features of today's Rust), and was basically "golang with different syntax":

http://smallcultfollowing.com/babysteps/blog/2012/11/18/imag...

The whole language got rebooted shortly after the blog post above, mostly because the borrow checker made so many other things suddenly unnecessary or trivial. What we call Rust today is at most 9 years old, and any similarities to pre-2013 Rust are strictly superficial syntax. They share a name and some syntax, sort of like Java and Javascript do.

Zig today at T+7 is not where Rust was in 2020 at T+7.


What matters is time from initial inception.


That's funny. I am old, and I think 3 seconds is unacceptable.

I use zig anyway.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: