Hacker News .hnnew | past | comments | ask | show | jobs | submit | vintermann's commentslogin

Attacks? That's a choice of words.

Definitely Anthropic playing the victim after distilling the whole internet.

Well, at least we know that's one gotcha/benchmark they aren't gaming.

The dataset excites me more than the fairly vague conclusion that some SNPs possibly linked to traits were selected for (or hitched along to genes which were selected for). Genetic archaeology is just so much more exciting than this.

But I bet there will be a ton more of that too, thanks to the high quality dataset.


> the fairly vague conclusion that some SNPs possibly linked to traits were selected for

Interesting. I find that part of the paper the most exciting. We always knew selection would happen for valuable traits. But seeing demonstrations of it in the timelines we have is pretty important.


Makes you wonder what is being selected for currently.

Levenshtein distance is rarely the similarity measure you need. Words usually mean something, and it's usually the distance in meaning you need.

As usual, examples from my genealogy hobby: many sites allow you to upload your family tree as a gedcom file and compare it to other people's trees or a public tree. Most of these use Levenshtein distance on names to judge similarity, and it's terrible. Anne Nilsen and Anne Olsen could be the same person, right? No!! These tools are unfortunately useless to me because they give so many false positives.

These days, an embedding model is the way to go. Even a small, bad embedding model is better than Levenshtein distance if you care about the meaning of the string.


It depends on if or not you're trying to correct for typos, or do something semantic. Also, embedding distance is much much more expensive.

Yes, this article is kicking in open doors, the original article was quite clear about the scope.

The present article could rather have spent time arguing why this isn't like NAND gate functional completeness.

I would have thought the differences lie in the other direction: not that trees of EML and 1 can describe too little, but that they can describe too much already. It's decidable whether two NAND circuits implement the same function, I'm pretty sure it's not decidable if two EML trees describe the same function.


You are correct, it is undecidable by Richardson's theorem [1].

[1] https://en.wikipedia.org/wiki/Richardson%27s_theorem


that result does not apply for EML: EML doesn't have the | . | absolute value function, a prerequisite for Richardson's theorem.

Yes it does; you can build the absolute value as sqrt(x²), and sqrt(x) and x² are both constructible using eml.

If I understand the page correctly, the extension by Miklós Laczkovich should be enough to show that it's undecidable.

You wrote:

> It's decidable whether two NAND circuits implement the same function, I'm pretty sure it's not decidable if two EML trees describe the same function.

Perhaps, perhaps not, same function so basically is this question solvable:

A(x[,y,...]) = f(x[,y,...])-g(x[,y,...]) == 0 everywhere?

if a user brings EML functions f and g; given their binary EML trees; can we decide if they represent the same function, so the question form is

A(x)=0 EVERYWHERE?

(like given 2 fractions a/b == c/d ? do the fractions represent the same fraction?)

From Wikipedia link reikonomusha gave:

> Miklós Laczkovich removed also the need for π and reduced the use of composition.[5] In particular, given an expression A(x) in the ring generated by the integers, x, sin xn, and sin(x sin xn) (for n ranging over positive integers), both the question of whether A(x) > 0 for some x and whether A(x) = 0 for some x are unsolvable.

Here the question forms are

1) exist x such that A(x) > 0 (does there exist an x where A(x) becomes positive?)

2) exist x such that A(x) = 0 (does there exist a value such that A(x) becomes 0? or basically find real roots

so at least the forms on WikiPedia don't generate the results both of you claim it does.

it does present undecidability results, but not straightforwardly in the context of this EML work.

second the Richardson's theorem is about the function on the reals, not complex functions (I mean the roots must lay somewhere)


> an expression in the ring generated by the integers, x, sin xn, and sin(x sin xn)

We can always write AML trees for expressions generated by the integers, x, sin xn, and sin(x sin xn), right?

So we should be able to write EML trees for any two such expressions, A and B. If they're equal everywhere, then A - B = 0 everywhere. A - B is also in the aforementioned ring.

If there was a decision procedure always to determine if EML trees represent the same function, then that contradicts Miklós Laczkovich's extension, right?


no Miklós Laczkovich's extension as described on wikipedia only says that both of the following questions are proven undecidable:

1) is there some value x such that some function F(x)=A(x)-B(x)=0?

2) is there some value x such that F(x)>0?

while you asked:

> I'm pretty sure it's not decidable if two EML trees describe the same function.

that would be

3) is for every x F(x)=A(x)-B(x)==0?

which Miklós Laczkovich's extension does not provide.

And you ignore the fact that Miklós Laczkovich's extension applies to real numbers and functions...


If it's undecidable whether it's 0 at even ONE point, clearly you can't prove that it's 0 everywhere.

Likewise, if it's not decidable for real-valued functions, clearly it's not decidable for complex valued functions.


thats not how this works

decidability does not distribute over pointwise question asking on sets, or if you believe it does, show us the proof.

Telling if an EML(x,y),1 constructed expression is identically 0 is in the gray zone, as far as I can tell, it has neither been proven decidable nor been proven undecidable.

Nevertheless regardless of decidability the authors clearly show the multipoint sampling/testing is a decent filter, and the shorter resulting expressions have been proven correct in the results for the construction at least.


> It's decidable whether two NAND circuits implement the same function

Well, sure. At least, until you have a loop that starts clocking for you, and now you've got the halting problem.


In rpn notation you just put the input on the stack, right? The encodings seems like they could get pretty big, and encodings certainly wouldn't be unique, but you should be able to encode pretty much any constant you could think of.

I'm way too unschooled to say if it's important or not, but what really excites me is the Catalan structure ("Every EML expression is a binary tree [...] isomorphic to well-studied combinatorial objects like full binary trees and Catalan objects").

So, what happens if you take say the EML expression for addition, and invert the binary tree?


If recorded music didn't kill music, then AI probably won't either.

But recorded music was a crisis. And it did tempt a lot of people into supporting fabulously abusable, rich-enriching "intellectual property" law as a means of financing art.

Rich people are lobbying to capitalize on this crisis as well.


Because I prefer not to think about the hair I'm removing from my shower drain?

The generous way of seeing it is that you don't know what the customer wants, and the customer doesn't know all that well what they want either, and certainly not how to express it to you. So you try something, and improve it from there.

But for aerospace, the customer probably knows pretty well what they want.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: