Is that so? Our color perception is weird. It's one dimension split in three overlapping sectors. Adding a fourth sector may add information that makes it easier to distinguish colors.
We do have four sectors, 3 color perception and then the brightness perception that is used in the dark. In mid darkness you get a mix of all of those, although the fourth is not really perceived as a color so it can be a bit hard to use.
Brightness is another dimension, not a "sector" (as I dubbed it) on the color spectrum. But it would be equal for all subjects in a test, so it can't add information.
Something I think about often is an oliver sacks book about an ethnic group that has a particularly high rate of true monochromacy. And the people with no color perception at all are particularly adept at spotting certain plants based on some characteristic of their leaves that is obscured by color. So even removing information can change perception in surprising ways.
OTOH sacks seems to have fabricated a lot of shit over the years so who knows if this is even real. Another thing I think about a lot now.
And the eye cones not are sharp filter, they overlap ranges with mid-low sensibility. That must be nought to someone with Tetrachromacy to percibe something different on a RGB screen.
> More precisely, she had an additional cone type L′, intermediate between M and L in its responsivity, and showed 3 dimensional (M, L′, and L components) color discrimination for wavelengths 546–670 nm (to which the fourth type, S, is insensitive).
Source: Wikipedia
Deep neural networks can generalize well even when they're far into the overparametrized regime where classical statistical learning theory predicts overfitting. This is usually called "double descent" and there are many papers on it.
If you have read the blog post it's a difference between the chemical symbol Ge and Gr, which as I understand is what you would refer to as a "semantic error".
How would the reader know the writer intended Ge instead of Ga? More importantly: why should the burden of figuring that out fall on the reader instead of the writer? Especially when considering that every publication normally has a lot more readers than writers.
In this case, chemistry of Ga and Ge are a bit different, and the Cr compound that was misstated is part of a family of materials that rely heavily on the coordinating chemistry of Ge and its mates in the same period. So it makes more sense. If indeed it were Ga, that would be an interesting compound that probably wouldn't look anything like the material families being discussed by these authors.
I think the reader and the writer share the burden of accurate communication. The reader should ideally come prepared and the writer should provide as best they can. A prepared reader makes quick work of this typo.
Thanks for replying, I understand your original reasoning now in a way that I didn't when I last responded. I was only considering how it would appears to people who don't recognize Gr isn't an element, I agree that it's a syntactic mistake to those who know chemical symbols well.
Versions numbers for LLMs don't mean anything consistent. They don't even publicly announce at this point which models are built from new base models and which aren't. I'm pretty sure Claude 3.5 was a new set of base models since Claude 3.
What do mean by "it's a 1.0" and "3rd iteration"? I'm having trouble parsing those in context.
If Claude 3.5 was a base model*, 3.7 is a third iteration** of that model.
GPT-4.5 is a 1.0, or, the first iteration of that model.
* My thought process when writing: "When evaluating this, I should assume the least charitable position for GPT-4.5 having headroom. I should assume Claude 3.5 was a completely new model scale, and it was the same scale as GPT-4.5." (this is rather unlikely, can explain why I think that if you're interested)
** 3.5 is an iteration, 3.6 is an iteration, 3.7 is an iteration.
Not really. Dirac's trick works entirely at a depth of two logs, using sqrt like unary to increment the number. It requires O(n) symbols to represent the number n, i.e. O(2^n) symbols to represent n bits of precision. This thing has arbitrary nesting depth of logs (or exps), and can represent a number to n bits of precision in O(n) symbols.
Sure, but that was because that was an arbitrary restriction on the problem dirac solved.
Still seems a fairly simple variation once you remove the arbitrary restriction. To the point:
I don't believe for a second that anyone familiar with his solution, asked to make it more bit efficient, would not have come up with this. Nor do I believe they would call it anything other than a variation.
That doesn't make it less cool but I don't think it's like amazingly novel.
I think it's better phrased as "find the best rule", with a tacit understanding that people mostly agree on what makes a rule decent vs. terrible (maybe not on what makes one great) and a tacit promise that the sequence presented has at least one decent rule and does not have multiple.
A rule being "good" is largely about simplicity, which is also essentially the trick that deep learning uses to escape no-free-lunch theorems.
There are two kinds of naturalness principle in physics, sometimes called "technical naturalness" and "Dirac naturalness" respectively.
Dirac naturalness is as you describe: skepticism towards extremely large numbers, end of story. It has the flaw you (and every other person who's ever heard it) point out.
Technical (or t'Hooft) naturalness is different, and specific to quantum field theory.
To cut a long story short, the "effective", observable parameters of the Standard Model, such as the mass of the electron, are really the sums of enormous numbers of contributions from different processes happening in quantum superposition. (Keywords: Feynman diagram, renormalization, effective field theory.) The underlying, "bare" parameters each end up affecting many different observables. You can think of this as a big machine with N knobs and N dials, but where each dial is sensitive to each knob in a complicated way.
Technical naturalness states: the sum of the contributions to e.g. the Higgs boson mass should not be many orders of magnitude smaller than each individual contribution, without good reason.
The Higgs mass that we observe is not technically natural. As far as we can tell, thousands of different effects due to unrelated processes are all cancelling out to dozens of decimal places, for no reason anyone can discern. There's a dial at 0.000000.., and turning any knob by a tiny bit would put it at 3 or -2 or something.
There are still critiques to be made here. Maybe the "underlying" parameters aren't really the only fundamental ones, and somehow the effective ones are also fundamental? Maybe there's some reason things cancel out, which we just haven't done the right math to discover? Maybe there's new physics beyond the SM (as we know there eventually has to be)?
But overall it's a situation that, imo, demands an explanation beyond "eh, sometimes numbers are big". If you want to say that physical calculations "explain" anything -- if, for example, you think electromagnetism and thermodynamics can "explain" the approximate light output of a 100W bulb -- then you should care about this.
She's saying that a different model -- one of the three disagreeing methods for distance ladder measurements -- must be wrong, because they disagree with each other. But if one or more of those models are wrong, then there's not much evidence that the LambdaCDM model is wrong.
Conversely, the hypothesis that LambdaCDM is wrong does nothing to explain why the distance ladder methods disagree.
She clearly isn't saying that any model is infallible, she's just saying that clear flaws with one set of models throw into question some specific accusations that a different model is wrong.
You actually need to pay attention to the details; the physicists certainly are. Glib contrarianism isn't very useful here.
reply