To minimise the KL you just calculate the surprisal. The integral can be approximated by sampling over your training data. It's a direct expression of the information loss between your real data and your fitted probability distribution.
Calculating the JSD could be more difficult, the expression uses a mixture between the 'true' and 'fitted' distribution. You can still simulate this, but half the time you'd be fitting the model to itself, and I just don't see why that would be useful.
I think the JSD is most useful when you need an actual metric, but as long as you have a fitted and target distribution the KL divergence is a natural fit since you can interpret the result as information loss.
That's just a baseless assumption. To use AI well you should do the things that allow you to use stuff well. You shouldn't just use it any way you can because you assume that 'not using it at all' is not the best option.
This is literally the same with every single technological development.
Ironically, companies overusing it will probably die at a similar speed. Maybe faster, even, depending whether cash burn or technical debt catches up to them first.
That's like saying farmers that don't use pesticides will die out. There's whole industries around doing things not the way big companies say you have to. Human-centric firms will pop up and proposer.
I'm also pretty sure 14 points font is a bit outdated at this point, 16 should probably be a minimum with current screens. It's not as if screens aren't wide enough to fit bigger text.
Haha I keep forgetting that. Fortunately the browser remembers my zoom settings per page. I'm pretty sure the font is now at 16 or something via repeated Cmd +.
10 point at 96 dpi or with correctly applied scaling is very readable. But some toolkits like GTK have huge paddings for their widgets, so the text will be readable, but you’ll lose density.
Oh that's annoying, seems to me there wouldn't have been an issue if you just merged B into A after merging A into main, or the other way around but that already works fine as you pointed out.
I mean if you've got a feature set to merge into dev, and it suddenly merges into main after someone merged dev into main then that's very annoying.
Huh interesting, my mental model is unable to see any difference between them.
I mean a branch is just jamming a flag into a commit with a polite note to move the flag along if you're working on it. You make a long trail, leave several flags and merge the whole thing back.
Of course leaving multiple waypoints only makes sense if merging the earlier parts makes any sense, and if the way you continue actually depends on the previous work.
If you can split it into several small changes made to a central branch it's a lot easier to merge things. Otherwise you risk making a new feature codependent on another even if there was no need to.
I'd probably go with something like the wave function collapse algorithm. It should be possible to make it generate trees with somewhat uniform probability.
Interesting idea, but the problem is that being connected and being non-cyclic (properties you want for a perfect maze where you can reach every location and where there is exactly one route between every two locations) are global conditions that are difficult to implement with function collapse algorithm that are local.
I think being connected is easy enough, being non-cyclic is trickier I suppose. If you do it badly the shape of the maze is going to depend on the order it's generated in. I imagine some people may have looked into it.
> being connected and being non-cyclic (properties you want for a perfect maze where you can reach every location and where there is exactly one route between every two locations)
Connected, sure, that's table stakes. But why is being non-cyclic a desirable property? (Other than it being the definition of "perfect maze", a term I've come to despise)
This. It's so far out there that I have to wonder if it's a rogue employee who thought this a good excuse to cause reputational damage without it being too obvious. Doesn't pass several razors though (not the simplest explanation; malice involved.. is that Hanlon's and Occam's razor?), so I don't truly believe it... but it would be possible
Since it's AI and Microsoft I can believe that someone who doesn't know what they're doing would be given a mandate to promote AI under any means necessary at the cost of some other team's reputation.
But it's an insane move. If anything AI has made it more important than ever to know who authored something and then someone does this to promote AI.
Occam's razor is about the simplest solution often being the correct one.
Hanlon's razor is about not assuming malice, which makes no sense when applied to faceless mega-corporations or even random strangers where you know conflicting motives exist.
Thanks for confirming I remembered the razor names correctly!
I still don't assume malice, at least as a default / until strongly indicated otherwise, from any individual employee. Emergent behavior of complex artificial incentive systems is, of course, a whole other matter so I can see what you mean that the razor won't apply there without breaking it down to an individual as in the scenario I mentioned about an ill-meaning employee
It's the one question that AIs seem unable to answer correctly.
reply