They call the dijkstra implementation slow but that's because they aren't using the full information it presents. Dijkstra gives shortest paths from one node to every other node in the graph, so you run it once and materialize it and now you have a full Kevin Bacon database
Eh? you have a full graph of one node to every other node. not shortest path between every pair (which I assume is what a full kevin bacon database would be).
oh that's true, for some reason I was thinking path from A->Bacon. But dijkstra from Bacon->A is just as computational intensive and much more valuable to keep around.
Couldn’t you “just” create a nullable distance column and update anything that’s null and reachable by Bacon and set it to 1. Then update anyone still null and reachable by 1 to 2, etc. Then you have a materialized view of everyone’s Bacon number.
Not a generic solution of course, but seems extremely effective for the Bacon case.
Very, very good: perfect for didactic. I thought I'd try this on SQLite, but I am in a rush right now... So I checked the documentation to refresh my memory about the extent of relevant SQL implementation in SQLite, and...
TIL that the 'WITH' clause in SQLite can draw the classic Mandelbrot Set:
Now do the Erdos number: defined the same way but for coauthors of papers linked to the absurdly production (and always high on amphetamines) great Paul Erdos. The most glorious of all, the Erdos-Bacon number, is defined as the sum of your Bacon number and Erdos number.
Mathematicians Daniel kleitman and Bruce Reznik both tie for top spot with a number of 3 (Erdos 1s and Bacon 2s). Danica McKellar and Elon both in there with 6s. And Mayim Bialik, Natalie Portman, and Kristen Stewart all up there with 7s.
My ex had an erdos-bacon number of six or seven when she was in grad school. One paper with your thesis advisor and having one role as an extra gets you there quickly.
I don't see why one has to create the actor-actor relation table. I would rather search for shortest paths in the bipartite graph where nodes are movies and actors, and divide the results by 2.
That's all very cool. As an aside I always remembered the bacon number was between Bacon and anyone else, Hollywood actor or not. Keeping it withing Hollywood makes a lot more sense.