What is everyone's favorite function / formula? Mine is n/(n+x) It has a bunch o...

ajtulloch · on July 30, 2023

One way to view this formula is to use the fact that the Beta distribution is a conjugate prior for the binomial distribution.

Essentially if you have a Beta(a, b) prior then your prior mean is a/(a+b) and after observing n samples from a Bernoulli distribution that are all positive, your posterior is Beta(a+n, b) with posterior mean (a+n)/(a+n+b). So in your example you effectively have a Beta(0, x) prior and x (“suspicious”/“gullible”) is directly interpreted as the strength of your prior!

TekMol · on July 30, 2023

Can this way to view the formula be expressed without the terms

    beta distribution
    conjugate
    prior
    binomial distribution
    bernoulli distribution
    posterior

?

Because I could easily grasp that it is a "trust formula" in the way mg described it. But this way to "view" the formula is a mistery to me.

ivansavz · on July 30, 2023

Yeah, that's a lot of jargon associated with Bayesian statistics, but at it's root the idea is simple. How to merge information you have before observing some data (a.k.a. prior) with new information you just observed, to obtain updated information (a.k.a. posterior) that includes both what you believed initially + the new evidence you observed.

The probability machinery (Bayes rule) is a principled way to do this, and in the case of count data (number of positive reviews for the cafe) works out to give be a simple fraction n/(n+x).

Define: x = parameter of how skeptical you are in general about the quality of cafes (large x very sceptical), m = number of positive reviews for the cafe,

p = m+1 / (m+1+x) your belief (expressed as a probability) that the cafe is good after hearing m positive reviews about it.

Learning about the binomial and the beta distribution would help you see where the formula comes from. People really like Bayesian machinery, because it has a logical/consistent feel: i.e. rather than coming up with some formula out of thin air, you derive the formula based on general rules about reasoning under uncertainty + updating beliefs.

bigbillheck · on July 30, 2023

> Can this way to view the formula be expressed without the terms

You're asking "Can this way of viewing the formula in terms of Bayesian probability be expressed without any of the machinery of Bayesian probability?".

checkyoursudo · on July 30, 2023

Have you ever heard of the "Up Goer 5"?

svalorzen · on July 30, 2023

Also, in case anyone is interested, the uninformative Jeffreys prior for this in Bayesian statistics (meaning it does not assume anything and is invariant to certain transformations of the inputs) is Beta(0.5, 0.5). Thus the initial guess is 0.5, and it evolves from there from the data.

CrazyStat · on July 30, 2023

Isn't 0.5 an absurd guess for the probability of a new restaurant being exceptionally good?

badtension · on July 30, 2023

This reminds me of a simple algorithm to determine which product to choose if all have similar ratings but varying number of votes - add one positive and one negative review and recalculate.

https://www.youtube.com/watch?v=8idr1WZ1A7Q

anonymous_sorry · on July 30, 2023

Doesn't quite work. If nobody has told you about the cafe, the probability that it is awesome should not be zero.

Perhaps you only define the trust function from n=1+

mg · on July 30, 2023

Good point! The formula indeed assumes a base probability of zero. That's actually why I put "The best cafe ever" in there and that it is called an "exceptionally awesome cafe". I got a bit lax later in the text just calling it "awesome".

For a cafe aficionado, who spends most of their time in cafes, reading HN and thinking about formulas, the probability that some random cafe becomes their new favorite is virtually zero.

In other words: The more cafes you already know, the closer to zero the chance that a random one will be the best of them all.

So yeah, it is a formula for cafe lovers. Not for the casual person who is happy with a random filtered coffee swill from the vending machine. Those would have to add a base probability, turning the formula into something like b+n/(n+x)*(1-b).

cubefox · on July 30, 2023

I think Laplace's Rule of succession [1] could be better here. It assumes there are binary "successes" and "failures" (e.g. thumbs up/down). Let s be the number of "successes", n be the total number of data points (successes+failures), and 1/x the prior probability of a success. Then the probability that the next data point will be a success is:

(s + 1)/(n + x)

E.g. for prior probability 1/2 (success and failure initially equally likely), x=2, so

(s + 1)/(n + 2)

[1] https://en.wikipedia.org/wiki/Rule_of_succession

Rayhem · on July 31, 2023

Interesting philosophical question: is awesomeness intrinsic or extrinsic (a matter of perception)? Can anything be intrinsic?

If it's intrinsic, then yes, the probability that it is awesome should not be zero if you've never heard of it. It's awesomeness exists independently of any measurement. But, by definition, you can't know it's awesomeness until you measure it, so awesomeness quotients only matter after they've been measured. And a measured value value must be expressible/observable outside the system (i.e. extrinsic).

Jap2-0 · on July 30, 2023

This isn't the same formula, but the concept is a bit reminiscent of https://steamdb.info/blog/steamdb-rating/

eterevsky · on July 30, 2023

I would view this as Laplace smoothing or Additive smoothing for binary distributions (https://en.wikipedia.org/wiki/Additive_smoothing). I use it all the time when I'm working estimating rates of some events from a limited amount of samples.

pcnix · on July 30, 2023

This is similar to calculating probability from an odds ratio - if x = 1 and if n = your odds ratio, p = n/(n+1)

A neat explainer for how this helps with Bayesian updates (which is the update in trust that you're describing) is here - https://www.lesswrong.com/posts/QGkYCwyC7wTDyt3yT/0-and-1-ar...

keithalewis · on July 30, 2023

The Weierstrass ℘-function. It is what your function wants to be when it grows up.

jackphilson · on July 30, 2023

I think the jump from 1->2 ppl telling me it's the greatest cafe ever is a bigger jump than the jump from 0->1. Thus I think it would be more like a logarithmic curve.

forgotpwd16 · on July 30, 2023

Initially thought this project was about formulas.

vorticalbox · on July 30, 2023

How do you pick "x"?

jmye · on July 30, 2023

I’m with you - I easily see how it affects the formula, but I’m not clear on how you decide what the right value, or range of values would be.

patmorgan23 · on July 30, 2023

Based on the application/your experience.

BSEdlMMldESB · on July 30, 2023

you pass it in?

it looks like it is a two argument function