HN2new | past | comments | ask | show | jobs | submitlogin

What is everyone's favorite function / formula?

Mine is

    n/(n+x)
It has a bunch of interesting aspects:

    As n gets bigger, it goes from 0 to 1.
    
    When n equals x, it is 0.5

    As n gets bigger, the difference between n and n+1 gets smaller

    For two sufficiently large n's, the results are equal.
Say somebody told you about a new cafe in town and that it is completely awesome. The best cafe ever. What probability do you assign to it really being an exceptionally awesome cafe? If your x is 3, then the probability after one person praised it is 25%:

    1/(1+3) = 0.25
And if another person told you about that cafe being awesome, the probability becomes 40%:

   2/(2+3) = 0.4
And after 3 people told you the cafe is awesome, chances are 50% it really is:

   3/(3+3) = 0.5
The changes in probability are pretty strong at the beginning. But after 1000 people reported about the awesome cafe, the next report makes almost no difference anymore. It only ups the probability from 0.997008 to 0.997011.

By changing x from 3 to 4, your formula becomes more "suspicious", by changing it from 3 to 2, it becomes more "gullible".

I wonder if this formula has a name already. If not, "the trust formula" might be a candidate.



One way to view this formula is to use the fact that the Beta distribution is a conjugate prior for the binomial distribution.

Essentially if you have a Beta(a, b) prior then your prior mean is a/(a+b) and after observing n samples from a Bernoulli distribution that are all positive, your posterior is Beta(a+n, b) with posterior mean (a+n)/(a+n+b). So in your example you effectively have a Beta(0, x) prior and x (“suspicious”/“gullible”) is directly interpreted as the strength of your prior!


Can this way to view the formula be expressed without the terms

    beta distribution
    conjugate
    prior
    binomial distribution
    bernoulli distribution
    posterior
?

Because I could easily grasp that it is a "trust formula" in the way mg described it. But this way to "view" the formula is a mistery to me.


Yeah, that's a lot of jargon associated with Bayesian statistics, but at it's root the idea is simple. How to merge information you have before observing some data (a.k.a. prior) with new information you just observed, to obtain updated information (a.k.a. posterior) that includes both what you believed initially + the new evidence you observed.

The probability machinery (Bayes rule) is a principled way to do this, and in the case of count data (number of positive reviews for the cafe) works out to give be a simple fraction n/(n+x).

Define: x = parameter of how skeptical you are in general about the quality of cafes (large x very sceptical), m = number of positive reviews for the cafe,

p = m+1 / (m+1+x) your belief (expressed as a probability) that the cafe is good after hearing m positive reviews about it.

Learning about the binomial and the beta distribution would help you see where the formula comes from. People really like Bayesian machinery, because it has a logical/consistent feel: i.e. rather than coming up with some formula out of thin air, you derive the formula based on general rules about reasoning under uncertainty + updating beliefs.


> Can this way to view the formula be expressed without the terms

You're asking "Can this way of viewing the formula in terms of Bayesian probability be expressed without any of the machinery of Bayesian probability?".


Have you ever heard of the "Up Goer 5"?


Also, in case anyone is interested, the uninformative Jeffreys prior for this in Bayesian statistics (meaning it does not assume anything and is invariant to certain transformations of the inputs) is Beta(0.5, 0.5). Thus the initial guess is 0.5, and it evolves from there from the data.


Isn't 0.5 an absurd guess for the probability of a new restaurant being exceptionally good?


This reminds me of a simple algorithm to determine which product to choose if all have similar ratings but varying number of votes - add one positive and one negative review and recalculate.

https://www.youtube.com/watch?v=8idr1WZ1A7Q


Doesn't quite work. If nobody has told you about the cafe, the probability that it is awesome should not be zero.

Perhaps you only define the trust function from n=1+


Good point! The formula indeed assumes a base probability of zero. That's actually why I put "The best cafe ever" in there and that it is called an "exceptionally awesome cafe". I got a bit lax later in the text just calling it "awesome".

For a cafe aficionado, who spends most of their time in cafes, reading HN and thinking about formulas, the probability that some random cafe becomes their new favorite is virtually zero.

In other words: The more cafes you already know, the closer to zero the chance that a random one will be the best of them all.

So yeah, it is a formula for cafe lovers. Not for the casual person who is happy with a random filtered coffee swill from the vending machine. Those would have to add a base probability, turning the formula into something like b+n/(n+x)*(1-b).


I think Laplace's Rule of succession [1] could be better here. It assumes there are binary "successes" and "failures" (e.g. thumbs up/down). Let s be the number of "successes", n be the total number of data points (successes+failures), and 1/x the prior probability of a success. Then the probability that the next data point will be a success is:

(s + 1)/(n + x)

E.g. for prior probability 1/2 (success and failure initially equally likely), x=2, so

(s + 1)/(n + 2)

[1] https://en.wikipedia.org/wiki/Rule_of_succession


Interesting philosophical question: is awesomeness intrinsic or extrinsic (a matter of perception)? Can anything be intrinsic?

If it's intrinsic, then yes, the probability that it is awesome should not be zero if you've never heard of it. It's awesomeness exists independently of any measurement. But, by definition, you can't know it's awesomeness until you measure it, so awesomeness quotients only matter after they've been measured. And a measured value value must be expressible/observable outside the system (i.e. extrinsic).


This isn't the same formula, but the concept is a bit reminiscent of https://steamdb.info/blog/steamdb-rating/


I would view this as Laplace smoothing or Additive smoothing for binary distributions (https://en.wikipedia.org/wiki/Additive_smoothing). I use it all the time when I'm working estimating rates of some events from a limited amount of samples.


This is similar to calculating probability from an odds ratio - if x = 1 and if n = your odds ratio, p = n/(n+1)

A neat explainer for how this helps with Bayesian updates (which is the update in trust that you're describing) is here - https://www.lesswrong.com/posts/QGkYCwyC7wTDyt3yT/0-and-1-ar...


The Weierstrass ℘-function. It is what your function wants to be when it grows up.


I think the jump from 1->2 ppl telling me it's the greatest cafe ever is a bigger jump than the jump from 0->1. Thus I think it would be more like a logarithmic curve.


Initially thought this project was about formulas.


How do you pick "x"?


I’m with you - I easily see how it affects the formula, but I’m not clear on how you decide what the right value, or range of values would be.


Based on the application/your experience.


you pass it in?

it looks like it is a two argument function




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: