Hacker News new | past | comments | ask | show | jobs | submit login

A few things I wish I knew when took Statistics courses at university some 25 or so years ago:

- Statistical significance testing and hypothesis testing are two completely different approaches with different philosophies behind them developed by different groups of people that kinda do the same thing but not quite and textbooks tend to completely blur this distinction out.

- The above approaches were developed in the early 1900s in the context of farms and breweries where 3 things were true - 1) data was extremely limited, often there were only 5 or 6 data points available, 2) there were no electronic computers, so computation was limited to pen and paper and slide rules, and 3) the cost in terms of time and money of running experiments (e.g., planting a crop differently and waiting for harvest) were enormous.

- The majority of classical statistics was focused on two simple questions - 1) what can I reliably say about a population based on a sample taken from it and 2) what can I reliably about the differences between two populations based on the samples taken from each? That's it. An enormous mathematical apparatus was built around answering those two questions in the context of the limitations in point #2.




That was a nice summary.

The data-poor and computation-poor context of old school statistics definitely biased the methods towards the "recipe" approach scientists are supposed to follow mechanically, where each recipe is some predefined sequence of steps, justified based on an analytical approximations to a sampling distribution (given lots of assumptions).

In modern computation-rich days, we can get away from the recipes by using resampling methods (e.g. permutation tests and bootstrap), so we don't need the analytical approximation formulas anymore.

I think there is still room for small sample methods though... it's not like biological and social sciences are dealing with very large samples.


My understanding is that frequentist statistics was developed in response to the Bayesian methodology which was prevalent in the 1800s and which was starting to be perceived as having important flaws. The idea that the invention of Bayesian statistics made frequentist statistics obsolete doesn't quite agree with the historical facts.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: