HN2new | past | comments | ask | show | jobs | submitlogin

Kaggle CEO here.

Agree with a mild version of the author's statement: that for many competitions, the difference between top n spots is not statistically significant. However, the author's statement (as represented by this chart https://lukeoakdenrayner.files.wordpress.com/2019/09/ai-comp...) is far too strong.

The actual best model may not always win, but will typically be in the top 0.1%.

There are people on this thread who have poked holes in the author's sample size calculator (I'm not going to rehash that).

But an empirical observation: the same top ranked Kagglers consistently perform well in competition after competition.

You can see this by digging through the profiles of top ranked Kagglers (https://www.kaggle.com/rankings). Or by looking at competition leaderboards. For example the leaderboard screenshot the author shared in the post (https://lukeoakdenrayner.files.wordpress.com/2019/09/pneumo-...) shows 11 of the 13 top performers are Masters and Grandmasters, which puts them at the top ranked 1.5K members of our community of 3.4MM data scientists (orange and gold dots under the profile pictures indicate Master and Grandmaster rank).

I actually think the author's headline is often correct: there are many cases where machine learning competitions don't produce useful models. But for a completely different reason: Competitions sometimes have leakage.



To elaborate on leakage: it's a case where something in the training or test dataset wouldn't be available in a production setting.

As a funny example: I remember we were once given a dataset to predict prostate cancer from ~300 variables. One of the variables was "had prostate cancer surgery". Turned out that was a very good predictor of prostate cancer ;). Thankfully that was an example where we caught the leakage. Unfortunately there are cases where we don't catch the leakage.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: