HN2new | past | comments | ask | show | jobs | submitlogin

Yes, it's a very thought-provoking article. I'm sure there are many competitions on Kaggle that were won due to the testing/training splits or other incidental choices rather than better machine learning.

But I don't think it's fair to complain about $30,000 prizes being awarded to first rather than second place in a specific competition without doing at least a little checking of whether that was actually the case. And the article kind of reads like cynicism that all machine learning is a waste, and all the algorithms are just producing random numbers that randomly happen to be right some of the time and win the competition by random chance.



All I can really say is that my usual readers understand that I am pro-ML, in fact I'm probably more hung go about the potential of deep learning than many of my compatriots.

I've fallen victim of getting a Twitter bump, and assuming that people know I'm not anti-ML.

The blog post is meant to be educational, not argumentative. Since it has got wider exposure I'll do a follow up to clarify my position on imagenet.


It's a great post; I love ML, I've spent many years trying to get value out of it, and sometimes succeeding. But folks are applying without any of the checks and balances that are needed to produce real value in a sustained way.

Two reasons : 1 - it's harder to do this vs. optimise the behooozas out of a dataset and throw the best one over the wall (and this is often done in good heart complete with a whole gamut of "standard practice" which are in-fact information leak from test to train like checking what features are informative on the test set before doing training) 2... folks don't know better, and best practice is sparsely documented or taught. This is because there are almost no practitioners turned teachers in comp sci. I'm not running down the great people who do great work pushing the field, they are my betters, but the next generation are being mislead into thinking that the skills they are picking up in their ML classes are going to keep them gainfully employed in the long term.


This is the problem with the internet and links. You come in with inappropriate context and make judgements based on single pages of text.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: