Has anyone seen a similar comparison for medium-sized document classification ta...

ogrisel · on June 25, 2012

Have a look at the RCV1 benchmarks on this page: http://leon.bottou.org/projects/sgd

SGD is still slightly faster but liblinear is behaving good enough in that case.

ogrisel · on June 25, 2012

One unmentioned caveat of SGD is how to configure the learning rate schedule. scikit-learn is using Bottou's tricks that seem to work reasonably well in practice but it might even be better to implement the online estimate of optimal learning rate schedule from this NIPS 2012 pre-print: http://arxiv.org/abs/1206.1106 (No More Pesky Learning Rates).