Has anyone seen a similar comparison for medium-sized document classification tasks? I'd imagine LibLINEAR would perform far better for document classification than it does in these results.
One unmentioned caveat of SGD is how to configure the learning rate schedule. scikit-learn is using Bottou's tricks that seem to work reasonably well in practice but it might even be better to implement the online estimate of optimal learning rate schedule from this NIPS 2012 pre-print: http://arxiv.org/abs/1206.1106 (No More Pesky Learning Rates).