It's a great toolkit for building pen-centric computing tools (especially math recognizers and tools), but unfortunately it is heavily tied to the old Windows 7 tablet APIs and so isn't easily generalized.
I've been hoping to port it to work on newer hardware for the past few years, but have not yet found the time. If anyone wants to take on that project it would be incredibly useful (especially since there seems to be a resurgence of pen-centric computing on the near horizon).
I'm very surprised that this worked with the small amount of training data he provided! (he "provided the algorithm ten images containing hand written digits..." - if we generously estimate that each image has 10 training images, that's only 100 training examples)
Author here: I'm equally surprised it worked so well with such little training data. Each image contained 1 example of each character (18 total characters * 10 images = 180 examples). Having said that, I don't think it would generalize well to other people's handwriting until I provided a (lot) more training data.
I experimented with a few regularization factors and ultimately settled on a lambda of 0.1. Rather than using a stopping criteria, I ran a fixed number of training iterations (~100) and just eyeballed the cost function results. Since my total training time was fairly brief (~2 minutes, tops), I had the luxury of designing the ANN somewhat heuristically.
superb! this is exactly the kind of thing computers should be doing for us. Would also like to have diagram recognition to turn hand-drawn graphs, networks, 3d-surfaces etc into data-structures that can be computed with, and also can it recognize code and run it please t
This looks interesting. But I was a bit disappointed about the amount of work needed to divide the job into subtasks (finding the characters, cropping them, etcetera). I was somehow hoping the ANN would take care of that as well.
Having a single ANN do everything is the best way to end up with a system that works wonders 95% of the time, but then you give it a simple 1+1 question and it answers 42.
It would also require way more neurons, and a lot of processing power. Consider the curse of dimensionality: he's working with a 5000ish dimensions vector, you have to make it simpler on the machine at some point!
But I can also recognize digits by correlating them against a database of millions of test-characters. Recognizing perfectly cropped characters is not a hard problem. The only benefit of an ANN is the compactness of the representation. So, unfortunately, as a non-expert, I would say this is only a minor step in the direction of true artificial intelligence. The application looks quite fun and interesting though.
That's because you're confusing ANN with General/Hard AI. It's not. ANN's are a tool which excells in clasically difficult tasks (pattern matching, for example). But since the AI winters we've learend that, as good as they're for pattern matching, we can't rely on them to do all the work.
And I would like to disagree with you. A system, with many small subsystems dedicated to specific tasks, is not only simpler to develop, but also better from an engineering point of view.
To put in a practical example: do you use the same "parts" of the brain to read a poem and to interpret a mathematical formula? If you would, you'd be quite bad at both things. your brain has specialized "parts" (not necesarily physical parts) to interpret correctly different things. Why should we not do the same with our AI systems?
The author probably could have used an ANN to segment the images as well before feeding it into another NN to classify the images. This has the benefit of re-using the character recognition network, which would better simulate what humans do, since we read new equations character by character into memory (unless we've seen so many that we begin chunking them like speed readers do with word groups)
The latest work, MathBoxes, uses the recognition engine from the starPAD sdk http://graphics.cs.brown.edu/research/pcc/research.html#star...
It's a great toolkit for building pen-centric computing tools (especially math recognizers and tools), but unfortunately it is heavily tied to the old Windows 7 tablet APIs and so isn't easily generalized. I've been hoping to port it to work on newer hardware for the past few years, but have not yet found the time. If anyone wants to take on that project it would be incredibly useful (especially since there seems to be a resurgence of pen-centric computing on the near horizon).
You can find more pen-math work on Brown's website: http://cs.brown.edu/research/ptc/FluidMath.html