From here on I will assume you have the Applet properly loaded, and will make reference to various GUI components. In fact, the following is a description of each of the components and the specifics of the network implementation where appropriate; this seems an appropriate approach for description of the project. Examples, conclusions, and interesting observations will be interspersed throughout the discussion of the components. If you just want to skip to the primary conclusions, look for the PROJECT CONCLUSION fields; you can't miss them.
If you're having trouble loading the Applet, all of the following information and more, along with a screen shot, is available at :
I. The input fields :
PROJECT CONCLUSION #1 : if I were really implementing a character recognition algorithm, a simple neural network would be inadequate, and a great deal of image parsing would be required (well beyond the simple scaling employed here). Furthermore, even if I were going to implement a neural-net-based OCR procedure, a Hopfield net would likely not be the most efficient choice. End project conclusion #1.
PROJECT CONCLUSION #2 : The system is much better at recognizing a figure that has had considerable noise applied to it than it is at discerning a hand-drawn shape. In fact, it is quite good at reconstructing noisy patterns that are unrecognizable to the human eye, but is completely incompetent at building a figure that differs in shape from the prototype, even for figures easily recognizable to a literate reader.
As a quick example, try drawing a simple figure (my favorite is an 'x' that is 5 squares from corner to corner). Click 'scale' (because it looks nicer that way), then click 'train'. Then set the noise field to 20%, and click the 'noise' button. Likely you would never recognize the figure in front of you as an 'x'. However, clicking 'propagate' should fully reconstruct the figure.
This is, of course, a silly demonstration, because only one figure has been stored. But it does show that on some level, the network is capable of reconstructing badly damaged prototypes that are beyond human recognition (but still really bad at handwritten characters). Similar properties are indeed demonstrated for larger training sets. Perhaps this network would thus be well-suited to recovering damaged typewritten characters into digital text (fuzzy-looking faxes are a great example).
II. The output fields :
This leads to a unique property of the Hopfield Network : the possibility of a stable state at the inverse of a stored pattern. For example, train the network on any arbitrary figure. Set the input field's noise to 50% (complete randomness), and propagate a few times. You'll note that, for random input, the original learned pattern and its inverse occur equally often as output. Hence both represent equally stable states.
This is an inconvenience, and will affect the performance of the network when the only performance criterion is the absolute number of 'correct' elements (see numerical performance analysis below). But the 'inverse pattern' effect probably does not have tremendous implications for character recognition, where shape is the primary parameter. In order to apply the Hopfield Network to character recognition, a post-processing algorithm would be required to compare the network output to prototype characters. It is trivial to deal with inverse patterns at this stage.
For each iteration, a random element a will be chosen from the field. The connection weight between a and each other element b will be multiplied by the value of b to obtain a weighted sum of connection influences. Provided the 'binary nonlinearization' option is checked below (more on this later), the new value of a will simply be +1 or -1, bearing the sign of the weighted sum. It is relevant to note here that in the Hopfield model, the previous state of an element has no influence whatsoever on its current state. This seems logical, as it would not be sensible for an element to have a connection weight other than 1.0 with itself, and self-weights would thus not influence the network in any useful way.
Iteration ceases when 10 * N (in this case 2250) iterations have passed without a change in an element's value. This is somewhat arbitrary, but provides reasonable confidence that the network is 'finished.' The total number of iterations required (including the 2250 uninteresting iterations at the tail end of propagation) will be displayed in the message window.
Virtually all instructions to propagate result in stable states within a few thousand iterations (again provided binary nonlinearization is applied, again more on this later). The required number of iterations for a given pattern may well be an indication of certainty (or proximity of the initial pattern to a stored state), again useful for a character recognition application where a rejection criteria is essential.
Unfortunately, the stable states reached often do not correspond to trained patterns; these 'spurious stable states' are a major drawback of the Hopfield network. One will often observe stable patterns that represent superpositions of trained patterns and/or their inverses. The occurrence of these states is dramatically reduced when error-correction is applied (discussed below), especially for larger numbers of stored figures.
III. The animation fields :
IV. The pattern set fields :
It is important to note that large numbers of learning trials can take a while, but will vastly improve the stability of the learned patterns (the autoassociative capabilities of the network). And again, the error-correction process is stable provided binary nonlinearization is applied (I still promise to discuss this later).
PROJECT CONCLUSION #3: Widrow-Hoff correction, or some similar supervised learning algorithm, is so effective as to be ESSENTIAL to any potential memory/recall applications of the Hopfield net. The numerical demonstration discussed below will make it very clear that pattern recognition with simple Hebbian learning is, so to speak, really hokey. Widrow-Hoff learning, however, can allow for perfect autoassociation - and thus good recovery from noise that does not damage figure shape - for numbers of patterns on the same order as the number of letters in the alphabet or digits available in Western numbering.
V. The file fields :
http://techhouse.brown.edu/dmorris/JOHN/JOHN.zip
The StinterNetLocal.class file is a Java 1.1 class that will load the above Applet with security restrictions disabled. I'll put a quarter on the fact that no one's interested enough to click that link. If you do, or even if you can just convince your browser to relax security enough to read a URL (which IE4.0 is SUPPOSED to let you do), an error-corrected set of weights representing the 10 Arabic digits in block form is available at :
http://techhouse.brown.edu/dmorris/digitweights.txtThis is essentially the data file that represents my 'character-recognition' prototypes. If you are in a position to read files, simply enter the above URL in the FILENAME field, and click 'READ WEIGHTS'. Then watch digits that you probably didn't intend to draw emerge as stable patterns. This would be a central feature of the program (and I would try much harder to make it more accessible) if character recognition was really all that good. Also, read weights at your own risk... since most browsers won't support it, I haven't been able to fully debug it and funny things happen sometimes...
VI. The correction trial fields :
The results should be rather convincing, provided sufficient learning trials are applied. For example, try running the simulation with a learning constant of .2, 15 patterns, a random seed of 5555, and 350 learning trials. Actually you might not want to actually try it, since it will take five minutes or so. But you will see that Hebbian learning alone gives 1671 errors (varying slightly, perhaps, with other implementations of the random number generator), and Widrow-Hoff learning with these parameters corrects the system to perfect autoassociation.
PROJECT CONCLUSION #4: See project conclusion #3... I just wanted to reiterate how useless the Hopfield network would be for OCR with no correction, and how much potential is added with Widrow-Hoff weight adjustments. Perfect autoassociation is, of course, a LONG way from useful pattern recognition. But it is certainly a prerequisite, and perhaps the only of the many prerequisites for OCR that can be achieved solely with the Hopfield net.
Incidentally, for a very small number of learning trials and a small learning constant, you may find that no change at all occurs; weight changes need to be large enough to actually change the sign of elements during propagation. So don't hold it against me if you run a trial with an LC of .2 and 20 learning trials, and nothing happens. In other words, good things come to those who can wait a few iterations.
VII. Nonlinearization to binary elements :
Unfortunately, the behavior of the network is VERY unpredictable with continuous-valued units, especially when weights are not constrained to symmetry (see below). As expected, when it works, one sees more effective learning with the Widrow-Hoff algorithm for a given number of iterations. This is because 'corrections' can be made when more information than the sign of the weighted connection sum is available.
I encourage you to try this only at risk of your own patience, as it does have a tendency to result in infinite propagation. This is a consequence of allowing very small values, which can lead to oscillations between signs. Hence the 10*N iterations necessary to declare the network 'stable' will never be reached.
PROJECT CONCLUSION #5: The binary nature of the elements in the traditional Hopfield network may seem a simplification, but it does not prevent high-quality autoassociation and it avoids potentially oscillatory states.
VIII. Constraining weight symmetry :
PROJECT CONCLUSION #6: While constraining the weights in the network to symmetry may intuitively seem to place limitations on possible weight solutions, it results in an immediately observable performance increase and should be applied whenever there is a limit on possible iterations.
Note that Hebbian learning alone inherently results in symmetric weights, with no applied constraints.
And so that's about it. An interesting exploration, though a disappointment with regard to character recognition. A task for graphics-types...