I have been hunting for words in some of the previous posts. Let me try again:
Suppose you have a space of objects, each given by data that can be measured and you wish to use the measurements to help recognize the object. Here is the method: use a discrete dictionary of parametrized ideal objects, called models, whose measurements are set to match the measurements of a given object you wish to recognize. There may be several models in the dictionary with these same measurements. Each such model is itself an object in the object space. Because the model is in the same space as the object to be recognized, comparison metrics can (and should) be based on model-to-object distance there, not on any distance concepts in the space of the measurements. The best model is the one closest to the object in object space, and having the same measurements as the object to be recognized.
In particular, you want to avoid the trap of defining recognition in terms of regions in the measurement space. That can lead to expert systems with training instability and an infinity of corner cases. This new better way of looking at it with a "Total Space" of objects, a "Base Space" of measurements, and a method for inverting the measurements, all are reminiscent of creating a section of a fibre bundle (like the logarithm).
But what is most evocative to me, is that this recognition takes place in a context where measurement is possible - a context with some form of coordinate system and some mechanism for aligning the coordinates to the objects to be recognized, in order to perform measurements. Hence the recognition is a byproduct of a perception (the measurement) and a projection (the forming of models being compared to the data). If you think about it, this is a reasonable fit for how we navigate the world about us in a continuous feedback loop of perception and projection. But here is the hardest part of the idea: the initial measurement depends on a prior coordinate frame attachment which, itself, is a best model result. The process is inherently hierarchical and (I suppose) will work best when the pattern dictionaries are nested in the same way that details are related to the whole.
Since getting my PhD, I have been fascinated not just with the mathematics of moving frames but also with some of the applications of attaching coordinate frames to data - (eg "Anatomical Frame Standards" for medical imaging). So, to discover these ideas connected to those of another old friend, the logarithm, all in a way that is a reasonable description of a cognitive process - is quite gratifying. I'm sure it sounds looney but I did use it successfully to solve a problem at work involving automatic feature detection for surfaces in 3D. So here is the full-on crazy:
Subscribe to:
Post Comments (Atom)
If you read that carefully you should be wondering "how can an object become a framework for measuring?". Put another way: "how can an object recognition become a framework for noticing details?".
ReplyDelete