It is a routine mistake in pattern recognition to make measurements of an input and then judge the input by the distance of its measurements to those of an ideal, using comparison of the measurements in a flat Euclidean parameter space ("phase space"). When you hear people talking about a "feature vector", they are headed in that direction. The correct thing to do is to look at the structures defined by the measurements and quantify the differences between them. [So in this case the feature vector becomes a guide in the placement of a structure template.]
For example, suppose two positive real numbers a and b describe the shape of a rectangle independent of its size. Then comparing (a1/b1) to (a2/b2) is better than using sqrt( (a1-a2)^2 + (b1-b2)^2 ).
But that is the wrong approach.
For example, suppose a and b describe a step with a tread of length a and a rise of length b. We can embed this step into a function space where we use the L2 metric to quantify distance, perhaps using a formula more like
(a2-a1)*(b2-b1)
But that is the wrong approach too, although it shows how different metrics makes sense in different contexts.
Go back to the how the input was measured. (Sticking with rectangles) imagine fitting rectangles to the data and measuring the data as a rectangle. Suppose you wish to distinguish ideal pattern X from ideal pattern Y in this world of rectangles. X has an a1,b1 and Y has an a2/b2. So now we have new input data to be recognized and we do not bother to measure a and b for the data. Instead we fit a scaled version of X to the data versus fitting a scaled version of Y to the data. Which one is a better fit? That is simpler, cleaner and I think maybe a more effective pattern recognition method than vector algebra in a Euclidean space.
A lot of the work I do like this uses if...else statements and compares the measured values to thresholds. Occasional you get fancy and look at a ratio or difference. It might be a real relief (and I plan to try it) to find a uniform approach that incorporates all those special relations - by virtue of the structures defined by the parameters rather than algebraic relations between the parameters. It is geometry not algebra.
Update: A reason we do not think this through is because of the computational burden of fitting more complex shapes to data - there are no good formulas and, when you allow the data to include parameter changes, the calculation can quickly overwhelm a desktop computer. So you don't think about it. But between having an elegant formula (least squares best fit for lines) and an computationally exhausting search for best fit, there is another possibility: a hierarchical search that does coarse alignment using a coarse pattern and fine alignment using sub-patterns or "details" of the coarser one, in such a way that the search space is much smaller. Then you start realizing that there is no pattern "recognition". Instead you measurement tool comes along with an alignment method - a way to hold up the ruler to the data - that requires an alignment step to precede the measurement step. Instead of recognition, the alignment "template" fits or doesn't. The measurements that start from that template match either can be made or they cannot.