If all N slots of a narrative are used then U=N and (U/N)=1. Similarly if every word is read between the first and last indices then (R/F)=1. Thus I propose the formula
GOF = (U/N)*(R/F)
This is between 0 and 1 and it is equal to 1 if an only if every word is read and every narrative slot is filled. (There are minor adjustments to F, for dull words and known control words. Also the high-level vaulting permits multiple occurrences of the narrative to be counted if they occur repeatedly.)But that GOF score is not linear and does not transfer up from the sub-narratives to the whole. So when we come to the need for a goodness of fit score during recursion, the linear/superimpose-able aspects need to used. But how? It only matters when reading the two-part narratives: sequence(a,b) and cause(a,b). What I do is try splitting the text into textA followed by textB and consider U_A as the number of slots used when reading textA with the narrative 'a' and let U_B be the number of slots used when reading textB with the narrative 'b'. Now we seek to maximize
g = U_A * U_B
over all possible ways of dividing the text into two consecutive pieces. It is tricky because the return value from the reading of this text will be U_A+U_B (using plus! for linearity) where g was maximized. This formula for g favors dividing the text into equal size pieces but the sum does not.
Update: It occurs to me, after explaining that the linear and superimpose-able is preserved in a recursion regardless of what formula you use for g, I can see no reason not to use the full GOF formula for g, as well. I'll have to think about it.
Update: It occurs to me, after explaining that the linear and superimpose-able is preserved in a recursion regardless of what formula you use for g, I can see no reason not to use the full GOF formula for g, as well. I'll have to think about it.
No comments:
Post a Comment