Tuesday, November 15, 2016

current GOF formula

(min(u,r)/N) * (r/F)
Where:
N = num slots in narrative
u = num slots used
F = num all words between first and last word read
r = number of words read

Update: Skip the min(u,r) but DO use the total number T of words, so the latest version of the formula is:
(u/N)*(r/F)*(r/T)
This makes a single word match less good than a multi word match; and it favors longer narrative patterns.

I should mention that dull words are discounted in F and in T. For no particular reason they are added to the numerator for (r/F) and subtracted from the denominator for (r/T)). Either way the scores I am getting now look better (eg "1.0"). 
Update: I changed it again, so the score in the numerator of (r/T) is corrected by replacing r with r + ur, where ur is a count of dull and control words in the full segment of text.

No comments:

Post a Comment