Wednesday, May 17, 2017

Combinatoric complexity of NLP versus simplicity of Narhwal

Taking an absolutely canned example of someone wanting to order a product of name X. Here are some simple forms:

order X
I need to order X
I want to order X
we want X
please make me an X

This small variety already stresses out the combinatoric, part of speech based, match algorithm, and never comes to grip with the concepts involved: AGENCY{I, me, we}; MOTIVE{need, want}; dull words {to, an, please}; the ORDER {order,make} and the undefined object X. So in Narwhal (which doesn't actually support variables like X but let's pretend and call it 'x') you write
followed by
event([GIMME], x, ORDER)
This gets a score of 1.0 on every example, except for "we want X". Since this sentence is missing the ORDER verb is get's no score according to current implementation. One workaround is to add a narrative attribute(GIMME,x) which does get a 1.0.
So at the expense of every keyword list being in the right place and thinking through the necessary narratives, the Narwhal programmer can accomplish a lot in a very few simple lines that require them to actually understand the concepts being programmed as concepts not as words.

If I was not such an lazy intellectual I would try to make this point exactly and publish it. After spending a week playing with AIML, I find the the majority of the programming effort goes into handling the variations in the words that are the least important. Quite literally, AIML is designed to spot the pattern of words around the key nouns in the input, so those same nouns can be substituted [without their meaning being important] into different patterns of words in the output. It is designed to not care about  the meaning of the topic defining words. Narwhal could not be more opposite in that regard - it is focused entirely on locating important topic words while remaining as oblivious as possible to the varying pattern of irrelevant words around the topic words.

No comments:

Post a Comment