Content Representation With A Twist

Thursday, October 25, 2007

text/editor auto-completion as a possible real world application for MOM

Right now, I am using my secondary workplace PC. At this one, I am used to use it one-handedly. And let the auto-completion kick in.

In a recent blog posting somewhere else, I was discussing lectures, lecturers, discussing as a topic, and the next issue I moved to was seminars. Intuitively, I expected, the auto-completion would kick in and offer "seminars" -- which it didn't.

I pondered whether to file a feature request, suggesting to background-use a thesaurus -- a word-processing one, not necessarily a real one -- to predict the words one might most-likely use soon. -- Then, I nticed, traditional term ordering systems like e.g. thesauri might have a hard time to do so; even more the programmers who actually should implement such kind of tool... well, on the second glance, maybe brute force could help there, and as a text is a relatively small amount of data (and vocabularies even more small), might be doable, easily to implement.

The brute force approach could pick up, stem the words of the text, then follow all the relation edges of a term to its set of neighbours, collect them, order them by alphabet, consider them like the words appearing really i the text: offer them for auto-completion where it looks appropriately.

On the other hand, a MOM approach might be to consider the words of the already typed-in text, step back a step, see the features of the items of the terms, count which other item(s) count the most features the until-now mentioned ones feature too. That way, we additionally would get a ranking of probability of upcoming terms. ... I'd do that myself, but the issue on tasks like this remans the old one: Where to get such interrelated collections of words in a reasonable amount and for reasonable .. no cost at all?

      
Updates:
none so far

No comments: