How to tackle this by a machine? The idea is to look at the keywords the feed postings are tagged with: How much do they match the user's interests?
Trying this by myself, I made the tags of this here blog's tag cloud be my interests and ran them against some of the more interesting feeds I read. The results are encouraging but the actual match values tiny: Yes, the software detects matches. No, they usually range below 2%, often even below 1%.
One issue here might be synonymy: There may be greater matches between my interests and the feeds topics, but the two of us may speak different languages: Engadget simply might use different words for the same things. So the
Determining valid synonyms based on a single given word likely will bring up such synonyms that match a different meaning of the given word. Like "canine", also the "trestle" is a synonym for "dog", and the software for sure would come up with that.
Looking into my old university books for this issue, they all implicitly presumed a human would look for the synonym. But, no, here it'd be a machine, and it won't be able to detect the meaning shift intuitively, won't be able to skip nonsensical synonyms.
Looking further, I found some postings on Google starting to imply synonyms to searches. So, there indeed is some kind of algorithm around that determines synonyms based on a small set of given keywords. Remaining question: Got that algorithm published ad what does it look like?
No comments:
Post a Comment