Content Representation With A Twist

Showing posts with label clustering. Show all posts
Showing posts with label clustering. Show all posts

Friday, July 13, 2007

Positive hits in the "content representation" search results

Correct hits on the "content representation" term Google search (in opposite to any such hits that contained "content <something else but whitespace only, such as punctuation> representation"): I went through the results from end (page 79) towards start, since I presumed many false hits the nearer the end of the tail. But there few false hits there.

The above results I picked from pages 79 and 78 only -- and already learned a lession: It might make more sense to apply some kind of clustering here instead of walking through the list manually. Even the intellectual check whether there is anything in between of "content" and "representation" -- to filter out false hits --, can be done by software.

I'd like to learn the most-often used terms (besides of "content representation"), and, by help of that clustering/visualization, I want to get the chance to ignore obvious false hits.

That demands for using -- get hands on -- the Google API.

      
Updates:
none so far

Sunday, June 17, 2007

cluster your feed news: MOM reorganizer vs RSS feed news

The chance to reveal topics several different sources work with by applying a reorganizer also implies the chance to cluster RSS feeds by topic: Instead of approaching that issue by applying traditional information science procedures, alternatively the tags of the fetched articles could get looked up (retrieve the original article, pick its tags) and thrown through a reorganizer.

That might ease to skip feed news of usually valuable feeds on topics completely out of interest.

      
Updates:
none so far