The 'link payload' -- tags assigned to a link, the users tag themselves with these tags when they click them, thus generating a cloud of tags of possible interests applying to the clicking individual user -- could be applied by a company that hosts its own site on its own server -- at least on a server it can set up the way they prefer.
In my previous example for such a link payload, I focused on a news company that provides a news overview consisting of nothing but the news articles headlines linking to the articles. That forces the users willing to read the article to click the link, thus tagging themselves by the tags assigned to that link. -- If the company provided the full text by RSS feed, they'd never learn the tag cloud the users would generate/reveal about themselves by clicking several such links.
Learning the interests of a current visitor in realtime might allow to pick more fitting ads to present.
Aside of that immediate advantage of tag based user tracking on a single site, what about the web? Aside of user tracking, a (tagged) headline link to a news article page reveals another particle of content, even without the chance to track the user at all: The link tags tag the linked page too. -- If there are multiple links pointing to the very page, a cloud of tags for that page cumulates. In other words, the page gets a content aside of the text written on that page: the tag cloud. Since that content is not present on that page, I'd call it content assigned to that page, not actually there. For short, maybe something like "content [assigned] to a page" instead of the familiar "content of/on a page".
One question is, whether content can be mined from the web immediately, without processing the text presented on the web. Mined in a way like determining tags for links, for users, for web pages.
One goal of processing the tag cloud assigned to a web page (or any other item, of course) might be to gather a MOM net, a condensed form of content. A multi-level directed graph storing distinct content by each of its nodes. I see, it might be helpful to go more into depth with this, explaining what a MOM net actually is and what it stores and how it does so.
I keep that in the back of my head for another post to come.
Updates: none so far
Content Representation With A Twist
Showing posts with label users tagging themselves. Show all posts
Showing posts with label users tagging themselves. Show all posts
Thursday, June 21, 2007
Link payload to get an impression of user interests
Is it link payload? Or something like a content or a set of features the link clicking web users reveal about themselves?
Having a tool in reach that might mine immediately processable content from the web, the reorganizer module of MOM, I keep wondering how to actually mine the web.
Just the minute, I am skimming a news web site that, on its overview page, provides the headlines of the articles only. Not the least preview, not even a snipped of text, hinting on what the linked article mght deal with, and where it might dig into the depth. So, a human can say: If you click on that link, you might be interested in the topic spotlighted by the headline. Or, since I know the sometimes crudely set up headlines, there's a chance you clicked only to get an idea, what the heck the article might deal with. There's also the chance you'd click any link accidentally, but let's skip that possibility for now.
What I noticed the minute before, when I was skimming that headlines list was that converting the headline's words to nouns (e.g. by stemming) might suffice to tag the links. Given the case people would click only links they'd be interested in, in the mirror, any such link clicked reveals the topics the user is interested in -- the tags peel off the link and adhese to the person who clicked that link. In other words: By clicking the link, the users tag themselves. -- Track, what the user clicks over time, and you'd get not only a cloud of tags which you can link to a user, but by actually linking them to the user, applying reorganization, it's simple to learn the interests of a user. Add counting of the -- no, not of the links, as you might do for plain web site statistics, but instead -- add counting of the tags the users tag themselves with, and you might get a rather specific profile of the user. -- Cover a broad cloud of topics, thus a broad cloud of tags, and your users' profiles would become even sharper.
And, in the back of my head, there's still Google's advertising system. If each page, Google puts ads on, has to be 'enriched' by a handful of tags, visiting that page, the users tag themselves with those tags. If Google manages to assign that set of tags to individually you, Google might have quite a good impression of your interests.
Updates: none so far
Having a tool in reach that might mine immediately processable content from the web, the reorganizer module of MOM, I keep wondering how to actually mine the web.
Just the minute, I am skimming a news web site that, on its overview page, provides the headlines of the articles only. Not the least preview, not even a snipped of text, hinting on what the linked article mght deal with, and where it might dig into the depth. So, a human can say: If you click on that link, you might be interested in the topic spotlighted by the headline. Or, since I know the sometimes crudely set up headlines, there's a chance you clicked only to get an idea, what the heck the article might deal with. There's also the chance you'd click any link accidentally, but let's skip that possibility for now.
What I noticed the minute before, when I was skimming that headlines list was that converting the headline's words to nouns (e.g. by stemming) might suffice to tag the links. Given the case people would click only links they'd be interested in, in the mirror, any such link clicked reveals the topics the user is interested in -- the tags peel off the link and adhese to the person who clicked that link. In other words: By clicking the link, the users tag themselves. -- Track, what the user clicks over time, and you'd get not only a cloud of tags which you can link to a user, but by actually linking them to the user, applying reorganization, it's simple to learn the interests of a user. Add counting of the -- no, not of the links, as you might do for plain web site statistics, but instead -- add counting of the tags the users tag themselves with, and you might get a rather specific profile of the user. -- Cover a broad cloud of topics, thus a broad cloud of tags, and your users' profiles would become even sharper.
And, in the back of my head, there's still Google's advertising system. If each page, Google puts ads on, has to be 'enriched' by a handful of tags, visiting that page, the users tag themselves with those tags. If Google manages to assign that set of tags to individually you, Google might have quite a good impression of your interests.
Updates: none so far
Subscribe to:
Comments (Atom)