Content Representation With A Twist

Showing posts with label has a relationship. Show all posts
Showing posts with label has a relationship. Show all posts

Sunday, June 24, 2007

Weaknesses of Traditional Notion Organizing Systems

Weaknesses of traditional notion organizing systems:
  • Traditional notion organizing systems aim at providing experts --
    1. people who already know the professional vocabulary and the items the vocabulary refers to
    2. people who are trained in using notion organizing systems
    -- with a tool to find matching terms.
  • The concept of notion traditional notion organizing systems make use of implies the real item too. Hence, in reality, notion organizing systems aren't organizing notions only, but also the items they refer to. -- That might be a reason for the omnipresent preference is a relationships over has a ones:
  • Traditional notion organizing systems prefer is a relationships over has a ones: A car is a motorized vehicle, but a motorized vehicle is clearly not a part of a car. Therefore, a has a relationship to be established from car to motorized vehicle is wrong.
So, because traditional notion organizing system presuppose the notion to imply the matter item too, they at least require two different kinds of relationships to organize a professional vocabulary: the yet mentioned is a and has a relationships.

In my opinion, they are wrong about when//with implying the item to the notion instead of just dealing with the idea of the item and vocabulary for the item: The idea, i.e. the immaterial variant, of a motorized vehicle of course is a part of, or at least is included to, the idea of a car.


      
Updates:
none so far

Friday, June 22, 2007

An Issue about Getting a Replacement for a Heater Part by Using a Thesaurus

One of the most ancient questions that led to MOM was the issue a former information science teacher of mine presented by one of the lectures he gave to us: He drafted the case someone had an issue with their heater and had to find the replacement part by using a thesaurus. Actually the task was the guy should look for the matching word in the thesaurus to order the part by mail.
 

Simply said, a thesaurus is a vocabulary with the aim to order that vocabulary. It helps experts to find the right words. Likely the 'thesaurus' embedded to any major text writing program does to lay people. The thesaurus relates the items it deals with to each other to bring them into a hierarchy: A mouse is a mammal, and a mammal is a vertebrate. The cat is a mammal too, so the cat node is placed next to the mouse one. I tiger is a cat too, as well as the lion and the panther, so they get placed as childs of the cat.

The core item the thesaurus deals with is a trinity of item, word for that item and thought item/imagination of the item. It's called a notion. The notion is an abstraction for an item, thus in every case is immaterial. Related to the state of matter the notion focuses on the imaginary part but never lets go the material part out of sight or the word. The thesaurus orders its words by looking at the words and the real items.

In a first step, words synonymous to each other get collected to a set. The most familiar one of those synonyms get picked and becomes declared to be the descriptor of of that set. If the descriptor is referred to, implicitly the whole set is meant -- or, thought-with.

In a second step, the thesaurus goes ahead to order the items, identified by their descriptors. Most widely used relationship between descriptors might be the is a relationship, just as demonstrated above: The cat is a mammal, the mammal is a vertebrate, and so on. Alternatively, another common relationship applied by thesauri is the has a relationship. It states that the vertebrate has a spine, and a human has a head, has a torso, has a pair of arms and also has a pair of legs and feet. However, the is a relationship is much more used but the has a one. Reason for that might be that the resulting network of relationships and descriptors would quickly become a rather dense graph, hardly to maintain.

Aside of these two kinds of relationship any creator of a thesaurus can set up any kind of relationship they might imagine. Such as associative relationships relating related notions to each other that cannot be brought together by using any other kind of relationships, such as cat and cat food. The available relationships may vary from thesaurus to thesaurus, as their developers might have chosen different kinds of relationships to use.
 

For the case of the heater, we assumed a thesaurus effectively consisting of is a relationships only, since that seems to be the most common set up of a thesaurus.

A thesaurus consisting of is a relationships only helps an expert to quickly find the words they already know. On the other hand, a lay person usually gets stuck in that professional slang rather quickly as they get unable to discern the one notion from the other. Thesauri traditionally don't aim at assisting lay people, so the definitions they provide for descriptors are barely more but a reminder -- as said, for something the thesaurus developers assume the user already knows. If the thesaurus provides that definition text at all. During my course of studies I learned, thesauri resemble deserts of words, providing definitions as rarely as deserts have oases.

So, the answer to the heater question is: Restricted to a thesaurus, the guy won't find the replacement part for his heater.

And the amazing part my information science teacher pointed to too, was that going to the next heating devices shop would solve the problem within a minute -- it would suffice if the guy would describe the missing part by its look.
 

That miracle stuck with me. I came to the point to wonder why not to set up kind of a thesaurus that would prefer has a over is a relationships and asked the teacher about that. I was pointed to issues of how to put that into practice? How to manage that heavily wired graph?
 

Well, that's a matter of coping with machines, so I went along, although I wasn't about to get any support from that teacher. On the other hand, I was familiar to programming since 1988 -- so what? ... And over time, MOM evolved.
 
 
Update: During tagging all the rather old postings which were already in this blog when blogger.com didn't offer post tagging yet, I noticed I presented another variant of the issue earlier, then related to a car replacement part.

      
Updates:
20070624: added reference to the car repair example

Wednesday, February 21, 2007

About the Simple Set Core

The Simple Set Core project is about a set engine. Aim of this set engine is to recognize ("identify") items -- sets of features -- by only some of their features, to store these items and their features recursively as directed graphs, and to reorganize these graphs so that as well implicit items/features become visible as the graph as a whole becomes less dense, thus more easy to handle.

Background

The set engine is part of a larger project, the Model of Meaning. Its approach is that notions ("meanings") consist of smaller notions (or raw data like "light sensed"). Different but common approaches, the Model of Meaning drops the familiar "is a" relationship between things: The assumption of the model is that, despite a car is a kind of a vehicle, a vehicle is a part of the car. -- At first glance this is hard to comprehend, but isn't it so that you are just thinking of the car and the vehicle? Then, both are not physical, thus there is no physical problem of "cramming" a just imagined vehicle within a car. Which also is just imagination.

Benefit For The Web

Having the Model of Meaning in mind, the set engine alone can do the web a big service: If tags would be related to other tags by "is part of" relationships, first, the tagging folks could quit to mention implications, second, the people searching for content could get the matching content even when looking for low level implications of the very content.

Perspective

Also, if there is a way to recognize items by just parts of their features, tag graphs could be integrated with each other, automatically, simply because the items mentioned in the graphs could be recognized by their features also; and, dropping the tags, replacing them by notion identifiers ("IDs") where the tags become attached to, maybe that could overcome even the language barrier, once and for all.

      
Updates:
none so far