One of the most ancient questions that led to MOM was the issue a former information science teacher of mine presented by one of the lectures he gave to us: He drafted the case someone had an issue with their heater and had to find the replacement part by using a thesaurus. Actually the task was the guy should look for the matching word in the thesaurus to order the part by mail.
Simply said, a thesaurus is a vocabulary with the aim to order that vocabulary. It helps experts to find the right words. Likely the 'thesaurus' embedded to any major text writing program does to lay people. The thesaurus relates the items it deals with to each other to bring them into a hierarchy: A mouse is a mammal, and a mammal is a vertebrate. The cat is a mammal too, so the cat node is placed next to the mouse one. I tiger is a cat too, as well as the lion and the panther, so they get placed as childs of the cat.
The core item the thesaurus deals with is a trinity of item, word for that item and thought item/imagination of the item. It's called a notion. The notion is an abstraction for an item, thus in every case is immaterial. Related to the state of matter the notion focuses on the imaginary part but never lets go the material part out of sight or the word. The thesaurus orders its words by looking at the words and the real items.
In a first step, words synonymous to each other get collected to a set. The most familiar one of those synonyms get picked and becomes declared to be the descriptor of of that set. If the descriptor is referred to, implicitly the whole set is meant -- or, thought-with.
In a second step, the thesaurus goes ahead to order the items, identified by their descriptors. Most widely used relationship between descriptors might be the is a relationship, just as demonstrated above: The cat is a mammal, the mammal is a vertebrate, and so on. Alternatively, another common relationship applied by thesauri is the has a relationship. It states that the vertebrate has a spine, and a human has a head, has a torso, has a pair of arms and also has a pair of legs and feet. However, the is a relationship is much more used but the has a one. Reason for that might be that the resulting network of relationships and descriptors would quickly become a rather dense graph, hardly to maintain.
Aside of these two kinds of relationship any creator of a thesaurus can set up any kind of relationship they might imagine. Such as associative relationships relating related notions to each other that cannot be brought together by using any other kind of relationships, such as cat and cat food. The available relationships may vary from thesaurus to thesaurus, as their developers might have chosen different kinds of relationships to use.
For the case of the heater, we assumed a thesaurus effectively consisting of is a relationships only, since that seems to be the most common set up of a thesaurus.
A thesaurus consisting of is a relationships only helps an expert to quickly find the words they already know. On the other hand, a lay person usually gets stuck in that professional slang rather quickly as they get unable to discern the one notion from the other. Thesauri traditionally don't aim at assisting lay people, so the definitions they provide for descriptors are barely more but a reminder -- as said, for something the thesaurus developers assume the user already knows. If the thesaurus provides that definition text at all. During my course of studies I learned, thesauri resemble deserts of words, providing definitions as rarely as deserts have oases.
So, the answer to the heater question is: Restricted to a thesaurus, the guy won't find the replacement part for his heater.
And the amazing part my information science teacher pointed to too, was that going to the next heating devices shop would solve the problem within a minute -- it would suffice if the guy would describe the missing part by its look.
That miracle stuck with me. I came to the point to wonder why not to set up kind of a thesaurus that would prefer has a over is a relationships and asked the teacher about that. I was pointed to issues of how to put that into practice? How to manage that heavily wired graph?
Well, that's a matter of coping with machines, so I went along, although I wasn't about to get any support from that teacher. On the other hand, I was familiar to programming since 1988 -- so what? ... And over time, MOM evolved.
Update: During tagging all the rather old postings which were already in this blog when blogger.com didn't offer post tagging yet, I noticed I presented another variant of the issue earlier, then related to a car replacement part.
Updates: 20070624: added reference to the car repair example
Content Representation With A Twist
Showing posts with label notion. Show all posts
Showing posts with label notion. Show all posts
Friday, June 22, 2007
Tuesday, February 20, 2007
What is MOM?
MOM is the acronym for "Model of Meaning". It is founded on one basic assumption, that a notion always consists out of simpler notions or, alternatively, out of raw data -- like "light sensored"
Like ontologies, the MOM building of notions is represented as a graph, a heavily wired network, where the notions make the nodes and the edges indicate which notion consists out of which other notions. Other but common ontologies, MOM avoids edges injecting knowledge which is unreachable to a system that gets the network as its knowledge base. For example, in thesauri there are edges like "this is an antonym to that" and "this a broader term to that". And most of the thesaurus edges (called "relationships") are abstracting. So, for example a dog node may be immediately attached to a animal node. But that implies: The system cannot sense why the dog may be an animal. It fully depends on the humans who foster the network. The same in the case of antonym or even more sophisticated relationships. -- As far as I know, most ontologies tend to make use of such knowledge injecting edges.
Related to this, there is one another big difference between MOM and common classifications/thesauri (also known as "controlled vocabularies"): MOM builds upon a slightly different definition of the term "notion".
Usually, notions are thought of as a triangle of term, item, and thought of the item. MOM does not focus on the item, nor does it have any interest on the terms. (Different cultures have different terms for the same items, hence why care?) Thus, the MOM nodes in fact don't represent notions but just thoughts of items. That makes a huge difference: Relying on the traditional definition of notions, a classification or a thesaurus can validly define a motor vehicle to consist of e.g. a set of wheels, engine, body (amongst others), but validly it can not define a van to consist of a motor vehicle and other parts, since a motor vehicle, simply, is not a part of a van. MOM, on the other hand, ignores the chance that a physical item might be attached to a notion. Hence, yes, part of a van may be a motor vehicle.
This insight allowed to drop the dominating kind of relationship of traditional controlled vocabularies, the abstraction relationship. Which MOM dropped in fact. Which led to a single remaining kind of edges: The partial one. Questions?
Updates: none so far
Like ontologies, the MOM building of notions is represented as a graph, a heavily wired network, where the notions make the nodes and the edges indicate which notion consists out of which other notions. Other but common ontologies, MOM avoids edges injecting knowledge which is unreachable to a system that gets the network as its knowledge base. For example, in thesauri there are edges like "this is an antonym to that" and "this a broader term to that". And most of the thesaurus edges (called "relationships") are abstracting. So, for example a dog node may be immediately attached to a animal node. But that implies: The system cannot sense why the dog may be an animal. It fully depends on the humans who foster the network. The same in the case of antonym or even more sophisticated relationships. -- As far as I know, most ontologies tend to make use of such knowledge injecting edges.
Related to this, there is one another big difference between MOM and common classifications/thesauri (also known as "controlled vocabularies"): MOM builds upon a slightly different definition of the term "notion".
Usually, notions are thought of as a triangle of term, item, and thought of the item. MOM does not focus on the item, nor does it have any interest on the terms. (Different cultures have different terms for the same items, hence why care?) Thus, the MOM nodes in fact don't represent notions but just thoughts of items. That makes a huge difference: Relying on the traditional definition of notions, a classification or a thesaurus can validly define a motor vehicle to consist of e.g. a set of wheels, engine, body (amongst others), but validly it can not define a van to consist of a motor vehicle and other parts, since a motor vehicle, simply, is not a part of a van. MOM, on the other hand, ignores the chance that a physical item might be attached to a notion. Hence, yes, part of a van may be a motor vehicle.
This insight allowed to drop the dominating kind of relationship of traditional controlled vocabularies, the abstraction relationship. Which MOM dropped in fact. Which led to a single remaining kind of edges: The partial one. Questions?
Updates: none so far
Saturday, May 22, 2004
[Merged from (the now removed) ia: organizing notions:] a link on this that could be relevant: Automatic Selection of Class Labels from a Thesaurus for an Effective Semantic Tagging of Corpor<<
[locally referred by: ./. ]
[locally referred by: ./. ]
Labels:
labelling,
links to sources,
notion,
thesaurus,
worth a read
Monday, May 17, 2004
[Merged from (the now removed) ia: organizing notions:] My studies were about information science but I never officially had contact with information architecture. Today I found a post on an information architecture blog that starts like a discussion I had a few years ago with a friend of mine.
In 2001, I started thinking about a general principle for describing concepts/ideas/notions. (In German we have the simple term "Begriff" for these three.) I'm aware about that this is a hard venture, but possibly I am on a good way.
Since the author of that blog notes that facets are (or were in 2002) a hot IA topic, i think it's a good idea to launch my own blog on this.
Hopefully, I'll get contact with some professional people who are interested in this field as well.<<
[locally referred by: ./. ]
In 2001, I started thinking about a general principle for describing concepts/ideas/notions. (In German we have the simple term "Begriff" for these three.) I'm aware about that this is a hard venture, but possibly I am on a good way.
Since the author of that blog notes that facets are (or were in 2002) a hot IA topic, i think it's a good idea to launch my own blog on this.
Hopefully, I'll get contact with some professional people who are interested in this field as well.<<
[locally referred by: ./. ]
Subscribe to:
Comments (Atom)