MASC Word Sense Comparisons
How do the word senses listed in different lexical resources compare? How well can we align them?
In this study, we considered this empirically by looking at text corpus data annotated with multiple sense inventories (those of WordNet and FrameNet). We compare how different occurrences of a word are annotated with respect to these sense inventories.
We also propose a new evaluation measure, the Expected Jaccard measure, to quantify how well the annotations align.
ICSI researchers and their collaborators have developed a way to statistically compare two of the most widely used lexical resources for English, FrameNet and WordNet. With the Expected Jaccard Index, researchers can for the first time empirically quantify agreement between WordNet and FrameNet expert annotations of words and phrases pulled from real-world texts. Comparing the annotations helps researchers find problematic gaps in the data and align the resources so they can be used together.
The work is part of an effort to understand how lexical resources differ through annotation of the Manually Annotated SubCorpus (MASC), a subset of the American National Corpus, using FrameNet, WordNet, and other lexical resources. The FrameNet Project, established by Professor Charles Fillmore, housed at ICSI, and led by Collin Baker, defines words through the semantic roles they play and the frames (types of event, relation, or entity) they evoke. WordNet, on the other hand, clusters partially synonymous words into “synsets”; researchers then describe the relationships between the synonym sets. Unlike a frame, which may include different parts of speech (such as verbs and nouns) and words with contradictory definitions (such as antonyms related to the same idea), a single synonym set comprises only synonyms of the same part of speech.
WordNet’s information about a word based on its synonyms complements FrameNet’s syntactic information about the role it plays in a sentence; the resources, however, do not always align nicely and sometimes define a different set of senses of the same word. The word curious, for example, has three senses in FrameNet and only two in WordNet: unlike WordNet, FrameNet distinguishes curiosity as a character trait from curiosity as a mental state.
At the international conference on Language Resources and Evaluation in May, researchers will present a new statistical measure that shows where WordNet and FrameNet agree well on the meanings of words and phrases, and where they do not. One team at Vassar and Columbia Universities used WordNet to annotate MASC sentences in which a particular word appeared, and another team at ICSI annotated the same sentences with frames and lexical units. Each instance of the word was assigned to a cell in a contingency table, with its WordNet sense on one axis and the frame it evokes on the other. For example, of the 67 annotated sentences that included the word curious, 48 both evoked the typicality frame (FN1) and used curious in the WordNet sense of “beyond or deviating from the usual or expected” (WN1). As the researchers expected, curious never simultaneously evoked the typicality frame and the WordNet sense “eager to investigate.” See the end of this post for a more complicated example.
If WordNet and FrameNet were perfectly aligned, all of the sentences that evoke one frame would use one WordNet sense, and vice versa. However, that is not always the case. The researchers devised a statistical measure based on the standard Jaccard similarity coefficient to measure how closely aligned the resources are. The measure, the Expected Jaccard Index, will produce a high number when a word’s WordNet senses align well with its FrameNet frames – that is, when most of the sentences assigned to one frame have the same WordNet sense in its contingency matrix. The index will produce a low number when sentences with one WordNet sense evoke several frames, or sentences evoking one frame use several WordNet senses.
Some results of the statistical measurement were surprising. Researchers expected to find that the more meanings a word has, the less WordNet and FrameNet would agree on its senses. This was not always true: one of the most closely aligned words, board, has six relevant senses in both FrameNet and WordNet, while trace, one of the least aligned words, has two WordNet and five FrameNet meanings.
The work has also helped researchers identify frames and senses that need adjustment. While aligning the word curious, for which only one lexical unit existed at the time (“unorthodox or unexpected,” evoking the typicality frame), FrameNet researchers found that they needed to add two lexical units (for the permanent characteristic of being driven to learn and for the temporary state of being inquisitive). Unexpected results – such as sentences with unpredicted combinations of frames and WordNet senses – help researchers identify aspects of their resources that may be confusing or unhelpful.
The Expected Jaccard Index can also be applied to lexical resources other than FrameNet and WordNet as well as different versions of the same lexical resource.
Credits: The above text and the first image was created with help from ICSI's Public Relations staff
For more information, please refer to the following article:
Empirical Comparisons of MASC Word Sense Annotations PDF BibTeX
Gerard de Melo, Collin F. Baker, Nancy Ide, Rebecca Passonneau, Christiane Fellbaum (2012)
In: Proc. LREC 2012. ELRA.