DSpace Repository

Computing a coarse-grained linguistic database using wordnet

Show simple item record

dc.contributor.advisor Prof. Vilas Wuwongse (Chairperson) en_US
dc.contributor.advisor Dr. Vatcharaporn Esichaikul (Member) en_US
dc.contributor.advisor Dr. Paul Janecek (Member) en_US
dc.contributor.author Kanjana Jiamjitvanich en_US
dc.date.accessioned 2015-01-12T10:46:14Z
dc.date.available 2015-01-12T10:46:14Z
dc.date.issued 2009 en_US
dc.identifier.other AIT Thesis no.IM-09-02 en_US
dc.identifier.uri http://www.cs.ait.ac.th/xmlui/handle/123456789/629
dc.description 60 p. en_US
dc.description.abstract WordNet is an electronic lexical database for English which stores lemmas and exceptional forms of words, word senses and sense glosses, semantic relations between word senses (e.g. hypernym, holonym), syntactic relations between words (e.g. antonym), and other information related to the structure and use of the language. In WordNet, word senses are represented as synsets which are sets of words with synonymous meaning in a particular context. WordNet synsets as well as holonym/hypernym relations are used in many natural language processing tasks in Sweb which is the semantic web application project of University of Trento , such as word sense disambiguation. A known problem of WordNet is that it is too ne-grained in its sense de nitions, whereas ordinary users discriminate among fewer word senses. Moreover, many applications which use WordNet data would bene t if the distinction among word senses was done at a more coarse-grained level and if some very rarely used senses were even dropped from the database. In SWeb the WordNet data is stored in a relational database handled by a component called Controlled Vocabulary (CV). The goal of this thesis is to de ne the appropriate level of granularity of word senses given the requirements de ned in the Sweb project, and develop an algorithm (Coarsealgo) which would compute the coarse-grained version of WordNet to improve the current background knowledge used in SWeb. Coarsealgo has the highest score of the performance measure in every part of speech. Coarsealgo has the best performance in nding polysemy and grouping the similar senses correctly. en_US
dc.description.sponsorship RTG Followship en_US
dc.language.iso en en_US
dc.publisher Asian Institute of Technology en_US
dc.relation.ispartofseries AIT Publications; en_US
dc.subject WordNet en_US
dc.subject Coarse-grained Database en_US
dc.subject Controlled Vocabulary en_US
dc.title Computing a coarse-grained linguistic database using wordnet en_US
dc.type Thesis en_US

Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace

Advanced Search


My Account