Vis enkel innførsel

dc.contributor.authorMoy, Ole-Alexander
dc.date.accessioned2009-03-16T09:37:08Z
dc.date.issued2008
dc.identifier.urihttp://hdl.handle.net/11250/137025
dc.descriptionMasteroppgave i informasjons- og kommunikasjonsteknologi 2008 – Universitetet i Agder, Grimstaden
dc.description.abstractThe huge amount of textual data available in digital form in today’s world increases the need for methods that facilitate ease of access and navigability. Automatic extraction of keywords from text bodies is one promising approach. However, the relevance of keywords are context dependent, and extracting relevant keywords often requires a semantic analysis, simply because words may have different meanings in different contexts. It is well-known that resolving such word sense ambiguity automatically can be very challenging. When the topic of interest is geographic information, important keywords would be geographic terms like countries, cities, counties and states. This thesis presents a probabilistic method for automatic identification of geographic terms within natural language text. The method uses a database of geographic terms to identify possible geographic entities. In contrast to state of the art, we resolve semantic ambiguity by using a Bayesian classifier that takes the context of ambiguous words into account. In our empirical results, we report a geographic term identification accuracy of 90%. We thus believe that the approach we present can be of importance for those working within the field of text analysis and data-mining, when accurate geographic term identification is of importance.en
dc.format.extent332424 bytes
dc.format.mimetypeapplication/pdf
dc.language.isoengen
dc.publisherUniversitetet i Agder / Agder Universityen
dc.subject.classificationIKT590
dc.titleIdentifying Geographic Terms within Natural Language Texten
dc.typeMaster thesisen
dc.subject.nsiVDP::Mathematics and natural science: 400::Information and communication science: 420::Simulation, visualization, signal processing, image processing: 429en
dc.source.pagenumber52en


Tilhørende fil(er)

Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel