Problem/issue detection and classification
Master thesis
Åpne
Permanent lenke
http://hdl.handle.net/11250/137476Utgivelsesdato
2009Metadata
Vis full innførselSamlinger
Sammendrag
This thesis investigates the possibility of using pattern recognition and machine
learning to detect online forum posts containing descriptions of problems
and issues. In addition we seek to further classify these posts as either
informative or non-informative, depending the quality of their content.
Our motivation for this research is the fact that more and more consumers
are turning to the Internet when expressing opinions, seeking help or searching
for advice about their purchases. As a producer, awareness of this online
“word-of-mouth” has become a key factor in brand and reputation control
and when dealing with product recall management. However, over the last
years, the amount of such online buzz has transcended the point where it’s
manually manageable. It’s simply not possible to survey thousands of online
forums by hand anymore. We therefore need systems able to automatically
monitor these web sites and detect posts of interest, such as people having
problems with a product.
In the thesis, we implement and examine three different algorithms, Naive
Bayes, k Nearest Neighbor and Self-Organizing Maps and determine what
parameters and features produce the highest accuracy. The prototype is
trained and validated using forum posts, but its application is not limited
to this type of documents. The contribution of the thesis is to the areas of
natural language text processing and classification.
Beskrivelse
Masteroppgave i informasjons- og kommunikasjonsteknologi 2009 – Universitetet i Agder, Grimstad