Privacy-preserving document similarity detection
Master thesis
Permanent lenke
http://hdl.handle.net/11250/137519Utgivelsesdato
2011Metadata
Vis full innførselSamlinger
Sammendrag
The document similarity detection is an important technique used in many applications. The
existence of the tool that guarantees the privacy protection of the documents during the comparison
will expand the area where this technique can be applied. The goal of this project is to develop a
method for privacy-preserving document similarity detection capable to identify either semantically
or syntactically similar documents. As the result two methods were designed, implemented, and
evaluated. In the first method privacy-preserving data comparison protocol was applied for secure
comparison. This original protocol was created as a part of this thesis. In the second method
modified private-matching scheme was used. In both methods the Natural Language processing
techniques were utilized to capture the semantic relations between documents. During the testing
phase the first method was found to be too slow for the practical application. The second method,
on the contrary, was rather fast and effective. It can be used for creation of the tool for detecting
syntactical and semantic similarity in a privacy-preserving way.
Beskrivelse
Masteroppgave i informasjons- og kommunikasjonsteknologi IKT590 2011 – Universitetet i Agder, Grimstad