Data mining K-clustering problem

Karoussi, Elham

dc.contributor.author	Karoussi, Elham
dc.date.accessioned	2012-10-03T11:40:16Z
dc.date.available	2012-10-03T11:40:16Z
dc.date.issued	2012
dc.identifier.uri	http://hdl.handle.net/11250/137565
dc.description	Masteroppgave i informasjons- og kommunikasjonsteknologi IKT590 2012 – Universitetet i Agder, Grimstad	no_NO
dc.description.abstract	In statistic and data mining, k-means clustering is well known for its efficiency in clustering large data sets. The aim is to group data points into clusters such that similar items are lumped together in the same cluster. In general, given a set of objects together with their attributes, the goal is to divide the objects into k clusters such that objects lying in one cluster should be as close as possible to each other’s (homogeneity) and objects lying in different clusters are further apart from each other. However, there exist some flaws in classical K-means clustering algorithm. According to the method, first, the algorithm is sensitive to selecting initial Centroid and can be easily trapped at a local minimum regarding to the measurement (the sum of squared errors) used in the model. And on the other hand, the K-means problem in terms of finding a global minimal sum of the squared errors is NP-hard even when the number of the cluster is equal 2 or the number of attribute for data point is 2, so finding the optimal clustering is believed to be computationally intractable. In this dissertation, to solving the k-means clustering problem, we provide designing a Variant Types of K-means in a Multilevel Context, which in this algorithm we consider the issue of how to derive an optimization model to the minimum sum of squared errors for a given data set. We introduce the variant type of k-means algorithm to guarantee the result of clustering is more accurate than clustering by basic k-means algorithms. We believe this is one type of k-means clustering algorithm that combines theoretical guarantees with positive experimental results.	no_NO
dc.language.iso	eng	no_NO
dc.publisher	Universitetet i Agder / University of Agder	no_NO
dc.title	Data mining K-clustering problem	no_NO
dc.type	Master thesis	no_NO
dc.source.pagenumber	80	no_NO

Tilhørende fil(er)

Filnavn:: masteroppgave.pdf
Størrelse:: 1.436Mb
Format:: PDF

Åpne

Denne innførselen finnes i følgende samling(er)

Master's theses in Information and Communication Technology [491]
MM500, IKT590, IKT591

Vis enkel innførsel