On utilizing dependence-based information to enhance micro-aggregation for secure statistical databases

Oommen, B. John; Fayyoumi, Ebaa

dc.contributor.author	Oommen, B. John
dc.contributor.author	Fayyoumi, Ebaa
dc.date.accessioned	2013-05-03T07:18:46Z
dc.date.available	2013-05-03T07:18:46Z
dc.date.issued	2013
dc.identifier.citation	Oommen, B. J., & Fayyoumi, E. (2013). On utilizing dependence-based information to enhance micro-aggregation for secure statistical databases. Pattern Analysis and Applications, 16(1), 99-116. doi: 10.1007/s10044-011-0199-9	no_NO
dc.identifier.issn	1433-7541
dc.identifier.uri	http://hdl.handle.net/11250/138000
dc.description	Published version of an article in the journal: Pattern Analysis and Applications. Also available from the publisher at: http://dx.doi.org/10.1007/s10044-011-0199-9	no_NO
dc.description.abstract	We consider the micro-aggregation problem which involves partitioning a set of individual records in a micro-data file into a number of mutually exclusive and exhaustive groups. This problem, which seeks for the best partition of the micro-data file, is known to be NP-hard, and has been tackled using many heuristic solutions. In this paper, we would like to demonstrate that in the process of developing micro-aggregation techniques (MATs), it is expedient to incorporate information about the dependence between the random variables in the micro-data file. This can be achieved by pre-processing the micro-data before invoking any MAT, in order to extract the useful dependence information from the joint probability distribution of the variables in the micro-data file, and then accomplishing the micro-aggregation on the "maximally independent" variables-thus confirming the conjecture [A conjecture, which was recently proposed by Domingo-Ferrer et al. (IEEE Trans Knowl Data Eng 14(1):189-201, 2002), was that the phenomenon of micro-aggregation can be enhanced by incorporating dependence-based information between the random variables of the micro-data file by working with (i.e., selecting) the maximally independent variables. Domingo-Ferrer et al. have proposed to select one variable from among the set of highly correlated variables inferred via the correlation matrix of the micro-data file. In this paper, we demonstrate that this process can be automated, and that it is advantageous to select the "most independent variables" by using methods distinct from those involving the correlation matrix.] of Domingo-Ferrer et al. Our results, on real life and artificial data sets, show that including such information will enhance the process of determining how many variables are to be used, and which of them should be used in the micro-aggregation process.	no_NO
dc.language.iso	eng	no_NO
dc.publisher	Springer	no_NO
dc.subject	micro-aggregation technique	no_NO
dc.subject	maximun spanning tree	no_NO
dc.subject	projected variables	no_NO
dc.title	On utilizing dependence-based information to enhance micro-aggregation for secure statistical databases	no_NO
dc.type	Journal article	no_NO
dc.type	Peer reviewed	no_NO
dc.subject.nsi	VDP::Mathematics and natural science: 400::Information and communication science: 420::Knowledge based systems: 425	no_NO
dc.subject.nsi	VDP::Technology: 500::Information and communication technology: 550	no_NO
dc.source.pagenumber	99-116	no_NO
dc.source.volume	16	no_NO
dc.source.journal	Pattern Analysis and Applications	no_NO
dc.source.issue	1	no_NO
dc.identifier.doi	10.1007/s10044-011-0199-9

Tilhørende fil(er)

Filnavn:: Oommen_2013_On.pdf
Størrelse:: 1.464Mb
Format:: PDF

Åpne

Denne innførselen finnes i følgende samling(er)

Scientific Publications in Information and Communication Technology [710]

Vis enkel innførsel