Peptide classification using optimal and information theoretic syntactic modeling

Aygün, Ezra; Oommen, B. John; Cataltepe, Z

dc.contributor.author	Aygün, Ezra
dc.contributor.author	Oommen, B. John
dc.contributor.author	Cataltepe, Z
dc.date.accessioned	2011-01-18T13:36:17Z
dc.date.available	2011-01-18T13:36:17Z
dc.date.issued	2010
dc.identifier.citation	Aygün, E., Oommen, B. J., & Cataltepe, Z. (2010). Peptide classification using optimal and information theoretic syntactic modeling. Pattern Recognition, 43(11), 3891-3899. doi: 10.1016/j.patcog.2010.05.022	en_US
dc.identifier.issn	0031-3203
dc.identifier.uri	http://hdl.handle.net/11250/137852
dc.description	Accepted version of an article published in the journal: Pattern Recognition. Published version available on Sciverse: http://dx.doi.org/10.1016/j.patcog.2010.05.022	en_US
dc.description.abstract	We consider the problem of classifying peptides using the information residing in their syntactic representations. This problem, which has been studied for more than a decade, has typically been investigated using distance-based metrics that involve the edit operations required in the peptide comparisons. In this paper, we shall demonstrate that the Optimal and Information Theoretic (OIT) model of Oommen and Kashyap [22] applicable for syntactic pattern recognition can be used to tackle peptide classification problem. We advocate that one can model the differences between compared strings as a mutation model consisting of random substitutions, insertions and deletions obeying the OIT model. Thus, in this paper, we show that the probability measure obtained from the OIT model can be perceived as a sequence similarity metric, using which a support vector machine (SVM)-based peptide classifier can be devised. The classifier, which we have built has been tested for eight different substitution matrices and for two different data sets, namely, the HIV-1 Protease cleavage sites and the T-cell epitopes. The results show that the OIT model performs significantly better than the one which uses a Needleman-Wunsch sequence alignment score, it is less sensitive to the substitution matrix than the other methods compared, and that when combined with a SVM, is among the best peptide classification methods available	en_US
dc.language.iso	eng	en_US
dc.publisher	Elsevier	en_US
dc.title	Peptide classification using optimal and information theoretic syntactic modeling	en_US
dc.type	Journal article	en_US
dc.type	Peer reviewed	en_US
dc.subject.nsi	VDP::Mathematics and natural science: 400::Information and communication science: 420::Algorithms and computability theory: 422	en_US
dc.subject.nsi	VDP::Medical disciplines: 700::Basic medical, dental and veterinary science disciplines: 710::Medical molecular biology: 711	en_US
dc.source.pagenumber	3891-3899	en_US

Tilhørende fil(er)

Filnavn:: Oommen_2010_Peptide.pdf
Størrelse:: 265.7Kb
Format:: PDF

Åpne

Denne innførselen finnes i følgende samling(er)

Scientific Publications in Information and Communication Technology [710]

Vis enkel innførsel