TY - GEN
T1 - Alignment Free Dissimilarities for Nucleosome Classification
AU - Lo Bosco, Giosue'
AU - Lo Bosco, Giosué
PY - 2016
Y1 - 2016
N2 - Epigenetic mechanisms such as nucleosome positioning, histone modications and DNA methylation play an important role in the regulation of cell type-specic gene activities, yet how epigenetic patterns are established and maintained remains poorly understood. Recent studies have shown a role of DNA sequences in recruitment of epigenetic regulators. For this reason, the use of more suitable similarities or dissimilarity between DNA sequences could help in the context of epigenetic studies. In particular, alignment-free dissimilarities have alreadybeen successfully applied to identify distinct sequence features that are associated with epigenetic patterns and to predict epigenomic proles. In this work, we focalize the study on the problem of nucleosome classification, providing a benchmark study of 6 alignment free dissimilarity measures between sequences, belonging to the categories of geometricbased, correlation-based, information-based and compression based. Their comparisons have been done versus an alignment based dissimilarity, by measuring the performance of several nearest neighbour classiers that incorporate each one the considered dissimilarities. Results computed on three dataset of nucleosome forming and inhibiting sequences, shows that among the alignment free dissimilarities, the geometric and correlation are the more suitable for the purpose of nucleosome classication, making them a more ecient alternative to the alignment-based similarity measures, which nevertheless are yet the preferred choice when dealing with sequence similarity measurements
AB - Epigenetic mechanisms such as nucleosome positioning, histone modications and DNA methylation play an important role in the regulation of cell type-specic gene activities, yet how epigenetic patterns are established and maintained remains poorly understood. Recent studies have shown a role of DNA sequences in recruitment of epigenetic regulators. For this reason, the use of more suitable similarities or dissimilarity between DNA sequences could help in the context of epigenetic studies. In particular, alignment-free dissimilarities have alreadybeen successfully applied to identify distinct sequence features that are associated with epigenetic patterns and to predict epigenomic proles. In this work, we focalize the study on the problem of nucleosome classification, providing a benchmark study of 6 alignment free dissimilarity measures between sequences, belonging to the categories of geometricbased, correlation-based, information-based and compression based. Their comparisons have been done versus an alignment based dissimilarity, by measuring the performance of several nearest neighbour classiers that incorporate each one the considered dissimilarities. Results computed on three dataset of nucleosome forming and inhibiting sequences, shows that among the alignment free dissimilarities, the geometric and correlation are the more suitable for the purpose of nucleosome classication, making them a more ecient alternative to the alignment-based similarity measures, which nevertheless are yet the preferred choice when dealing with sequence similarity measurements
KW - Alignment free DNA sequence dissimilarities
KW - Epigenetic
KW - Knn classifier
KW - L-tuples
KW - Nucleosome classification
KW - k-mers
KW - Alignment free DNA sequence dissimilarities
KW - Epigenetic
KW - Knn classifier
KW - L-tuples
KW - Nucleosome classification
KW - k-mers
UR - http://hdl.handle.net/10447/181678
UR - http://link.springer.com/chapter/10.1007/978-3-319-44332-4_9
M3 - Conference contribution
SN - 9783319443317
T3 - LECTURE NOTES IN COMPUTER SCIENCE
SP - 114
EP - 128
BT - Computational Intelligence Methods for Bioinformatics and Biostatistics
ER -