Motivation: Information-theoretic and compositional analysis of biological sequences, in terms of k-mer dictionaries, has a well established role in genomic and proteomic studies. Much less so in epigenomics, although the role of k-mers in chromatin organization and nucleosome positioning isparticularly relevant. Fundamental questions concerning the informational content and compositional structure of nucleosome favouring and disfavoring sequences with respect to their basic building blocks still remain open. Results: We present the first analysis on the role of k-mers in the composition of nucleosome enriched and depleted genomic regions (NER and NDR for short) that is: (i) exhaustive and within thebounds dictated by the information-theoretic content of the sample sets we use and (ii) informative for comparative epigenomics. We analize four different organisms and we propose a paradigmatic formalization of k-mer dictionaries, providing two different and complementary views of thek-mers involved in NER and NDR. The first extends well known studies in this area, its comparative nature being its major merit. The second, very novel, brings to light the rich variety of k-mers involved in influencing nucleosome positioning, for which an initial classification in terms of clustersis also provided. Although such a classification offers many insights, the following deserves to be singled-out: short poly(dA:dT) tracts are reported in the literature as fundamental for nucleosome depletion, however a global quantitative look reveals that their role is much less prominentthan one would expect based on previous studies.
|Numero di pagine||8|
|Stato di pubblicazione||Published - 2015|
All Science Journal Classification (ASJC) codes