Application of SNP reduction approaches and random forest for the identification of population informative markers in cosmopolitan and local cattle breeds

Research output: Contribution to conferenceOtherpeer-review


In livestock, single nucleotide polymorphism genotyping arrays have beenused to differentiate breeds and populations for several downstreamapplications, including breed allocation of individuals, breeds of origin ofcrossbred animals, authentication of mono breed products, comparativeanalyses of selection signatures among several other uses. We alreadytested a combination of principal component analysis (PCA), used as preselectionmethod, and random forest (RF) used as classification method to assign cosmopolitan Italian breeds with no or very low error rate. In thiswork, we increased the number of breeds and approaches, to have a morecomprehensive view of the strategies available and the applicability to localItalian breeds. The most common cosmopolitan dairy or dual purposebreeds (Holstein, Brown, Simmental) and 3 local breeds subjected tolimited or no breeding programs (Reggiana, Modicana and Cinisara) wereanalyzed comparing several methods of SNPs pre-selection (Delta, Fst andPCA) in addition to RF classifications. From these classifications, two panelsof 96 and 48 SNPs that contained the most discriminant SNPs were createdfor each pre-selection method. The results showed that the 96-SNP panelswere generally more able to discriminate all breeds, while for the 48- SNPpanels the error rate increased mainly for autochthonous breeds,particularly for Cinisara. This was probably a consequence of limitedselection pressure, admixed origin, and ascertain bias on the constructionof the SNP chip that was not designed considering these breeds. Severalselected SNPs are located nearby genes affecting breed-specific traits (e.g.coat color and stature) or associated to production traits. The 96-SNPpanel obtained after a preselection chromosome by chromosome, and usedin the previous work with cosmopolitan breeds only, could identifyinformative SNPs that were particularly useful for the assignment of minorbreeds. This panel reached the lowest value of out of bag (OOB) error inthe RF test even in the Cinisara, whose value was quite high in all otherpanels. Moreover, this panel contained also the lowest number of SNPs inlinkage disequilibrium. Our results showed the usefulness and power of thecombination of PCA pre-selection and RF also for the discrimination of localcattle breeds.
Original languageEnglish
Number of pages1
Publication statusPublished - 2017


Dive into the research topics of 'Application of SNP reduction approaches and random forest for the identification of population informative markers in cosmopolitan and local cattle breeds'. Together they form a unique fingerprint.

Cite this