The BioDICE Taverna plugin for clustering and visualization of biological data: a workflow for molecular compounds exploration

Salvatore Gaglio, Massimo La Rosa, Antonino Fiannaca, Alfonso Urso, Giuseppe Di Fatta, Salvatore Gaglio, Riccardo Rizzo

Risultato della ricerca: Article

2 Citazioni (Scopus)

Abstract

Background: In many experimental pipelines, clustering of multidimensional biological datasets is used to detect hidden structures in unlabelled input data. Taverna is a popular workflow management system that is used to design and execute scientific workflows and aid in silico experimentation. The availability of fast unsupervised methods for clustering and visualization in the Taverna platform is important to support a data-driven scientific discovery in complex and explorative bioinformatics applications. Results: This work presents a Taverna plugin, the Biological Data Interactive Clustering Explorer (BioDICE), that performs clustering of high-dimensional biological data and provides a nonlinear, topology preserving projection for the visualization of the input data and their similarities. The core algorithm in the BioDICE plugin is Fast Learning Self Organizing Map (FLSOM), which is an improved variant of the Self Organizing Map (SOM) algorithm. The plugin generates an interactive 2D map that allows the visual exploration of multidimensional data and the identification of groups of similar objects. The effectiveness of the plugin is demonstrated on a case study related to chemical compounds. Conclusions: The number and variety of available tools and its extensibility have made Taverna a popular choice for the development of scientific data workflows. This work presents a novel plugin, BioDICE, which adds a data-driven knowledge discovery component to Taverna. BioDICE provides an effective and powerful clustering tool, which can be adopted for the explorative analysis of biological datasets.
Lingua originaleEnglish
pagine (da-a)1-6
Numero di pagine6
RivistaJournal of Cheminformatics
Volume6
Stato di pubblicazionePublished - 2014

Fingerprint

Self organizing maps
workflow
visualization
Visualization
organizing
Chemical compounds
Bioinformatics
Data mining
chemical compounds
data mining
Pipelines
management systems
Topology
experimentation
Availability
preserving
learning
availability
topology
platforms

All Science Journal Classification (ASJC) codes

  • Computer Science Applications
  • Physical and Theoretical Chemistry
  • Computer Graphics and Computer-Aided Design
  • Library and Information Sciences

Cita questo

The BioDICE Taverna plugin for clustering and visualization of biological data: a workflow for molecular compounds exploration. / Gaglio, Salvatore; Rosa, Massimo La; Fiannaca, Antonino; Urso, Alfonso; Fatta, Giuseppe Di; Gaglio, Salvatore; Rizzo, Riccardo.

In: Journal of Cheminformatics, Vol. 6, 2014, pag. 1-6.

Risultato della ricerca: Article

Gaglio, Salvatore ; Rosa, Massimo La ; Fiannaca, Antonino ; Urso, Alfonso ; Fatta, Giuseppe Di ; Gaglio, Salvatore ; Rizzo, Riccardo. / The BioDICE Taverna plugin for clustering and visualization of biological data: a workflow for molecular compounds exploration. In: Journal of Cheminformatics. 2014 ; Vol. 6. pagg. 1-6.
@article{27b89d0f850c4f65be8973e883f4537a,
title = "The BioDICE Taverna plugin for clustering and visualization of biological data: a workflow for molecular compounds exploration",
abstract = "Background: In many experimental pipelines, clustering of multidimensional biological datasets is used to detect hidden structures in unlabelled input data. Taverna is a popular workflow management system that is used to design and execute scientific workflows and aid in silico experimentation. The availability of fast unsupervised methods for clustering and visualization in the Taverna platform is important to support a data-driven scientific discovery in complex and explorative bioinformatics applications. Results: This work presents a Taverna plugin, the Biological Data Interactive Clustering Explorer (BioDICE), that performs clustering of high-dimensional biological data and provides a nonlinear, topology preserving projection for the visualization of the input data and their similarities. The core algorithm in the BioDICE plugin is Fast Learning Self Organizing Map (FLSOM), which is an improved variant of the Self Organizing Map (SOM) algorithm. The plugin generates an interactive 2D map that allows the visual exploration of multidimensional data and the identification of groups of similar objects. The effectiveness of the plugin is demonstrated on a case study related to chemical compounds. Conclusions: The number and variety of available tools and its extensibility have made Taverna a popular choice for the development of scientific data workflows. This work presents a novel plugin, BioDICE, which adds a data-driven knowledge discovery component to Taverna. BioDICE provides an effective and powerful clustering tool, which can be adopted for the explorative analysis of biological datasets.",
author = "Salvatore Gaglio and Rosa, {Massimo La} and Antonino Fiannaca and Alfonso Urso and Fatta, {Giuseppe Di} and Salvatore Gaglio and Riccardo Rizzo",
year = "2014",
language = "English",
volume = "6",
pages = "1--6",
journal = "Journal of Cheminformatics",
issn = "1758-2946",
publisher = "Chemistry Central",

}

TY - JOUR

T1 - The BioDICE Taverna plugin for clustering and visualization of biological data: a workflow for molecular compounds exploration

AU - Gaglio, Salvatore

AU - Rosa, Massimo La

AU - Fiannaca, Antonino

AU - Urso, Alfonso

AU - Fatta, Giuseppe Di

AU - Gaglio, Salvatore

AU - Rizzo, Riccardo

PY - 2014

Y1 - 2014

N2 - Background: In many experimental pipelines, clustering of multidimensional biological datasets is used to detect hidden structures in unlabelled input data. Taverna is a popular workflow management system that is used to design and execute scientific workflows and aid in silico experimentation. The availability of fast unsupervised methods for clustering and visualization in the Taverna platform is important to support a data-driven scientific discovery in complex and explorative bioinformatics applications. Results: This work presents a Taverna plugin, the Biological Data Interactive Clustering Explorer (BioDICE), that performs clustering of high-dimensional biological data and provides a nonlinear, topology preserving projection for the visualization of the input data and their similarities. The core algorithm in the BioDICE plugin is Fast Learning Self Organizing Map (FLSOM), which is an improved variant of the Self Organizing Map (SOM) algorithm. The plugin generates an interactive 2D map that allows the visual exploration of multidimensional data and the identification of groups of similar objects. The effectiveness of the plugin is demonstrated on a case study related to chemical compounds. Conclusions: The number and variety of available tools and its extensibility have made Taverna a popular choice for the development of scientific data workflows. This work presents a novel plugin, BioDICE, which adds a data-driven knowledge discovery component to Taverna. BioDICE provides an effective and powerful clustering tool, which can be adopted for the explorative analysis of biological datasets.

AB - Background: In many experimental pipelines, clustering of multidimensional biological datasets is used to detect hidden structures in unlabelled input data. Taverna is a popular workflow management system that is used to design and execute scientific workflows and aid in silico experimentation. The availability of fast unsupervised methods for clustering and visualization in the Taverna platform is important to support a data-driven scientific discovery in complex and explorative bioinformatics applications. Results: This work presents a Taverna plugin, the Biological Data Interactive Clustering Explorer (BioDICE), that performs clustering of high-dimensional biological data and provides a nonlinear, topology preserving projection for the visualization of the input data and their similarities. The core algorithm in the BioDICE plugin is Fast Learning Self Organizing Map (FLSOM), which is an improved variant of the Self Organizing Map (SOM) algorithm. The plugin generates an interactive 2D map that allows the visual exploration of multidimensional data and the identification of groups of similar objects. The effectiveness of the plugin is demonstrated on a case study related to chemical compounds. Conclusions: The number and variety of available tools and its extensibility have made Taverna a popular choice for the development of scientific data workflows. This work presents a novel plugin, BioDICE, which adds a data-driven knowledge discovery component to Taverna. BioDICE provides an effective and powerful clustering tool, which can be adopted for the explorative analysis of biological datasets.

UR - http://hdl.handle.net/10447/96964

M3 - Article

VL - 6

SP - 1

EP - 6

JO - Journal of Cheminformatics

JF - Journal of Cheminformatics

SN - 1758-2946

ER -