Hierarchies of Self-Organizing Maps for action recognition

Haris Dindo, Miriam Buonamente, Magnus Johnsson

Risultato della ricerca: Article

8 Citazioni (Scopus)

Abstract

We propose a hierarchical neural architecture able to recognise observed human actions. Each layer in the architecture represents increasingly complex human activity features. The first layer consists of a SOM which performs dimensionality reduction and clustering of the feature space. It represents the dynamics of the stream of posture frames in action sequences as activity trajectories over time. The second layer in the hierarchy consists of another SOM which clusters the activity trajectories of the first-layer SOM and learns to represent action prototypes. The third- and last-layer of the hierarchy consists of a neural network that learns to label action prototypes of the second-layer SOM and is independent - to certain extent - of the camera's angle and relative distance to the actor. The experiments were carried out with encouraging results with action movies taken from the INRIA 4D repository. In terms of representational accuracy, measured as the recognition rate over the training set, the architecture exhibits 100% accuracy indicating that actions with overlapping patterns of activity can be correctly discriminated. On the other hand, the architecture exhibits 53% recognition rate when presented with the same actions interpreted and performed by a different actor. Experiments on actions captured from different view points revealed a robustness of our system to camera rotation. Indeed, recognition accuracy was comparable to the single viewpoint case. To further assess the performance of the system we have also devised a behavioural experiments in which humans were asked to recognise the same set of actions, captured from different points of view. Results form such a behavioural study let us argue that our architecture is a good candidate as cognitive model of human action recognition, as architectural results are comparable to those observed in humans.
Lingua originaleEnglish
pagine (da-a)33-41
Numero di pagine9
RivistaCognitive Systems Research
Volume39
Stato di pubblicazionePublished - 2016

Fingerprint

Self organizing maps
Cameras
Trajectories
Experiments
Labels
Motion Pictures
Posture
Human Activities
Neural networks
Cluster Analysis

All Science Journal Classification (ASJC) codes

  • Experimental and Cognitive Psychology
  • Cognitive Neuroscience
  • Artificial Intelligence

Cita questo

Hierarchies of Self-Organizing Maps for action recognition. / Dindo, Haris; Buonamente, Miriam; Johnsson, Magnus.

In: Cognitive Systems Research, Vol. 39, 2016, pag. 33-41.

Risultato della ricerca: Article

@article{ae6c8ef383b044d39076dc774a82c75c,
title = "Hierarchies of Self-Organizing Maps for action recognition",
abstract = "We propose a hierarchical neural architecture able to recognise observed human actions. Each layer in the architecture represents increasingly complex human activity features. The first layer consists of a SOM which performs dimensionality reduction and clustering of the feature space. It represents the dynamics of the stream of posture frames in action sequences as activity trajectories over time. The second layer in the hierarchy consists of another SOM which clusters the activity trajectories of the first-layer SOM and learns to represent action prototypes. The third- and last-layer of the hierarchy consists of a neural network that learns to label action prototypes of the second-layer SOM and is independent - to certain extent - of the camera's angle and relative distance to the actor. The experiments were carried out with encouraging results with action movies taken from the INRIA 4D repository. In terms of representational accuracy, measured as the recognition rate over the training set, the architecture exhibits 100{\%} accuracy indicating that actions with overlapping patterns of activity can be correctly discriminated. On the other hand, the architecture exhibits 53{\%} recognition rate when presented with the same actions interpreted and performed by a different actor. Experiments on actions captured from different view points revealed a robustness of our system to camera rotation. Indeed, recognition accuracy was comparable to the single viewpoint case. To further assess the performance of the system we have also devised a behavioural experiments in which humans were asked to recognise the same set of actions, captured from different points of view. Results form such a behavioural study let us argue that our architecture is a good candidate as cognitive model of human action recognition, as architectural results are comparable to those observed in humans.",
keywords = "Action recognition, Artificial Intelligence, Cognitive Neuroscience, Experimental and Cognitive Psychology, Hierarchical models, Intention understanding, Neural network, Self-Organizing Map",
author = "Haris Dindo and Miriam Buonamente and Magnus Johnsson",
year = "2016",
language = "English",
volume = "39",
pages = "33--41",
journal = "Cognitive Systems Research",
issn = "1389-0417",
publisher = "Elsevier",

}

TY - JOUR

T1 - Hierarchies of Self-Organizing Maps for action recognition

AU - Dindo, Haris

AU - Buonamente, Miriam

AU - Johnsson, Magnus

PY - 2016

Y1 - 2016

N2 - We propose a hierarchical neural architecture able to recognise observed human actions. Each layer in the architecture represents increasingly complex human activity features. The first layer consists of a SOM which performs dimensionality reduction and clustering of the feature space. It represents the dynamics of the stream of posture frames in action sequences as activity trajectories over time. The second layer in the hierarchy consists of another SOM which clusters the activity trajectories of the first-layer SOM and learns to represent action prototypes. The third- and last-layer of the hierarchy consists of a neural network that learns to label action prototypes of the second-layer SOM and is independent - to certain extent - of the camera's angle and relative distance to the actor. The experiments were carried out with encouraging results with action movies taken from the INRIA 4D repository. In terms of representational accuracy, measured as the recognition rate over the training set, the architecture exhibits 100% accuracy indicating that actions with overlapping patterns of activity can be correctly discriminated. On the other hand, the architecture exhibits 53% recognition rate when presented with the same actions interpreted and performed by a different actor. Experiments on actions captured from different view points revealed a robustness of our system to camera rotation. Indeed, recognition accuracy was comparable to the single viewpoint case. To further assess the performance of the system we have also devised a behavioural experiments in which humans were asked to recognise the same set of actions, captured from different points of view. Results form such a behavioural study let us argue that our architecture is a good candidate as cognitive model of human action recognition, as architectural results are comparable to those observed in humans.

AB - We propose a hierarchical neural architecture able to recognise observed human actions. Each layer in the architecture represents increasingly complex human activity features. The first layer consists of a SOM which performs dimensionality reduction and clustering of the feature space. It represents the dynamics of the stream of posture frames in action sequences as activity trajectories over time. The second layer in the hierarchy consists of another SOM which clusters the activity trajectories of the first-layer SOM and learns to represent action prototypes. The third- and last-layer of the hierarchy consists of a neural network that learns to label action prototypes of the second-layer SOM and is independent - to certain extent - of the camera's angle and relative distance to the actor. The experiments were carried out with encouraging results with action movies taken from the INRIA 4D repository. In terms of representational accuracy, measured as the recognition rate over the training set, the architecture exhibits 100% accuracy indicating that actions with overlapping patterns of activity can be correctly discriminated. On the other hand, the architecture exhibits 53% recognition rate when presented with the same actions interpreted and performed by a different actor. Experiments on actions captured from different view points revealed a robustness of our system to camera rotation. Indeed, recognition accuracy was comparable to the single viewpoint case. To further assess the performance of the system we have also devised a behavioural experiments in which humans were asked to recognise the same set of actions, captured from different points of view. Results form such a behavioural study let us argue that our architecture is a good candidate as cognitive model of human action recognition, as architectural results are comparable to those observed in humans.

KW - Action recognition

KW - Artificial Intelligence

KW - Cognitive Neuroscience

KW - Experimental and Cognitive Psychology

KW - Hierarchical models

KW - Intention understanding

KW - Neural network

KW - Self-Organizing Map

UR - http://hdl.handle.net/10447/216545

UR - http://www.elsevier.com/wps/find/journaldescription.cws_home/620288/description#description

M3 - Article

VL - 39

SP - 33

EP - 41

JO - Cognitive Systems Research

JF - Cognitive Systems Research

SN - 1389-0417

ER -