We present a system aimed at the automatic classification of the sentiment orientation expressed into bookreviews written in Italian language. The system we have developed is found on a lexicon-based approachand uses NLP techniques in order to take into account the linguistic relation between terms in the analyzedtexts. The classification of a review is based on the average sentiment strenght of its sentences, while theclassification of each sentence is obtained through a parsing process inspecting, for each term, a window ofprevious items to detect particular combinations of elements giving inversions or variations of polarity. Thescore of a single word depends on all the associated meanings considering also semantically related conceptsas synonyms and hyperonims. Concepts associated to words are extracted from a proper stratification oflinguistic resources that we adopt to solve the problems of lack of an opinion lexicon specifically tailored onthe Italian language. The system has been prototyped by using Python language and it has been tested on adataset of reviews crawled from Amazon.it, the Italian Amazon website. Experiments show that the proposedsystem is able to automatically classify both positive and negative reviews, with an average accuracy of above82%.
|Title of host publication||Proceedings of the 12th International Conference on Web Information Systems and Technologies (WEBIST 2016)|
|Number of pages||12|
|Publication status||Published - 2016|
All Science Journal Classification (ASJC) codes
- Computer Networks and Communications
- Information Systems