Classifying Written Texts Through Rhythmic Features

Mihaela Balint, Stefan Trausan-Matu, Mihai Dascalu

Research output: Chapter in Book/Report/Conference proceedingConference Article in proceedingAcademicpeer-review

65 Downloads (Pure)

Abstract

Rhythm analysis of written texts focuses on literary analysis and it mainly considers poetry. In this paper we investigate the relevance of rhythmic features for categorizing texts in prosaic form pertaining to different genres. Our contribution is threefold. First, we define a set of rhythmic features for written texts. Second, we extract these features from three corpora, of speeches, essays, and newspaper articles. Third, we perform feature selection by means of statistical analyses, and determine a subset of features which efficiently discriminates between the three genres. We find that using as little as eight rhythmic features, documents can be adequately assigned to a given genre with an accuracy of around 80 %, significantly higher than the 33 % baseline which results from random assignment.
Original languageEnglish
Title of host publicationArtificial Intelligence: Methodology, Systems, and Applications. AIMSA 2016
EditorsC. Dichev, G. Agre
PublisherSpringer
Pages121-129
ISBN (Electronic)978-3-319-44748-3
ISBN (Print)978-3-319-44747-6
DOIs
Publication statusPublished - 18 Aug 2016
Externally publishedYes
EventInternational Conference on Artificial Intelligence: Methodology, Systems, and Applications: Artificial Intelligence: Methodology, Systems, and Applications - Varna, Bulgaria
Duration: 7 Sept 201610 Sept 2016
https://link.springer.com/book/10.1007/978-3-319-44748-3
https://www.springer.com/la/book/9783319447476

Publication series

SeriesLecture Notes in Computer Science (LNCS)
Volume9883
SeriesLecture Notes in Artificial Intelligence (subseries)
Volume9883

Conference

ConferenceInternational Conference on Artificial Intelligence: Methodology, Systems, and Applications
Abbreviated titleAIMSA 2016
Country/TerritoryBulgaria
CityVarna
Period7/09/1610/09/16
Internet address

Keywords

  • rhythm
  • text classification
  • natural language processing
  • discourse analysis

Fingerprint

Dive into the research topics of 'Classifying Written Texts Through Rhythmic Features'. Together they form a unique fingerprint.

Cite this