Discourse Studies | 2019

A computational model for measuring discourse complexity

 
 

Abstract


In past studies, the few quantitative approaches to discourse structure were mostly confined to the presentation of the frequency of discourse relations. However, quantitative approaches should take into account both hierarchical and relational layers in the discourse structure. This study considers these factors and addresses the issue of how discourse relations and discourse units are related. It draws upon the available corpora of discourse structure (rhetorical structure theory-discourse treebank (RST-DT)) from a new perspective. Since an RST tree can be converted into a syntactic dependency tree, the data extracted from the RST-DT can be useful for calculating the discourse distance in much the same way as syntactic dependency distance is calculated. Discourse distance is also applicable to measuring the depth of the human processing of discourse. Furthermore, the data derived from the RST-DT are also easily converted into network data. This study finds that discourse structure has its discourse distance minimum and each type of RST relations has its range of discourse distance. The frequency distribution of discourse data basically follows the power law on several levels, while a network approach reveals how discourse units are arranged spatially in regular patterns. The two methods are mutually complementary in revealing the interaction between discourse relations and discourse units in a comprehensive manner, as well as in revealing how people process and comprehend discourse dynamically. Accordingly, we propose merging the two methods so as to yield a computational model for assessing discourse complexity and comprehension.

Volume 21
Pages 690 - 712
DOI 10.1177/1461445619866985
Language English
Journal Discourse Studies

Full Text