Patterns in syntactic dependency networks from authored and randomised texts
Patterns in syntactic dependency networks from authored and randomised texts
The syntactic relationships between words allow a communicator to express a virtually endless array of thoughts by a finite set of elements. The co-occurrence of words in a sentence reflects the syntactic dependency between words, and can be represented as a directed graph. In this account we compiled the grammar dependency networks of 86 texts from 11 well known English authors. In an analysis of the common and specific features of these networks we try to attribute network properties to individual authors. A pointwise defined measure shows no significant groups which could be identified with authors. Further, a comparison to randomized versions of the same texts shows a systematic, but very small difference between networks constructed for the originals and the randomisations, respectively. This suggests, that the scale-free and small world-like nature of these networks can be explained by an underlying regularity in the word frequency distribution, known as Zipf’s law. A stochastic model, which allows the construction of networks for arbitrary word frequency distributions, illustrates this idea.
Brede, Markus
bbd03865-8e0b-4372-b9d7-cd549631f3f7
Newth, David
e4f6e8f6-b8cf-49c0-b3db-489c50143a8a
2008
Brede, Markus
bbd03865-8e0b-4372-b9d7-cd549631f3f7
Newth, David
e4f6e8f6-b8cf-49c0-b3db-489c50143a8a
Brede, Markus and Newth, David
(2008)
Patterns in syntactic dependency networks from authored and randomised texts.
Complexity International, 12 (msid23).
Abstract
The syntactic relationships between words allow a communicator to express a virtually endless array of thoughts by a finite set of elements. The co-occurrence of words in a sentence reflects the syntactic dependency between words, and can be represented as a directed graph. In this account we compiled the grammar dependency networks of 86 texts from 11 well known English authors. In an analysis of the common and specific features of these networks we try to attribute network properties to individual authors. A pointwise defined measure shows no significant groups which could be identified with authors. Further, a comparison to randomized versions of the same texts shows a systematic, but very small difference between networks constructed for the originals and the randomisations, respectively. This suggests, that the scale-free and small world-like nature of these networks can be explained by an underlying regularity in the word frequency distribution, known as Zipf’s law. A stochastic model, which allows the construction of networks for arbitrary word frequency distributions, illustrates this idea.
Text
ComplexityInternational.pdf
- Other
More information
Published date: 2008
Organisations:
Agents, Interactions & Complexity
Identifiers
Local EPrints ID: 272901
URI: http://eprints.soton.ac.uk/id/eprint/272901
PURE UUID: fe7d6d8f-1ac2-419b-99c9-1961d97ecca9
Catalogue record
Date deposited: 30 Sep 2011 15:12
Last modified: 14 Mar 2024 10:12
Export record
Contributors
Author:
Markus Brede
Author:
David Newth
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics