Characterizing linguistic structure with mutual information.

We explore mutual information (MI) as a means of characterizing linguistic statistical structure. The MI between two linguistic tokens x and y is the degree to which seeing x helps us anticipate the occurrence of y. We computed MI between words in 595 samples of written text in 25 languages. Our analyses indicate that MI dependencies do not extend beyond a range of five words. Moreover, the similarity between MI profiles of different languages was used to cluster the languages. These results are discussed in terms of a putative link between short-term memory and linguistic structure and the further utility of MI in terms of characterizing the latter.

Identifying an appropriate way to ...

More articles like this:

Loading
We're searching over:
  • 60 million articles
  • 3,500 publications


Newsweek Harper's Magazine The Washington Post Chicago Tribune Crain's Chicago Business PRNewswire Pediatric News The Nation Advertising Age The Economist (US) Register Register