|
|
Machines in the conversation: detecting themes and trends in informal communication streams.
- Article from:
-
IBM Systems Journal
- Article date:
-
October 1, 2006
- Author:
- Spangler, W. Scott; Kreulen, Jeffrey T.; Newswanger, James F.
|
Copyright informationCOPYRIGHT 2006 All Rights Reserved. This material is published under license from the publisher through the Gale Group, Farmington Hills, Michigan. All inquiries regarding rights should be directed to the Gale Group. (Hide copyright information)
|
Data-mining techniques that detect trends and patterns in structured data are often ill-suited for analysis of unstructured text. Information critical to business--and generated by groups such as employees, customers, and the public--appears in such forms as chats, electronic discussion forums, and blogs. This paper describes techniques developed to detect themes and trends in such informal communication streams. Our approach begins with unsupervised text clustering to create initial categories. A human analyst then refines the categories into easily understandable themes. To facilitate this process, we developed an interactive approach to text category creation and validation that aids ...