George Mason University
AES/CCS/SCS/Statistics Colloquium Series
Seminar Announcement


Spectral Decomposition Approaches to Analyzing Text Data

Elizabeth Leeds

Naval Surface Warfare Center, Dahlgren Division

Location: Johnson Center: Meeting Room D
Time: 10:30 a.m. Refreshments, 10:45 a.m. Colloquium Talk
Date: November 12, 2004



ABSTRACT

We investigate several approached to analyzing text data based on the spectral decomposition of matrices created from the documents. The documents are represented as vectors using a ``bag-of-words'' representation and a word weighting motivated by mutual information. Using this vector representation, traditional spectral decomposition approaches such as principal components analysis are used to visualize the data. In addition, graphs are constructed from the data and the spectral decomposition of the Laplacian of the graph is used to visualize the data. We show that the graph approach provides to a visualization of the data with the most cluster structure.