Differences between revisions 4 and 5
Revision 4 as of 2008-11-26 04:34:47
Size: 525
Editor: 219
Revision 5 as of 2009-09-20 21:45:52
Size: 525
Editor: localhost
Comment: converted to 1.6 markup
Deletions are marked like this. Additions are marked like this.
Line 2: Line 2:
[[TableOfContents(4)]] <<TableOfContents(4)>>


  1. Abstract


The word count matrix (document-word) approach is often referred to as latent semantic indexing and document clustering (Of course, A word frequently present in all documents will not be useful for clustering; The length of all documents is not uniform so a lengthy document will have higher word counts). This example gives parallel implementation of the Matrix-creation (In the future, the matrix sparse decomposition technique).

TermByDocumentMatrix (last edited 2009-09-20 21:45:52 by localhost)