A query about SVD



Hello everybody,

I have a doubt regarding SVD.Suppose i compute SVD for a huge corpus
of similar category, and i have the decomposition as [USV^T].By "huge
corpus of similar category", i mean web pages downloaded from the
similar category . Actually i am creating a term by document
matrix(rows indicating the terms, columns the documents and each
element of the matrix indicating
the frequency of each term in the corresponding document) of certain
number of web pages and then i will aplly SVD to that term by document
matrix in order to calculate the similarity bwetween the documents or
web pages. Now, what i am asking is , if i create the term by document
matrix of
pages or documents taken from the same category, i.e if they are
already similar, then in the SVD of the Term by Document matrix which i
create using these similar pages, does the singular values in the
diagonal will be very near to each other, i.e. does the numerical
difference
between one and next singular value in the diagonal will be very
small..?

I would be thankful for any suggestion

.



Relevant Pages


Loading