You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 195 Next »

TableOfContents(4)

This effort is still a "work in progress". Please feel free to add comments. BRBut please make the content less visible by using smaller fonts. – Edward J. Yoon


Sparse Matrix

NOTE:

  • Sparse matrix operations cannot be optimized
  • Sparse structures which are growable can exceed the initial bandwidth allocation, while those which are not growable are fixed, and over-allocation will cause an error
  • Matrices which are column major typically perform better with column-oriented operations, and likewise for row major matrices. Matrix/vector multiplication is row-major, while transpose multiplication is column-major

Why sparse matrices?

  • Many classes of problems result in matrices with a large number of zeros
  • A sparse matrix is a special class of matrix that allows only the non-zero terms to be stored
  • Reduction in the storage requirements for sparse matrices
  • Significant speed improvement as many calculations involving zero elements are neglected

Storage of sparse matrices

We choosed HBase which column-oriented sparse table storage to reduce storage and complexity.

  • Hama use column-oriented storage of matrices (HBase) , and so compressed column format is a natural choice of sparse storage
  • Hama forces the elements of each column to be stored in increasing order of their row index
  1  0  0       (1,1) = 1           
  0  3  1       (2,2) = 3
  0  0  0       (2,3) = 1

See also: [http://labs.google.com/papers/bigtable-osdi06.pdf Bigtable], A Distributed Storage System for Structured Data

  • No labels