2021-05-18

SparseMatrix

Sparse Matrix

There is no strict definition how many elements need to be zero for a matrix to be considered sparse but a common criterion is that the number of non-zero elements is roughly the number of rows or columns.

When storing and manipulating sparse matrices on a computer, it is beneficial and often necessary to use specialized algorithms and data structures that take advantage of the sparse structure of the matrix.(Don’t waste your precious memory on non-sense zeros!) Specialized computers have been made for sparse matrices, as they are common in the machine learning field.

Storage strategy

Formats can be divided into two groups:

Those that support efficient modification, such as DOK (Dictionary of keys), LIL (List of lists), or COO (Coordinate list). These are typically used to construct the matrices.
Those that support efficient access and matrix operations, such as CSR (Compressed Sparse Row) or CSC (Compressed Sparse Column).

快速转置

原稀疏矩阵$m\times n$，共有非0元素$t$个，则算法复杂度为$O(n+t)$

转置后行号为转置前的列号，即转置前列号在前的，转置后行号也在前
转置后的行号（转置前的列号）的存储位置为所有转置前该列前所有非0元素个数和
- 注意递归转化为迭代

根据以上信息，即可构造出转置后的压缩稀疏矩阵表

用空间$(S(n))$换时间，先从原压缩稀疏矩阵表中提取出以上信息$O(t)$,(忽略初始化时间$O(n)$)，迭代法构造出可以直接使用的下标($O(n)$),再用这些信息去构造结果$O(n)$(遍历信息矩阵空间),总共$O(n+t)$

注意在使用下标的时候不要忘记++

#include <stdio.h>
#define Type int
#define N 8
#define columnLength 6
typedef struct Item{
	int row ;
	int col ;
	Type element ;
}Item ; 

typedef struct Info{
	int number_of_elements_of_current_col ;
	int location;
}Info;

int main(){
	Item compressed  [N] = {{0,0,16} , {0,3,22} , {0,5,16} , {1,1,12}, {1,2,3} ,{2,3,-8},{4,0,91} , {6,2,15}};
	Info infomation [columnLength] = {{0,0} };
	Item transposed [N] = {{0,0,0},};

	//loop over compressed to get number of elements of current column
	for (int i = 0 ; i < N ; i++){
		infomation[compressed[i].col].number_of_elements_of_current_col++;
	}
	//loop over information to generate usable location
	for (int i = 1 ; i < columnLength ; i++){
		infomation[i].location = infomation[i-1].location + infomation[i-1].number_of_elements_of_current_col;
	}
	//loop over compressed & transposed to do the transposing
	for (int i = 0 ; i < N ; i++){
		int index = infomation[compressed[i].col].location++;
                //++ means in case of one col contains multiple elements, move the location 1 step forward. So the output location is the last element's location instead of the first one's.
                

		transposed[index].row = compressed[i].col;
		transposed[index].col = compressed[i].row;
		transposed[index].element = compressed[i].element;
	}
	//print out the result
	for (int i = 0 ; i < columnLength ; i++) printf("%d %d\n" , infomation[i].number_of_elements_of_current_col, infomation[i].location);
	for (int i = 0 ; i < N ; i++) printf("{%d,%d,%d}\n" , transposed[i].row , transposed[i].col , transposed[i].element);
}
/*
2 2
1 3
2 5
2 7
0 7
1 8
{0,0,16}
{0,4,91}
{1,1,12}
{2,1,3}
{2,6,15}
{3,0,22}
{3,2,-8}
{5,0,16}
*/