Populate a Pandas SparseDataFrame from a SciPy Sparse Matrix

I noticed Pandas now has support for Sparse Matrices and Arrays. Currently, I create DataFrame()s like this:

return DataFrame(matrix.toarray(), columns=features, index=observations)

Is there a way to create a SparseDataFrame() with a scipy.sparse.csc_matrix() or csr_matrix()? Converting to dense format kills RAM badly. Thanks!

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

A direct conversion is not supported ATM. Contributions are welcome!

Try this, should be ok on memory as the SpareSeries is much like a csc_matrix (for 1 column)
and pretty space efficient

In [37]: col = np.array([0,0,1,2,2,2])

In [38]: data = np.array([1,2,3,4,5,6],dtype='float64')

In [39]: m = csc_matrix( (data,(row,col)), shape=(3,3) )

In [40]: m
Out[40]: 
<3x3 sparse matrix of type '<type 'numpy.float64'>'
        with 6 stored elements in Compressed Sparse Column format>

In [46]: pd.SparseDataFrame([ pd.SparseSeries(m[i].toarray().ravel()) 
                              for i in np.arange(m.shape[0]) ])
Out[46]: 
   0  1  2
0  1  0  4
1  0  0  5
2  2  3  6

In [47]: df = pd.SparseDataFrame([ pd.SparseSeries(m[i].toarray().ravel()) 
                                   for i in np.arange(m.shape[0]) ])

In [48]: type(df)
Out[48]: pandas.sparse.frame.SparseDataFrame

Method 2

As of pandas v 0.20.0 you can use the SparseDataFrame constructor.

An example from the pandas docs:

import numpy as np
import pandas as pd
from scipy.sparse import csr_matrix

arr = np.random.random(size=(1000, 5))
arr[arr < .9] = 0
sp_arr = csr_matrix(arr)
sdf = pd.SparseDataFrame(sp_arr)

Method 3

A much shorter version:

df = pd.DataFrame(m.toarray())


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x