I have two different arrays, one with strings and another with ints. I want to concatenate them, into one array where each column has the original datatype. My current solution for doing this (see below) converts the entire array into dtype = string, which seems very memory inefficient.
combined_array = np.concatenate((A, B), axis = 1)
Is it possible to mutiple dtypes in combined_array when A.dtype = string and B.dtype = int?
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
One approach might be to use a record array. The “columns” won’t be like the columns of standard numpy arrays, but for most use cases, this is sufficient:
>>> a = numpy.array(['a', 'b', 'c', 'd', 'e'])
>>> b = numpy.arange(5)
>>> records = numpy.rec.fromarrays((a, b), names=('keys', 'data'))
>>> records
rec.array([('a', 0), ('b', 1), ('c', 2), ('d', 3), ('e', 4)],
dtype=[('keys', '|S1'), ('data', '<i8')])
>>> records['keys']
rec.array(['a', 'b', 'c', 'd', 'e'],
dtype='|S1')
>>> records['data']
array([0, 1, 2, 3, 4])
Note that you can also do something similar with a standard array by specifying the datatype of the array. This is known as a “structured array“:
>>> arr = numpy.array([('a', 0), ('b', 1)],
dtype=([('keys', '|S1'), ('data', 'i8')]))
>>> arr
array([('a', 0), ('b', 1)],
dtype=[('keys', '|S1'), ('data', '<i8')])
The difference is that record arrays also allow attribute access to individual data fields. Standard structured arrays do not.
>>> records.keys
chararray(['a', 'b', 'c', 'd', 'e'],
dtype='|S1')
>>> arr.keys
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'numpy.ndarray' object has no attribute 'keys'
Method 2
A simple solution: convert your data to object ‘O’ type
z = np.zeros((2,2), dtype='U2') o = np.ones((2,1), dtype='O') np.hstack([o, z])
creates the array:
array([[1, '', ''],
[1, '', '']], dtype=object)
Method 3
Refering Numpy doc, there is a function named numpy.lib.recfunctions.merge_arraysfunction which can be used to merge numpy arrays in different data type into either structured array or record array.
Example:
>>> from numpy.lib import recfunctions as rfn
>>> A = np.array([1, 2, 3])
>>> B = np.array(['a', 'b', 'c'])
>>> b = rfn.merge_arrays((A, B))
>>> b
array([(1, 'a'), (2, 'b'), (3, 'c')], dtype=[('f0', '<i4'), ('f1', '<U1')])
For more detail please refer the link above.
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0