I have a dataframe resultstatsDF
resultstatsDF = DataFrame({'a': [1,2,3,4,5]})
resultstatsDF['file'] = 'asdf'
resultstatsDF.dtypes
a int64
file object
dtype: object
with the object column file that I would like to cast to string:
I tried
resultstatsDF = resultstatsDF.astype({'file': str})
resultstatsDF['file'] = resultstatsDF['file'].astype(str)
resultstatsDF['file'] = resultstatsDF['file'].to_string
resultstatsDF['file'] = resultstatsDF.file.apply(str)
resultstatsDF['file'] = resultstatsDF['file'].apply(str)
but whatever I do, when I check with
resultstatsDF.dtypes
the column file stays to be of tpye object.
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
dtype of string, dict, list is always object, for testing type need select some value of column e.g. by iat:
type(resultstatsDF['file'].iat[0])
Sample:
resultstatsDF = pd.DataFrame({'file':['a','d','f']})
print (resultstatsDF)
file
0 a
1 d
2 f
print (type(resultstatsDF['file'].iloc[0]))
<class 'str'>
print (resultstatsDF['file'].apply(type))
0 <class 'str'>
1 <class 'str'>
2 <class 'str'>
Name: file, dtype: object
Sample:
df = pd.DataFrame({'strings':['a','d','f'],
'dicts':[{'a':4}, {'c':8}, {'e':9}],
'lists':[[4,8],[7,8],[3]],
'tuples':[(4,8),(7,8),(3,)],
'sets':[set([1,8]), set([7,3]), set([0,1])] })
print (df)
dicts lists sets strings tuples
0 {'a': 4} [4, 8] {8, 1} a (4, 8)
1 {'c': 8} [7, 8] {3, 7} d (7, 8)
2 {'e': 9} [3] {0, 1} f (3,)
All values have same dtypes:
print (df.dtypes) dicts object lists object sets object strings object tuples object dtype: object
But type is different, if need check it by loop:
for col in df:
print (df[col].apply(type))
0 <class 'dict'>
1 <class 'dict'>
2 <class 'dict'>
Name: dicts, dtype: object
0 <class 'list'>
1 <class 'list'>
2 <class 'list'>
Name: lists, dtype: object
0 <class 'set'>
1 <class 'set'>
2 <class 'set'>
Name: sets, dtype: object
0 <class 'str'>
1 <class 'str'>
2 <class 'str'>
Name: strings, dtype: object
0 <class 'tuple'>
1 <class 'tuple'>
2 <class 'tuple'>
Name: tuples, dtype: object
Or first value of columns:
print (type(df['strings'].iat[0])) <class 'str'> print (type(df['dicts'].iat[0])) <class 'dict'> print (type(df['lists'].iat[0])) <class 'list'> print (type(df['tuples'].iat[0])) <class 'tuple'> print (type(df['sets'].iat[0])) <class 'set'>
With boolean indexing if possible mixed column (then some pandas function can be broken) is possible filter by type:
df = pd.DataFrame({'mixed':['3', 5, 9,'2']})
print (df)
mixed
0 3
1 5
2 9
3 2
print (df.dtypes)
mixed object
dtype: object
for col in df:
print (df[col].apply(type))
0 <class 'str'>
1 <class 'int'>
2 <class 'int'>
3 <class 'str'>
Name: mixed, dtype: object
#python 3 - string
#python 2 - basestring
mask = df['mixed'].apply(lambda x: isinstance(x,str))
print (mask)
0 True
1 False
2 False
3 True
Name: mixed, dtype: bool
df = df[mask]
print (df)
mixed
0 3
3 2
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0