Pandas: Cast column to string does not work

I have a dataframe resultstatsDF

resultstatsDF = DataFrame({'a': [1,2,3,4,5]})
resultstatsDF['file'] = 'asdf'
resultstatsDF.dtypes
a        int64
file    object
dtype: object

with the object column file that I would like to cast to string:

I tried

resultstatsDF = resultstatsDF.astype({'file': str})
resultstatsDF['file'] = resultstatsDF['file'].astype(str)
resultstatsDF['file'] = resultstatsDF['file'].to_string
resultstatsDF['file'] = resultstatsDF.file.apply(str)
resultstatsDF['file'] = resultstatsDF['file'].apply(str)

but whatever I do, when I check with

resultstatsDF.dtypes

the column file stays to be of tpye object.

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

dtype of string, dict, list is always object, for testing type need select some value of column e.g. by iat:

type(resultstatsDF['file'].iat[0])

Sample:

resultstatsDF = pd.DataFrame({'file':['a','d','f']})
print (resultstatsDF)
  file
0    a
1    d
2    f

print (type(resultstatsDF['file'].iloc[0]))
<class 'str'>

print (resultstatsDF['file'].apply(type))
0    <class 'str'>
1    <class 'str'>
2    <class 'str'>
Name: file, dtype: object

Sample:

df = pd.DataFrame({'strings':['a','d','f'],
                   'dicts':[{'a':4}, {'c':8}, {'e':9}],
                   'lists':[[4,8],[7,8],[3]],
                   'tuples':[(4,8),(7,8),(3,)],
                   'sets':[set([1,8]), set([7,3]), set([0,1])] })

print (df)
      dicts   lists    sets strings  tuples
0  {'a': 4}  [4, 8]  {8, 1}       a  (4, 8)
1  {'c': 8}  [7, 8]  {3, 7}       d  (7, 8)
2  {'e': 9}     [3]  {0, 1}       f    (3,)

All values have same dtypes:

print (df.dtypes)
dicts      object
lists      object
sets       object
strings    object
tuples     object
dtype: object

But type is different, if need check it by loop:

for col in df:
    print (df[col].apply(type))

0    <class 'dict'>
1    <class 'dict'>
2    <class 'dict'>
Name: dicts, dtype: object
0    <class 'list'>
1    <class 'list'>
2    <class 'list'>
Name: lists, dtype: object
0    <class 'set'>
1    <class 'set'>
2    <class 'set'>
Name: sets, dtype: object
0    <class 'str'>
1    <class 'str'>
2    <class 'str'>
Name: strings, dtype: object
0    <class 'tuple'>
1    <class 'tuple'>
2    <class 'tuple'>
Name: tuples, dtype: object

Or first value of columns:

print (type(df['strings'].iat[0]))
<class 'str'>

print (type(df['dicts'].iat[0]))
<class 'dict'>

print (type(df['lists'].iat[0]))
<class 'list'>

print (type(df['tuples'].iat[0]))
<class 'tuple'>

print (type(df['sets'].iat[0]))
<class 'set'>

With boolean indexing if possible mixed column (then some pandas function can be broken) is possible filter by type:

df = pd.DataFrame({'mixed':['3', 5, 9,'2']})
print (df)
  mixed
0     3
1     5
2     9
3     2

print (df.dtypes)
mixed    object
dtype: object

for col in df:
    print (df[col].apply(type))
0    <class 'str'>
1    <class 'int'>
2    <class 'int'>
3    <class 'str'>
Name: mixed, dtype: object

#python 3 - string
#python 2 - basestring
mask = df['mixed'].apply(lambda x: isinstance(x,str))
print (mask)
0     True
1    False
2    False
3     True
Name: mixed, dtype: bool

df = df[mask]
print (df)
  mixed
0     3
3     2


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x