I have successfully read a csv file using pandas. When I am trying to print the a particular column from the data frame i am getting keyerror. Hereby i am sharing the code with the error.
import pandas as pd
reviews_new = pd.read_csv("D:\aviva.csv")
reviews_new['review']
**
reviews_new['review']
Traceback (most recent call last):
File "<ipython-input-43-ed485b439a1c>", line 1, in <module>
reviews_new['review']
File "C:Users30216AppDataLocalContinuumAnaconda2libsite-packagespandascoreframe.py", line 1997, in __getitem__
return self._getitem_column(key)
File "C:Users30216AppDataLocalContinuumAnaconda2libsite-packagespandascoreframe.py", line 2004, in _getitem_column
return self._get_item_cache(key)
File "C:Users30216AppDataLocalContinuumAnaconda2libsite-packagespandascoregeneric.py", line 1350, in _get_item_cache
values = self._data.get(item)
File "C:Users30216AppDataLocalContinuumAnaconda2libsite-packagespandascoreinternals.py", line 3290, in get
loc = self.items.get_loc(item)
File "C:Users30216AppDataLocalContinuumAnaconda2libsite-packagespandasindexesbase.py", line 1947, in get_loc
return self._engine.get_loc(self._maybe_cast_indexer(key))
File "pandasindex.pyx", line 137, in pandas.index.IndexEngine.get_loc (pandasindex.c:4154)
File "pandasindex.pyx", line 159, in pandas.index.IndexEngine.get_loc (pandasindex.c:4018)
File "pandashashtable.pyx", line 675, in pandas.hashtable.PyObjectHashTable.get_item (pandashashtable.c:12368)
File "pandashashtable.pyx", line 683, in pandas.hashtable.PyObjectHashTable.get_item (pandashashtable.c:12322)
KeyError: 'review'
**
Can someone help me in this ?
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
I think first is best investigate, what are real columns names, if convert to list better are seen some whitespaces or similar:
print (reviews_new.columns.tolist())
I think there can be 2 problems (obviously):
1.whitespaces in columns names (maybe in data also)
Solutions are strip whitespaces in column names:
reviews_new.columns = reviews_new.columns.str.strip()
Or add parameter skipinitialspace to read_csv:
reviews_new = pd.read_csv("D:\aviva.csv", skipinitialspace=True)
2.different separator as default ,
Solution is add parameter sep:
#sep is ;
reviews_new = pd.read_csv("D:\aviva.csv", sep=';')
#sep is whitespace
reviews_new = pd.read_csv("D:\aviva.csv", sep='s+')
reviews_new = pd.read_csv("D:\aviva.csv", delim_whitespace=True)
EDIT:
You get whitespace in column name, so need 1.solutions:
print (reviews_new.columns.tolist())
['Name', ' Date', ' review']
^ ^
Method 2
import pandas as pd
df=pd.read_csv("file.txt", skipinitialspace=True)
df.head()
df['review']
Method 3
dfObj['Hash Key'] = (dfObj['DEAL_ID'].map(str) +dfObj['COST_CODE'].map(str) +dfObj['TRADE_ID'].map(str)).apply(hash) #for index, row in dfObj.iterrows(): # dfObj.loc[`enter code here`index,'hash'] = hashlib.md5(str(row[['COST_CODE','TRADE_ID']].values)).hexdigest() print(dfObj['hash'])
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0