I want to print all data (all rows) of a specific column in python using openpyxl I am working in this way;
from openpyxl import load_workbook
workbook = load_workbook('----------/dataset.xlsx')
sheet = workbook.active
for i in sheet:
print(sheet.cell(row=i, column=2).value)
But it gives
if row < 1 or column < 1:
TypeError: unorderable types: tuple() < int()
Because i am iterating in row=i. If I use sheet.cell(row=4, column=2).value it print the value of cell. But how can I iterate over all document?
Edit 1
On some research, it is found that data can be get using Sheet Name. The Sheet 1 exists in the .xlsx file but its data is not printing. Any problem in this code?
workbook = load_workbook('---------------/dataset.xlsx')
print(workbook.get_sheet_names())
worksheet =workbook.get_sheet_by_name('Sheet1')
c=2
for i in worksheet:
d = worksheet.cell(row=c, column=2)
if(d.value is None):
return
else:
print(d.value)
c=c+1
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
Read the OpenPyXL Documentation
Iteration over all worksheets in a workbook, for instance:
for n, sheet in enumerate(wb.worksheets):
print('Sheet Index:[{}], Title:{}'.format(n, sheet.title))
Output:
Sheet Index:[0], Title: Sheet Sheet Index:[1], Title: Sheet1 Sheet Index:[2], Title: Sheet2
Iteration over all rows and columns in one Worksheet:
worksheet = workbook.get_sheet_by_name('Sheet')
for row_cells in worksheet.iter_rows():
for cell in row_cells:
print('%s: cell.value=%s' % (cell, cell.value) )
Output:
<Cell Sheet.A1>: cell.value=²234 <Cell Sheet.B1>: cell.value=12.5 <Cell Sheet.C1>: cell.value=C1 <Cell Sheet.D1>: cell.value=D1 <Cell Sheet.A2>: cell.value=1234 <Cell Sheet.B2>: cell.value=8.2 <Cell Sheet.C2>: cell.value=C2 <Cell Sheet.D2>: cell.value=D2
Iteration over all columns of one row, for instance row==2:
for row_cells in worksheet.iter_rows(min_row=2, max_row=2):
for cell in row_cells:
print('%s: cell.value=%s' % (cell, cell.value) )
Output:
<Cell Sheet.A2>: cell.value=1234 <Cell Sheet.B2>: cell.value=8.2 <Cell Sheet.C2>: cell.value=C2 <Cell Sheet.D2>: cell.value=D2
Iteration over all rows, only column 2:
for col_cells in worksheet.iter_cols(min_col=2, max_col=2):
for cell in col_cells:
print('%s: cell.value=%s' % (cell, cell.value))
Output:
<Cell Sheet.B1>: cell.value=12.5 <Cell Sheet.B2>: cell.value=8.2 <Cell Sheet.B3>: cell.value=9.8 <Cell Sheet.B4>: cell.value=10.1 <Cell Sheet.B5>: cell.value=7.7
Tested with Python:3.4.2 – openpyxl:2.4.1 – LibreOffice: 4.3.3.2
Method 2
Try this,
from openpyxl import load_workbook
workbook = load_workbook('----------/dataset.xlsx')
sheet = workbook.active
row_count = sheet.max_row
for i in range(row_count):
print(sheet.cell(row=i, column=2).value)
Method 3
This code will read a sheet as if it was a csv and populatte a list of dictionaries in result using the first row as the column titles.
from openpyxl import load_workbook
result = []
wb = load_workbook(filename=file_name)
sheet = wb.active
col_count = sheet.max_column
column_names = {}
for c in range(1, col_count):
heading = sheet.cell(row=1, column=c).value
if not heading:
col_count = c
break
column_names[c] = heading
for r, row_cells in enumerate(sheet.iter_rows(2), 2):
row = {}
for c in range(1, col_count):
value = sheet.cell(row=r, column=c).value
if type(value) == datetime:
value = value.strftime('%Y-%m-%d')
row[column_names[c]] = value
result.append(row)
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0