creating multiple columns in a for loop python

I’m new to Python.
I’m trying to create multiple columns in a for loop but I’m having trouble with it.
I have several columns and I’m trying to create a new column that shows whether or not the elements in ohlcs is greater than elements in metrics. I can do it to create one column but I want to save time since I plan on doing the same function but for different variables.

ohlcs = ['open', 'high', 'low', 'close']
metrics = ['vwap', '9EMA', '20EMA']
wip = []
for idx, row in master_df.iterrows():
    for ohlc in ohlcs:
        for metric in metrics:
            row[f'{ohlc} above {metric}'] = np.where(row[ohlc] >= row[metric], 1, 0)

This didn’t do anything.
I’ve also done this:

ohlcs = ['open', 'high', 'low', 'close']
metrics = ['vwap', '9EMA', '20EMA']
wip = []
for idx, row in master_df.iterrows():
    for ohlc in ohlcs:
        for metric in metrics:
           if master_df[ohlc] >= master_df[metric]: 
               master_df[f'{ohlc} above {metric}'] = 1
           else:
               master_df[f'{ohlc} above {metric}'] = 0

That gave me an error.

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

I did other things but I erased those as I worked on it. At this point I’m out of ideas. Please help!

I got it now but I checked manually to see if the values lined up and it wasn’t.

enter image description here

How do I fix it?

Contents hide

Answers:

Method 1

Method 2

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

There is no need to iterate over the rows of the dataframe. This will give you the required result:

for ohlc in ohlcs:
    for metric in metrics:
        master_df[f'{ohlc} over {metric}'] = (master_df[ohlc] >= master_df[metric]).astype(int)

The part astype(int) is just to convert True and False into 1 and 0, if you are okay with True and False representation, you can use just master_df[f'{ohlc} over {metric}'] = master_df[ohlc] >= master_df[metric].

EDIT: Of course, (master_df[ohlc] >= master_df[metric]).astype(int) is equivalent to np.where(master_df[ohlc] >= master_df[metric], 1, 0), you can use either.

Method 2

Consider itertools.product and the functional form DataFrame.ge for all pairwise possibilities fir a flatter looping:

from itertools import product
...

ohlcs = ['open', 'high', 'low', 'close']
metrics = ['vwap', '9EMA', '20EMA']

pairs = product(ohlcs, metrics)

for ohlc, metric in pairs:
    master_df[f"{ohlc} over {metric}"] = master_df[ohlc].ge(master_df[metric])

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes

Article Rating