'Looping over pandas DataFrame

I have a weird issue that the result doesn't change for each iteration. The code is the following:

import pandas as pd
import numpy as np

X = np.arange(10,100)
Y = X[::-1]
Z = np.array([X,Y]).T

df = pd.DataFrame(Z ,columns = ['col1','col2'])
dif = df['col1'] - df['col2']

for gap in range(100):
    Up = dif > gap
    Down = dif < -gap
    
    df.loc[Up,'predict'] = 'Up'
    df.loc[Down,'predict'] = 'Down'
    
    df_result = df.dropna()
    Total = df.shape[0]
    count = df_result.shape[0]
    ratio = count/Total
    print(f'Total: {Total}; count: {count}; ratio: {ratio}')

The result is always

Total: 90; count: 90; ratio: 1.0

when it shouldn't be.



Solution 1:[1]

Found the root of the problem 5 mins after posting this question. I just needed to reset the dataFrame to the original to fix the problem.

import pandas as pd
import numpy as np

X = np.arange(10,100)
Y = X[::-1]
Z = np.array([X,Y]).T

df = pd.DataFrame(Z ,columns = ['col1','col2'])
df2 = df.copy()#added this line to preserve the original df
dif = df['col1'] - df['col2']

for gap in range(100):
    df = df2.copy()#reset the altered df back to the original
    Up = dif > gap
    Down = dif < -gap

    df.loc[Up,'predict'] = 'Up'
    df.loc[Down,'predict'] = 'Down'

    df_result = df.dropna()
    Total = df.shape[0]
    count = df_result.shape[0]
    ratio = count/Total
    print(f'Total: {Total}; count: {count}; ratio: {ratio}')

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 mathguy