'pandas non equi join in range
I need to do 'pandas non equi join', when first table joined with second table in range.
first_table
EMPLOYEE_ID SALARY
100 3000.00
101 17000.00
102 17000.00
103 9000.00
104 6000.00
105 4800.00
106 4800.00
………….. …………
………………. …………
second_table
grade_id lowest_sal highest_sal grade_level
1 0 3500 GRADE-A
2 3501 7000 GRADE-B
3 7001 10000 GRADE-C
4 10000 20000 GRADE-D
Need_table(OUTPUT):
EMPLOYEE_ID SALARY grade_level
115 3000 GRADE-A
116 17000 GRADE-D
117 17000 GRADE-D
118 9000 GRADE-C
119 6000 GRADE-B
125 4800 GRADE-B
126 4800 GRADE-B
This equivalent SQL query as:
SELECT f.EMPLOYEE_ID,
f.SALARY,
s.grade_level
FROM first_table f JOIN second_table s
ON f.SALARY BETWEEN s.lowest_sal AND s.highest_sal
Can't to use 'pd.merge' method to join tables because not have any common column.... Please help to find method
Thanks
Solution 1:[1]
If df1
is your first table and df2
is your second table, you could do for example this:
d = df2.set_index('grade_level').to_dict('split')
df1['GRADE'] = df1['SALARY'].apply(
lambda x: next((c for i, c in enumerate(d['index']) if d['data'][i][1] <= x <= d['data'][i][2]), np.nan)
)
print(df1)
Prints:
EMPLOYEE_ID SALARY GRADE
0 100 3000.0 GRADE-A
1 101 17000.0 GRADE-D
2 102 17000.0 GRADE-D
3 103 9000.0 GRADE-C
4 104 6000.0 GRADE-B
5 105 4800.0 GRADE-B
6 106 4800.0 GRADE-B
Solution 2:[2]
One option is with conditional_join from pyjanitor, which avoids a cartesian join (helpful with memory, and performance, depending on the data size):
# pip install pyjanitor
import pandas as pd
import janitor
(first_table
.astype({'SALARY':int})
.conditional_join(
second_table,
('SALARY', 'lowest_sal', '>='),
('SALARY', 'highest_sal', '<='))
.loc[:, ['EMPLOYEE_ID', 'SALARY', 'grade_level']]
)
EMPLOYEE_ID SALARY grade_level
0 100 3000 GRADE-A
1 101 17000 GRADE-D
2 102 17000 GRADE-D
3 103 9000 GRADE-C
4 104 6000 GRADE-B
5 105 4800 GRADE-B
6 106 4800 GRADE-B
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Andrej Kesely |
Solution 2 | sammywemmy |