'spark join two dataframe without common column
Need to join two dataframes in pyspark.
One dataframe df1
is like:
city user_count_city meeting_session
NYC 100 5
LA 200 10
....
Another dataframe df2
is like:
total_user_count total_meeting_sessions
1000 100
Need to calculate user_percentage
and meeting_session_percentage
so I need a left join, something like
df1 left join df2
How could I join the two dataframes since they do not have common key?
Take a look of solution from this post Joining two dataframes without a common column But this is not same as my case.
Expected results
city user_count_city meeting_session total_user_count total_meeting_sessions
NYC 100 5 1000 100
LA 200 10 1000 100
....
Solution 1:[1]
You are looking for a cross join:
result = df1.crossJoin(df2)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | blackbishop |