'queries executed using spark-sql run on driver or on the executors
I am curious to understand as where does the query executes when we run it using spark-sql -e.
spark-sql -e "SELECT count(*) FROM table"
moreover when we do a count, is the action called only on the driver?
Solution 1:[1]
An Action means do something. That Action comes from the Driver and work is distributed to Workers where Executors run Tasks against Partitions.
So, counting occurs on Executors. Partial count results from the Executors for counting Partition are sent to- and aggregated on the Driver.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 |