'How to tail yarn logs?

I am submitting a Spark Job using below command. I want to tail the yarn log using application Id similar to tail command operation in Linux box.

export SPARK_MAJOR_VERSION=2
nohup spark-submit --class "com.test.TestApplication" --name TestApp --queue queue1 --properties-file application.properties --files "hive-site.xml,tez-site.xml,hbase-site.xml,application.properties" --master yarn --deploy-mode cluster Test-app.jar > /tmp/TestApp.log &


Solution 1:[1]

Not easily.

"YARN logs" aren't really in YARN, they are actually on the executor nodes of Spark. If YARN log aggregation is enabled, then logs are in HDFS, and available from Spark History server.

The industry deployment pattern is to configure the Spark log4j properties to write to a file with a log forwarder (like Filebeat, Splunk, Fluentd), then those processes collect data into a search engine like Solr, Elasticsearch, Graylog, Splunk, etc. From these tools, you can approximately tail/search/analyze log messages outside of a CLI.

Solution 2:[2]

yarn logs -applicationId application_1648123761230_0106 -log_files stdout -size -1000

https://docs.cloudera.com/HDPDocuments/HDP3/HDP-3.0.1/data-operating-system/content/use_the_yarn_cli_to_view_logs_for_running_applications.html

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 OneCricketeer
Solution 2 Jacob Joy