'Presto cluster + presto agent are restart each 1 min insted to be stable
we have Hadoop cluster that include presto , HIVE , HDFS , etc
we have in our presto production cluster 254 presto agent machines , and one presto coordinator
all presto services are installed on RHEL 7.6 machines
we have strange behavior that presto agents are restart each ~ 60 seconds , and until now we cant put the finger about the root cause
the log server.log
looks as the following:
Configuration should be updated:
1) Configuration property 'hive.parquet.fail-on-corrupted-statistics' has been replaced. Use 'parquet.ignore-statistics' instead.
2) Configuration property 'hive.parquet.fail-on-corrupted-statistics' is deprecated and should not be used
==========
2022-02-17T17:33:02.684Z WARN http-client-node-manager-42 io.prestosql.metadata.RemoteNodeState Error fetching node state from http://34.2.37.165:4444/v1/info/state: Server refused connection: http://34.2.37.165:4444/v1/info/state
2022-02-17T17:33:02.684Z WARN http-client-node-manager-40 io.prestosql.metadata.RemoteNodeState Error fetching node state from http://34.2.37.240:4444/v1/info/state: Server refused connection: http://34.2.37.240:4444/v1/info/state
2022-02-17T17:33:03.650Z INFO main io.prestosql.metadata.StaticCatalogStore -- Added catalog hive using connector hive-hadoop2 --
2022-02-17T17:33:03.650Z INFO main io.prestosql.metadata.StaticCatalogStore -- Loading catalog etc/catalog/jmx.properties --
2022-02-17T17:33:04.128Z INFO main Bootstrap PROPERTY DEFAULT RUNTIME DESCRIPTION
2022-02-17T17:33:04.128Z INFO main Bootstrap jmx.dump-period 10.00s 10.00s
2022-02-17T17:33:04.128Z INFO main Bootstrap jmx.dump-tables [] [java.lang:type=Runtime, presto.execution.scheduler:name=NodeScheduler]
2022-02-17T17:33:04.128Z INFO main Bootstrap jmx.max-entries 86400 86400
2022-02-17T17:33:04.355Z INFO main io.prestosql.metadata.StaticCatalogStore -- Added catalog jmx using connector jmx --
2022-02-17T17:33:04.355Z INFO main io.prestosql.metadata.StaticCatalogStore -- Loading catalog etc/catalog/memory.properties --
2022-02-17T17:33:04.762Z INFO main Bootstrap PROPERTY DEFAULT RUNTIME DESCRIPTION
2022-02-17T17:33:04.762Z INFO main Bootstrap memory.enable-lazy-dynamic-filtering true true
2022-02-17T17:33:04.762Z INFO main Bootstrap memory.max-data-per-node 128MB 4GB
2022-02-17T17:33:04.763Z INFO main Bootstrap memory.splits-per-node 20 20
2022-02-17T17:33:04.971Z INFO main io.prestosql.metadata.StaticCatalogStore -- Added catalog memory using connector memory --
2022-02-17T17:33:04.971Z INFO main io.prestosql.metadata.StaticCatalogStore -- Loading catalog etc/catalog/phoenix.properties --
2022-02-17T17:33:05.545Z INFO main stderr log4j:WARN No appenders could be found for logger (org.apache.hadoop.util.Shell).
2022-02-17T17:33:05.545Z INFO main stderr log4j:WARN Please initialize the log4j system properly.
2022-02-17T17:33:05.545Z INFO main stderr log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
2022-02-17T17:33:05.705Z INFO main Bootstrap PROPERTY DEFAULT RUNTIME DESCRIPTION
2022-02-17T17:33:05.705Z INFO main Bootstrap aggregation-pushdown.enabled true true Enable aggregation pushdown
2022-02-17T17:33:05.705Z INFO main Bootstrap allow-drop-table true true Allow connector to drop tables
2022-02-17T17:33:05.705Z INFO main Bootstrap domain-compaction-threshold 32 32 Maximum ranges to allow in a tuple domain without compacting it
2022-02-17T17:33:05.705Z INFO main Bootstrap unsupported-type-handling IGNORE IGNORE Unsupported type handling strategy
2022-02-17T17:33:05.705Z INFO main Bootstrap case-insensitive-name-matching false false
2022-02-17T17:33:05.705Z INFO main Bootstrap case-insensitive-name-matching.cache-ttl 1.00m 1.00m
2022-02-17T17:33:05.705Z INFO main Bootstrap phoenix.connection-url ---- jdbc:phoenix:master01:2181:/hbase-unsecure
2022-02-17T17:33:05.705Z INFO main Bootstrap phoenix.config.resources [] [/opt/mcspace/mass_hbase/hbase/conf/hbase-site.xml]
2022-02-17T17:33:06.368Z INFO main io.airlift.bootstrap.LifeCycleManager Life cycle starting...
2022-02-17T17:33:06.369Z INFO main io.airlift.bootstrap.LifeCycleManager Life cycle startup complete
2022-02-17T17:33:06.370Z INFO main io.prestosql.metadata.StaticCatalogStore -- Added catalog phoenix using connector phoenix --
2022-02-17T17:33:06.373Z INFO main io.prestosql.security.AccessControlManager Using system access control default
2022-02-17T17:33:06.407Z INFO main io.prestosql.server.Server ======== SERVER STARTED ========
2022-02-17T17:33:07.683Z WARN http-client-node-manager-39 io.prestosql.metadata.RemoteNodeState Error fetching node state from http://34.2.37.165:4444/v1/info/state: Server refused connection: http://34.2.37.165:4444/v1/info/state
2022-02-17T17:33:07.686Z INFO node-state-poller-0 io.prestosql.metadata.DiscoveryNodeManager Previously active node is missing: worker01 (last seen at 34.2.37.165)
2022-02-17T17:33:07.686Z INFO node-state-poller-0 io.prestosql.metadata.DiscoveryNodeManager Previously active node is missing: worker04 (last seen at 34.2.37.240)
2022-02-17T17:33:07.687Z WARN http-client-node-manager-42 io.prestosql.metadata.RemoteNodeState Error fetching node state from http://34.2.37.240:4444/v1/info/state: Server refused connection: http://34.2.37.240:4444/v1/info/state
2022-02-17T17:33:22.101Z INFO Thread-44 io.airlift.bootstrap.LifeCycleManager JVM is shutting down, cleaning up
2022-02-17T17:33:22.101Z INFO Thread-40 io.airlift.bootstrap.LifeCycleManager JVM is shutting down, cleaning up
as we can see at the end presto agent is shutting down
and we can also see that:
grep "JVM is shutting down" server.log | tail -20
grep "JVM is shutting down" server.log | tail -20
2022-02-17T17:28:47.612Z INFO Thread-40 io.airlift.bootstrap.LifeCycleManager JVM is shutting down, cleaning up
2022-02-17T17:28:47.612Z INFO Thread-44 io.airlift.bootstrap.LifeCycleManager JVM is shutting down, cleaning up
2022-02-17T17:29:48.603Z INFO Thread-40 io.airlift.bootstrap.LifeCycleManager JVM is shutting down, cleaning up
2022-02-17T17:29:48.603Z INFO Thread-44 io.airlift.bootstrap.LifeCycleManager JVM is shutting down, cleaning up
2022-02-17T17:31:20.123Z INFO Thread-45 io.airlift.bootstrap.LifeCycleManager JVM is shutting down, cleaning up
2022-02-17T17:32:21.108Z INFO Thread-44 io.airlift.bootstrap.LifeCycleManager JVM is shutting down, cleaning up
2022-02-17T17:32:21.109Z INFO Thread-40 io.airlift.bootstrap.LifeCycleManager JVM is shutting down, cleaning up
2022-02-17T17:33:22.101Z INFO Thread-44 io.airlift.bootstrap.LifeCycleManager JVM is shutting down, cleaning up
2022-02-17T17:33:22.101Z INFO Thread-40 io.airlift.bootstrap.LifeCycleManager JVM is shutting down, cleaning up
2022-02-17T17:34:23.307Z INFO Thread-44 io.airlift.bootstrap.LifeCycleManager JVM is shutting down, cleaning up
2022-02-17T17:34:23.307Z INFO Thread-40 io.airlift.bootstrap.LifeCycleManager JVM is shutting down, cleaning up
2022-02-17T17:35:24.360Z INFO Thread-44 io.airlift.bootstrap.LifeCycleManager JVM is shutting down, cleaning up
2022-02-17T17:35:24.360Z INFO Thread-40 io.airlift.bootstrap.LifeCycleManager JVM is shutting down, cleaning up
2022-02-17T17:36:25.357Z INFO Thread-48 io.airlift.bootstrap.LifeCycleManager JVM is shutting down, cleaning up
2022-02-17T17:36:25.357Z INFO Thread-52 io.airlift.bootstrap.LifeCycleManager JVM is shutting down, cleaning up
2022-02-17T17:41:00.359Z INFO Thread-40 io.airlift.bootstrap.LifeCycleManager JVM is shutting down, cleaning up
2022-02-17T17:41:00.359Z INFO Thread-44 io.airlift.bootstrap.LifeCycleManager JVM is shutting down, cleaning up
2022-02-17T17:42:01.268Z INFO Thread-40 io.airlift.bootstrap.LifeCycleManager JVM is shutting down, cleaning up
2022-02-17T17:42:01.268Z INFO Thread-44 io.airlift.bootstrap.LifeCycleManager JVM is shutting down, cleaning up
2022-02-17T17:43:02.372Z INFO Thread-44 io.airlift.bootstrap.LifeCycleManager JVM is shutting down, cleaning up
presto agent restart each 1 min
what could be the reason for this unstable presto agent?
also I must to say that server not have any problem of memory resources or full partitions usage
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|