java.util.concurrent.TimeoutException: Heartbeat of TaskManager with id container_e87_1591346769672_60966_01_000006 timed out.

刘超 12天前 ⋅ 116 阅读   编辑

一、描述

  flink报如下错误

2020-07-20 18:11:01,047 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph        - process (3/6) (154dc0ba06aab8e974980799eeb42aa1) switched from RUNNING to FAILED.
java.util.concurrent.TimeoutException: Heartbeat of TaskManager with id container_e87_1591346769672_60966_01_000006 timed out.
  at org.apache.flink.runtime.jobmaster.JobMaster$TaskManagerHeartbeatListener.notifyHeartbeatTimeout(JobMaster.java:1666)
  at org.apache.flink.runtime.heartbeat.HeartbeatManagerImpl$HeartbeatMonitor.run(HeartbeatManagerImpl.java:318)
  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
  at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:392)
  at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:185)
  at org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:74)
  at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.onReceive(AkkaRpcActor.java:147)
  at org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.onReceive(FencedAkkaRpcActor.java:40)
  at akka.actor.UntypedActor$$anonfun$receive$1.applyOrElse(UntypedActor.scala:165)
  at akka.actor.Actor$class.aroundReceive(Actor.scala:502)
  at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:95)
  at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526)
  at akka.actor.ActorCell.invoke(ActorCell.scala:495)
  at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257)
  at akka.dispatch.Mailbox.run(Mailbox.scala:224)
  at akka.dispatch.Mailbox.exec(Mailbox.scala:234)
  at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
  at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
  at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
  at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

二、解决方法

  调大heartbeat.timeout值

三、参考文章

  1、https://stackoverflow.com/questions/55213150/flink-taskmanager-timeout

  2、https://www.cnblogs.com/createweb/p/12035544.html


注意:本文归作者所有,未经作者允许,不得转载

全部评论: 0

    我有话说: