Caused by: org.apache.spark.SparkException: Could not execute broadcast in 300 secs. You can increase the timeout for broadcasts via spark.sql.broadcastTimeout or disable broadcast join by setting spark.sql.autoBroadcastJoinThreshold to -1

刘超 3月前 ⋅ 858 阅读   编辑

一、描述

  跑spark受众任务,报如下错误

User class threw exception: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.opera.adx.job.Stat$$anonfun$main$2$$anonfun$apply$2.apply(Stat.scala:131)
at com.opera.adx.job.Stat$$anonfun$main$2$$anonfun$apply$2.apply(Stat.scala:122)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
at com.opera.adx.job.Stat$$anonfun$main$2.apply(Stat.scala:121)
at com.opera.adx.job.Stat$$anonfun$main$2.apply(Stat.scala:99)
at scala.collection.immutable.List.foreach(List.scala:392)
at com.opera.adx.job.Stat$.main(Stat.scala:99)
at com.opera.adx.job.Stat.main(Stat.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:684)
Caused by: org.apache.spark.SparkException: Could not execute broadcast in 300 secs. You can increase the timeout for broadcasts via spark.sql.broadcastTimeout or disable broadcast join by setting spark.sql.autoBroadcastJoinThreshold to -1
at org.apache.spark.sql.execution.exchange.BroadcastExchangeExec.doExecuteBroadcast(BroadcastExchangeExec.scala:150)
二、分析
  1、增加spark.sql.broadcastTimeout参数,这里改为spark.sql.broadcastTimeout=3000,依然报错
  2、检查日志,有20/06/24 08:03:41 INFO MapOutputTracker: Broadcast mapstatuses size = 434, actual size = 621197,在2.4.4的MapOutputTracker的818行代码会打印该信息
  3、设置setting spark.sql.autoBroadcastJoinThreshold=-1,依然报错

注意:本文归作者所有,未经作者允许,不得转载

全部评论: 0

    我有话说: