一、描述
执行spark-shell报如下错误
Exception in thread "main" org.apache.spark.SparkException: When running with master 'yarn' either HADOOP_CONF_DIR or YARN_CONF_DIR must be set in the environment.
at org.apache.spark.deploy.SparkSubmitArguments.error(SparkSubmitArguments.scala:657)
at org.apache.spark.deploy.SparkSubmitArguments.validateSubmitArguments(SparkSubmitArguments.scala:290)
at org.apache.spark.deploy.SparkSubmitArguments.validateArguments(SparkSubmitArguments.scala:251)
at org.apache.spark.deploy.SparkSubmitArguments.(SparkSubmitArguments.scala:120)
at org.apache.spark.deploy.SparkSubmit$$anon$2$$anon$1.(SparkSubmit.scala:907)
at org.apache.spark.deploy.SparkSubmit$$anon$2.parseArguments(SparkSubmit.scala:907)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:81)
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
二、分析
hdp安装好执行spark-shell是没问题,但是spark版本太低,想升级到2.4.4,从apache下载spark2.4.4,执行其中{spark2.4.4}/bin/spark-shell就报如上错误;追踪hdp自带spark-shell,发现它会执行/etc/spark2/2.5.6.0-40/0/spark-env.sh,而在spark-env.sh中设置了HADOOP_CONF_DIR环境变量,如下
/etc/spark2/2.5.6.0-40/0/spark-env.sh:39:export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-/usr/hdp/current/hadoop-client/conf}
二、解决方法
方法一、手动执行export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-/usr/hdp/current/hadoop-client/conf}
方法二、cd {spark2.4.4}/conf,然后拷贝spark-env.sh到当前目录下 cp /usr/hdp/2.5.6.0-40/spark2/conf/spark-env.sh .
注意:本文归作者所有,未经作者允许,不得转载