Advertisement

Python in worker has different version 3.7 than that in driver 3.6, PySpark cannot run with differe

阅读量:

错误如下:使用Anaconda,默认是python37,下载python36,idea中更换为python3.6,报了如下错误。

复制代码
 Exception: Python in worker has different version 3.7 than that in driver 3.6, PySpark cannot run with different minor versions.Please check environment variables PYSPARK_PYTHON and PYSPARK_DRIVER_PYTHON are correctly set.

    
  
    
 	at org.apache.spark.api.python.BasePythonRunner$ReaderIterator.handlePythonException(PythonRunner.scala:298)
    
 	at org.apache.spark.api.python.PythonRunner$$anon$1.read(PythonRunner.scala:438)
    
 	at org.apache.spark.api.python.PythonRunner$$anon$1.read(PythonRunner.scala:421)
    
 	at org.apache.spark.api.python.BasePythonRunner$ReaderIterator.hasNext(PythonRunner.scala:252)
    
 	at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
    
 	at scala.collection.Iterator$GroupedIterator.fill(Iterator.scala:1126)
    
 	at scala.collection.Iterator$GroupedIterator.hasNext(Iterator.scala:1132)
    
 	at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
    
 	at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:125)
    
 	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
    
 	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
    
 	at org.apache.spark.scheduler.Task.run(Task.scala:109)
    
 	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
    
 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    
 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    
 	at java.lang.Thread.run(Thread.java:748)
    
  
    
 Driver stacktrace:
    
 	at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1602)
    
 	at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1590)
    
 	at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1589)
    
 	at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
    
 	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
    
 	at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1589)
    
 	at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:831)
    
 	at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:831)
    
 	at scala.Option.foreach(Option.scala:257)
    
 	at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:831)
    
 	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1823)
    
 	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1772)
    
 	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1761)
    
 e0(Native Method)
    
 	at sun.reflect.NativeMethodAccessorImpl.inv

解决步骤如下:

第一种方法:

到这里,配置以后就可以了。

第二种方法:以上这种方法只是针对前工程,如果你想对所有工程都起作用,需要到环境变量配置PYSPARK_PYTHON如下所示:

全部评论 (0)

还没有任何评论哟~