Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-2356

Exception: Could not locate executable null\bin\winutils.exe in the Hadoop

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Duplicate
    • 1.0.0, 1.1.1, 1.2.1, 1.2.2, 1.3.1, 1.4.0, 1.4.1, 1.5.0, 1.5.1, 1.5.2
    • None
    • Windows
    • None

    Description

      I'm trying to run some transformation on Spark, it works fine on cluster (YARN, linux machines). However, when I'm trying to run it on local machine (Windows 7) under unit test, I got errors (I don't use Hadoop, I'm read file from local filesystem):

      14/07/02 19:59:31 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
      14/07/02 19:59:31 ERROR Shell: Failed to locate the winutils binary in the hadoop binary path
      java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
      	at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:318)
      	at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:333)
      	at org.apache.hadoop.util.Shell.<clinit>(Shell.java:326)
      	at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:76)
      	at org.apache.hadoop.security.Groups.parseStaticMapping(Groups.java:93)
      	at org.apache.hadoop.security.Groups.<init>(Groups.java:77)
      	at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:240)
      	at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:255)
      	at org.apache.hadoop.security.UserGroupInformation.setConfiguration(UserGroupInformation.java:283)
      	at org.apache.spark.deploy.SparkHadoopUtil.<init>(SparkHadoopUtil.scala:36)
      	at org.apache.spark.deploy.SparkHadoopUtil$.<init>(SparkHadoopUtil.scala:109)
      	at org.apache.spark.deploy.SparkHadoopUtil$.<clinit>(SparkHadoopUtil.scala)
      	at org.apache.spark.SparkContext.<init>(SparkContext.scala:228)
      	at org.apache.spark.SparkContext.<init>(SparkContext.scala:97)
      

      It's happened because Hadoop config is initialized each time when spark context is created regardless is hadoop required or not.

      I propose to add some special flag to indicate if hadoop config is required (or start this configuration manually)

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Abbrechen

          Menschen

            Unassigned Unassigned
            kostiantyn Kostiantyn Kudriavtsev
            Votes:
            17 Vote for this issue
            Watchers:
            27 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment