SparkContext Error - File not found /tmp/spark-events does not exist
This video explains
SparkContext Error - File not found /tmp/spark-events does not exist
--
Become part of the top 3% of the developers by applying to Toptal
https://topt.al/25cXVn
--
Music by Eric Matyas
https://www.soundimage.org
Track title: Puzzle Game Looping
--
Chapters
00:00 Question
01:24 Accepted answer (Score 37)
01:47 Answer 2 (Score 9)
02:52 Answer 3 (Score 4)
03:28 Answer 4 (Score 1)
03:52 Thank you
--
Full question
https://stackoverflow.com/questions/3835...
Answer 2 links:
[How to enable spark-history server for standalone cluster non hdfs mode]: https://stackoverflow.com/questions/4483...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #amazonwebservices #apachespark #amazonec2 #pyspark
#avk47
SparkContext Error - File not found /tmp/spark-events does not exist
--
Become part of the top 3% of the developers by applying to Toptal
https://topt.al/25cXVn
--
Music by Eric Matyas
https://www.soundimage.org
Track title: Puzzle Game Looping
--
Chapters
00:00 Question
01:24 Accepted answer (Score 37)
01:47 Answer 2 (Score 9)
02:52 Answer 3 (Score 4)
03:28 Answer 4 (Score 1)
03:52 Thank you
--
Full question
https://stackoverflow.com/questions/3835...
Answer 2 links:
[How to enable spark-history server for standalone cluster non hdfs mode]: https://stackoverflow.com/questions/4483...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #amazonwebservices #apachespark #amazonec2 #pyspark
#avk47
ACCEPTED ANSWER
Score 45
/tmp/spark-events is the location that Spark store the events logs. Just create this directory in the master machine and you're set.
$mkdir /tmp/spark-events
$ sudo /root/spark-ec2/copy-dir /tmp/spark-events/
RSYNC'ing /tmp/spark-events to slaves...
ec2-54-175-163-32.compute-1.amazonaws.com
ANSWER 2
Score 10
While trying to setup my spark history server on my local machine, I had the same 'File file:/tmp/spark-events does not exist.' error. I had customized my log directory to a non-default path. To resolve this, I needed to do 2 things.
- edit $SPARK_HOME/conf/spark-defaults.conf
-- add these 2 lines
spark.history.fs.logDirectory /mycustomdir spark.eventLog.enabled true - create a link from /tmp/spark-events to /mycustomdir.
ln -fs /tmp/spark-events /mycustomdirIdeally, step 1 would have solved my issue entirely, but i still needed to create the link so I suspect there might have been one other setting i missed. Anyhow, once I did this, i was able to run my historyserver and see new jobs logged in my webui.
ANSWER 3
Score 5
Use spark.eventLog.dir for client/driver program
spark.eventLog.dir=/usr/local/spark/history
and use spark.history.fs.logDirectory for history server
spark.history.fs.logDirectory=/usr/local/spark/history
as mentioned in: How to enable spark-history server for standalone cluster non hdfs mode
At least as per Spark version 2.2.1
ANSWER 4
Score 1
I just created /tmp/spark-events on the {master} node and then distributed it to other nodes on the cluster to work.
mkdir /tmp/spark-events
rsync -a /tmp/spark-events {slaves}:/tmp/spark-events
my spark-default.conf:
spark.history.ui.port=18080
spark.eventLog.enabled=true
spark.history.fs.logDirectory=hdfs:///home/elon/spark/events