Hadoop File Already Exists Exception
org.apache.hadoop.mapred.FileAlreadyExistsException
Hello folks!
Aim behind writing this article is to make developers aware about the issue which they might face while developing the MapReduce application. Well the above error "org.apache.hadoop.mapred.FileAlreadyExistsException" is one of the most basic exception which every beginner face while writing their first map reduce program.
Exception in thread "main" org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory hdfs://localhost:9000/home/facebook/crawler-output already exists
at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:146)
at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:269)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:142)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1308)
at com.wagh.wordcountjob.WordCount.main(WordCount.java:68)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Let's start from scratch.
To run a map reduce job you have to write a command similar to below command
$hadoop jar {name_of_the_jar_file.jar} {package_name_of_jar} {hdfs_file_path_on_which_you_want_to_perform_map_reduce} {output_directory_path}
Example : - hadoop jar facebookCrawler.jar com.wagh.wordcountjob.WordCount /home/facebook/facebook-cocacola-page.txt /home/facebook/crawler-output
Just pay attention on the {output_directory_path} i.e. /home/facebook/crawler-output . If you have already created this directory structure in your HDFS than Hadoop EcoSystem will throw the exception "org.apache.hadoop.mapred.FileAlreadyExistsException".
Solution: - Always specify the output directory name at run time(i.e Hadoop will create the directory automatically for you. You need not to worry about the output directory creation).
As mentioned in the above example the same command can be run in following manner - "hadoop jar facebookCrawler.jar com.wagh.wordcountjob.WordCount /home/facebook/facebook-cocacola-page.txt /home/facebook/crawler-output-1"
So output directory {crawler-output-1} will be created at runtime by Hadoop eco system.
Hi Rahul,
ReplyDeleteI am facing same problem with "yarn-cluster", in local it is working fine. In cluster mode, first node works fine but after the execution of first node, other nodes are throwing this exception - FileAlreadyExistsException.
Any idea about this.
Thanks,
Ajeet
Hi Ajeet,
ReplyDeleteAs per the problem description of yours just check the other nodes whether directory already exist or not. Because of replication directory might exist in the other nodes. Try to remove the directory from other nodes and than specify the output directory at run time. Hope it would solve your problem.
Regards
Rahul wagh
Hi!
ReplyDeletePlease can you help, you are my only hope...
I am completely new to this and after having been helped by the above I still can't get the right file in my output. Its driving me crazy! Can you help?
Hello Thanks for your comment.
DeleteAs per my experience issue might be related to the output file path.
Always remember never specify the output directory name which you already created in HDFS.
Example : /your_path/output_directory
So "output_directory" should never pre exist in HDFS. Just give any random name which comes in your mind at runtime and you will not get exception.
Let me know your end result
Regards
Rahul wagh
Excellent article. Very interesting to read. I really love to read such a nice article. Thanks! keep rocking. Big data hadoop online Course
ReplyDeleteInspiring writings and I greatly admired what you have to say , I hope you continue to provide new ideas for us all and greetings success always for you..Keep update more information..
ReplyDeletepython Online training in chennai
python Online training in bangalore
python interview question and answers
Wonderful article, very useful and well explanation. Your post is extremely incredible. I will refer this to my candidates...
ReplyDeleteData Science training in Chennai
Data science training in Bangalore
Data science training in pune
Data science online training
Your very own commitment to getting the message throughout came to be rather powerful and have consistently enabled employees just like me to arrive at their desired goals.
ReplyDeleteData science Course Training in Chennai | No.1 Data Science Training in Chennai
RPA Course Training in Chennai | No.1 RPA Training in Chennai
AWS Course Training in Chennai | No.1 AWS Training in Chennai
Devops Course Training in Chennai | Best Devops Training in Chennai
Selenium Course Training in Chennai | Best Selenium Training in Chennai
Thanks for sharing valuable article having good information and also gain worth-full knowledge.
ReplyDeleteOracle ICS Online Training
I was just browsing through the internet looking for some information and came across your blog. I am impressed by the information that you have on this blog. It shows how well you understand this subject. Bookmarked this page, will come back for more.
ReplyDeletedata analytics certification courses in Bangalore
ExcelR Data science courses in Bangalore
The article is actually the best topic about this issue. Great sharing.
ReplyDeleteExcelR Data Science Course Bangalore
You completed certain reliable points there. I did a search on the subject and found nearly all persons will agree with your blog.
ReplyDeleteBIG DATA COURSE MALAYSIA
You might comment on the order system of the blog. You should chat it's splendid. Your blog audit would swell up your visitors. I was very pleased to find this site.I wanted to thank you for this great read!!data science course in dubai
ReplyDeleteI am really enjoying reading your well written articles. It looks like you spend a lot of effort and time on your blog. I have bookmarked it and I am looking forward to reading new articles. Keep up the good work.
ReplyDeletetop 7 best washing machine
Awesome post sir,
ReplyDeletereally appreciate for your writing. This blog is very much useful...
Hi guyz click here Digital Marketing Course to get the best knowledge and details and also 100% job assistance hurry up... !!
DO NOT MISS THE CHANCE...
Thanks for your informative blog!!! Your article helped me to understand the future of .net programming language. Keep on updating your with such awesome information. .net
ReplyDeleteDot Net Training in Chennai | Dot Net Training in anna nagar | Dot Net Training in omr | Dot Net Training in porur | Dot Net Training in tambaram | Dot Net Training in velachery
I am impressed by the information that you have on this blog. It shows how well you understand this subject.
ReplyDelete360digitmg data scientist course online
Really nice and interesting post. I was looking for this kind of information and enjoyed reading this one. Keep posting. Thanks for sharing.
ReplyDelete360digitmg artificial intelligence course
Wonderful share in this blog!!
ReplyDeleteAndroid Training in Chennai | Certification | Mobile App Development Training Online | Android Training in Bangalore | Certification | Mobile App Development Training Online | Android Training in Hyderabad | Certification | Mobile App Development Training Online | Android Training in Coimbatore | Certification | Mobile App Development Training Online | Android Training in Online | Certification | Mobile App Development Training Online
I must appreciate you for providing such a valuable content for us. This is one amazing piece of article. Helped a lot in increasing my knowledge
ReplyDeletehadoop training in chennai
hadoop training in porur
salesforce training in chennai
salesforce training in porur
c and c plus plus course in chennai
c and c plus plus course in porur
machine learning training in chennai
machine learning training in porur
Such a very useful article. Very interesting to read this article.I would like to thank you for the efforts you had made for writing this awesome article. data scientist courses
ReplyDeleteThrough this post, I know that your good knowledge in playing with all the pieces was very helpful. I notify that this is the first place where I find issues I've been searching for. You have a clever yet attractive way of writing.
ReplyDeletemachine learning courses in bangalore
Informative blog.
ReplyDeleteOnline Python Training in Hyderabad
It is the perfect time to make some plans for the future and it is the time to be happy. I've read this post and if I could I would like to suggest some interesting things or suggestions. Perhaps you could write the next articles referring to this article. I want to read more things about it!
ReplyDeletedata analytics courses in hyderabad with placements
This is a fabulous post I seen because of offer it. It is really what I expected to see trust in future you will continue in sharing such a mind boggling post data scientist course in surat
ReplyDelete