How to change the file cached by Distributed Cache?

Distributed Cache in MapReduce can be updated by replacing the file with the new one and changing the pointer location to point to new location and restart the MapReduce job or by appending the values in Distributed cache and restarting the job.

Read More

What is Shuffling in MapReduce?

Shuffling in MapReduce As the Reducer receives the Mapper output which is also called as Intermediate Data as its input, it has to make sure that the Reducer receives the data sorted on its Key. For this purpose all the Unique keys

Read More

Can we update the file cached by the Distributed Cache?

No, Distributed Cache tracks the caching with timestamp. Cached file should not be changed during the job execution. Distributed Cache in MapReduce can be updated by replacing the file with the new one and changing the pointer location to point to the

Read More

Hadoop Architecture (Article 2 in Hadoop series)

Hadoop Architecture is divided into 2 core layers, one for storage and the other handles the programming or computational part of Hadoop. One is a framework written in java to allow the system to store the various forms of data generated at

Read More

What is Hadoop? (Article 1 in Hadoop series)

Hadoop is the Buzz Word heard everywhere in the World Wide Web. The Moment you open your social Networking profile, there are countless Ads from profound Educational and training institutes about their expertise in teaching you Hadoop. One must get a doubt

Read More

What is Big Data?

What actually is Big Data? Where did Big Data come from? How big is Big Data? Why are everyone so much concerned about Big Data? When to use Big data? If you are studying this article, you must be one of the

Read More