How to change the file cached by Distributed Cache?

Distributed Cache in MapReduce can be updated by replacing the file with the new one and changing the pointer location to point to new location and restart the MapReduce job or by appending the values in Distributed cache and restarting the job.

Note:  We cannot update the Distributed Cache when the MapReduce job is running. It will become a race between the two operation in which both will lose.

We have to restart the job and submit another DistributedCache data. Distributedchache is not persistent between jobs.

Happy Hadooping   :-) 

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.