I need to modify LD_LIBRARY_PATH JAVA_LIBRARY_PATH and CLASSPATH before running hadoop job at cluster. In LD_LIBRARY_PATH and JAVA_LIBRARY_PATH i need to add location of some jars which are required while running the job, As these jars are available at my cluster, similar with CLASSPATH.
I have a 3 NODE cluster, I need to modify this LD_LIBRARY_PATH and CLASSPATH for all the 3 data nodes in such a way that jars available at my cluster node at added to classpath, so that the following jar are available while running the job as i am avoiding jar distribution while running the job to use all ready available jar on cluster nodes. I have tried the given below options
1.I have tried modifying hadoop-env.sh to modify CLASSPATH
but the above thing modify HADOOP_CLASSPATH not the CLASSPATH
For LD_LIBRARY_PATH and JAVA_LIBRARY_PATH i have tired adding given below property in mapred-site.xml as suggested a my place but that didn’t work.
< property >
< name >mapred.child.env< /name >
< value >JAVA_LIBRARY_PATH=/opt/oracle/oraloader-2.0.0-2/lib/< /value >
< value >LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/oracle/oraloader-2.0.0-2/lib/< /value >
< description>User added environment variables for the task tracker child processes. Example : 1) A=foo This will set the env variable A to foo 2) B=$B:c This is inherit tasktracker’s B env variable.
I have also restarted my all 3 data nodes,all tasktrakers and 2 NAMENOdes. Still these variables are not set and my hadoop job is not able to find all those jar files required for running the test.
When i do echo HADOOP_CLASSPATH at my cluster nodes, all the required for running hadoop job are coming. But i think the following jars need to be added in JAVA_LIBRARY_PATH but it not coming.
Do not reinvent the wheel.
If your implementation uses
ToolRunner (which you really should use if you implement map-reduce in Java) then you can use
-libjars jar1,jar2 to ship your jars to the cluster.
Check out “Side Data Distribution” section in “Hadoop: The definitive guide” by Tom White.