In this tutorial I show how to use Java to interact with your Hadoop Distributed File System (HDFS) using libHDFS.
This Java program creates a file named hadoop.txt, writes a short message into it, then reads it back and prints it to the screen. If the file already existed, it is deleted first.
import java.io.File; import java.io.IOException; //Import LibHDFS Packages import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.FileSystem; import org.apache.hadoop.fs.FSDataInputStream; import org.apache.hadoop.fs.FSDataOutputStream; import org.apache.hadoop.fs.Path; // Create Class public class HDFSExample { public static final String FileName = "hadoop.txt"; public static final String message = "My First Hadoop API call!\n"; public static void main (String [] args) throws IOException { //Initialize new default Hadoop Configuration Configuration conf = new Configuration(); //Initialize new abstract Hadoop FileSystem FileSystem fs = FileSystem.get(conf); //Specify File Path of Hadoop File System Path filenamePath = new Path(theFilename); try { //Check if file doesn't exist if (fs.exists(filenamePath)) { // if file exist, remove file first fs.delete(filenamePath); } //Write Configuration to File FSDataOutputStream out = fs.create(filenamePath); out.writeUTF(message); out.close(); //Open Config file to read FSDataInputStream in = fs.open(filenamePath); String messageIn = in.readUTF(); System.out.print(messageIn); in.close(); } catch (IOException ioe) { System.err.println("IOException during operation: " + ioe.toString()); System.exit(1); } } }
For more Information:
Complete JavaDoc for the HDFS API is provided at http://wiki.apache.org/hadoop/LibHDFS
Thanks, one issue with this is that it doesn’t compile.
theFilename is undeclared. I assume this is the same variable as FileName?
I modified the two names to agree, and also modded the code to add the core-site.xml config file from my production HDFS instance, like this:
conf.addResource(new Path(“/usr/lib/hadoop/conf/core-site.xml”));
and it worked. Thanks for the leg up! But you’re gonna wanna fix that variable name issue.
Yes correct!
i m getting the following error——-
Exception in thread “main” java.lang.NoClassDefFoundError: org/apache/commons/logging/LogFactory
at org.apache.hadoop.conf.Configuration.(Configuration.java:153)
at pkgHdfs.HDFSClient.main(HDFSClient.java:192)
Caused by: java.lang.ClassNotFoundException: org.apache.commons.logging.LogFactory
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
… 2 more
Java Result: 1
Hi,
Can you please suggest me a way to read the contents of a file which is of RC file format using jave ?