Tuesday, November 11, 2014
Saturday, November 1, 2014
Using Pig to Load and Store data from HBase
Lets first store data from HDFS to our HBase Table. For this we will be using
org.apache.pig.backend.hadoop.hbase
Class HBaseStorage
Warning: Make sure that your PIG_CLASSPATH refers to all the library files in HBASE,HADOOP and ZOOKEEPER. Doing this will save you countless hours of debugging.
Lets Create a HBase table for the data given below named as testtable.
Make sure that your first column is the ROWKEY while doing an insert to HBase table.
Lets Create a Table for this data in HBase.
>> cd $HBASE_HOME\bin
>> ./hbase shell
This will take you to your HBase shell
>> create 'testtable','cf'
>> list 'testtable'
>> scan 'testtable'
Now lets fire up grunt shell
org.apache.pig.backend.hadoop.hbase
Class HBaseStorage
public HBaseStorage(String columnList) throws org.apache.commons.cli.ParseException,IOException)
Warning: Make sure that your PIG_CLASSPATH refers to all the library files in HBASE,HADOOP and ZOOKEEPER. Doing this will save you countless hours of debugging.
Lets Create a HBase table for the data given below named as testtable.
Make sure that your first column is the ROWKEY while doing an insert to HBase table.
Lets Create a Table for this data in HBase.
>> cd $HBASE_HOME\bin
>> ./hbase shell
This will take you to your HBase shell
>> create 'testtable','cf'
>> list 'testtable'
>> scan 'testtable'
Type in the following commands in the grunt shell
and TaDa....
Pig Casting and Schema Management
Pig is quite flexible when schema need to be manipulated.
Consider this data set
Suppose we needed to define schema after some processing we could cast the columns with their data types
That all for today folks.
Cheers!
Consider this data set
Suppose we needed to define schema after some processing we could cast the columns with their data types
That all for today folks.
Cheers!
Subscribe to:
Posts (Atom)