Tuesday, November 11, 2014

MapR SingleNode in Ubuntu

Creating a single node instance of MapR in ubuntu.


Saturday, November 1, 2014

Using Pig to Load and Store data from HBase

Lets first store data from HDFS to our HBase Table. For this we will be using

org.apache.pig.backend.hadoop.hbase
Class HBaseStorage

public HBaseStorage(String columnList) throws org.apache.commons.cli.ParseException,IOException)


Warning: Make sure that your PIG_CLASSPATH refers to all the library files in HBASE,HADOOP and ZOOKEEPER. Doing this will save you countless hours of debugging.

Lets Create a HBase table for the data given below named as testtable.

Make sure that your first column is the ROWKEY while doing an insert to HBase table.



Lets Create a Table for this data in HBase.

>> cd $HBASE_HOME\bin
>> ./hbase shell

This will take you to your HBase shell

>> create 'testtable','cf'
>> list 'testtable'
>> scan 'testtable'

Now lets fire up grunt shell

Type in the following commands in the grunt shell

and TaDa....







Pig Casting and Schema Management

Pig is quite flexible when schema need to be manipulated.

Consider this data set



Suppose we needed to define schema after some processing we could cast the columns with their data types



That all for today folks.

Cheers!