Friday, October 31, 2014

Sum in Pig

This is a simple PigScript i wrote to understand the concepts of pig.
Well here it goes

Today we will see how to do a simple sum operation in Pig.

Consider this as my input data
.

The first example - sum
The second example - group and sum

Wednesday, October 29, 2014

Installing R with RStudio in Ubuntu

Helllo people,
lets start the installation by opening your terminal first.

and type in the flowering commands.

sudo apt-get install r-base
sudo apt-get install gdebi-core
sudo apt-get install libapparmor1
wget http://download2.rstudio.org/rstudio-server-0.98.1085-amd64.deb
sudo gdebi rstudio-server-0.98.1085-amd64.deb

and here you go with a brand new R-Studio.


Apache Pig Day 1

Hello People,
I increasingly spending more time working on pig. (Thankgod).
This experience has been very valuable as it has increased my knowledge on data.
Pig is an open source project.
Language for developing pig scripts is called pig latin.
Its easier to write pig script than mapreduce code (Its saves the developers time)
Pig is a dataflow language that means it allows the users to describe data.
Pig Latin Scripts are describes a DAG. Directed Acyclic Graph
Developed by Yahoo!.
Aim to this language is to find a sweet spot between SQL and Shell Scripts.
To sum it up Pig is you to build data pipelines / run ETL workloads.