POC #: Generate Analytics from a Product based Company Web Log.

POC #: Generate Analytics from a Product based Company Web Log. The POC is based on Analysing a Product based Company Web Log. Industry: E-Commerce 1.          Load weblog data into HDFS using HDFS client 2.         Develop Pig program to load log and perform analytics on  IP Category-1 Category-2 page, status_code 2.1.       Count of page views by individual user ie [IP, count(*)] 2.2.      Top / Bottom 2: catagery-1/ catagery-2 / page /users (Exclude status code other than  200) Top 2 and bottom 2 records ·         Category, total_number_views ·         page, total_number_views ·         IP, total_number_of_views 2.3. Total page views / Category wise pageviews / Unique pageviews ·         page,total_number_of_views ·         category, total_views ·         page, total_number_of_unique_views 2.4. Count of status code = 200 / 404 / 400 / 500 ·         status_code, count 3. Load results into tables in MySql Database using Sqoop. CODING DETAILS Pig Script weblog.pig /* Develop Pig program to extract data for the following KPIs */ FILE1 = LOAD '/home/bhavesh/weblog.txt' using PigStorage(',') as (IP:chararray,Category1:chararray,Category2:chararray,page:int,statuscode:int); GRPD = group FILE1 by IP;   IPCOUNT = foreach GRPD generate group,COUNT(FILE1.IP); /* Count of page views by individual user ie IP, count(*) */ STORE IPCOUNT into '/home/bhavesh/Count_by_Pageview_by_IndividualUser/' using PigStorage(','); /*(Exclude status code other than 200) */ FILE_STATUS_CODE_200 = FILTER FILE1 by statuscode == 200; /* Top 2 and bottom 2 records Category, total_number_views page, total_number_views IP, total_number_of_views */ GRPD_CATEGORY1 = group FILE_STATUS_CODE_200 by Category1; CNT_FOR_CATEGORY1 = FOREACH GRPD_CATEGORY1 generate group,COUNT(FILE_STATUS_CODE_200.Category1) as COUNTING; SORTED_CATEGORY1_DEC = ORDER CNT_FOR_CATEGORY1 by COUNTING DESC; TOP_2_CATEGORY1 = limit SORTED_CATEGORY1_DEC 2; STORE TOP_2_CATEGORY1 into '/home/bhavesh/Top2_By_Category1/' using PigStorage(','); SORTED_CATEGORY1_ASC = ORDER CNT_FOR_CATEGORY1 by COUNTING ASC; BOTTOM_2_CATEGORY1 = limit SORTED_CATEGORY1_ASC 2; STORE BOTTOM_2_CATEGORY1 into '/home/bhavesh/Bottom2_By_Category1/' using PigStorage(','); GRPD_CATEGORY2 = group FILE_STATUS_CODE_200 by Category2; CNT_FOR_CATEGORY2 = FOREACH GRPD_CATEGORY2 generate group,COUNT(FILE_STATUS_CODE_200.Category2) as COUNTING; SORTED_CATEGORY2_DEC = ORDER CNT_FOR_CATEGORY2 by COUNTING DESC; TOP_2_CATEGORY2 = limit SORTED_CATEGORY2_DEC 2; STORE TOP_2_CATEGORY2 into '/home/bhavesh/Top2_By_Category2/' using PigStorage(','); SORTED_CATEGORY2_ASC = ORDER CNT_FOR_CATEGORY2 by COUNTING ASC; BOTTOM_2_CATEGORY2 = limit SORTED_CATEGORY2_ASC 2; STORE BOTTOM_2_CATEGORY2 into '/home/bhavesh/Bottom2_By_Category2/' using PigStorage(','); GRPD_PAGES = group FILE_STATUS_CODE_200 by page; CNT_FOR_PAGE = FOREACH GRPD_PAGES generate group,COUNT(FILE_STATUS_CODE_200.page) as COUNTING; SORTED_PAGE_DEC = ORDER CNT_FOR_PAGE by COUNTING DESC; TOP_2_PAGE = limit SORTED_PAGE_DEC 2; STORE TOP_2_PAGE into '/home/bhavesh/Top2_By_PAGE/' using PigStorage(','); SORTED_PAGE_ASC = ORDER CNT_FOR_PAGE by COUNTING ASC; BOTTOM_2_PAGE = limit SORTED_PAGE_ASC 2; STORE BOTTOM_2_PAGE into '/home/bhavesh/Bottom2_By_PAGE/' using PigStorage(','); GRPD_IP = group FILE_STATUS_CODE_200 by IP; CNT_FOR_IP = FOREACH GRPD_IP generate group,COUNT(FILE_STATUS_CODE_200.IP) as COUNTING; SORTED_IP_DEC = ORDER CNT_FOR_IP by COUNTING DESC; TOP_2_IP = limit SORTED_IP_DEC 2; STORE TOP_2_IP into '/home/bhavesh/Top2_By_IP/' using PigStorage(','); SORTED_IP_ASC = ORDER CNT_FOR_IP by COUNTING ASC; BOTTOM_2_IP = limit SORTED_IP_ASC 2; STORE BOTTOM_2_IP into '/home/bhavesh/Bottom2_By_IP/' using PigStorage(','); /* Total page views / Category wise pageviews / Unique pageviews page,total_number_of_views category, total_views page, total_number_of_unique_views */ GRPD_TOTALPAGES = group FILE1 by page; CNT_FOR_TOTALPAGE = FOREACH GRPD_TOTALPAGES generate group,COUNT(FILE1.page) as COUNTING; SORTED_PAGES_DEC = ORDER CNT_FOR_TOTALPAGE by COUNTING DESC; STORE SORTED_PAGES_DEC into '/home/bhavesh/Total_PAGES_Count/' using PigStorage(','); GRPD_TOTALCATEGORY1 = group FILE1 by Category1; CNT_FOR_TOTALCATEGORY1 = FOREACH GRPD_TOTALCATEGORY1 generate group,COUNT(FILE1.page) as COUNTING; SORTED_TOTALCATEGORY1_DEC = ORDER CNT_FOR_TOTALCATEGORY1 by COUNTING DESC; STORE SORTED_TOTALCATEGORY1_DEC into '/home/bhavesh/Total_Category1_Count/' using PigStorage(','); GRPD_TOTALCATEGORY2 = group FILE1 by Category2; CNT_FOR_TOTALCATEGORY2 = FOREACH GRPD_TOTALCATEGORY2 generate group,COUNT(FILE1.page) as COUNTING; SORTED_TOTALCATEGORY2_DEC = ORDER CNT_FOR_TOTALCATEGORY2 by COUNTING DESC; STORE SORTED_TOTALCATEGORY2_DEC into '/home/bhavesh/Total_Category2_Count/' using PigStorage(','); GRPD_TOTALPAGES_UNIQUEVIEW = group FILE1 by page; CNT

POC #: Generate Analytics from a Product based Company Web Log.
POC #: Generate Analytics from a Product based Company Web Log.


The POC is based on Analysing a Product based Company Web Log.


Industry: E-Commerce


1.          Load weblog data into HDFS using HDFS client

2.         Develop Pig program to load log and perform analytics on  IP Category-1 Category-2 page, status_code

2.1.       Count of page views by individual user ie [IP, count(*)]

2.2.      Top / Bottom 2: catagery-1/ catagery-2 / page /users (Exclude status code other than  200)

Top 2 and bottom 2 records

·         Category, total_number_views

·         page, total_number_views

·         IP, total_number_of_views

2.3. Total page views / Category wise pageviews / Unique pageviews

·         page,total_number_of_views

·         category, total_views

·         page, total_number_of_unique_views

2.4. Count of status code = 200 / 404 / 400 / 500

·         status_code, count

3. Load results into tables in MySql Database using Sqoop.


CODING DETAILS


Pig Script


weblog.pig


/* Develop Pig program to extract data for the following KPIs */


FILE1 = LOAD '/home/bhavesh/weblog.txt' using PigStorage(',') as (IP:chararray,Category1:chararray,Category2:chararray,page:int,statuscode:int);

GRPD = group FILE1 by IP;  

IPCOUNT = foreach GRPD generate group,COUNT(FILE1.IP);


/* Count of page views by individual user ie IP, count(*) */

STORE IPCOUNT into '/home/bhavesh/Count_by_Pageview_by_IndividualUser/' using PigStorage(',');


/*(Exclude status code other than 200) */


FILE_STATUS_CODE_200 = FILTER FILE1 by statuscode == 200;


/*

Top 2 and bottom 2 records

Category, total_number_views

page, total_number_views

IP, total_number_of_views

*/


GRPD_CATEGORY1 = group FILE_STATUS_CODE_200 by Category1;

CNT_FOR_CATEGORY1 = FOREACH GRPD_CATEGORY1 generate group,COUNT(FILE_STATUS_CODE_200.Category1) as COUNTING;


SORTED_CATEGORY1_DEC = ORDER CNT_FOR_CATEGORY1 by COUNTING DESC;

TOP_2_CATEGORY1 = limit SORTED_CATEGORY1_DEC 2;

STORE TOP_2_CATEGORY1 into '/home/bhavesh/Top2_By_Category1/' using PigStorage(',');


SORTED_CATEGORY1_ASC = ORDER CNT_FOR_CATEGORY1 by COUNTING ASC;

BOTTOM_2_CATEGORY1 = limit SORTED_CATEGORY1_ASC 2;

STORE BOTTOM_2_CATEGORY1 into '/home/bhavesh/Bottom2_By_Category1/' using PigStorage(',');


GRPD_CATEGORY2 = group FILE_STATUS_CODE_200 by Category2;

CNT_FOR_CATEGORY2 = FOREACH GRPD_CATEGORY2 generate group,COUNT(FILE_STATUS_CODE_200.Category2) as COUNTING;


SORTED_CATEGORY2_DEC = ORDER CNT_FOR_CATEGORY2 by COUNTING DESC;

TOP_2_CATEGORY2 = limit SORTED_CATEGORY2_DEC 2;

STORE TOP_2_CATEGORY2 into '/home/bhavesh/Top2_By_Category2/' using PigStorage(',');


SORTED_CATEGORY2_ASC = ORDER CNT_FOR_CATEGORY2 by COUNTING ASC;

BOTTOM_2_CATEGORY2 = limit SORTED_CATEGORY2_ASC 2;

STORE BOTTOM_2_CATEGORY2 into '/home/bhavesh/Bottom2_By_Category2/' using PigStorage(',');


GRPD_PAGES = group FILE_STATUS_CODE_200 by page;

CNT_FOR_PAGE = FOREACH GRPD_PAGES generate group,COUNT(FILE_STATUS_CODE_200.page) as COUNTING;


SORTED_PAGE_DEC = ORDER CNT_FOR_PAGE by COUNTING DESC;

TOP_2_PAGE = limit SORTED_PAGE_DEC 2;

STORE TOP_2_PAGE into '/home/bhavesh/Top2_By_PAGE/' using PigStorage(',');


SORTED_PAGE_ASC = ORDER CNT_FOR_PAGE by COUNTING ASC;

BOTTOM_2_PAGE = limit SORTED_PAGE_ASC 2;

STORE BOTTOM_2_PAGE into '/home/bhavesh/Bottom2_By_PAGE/' using PigStorage(',');


GRPD_IP = group FILE_STATUS_CODE_200 by IP;

CNT_FOR_IP = FOREACH GRPD_IP generate group,COUNT(FILE_STATUS_CODE_200.IP) as COUNTING;


SORTED_IP_DEC = ORDER CNT_FOR_IP by COUNTING DESC;

TOP_2_IP = limit SORTED_IP_DEC 2;

STORE TOP_2_IP into '/home/bhavesh/Top2_By_IP/' using PigStorage(',');


SORTED_IP_ASC = ORDER CNT_FOR_IP by COUNTING ASC;

BOTTOM_2_IP = limit SORTED_IP_ASC 2;

STORE BOTTOM_2_IP into '/home/bhavesh/Bottom2_By_IP/' using PigStorage(',');


/* Total page views / Category wise pageviews / Unique pageviews

page,total_number_of_views

category, total_views

page, total_number_of_unique_views */


GRPD_TOTALPAGES = group FILE1 by page;

CNT_FOR_TOTALPAGE = FOREACH GRPD_TOTALPAGES generate group,COUNT(FILE1.page) as COUNTING;

SORTED_PAGES_DEC = ORDER CNT_FOR_TOTALPAGE by COUNTING DESC;

STORE SORTED_PAGES_DEC into '/home/bhavesh/Total_PAGES_Count/' using PigStorage(',');


GRPD_TOTALCATEGORY1 = group FILE1 by Category1;

CNT_FOR_TOTALCATEGORY1 = FOREACH GRPD_TOTALCATEGORY1 generate group,COUNT(FILE1.page) as COUNTING;

SORTED_TOTALCATEGORY1_DEC = ORDER CNT_FOR_TOTALCATEGORY1 by COUNTING DESC;

STORE SORTED_TOTALCATEGORY1_DEC into '/home/bhavesh/Total_Category1_Count/' using PigStorage(',');


GRPD_TOTALCATEGORY2 = group FILE1 by Category2;

CNT_FOR_TOTALCATEGORY2 = FOREACH GRPD_TOTALCATEGORY2 generate group,COUNT(FILE1.page) as COUNTING;

SORTED_TOTALCATEGORY2_DEC = ORDER CNT_FOR_TOTALCATEGORY2 by COUNTING DESC;

STORE SORTED_TOTALCATEGORY2_DEC into '/home/bhavesh/Total_Category2_Count/' using PigStorage(',');


GRPD_TOTALPAGES_UNIQUEVIEW = group FILE1 by page;

CNT_FOR_TOTALPAGE_UNIQUEVIEW = FOREACH GRPD_TOTALPAGES_UNIQUEVIEW {

                                                                internet_protocol = FILE1.IP;

                                                                unique_internet_protocol = DISTINCT internet_protocol;

                                                                GENERATE group, COUNT(unique_internet_protocol);

                                                                };

STORE CNT_FOR_TOTALPAGE_UNIQUEVIEW into '/home/bhavesh/Page_Total_Number_Of_Unique_Views/' using PigStorage(',');


/*

Count of status code = 200 / 404 / 400 / 500

status_code, count

*/


GRPD = group FILE1 by statuscode;  

STATUS_CODE_COUNT = foreach GRPD generate group,COUNT(FILE1.statuscode);


STORE STATUS_CODE_COUNT into '/home/bhavesh/Status_Code_Count/' using PigStorage(',');


Shell Script


#####################################################################

#############################  COMPLETE SCRIPT  #######################
### HEADER - PROGRAM NAME -                                                                                      ###
### AUTHOR - BHAVESH BHADRICHA                                                                                                        ###

### DATE  - 11/DEC/2015                                                                                                                                          ###

### VERSION - 1.0                                                                                                                                                         ###

### DESCRIPTION - Data: It comprises of the information gathered from websites                 ###

### which contains IP, Two Categories of Product, Pages and Status Code                                      ###

###                                                                                                                                                                                           ###

### Problem Statement: Analyse the data in Hadoop Eco-system to:                                                ###

### 1.Load data into HDFS using HDFS client                                                                                                ###

###                                                                                                                                                                                           ###

### 2. Develop PIG program to parse WEB logs and meaning full result from it                         ###

###    INUPT file Format                                                                                                                                              ###

###    IP,Category-1,Category-2,page status_code                                                                                         ###

###                                                                                                                                                                                           ###

### PIG program to extract data for the following                                                                                           ###

###                                                                                                                                                                                            ###

### 3. Count of page views by individual user                                                                                                  ###

###    IP, count(*)                                                                                                                                                              ###

###                                                                                                                                                                                            ###

### 4. Top / Bottom 5: catagery-1/ catagery-2 / page /users                                                                   ###

###    (Exclude status code other than 200)                                                                                                       ###

###                                                                                                                                                                                            ###

###    Top 5 and bottom 5 records                                                                                                                             ###

###    Category, total_number_views                                                                                                                     ###

###    page, total_number_views                                                                                                                             ###

###    IP, total_number_of_views                                                                                                                            ###

###                                                                                                                                                                                            ###

### 5. Total page views / Category wise pageviews / Unique pageviews                                            ###

###                                                                                                                                                                                            ###

###     page,total_number_of_views                                                                                                                      ###

###                                               category, total_views                                                                                               ###

###                                               page, total_number_of_unique_views                                                        ###

###                                                                                                                                                                                           ###

### 6. Count of status code = 200 / 404 / 400 / 500                                                                                    ###

###    status_code, count                                                                                                                                               ###

###                                                                                                                                                                                            ###

### 7. Load results into tables in MySql Database using Sqoop.?                                                          ###

#####################################################################

#####################################################################
#####################################################################
###DEFINING THE LOCAL VARIABLES###

######################################################################

DATE=$(date +"%Y%m%d_%H%M%S")

LOGFILE="/home/bhavesh/POC/WEBLOG_POC/LOG/"$DATE".log"

#####################################################################

###  Load data into HDFS using HDFS client  ####################################

#####################################################################


hadoop fs -put weblog.txt /home/bhavesh/weblog.txt


########################## PIG Processing ###############################

#### PIG, which splits the data into two parts: Category data and Ratings data ###

#####################################################################


echo "Pig Script starts here"


echo "PIG Script,Weblog Processing" >> $LOGFILE


hadoop fs -rmr /home/bhavesh/Total_Category1_Count

hadoop fs -rmr /home/bhavesh/Bottom2_By_PAGE

hadoop fs -rmr /home/bhavesh/Top2_By_Category1

hadoop fs -rmr /home/bhavesh/Top2_By_IP

hadoop fs -rmr /home/bhavesh/Total_PAGES_Count

hadoop fs -rmr /home/bhavesh/Bottom2_By_IP

hadoop fs -rmr /home/bhavesh/Bottom2_By_Category2

hadoop fs -rmr /home/bhavesh/Status_Code_Count

hadoop fs -rmr /home/bhavesh/Count_by_Pageview_by_IndividualUser

hadoop fs -rmr /home/bhavesh/Page_Total_Number_Of_Unique_Views

hadoop fs -rmr /home/bhavesh/Total_Category2_Count

hadoop fs -rmr /home/bhavesh/Bottom2_By_Category1

hadoop fs -rmr /home/bhavesh/Top2_By_Category2

hadoop fs -rmr /home/bhavesh/Top2_By_PAGE


pig /home/bhavesh/POC/WEBLOG_POC/weblog.pig


if [ $? -eq 0 ]; then

    echo "Succesfully finished PIG  Processing " >> $LOGFILE

else

    echo "PIG Processing Failed Please check the Log " >> $LOGFILE

fi



############################ HIVE Processing ###############################

###### HIVE will load the Category data and Rating Data into Hive Tables  ##########

########################################################################


echo "HIVE Script starts here"


echo "HIVE LOAD data into Table " >> $LOGFILE


hive -e 'drop table if exists TotalCategory1Count';

hive -e 'drop table if exists Bottom2ByPAGE';

hive -e 'drop table if exists Top2ByCategory1';

hive -e 'drop table if exists Top2ByIP';

hive -e 'drop table if exists TotalPAGESCount';

hive -e 'drop table if exists Bottom2ByIP';

hive -e 'drop table if exists Bottom2ByCategory2';

hive -e 'drop table if exists StatusCodeCount';

hive -e 'drop table if exists CountbyPageviewbyIndividualUser';

hive -e 'drop table if exists PageTotalNumberOfUniqueViews';

hive -e 'drop table if exists TotalCategory2Count';

hive -e 'drop table if exists Bottom2ByCategory1';

hive -e 'drop table if exists Top2ByCategory2';

hive -e 'drop table if exists Top2ByPAGE';


hive -e "create external table TotalCategory1Count

(Category1 string,

countings int)

row format delimited

fields terminated by','

lines terminated by '\n'

stored as textfile location '/home/bhavesh/hive/TotalCategory1Count'";


hive -e "create external table Bottom2ByPAGE

(Pages int,

countings int)

row format delimited

fields terminated by','

lines terminated by '\n'

stored as textfile location '/home/bhavesh/hive/Bottom2ByPAGE'";


hive -e "create external table Top2ByCategory1

(Category1 string,

countings int)

row format delimited

fields terminated by','

lines terminated by '\n'

stored as textfile location '/home/bhavesh/hive/Top2ByCategory1'";



hive -e "create external table Top2ByIP

(IP string,

countings int)

row format delimited

fields terminated by','

lines terminated by '\n'

stored as textfile location '/home/bhavesh/hive/Top2ByIP'";


hive -e "create external table TotalPAGESCount

(PAGES int,

countings int)

row format delimited

fields terminated by','

lines terminated by '\n'

stored as textfile location '/home/bhavesh/hive/TotalPAGESCount'";


hive -e "create external table Bottom2ByIP

(IP string,

countings int)

row format delimited

fields terminated by','

lines terminated by '\n'

stored as textfile location '/home/bhavesh/hive/Bottom2ByIP'";


hive -e "create external table Bottom2ByCategory2

(Category2 string,

countings int)

row format delimited

fields terminated by','

lines terminated by '\n'

stored as textfile location '/home/bhavesh/hive/Bottom2ByCategory2'";


hive -e "create external table StatusCodeCount

(StatusCode int,

countings int)

row format delimited

fields terminated by','

lines terminated by '\n'

stored as textfile location '/home/bhavesh/hive/StatusCodeCount'";


hive -e "create external table CountbyPageviewbyIndividualUser

(IP string,

countings int)

row format delimited

fields terminated by','

lines terminated by '\n'

stored as textfile location '/home/bhavesh/hive/CountbyPageviewbyIndividualUser'";


hive -e "create external table PageTotalNumberOfUniqueViews

(page int,

countings int)

row format delimited

fields terminated by','

lines terminated by '\n'

stored as textfile location '/home/bhavesh/hive/PageTotalNumberOfUniqueViews'";


hive -e "create external table TotalCategory2Count

(Category2 string,

countings int)

row format delimited

fields terminated by','

lines terminated by '\n'

stored as textfile location '/home/bhavesh/hive/TotalCategory2Count'";


hive -e "create external table Bottom2ByCategory1

(Category1 string,

countings int)

row format delimited

fields terminated by','

lines terminated by '\n'

stored as textfile location '/home/bhavesh/hive/Bottom2ByCategory1'";


hive -e "create external table Top2ByCategory2

(Category2 string,

countings int)

row format delimited

fields terminated by','

lines terminated by '\n'

stored as textfile location '/home/bhavesh/hive/Top2ByCategory2'";


hive -e "create external table Top2ByPAGE

(page int,

countings int)

row format delimited

fields terminated by','

lines terminated by '\n'

stored as textfile location '/home/bhavesh/hive/Top2ByPAGE'";


hive -e "load data inpath '/home/bhavesh/Total_Category1_Count/part-r-00000' overwrite into table TotalCategory1Count";

hive -e "load data inpath '/home/bhavesh/Bottom2_By_PAGE/part-r-00000' overwrite into table Bottom2ByPAGE";

hive -e "load data inpath '/home/bhavesh/Top2_By_Category1/part-r-00000' overwrite into table Top2ByCategory1";

hive -e "load data inpath '/home/bhavesh/Top2_By_IP/part-r-00000' overwrite into table Top2ByIP";

hive -e "load data inpath '/home/bhavesh/Total_PAGES_Count/part-r-00000' overwrite into table TotalPAGESCount";

hive -e "load data inpath '/home/bhavesh/Bottom2_By_IP/part-r-00000' overwrite into table Bottom2ByIP";

hive -e "load data inpath '/home/bhavesh/Bottom2_By_Category2/part-r-00000' overwrite into table Bottom2ByCategory2";

hive -e "load data inpath '/home/bhavesh/Status_Code_Count/part-r-00000' overwrite into table StatusCodeCount";

hive -e "load data inpath '/home/bhavesh/Count_by_Pageview_by_IndividualUser/part-r-00000' overwrite into table CountbyPageviewbyIndividualUser";

hive -e "load data inpath '/home/bhavesh/Page_Total_Number_Of_Unique_Views/part-r-00000' overwrite into table PageTotalNumberOfUniqueViews";

hive -e "load data inpath '/home/bhavesh/Total_Category2_Count/part-r-00000' overwrite into table TotalCategory2Count";

hive -e "load data inpath '/home/bhavesh/Bottom2_By_Category1/part-r-00000' overwrite into table Bottom2ByCategory1";

hive -e "load data inpath '/home/bhavesh/Top2_By_Category2/part-r-00000' overwrite into table Top2ByCategory2";

hive -e "load data inpath '/home/bhavesh/Top2_By_PAGE/part-r-00000' overwrite into table Top2ByPAGE";



############################ SQOOP Processing ##############################

###### Pushing the HIVE Tale data into RDBMS Tables via SQOOP #######################

#########################################################################


sqoop export --connect jdbc:mysql://localhost/mysql --username root --password root --table TotalCategory1Count --export-dir /home/bhavesh/hive/TotalCategory1Count/part-r-00000 --input-fields-terminated-by ',';


sqoop export --connect jdbc:mysql://localhost/mysql --username root --password root --table Bottom2ByPAGE --export-dir /home/bhavesh/hive/Bottom2ByPAGE/part-r-00000 --input-fields-terminated-by ',';


sqoop export --connect jdbc:mysql://localhost/mysql --username root --password root --table Top2ByCategory1 --export-dir /home/bhavesh/hive/Top2ByCategory1/part-r-00000 --input-fields-terminated-by ',';


sqoop export --connect jdbc:mysql://localhost/mysql --username root --password root --table Top2ByIP --export-dir /home/bhavesh/hive/Top2ByIP/part-r-00000 --input-fields-terminated-by ',';


sqoop export --connect jdbc:mysql://localhost/mysql --username root --password root --table TotalPAGESCount --export-dir /home/bhavesh/hive/TotalPAGESCount/part-r-00000 --input-fields-terminated-by ',';


sqoop export --connect jdbc:mysql://localhost/mysql --username root --password root --table Bottom2ByIP --export-dir /home/bhavesh/hive/Bottom2ByIP/part-r-00000 --input-fields-terminated-by ',';


sqoop export --connect jdbc:mysql://localhost/mysql --username root --password root --table Bottom2ByCategory2 --export-dir /home/bhavesh/hive/Bottom2ByCategory2/part-r-00000 --input-fields-terminated-by ',';


sqoop export --connect jdbc:mysql://localhost/mysql --username root --password root --table StatusCodeCount --export-dir /home/bhavesh/hive/StatusCodeCount/part-r-00000 --input-fields-terminated-by ',';


sqoop export --connect jdbc:mysql://localhost/mysql --username root --password root --table CountbyPageviewbyIndividualUser --export-dir /home/bhavesh/hive/CountbyPageviewbyIndividualUser/part-r-00000 --input-fields-terminated-by ',';


sqoop export --connect jdbc:mysql://localhost/mysql --username root --password root --table PageTotalNumberOfUniqueViews --export-dir /home/bhavesh/hive/PageTotalNumberOfUniqueViews/part-r-00000 --input-fields-terminated-by ',';


sqoop export --connect jdbc:mysql://localhost/mysql --username root --password root --table TotalCategory2Count --export-dir /home/bhavesh/hive/TotalCategory2Count/part-r-00000 --input-fields-terminated-by ',';


sqoop export --connect jdbc:mysql://localhost/mysql --username root --password root --table Bottom2ByCategory1 --export-dir /home/bhavesh/hive/Bottom2ByCategory1/part-r-00000 --input-fields-terminated-by ',';


sqoop export --connect jdbc:mysql://localhost/mysql --username root --password root --table Top2ByCategory2 --export-dir /home/bhavesh/hive/Top2ByCategory2/part-r-00000 --input-fields-terminated-by ',';


sqoop export --connect jdbc:mysql://localhost/mysql --username root --password root --table Top2ByPAGE --export-dir /home/bhavesh/hive/Top2ByPAGE/part-r-00000 --input-fields-terminated-by ',';


Input Data




Script Execution






Hive Output







MySQL Output