Sunday, April 30, 2017

Import data in HDFS from mysql db using Sqoop Commands

Import data in HDFS from mysql db using Sqoop:


1. when target dir is not available:
>sqoop import --connect jdbc:mysql://localhost/retail_db --username root --password cloudera --table departments --m 1

2. To check departments part and success files:
>hdfs dfs -ls -R departments

3. To view content of departments part file:
>hdfs dfs -cat /departments/part-m-00000

=============

when target-dir is available:


1. >sqoop import --connect jdbc:mysql://localhost/retail_db --username root --password cloudera --table departments -m 1 --target-dir /home/cloudera/sqoopQueryResult1/


2. To check departments part and success files:
>hdfs dfs -ls -R departments


3. To view content of departments part file:
>hdfs dfs -cat /home/cloudera/sqoopQueryResult1/part-m-00000


==============


Import records sqoop command while applying where clause in the table:


>sqoop import --connect jdbc:mysql://localhost/retail_db --username root --password cloudera --table categories -m 1 --where "category_name like '%Golf%'" --target-dir /home/cloudera/sqoopQueryResult3/


=============
Incremental column value defines newly added rows in the table,
like here fetch the rows after the category_id last value is 57:

>sqoop import --connect jdbc:mysql://localhost/retail_db --username root --password cloudera --table categories -m 1 --incremental append --check-column category_id --last-value 57 --target-dir /home/cloudera/sqoopQueryResult4/

Note:

You can also specified the –class-name for the generated code and specified the –bindir where the compiled code and .jar file should be located. (Without these arguments, Sqoop would place the generated Java source file in your current working directory and the compiled .class file and .jar file in /tmp/sqoop-/compile.)

The class name simply derives from the table name unless you specify a name with the help of the –class-name command line argument (CLA). The –target-diris the location in HDFS where you want the imported table to be placed.

>sqoop import --connect jdbc:mysql://localhost/retail_db --username root --password cloudera --table departments -m 1 --class-name departmentsNew --target-dir /home/cloudera/sqoopQueryResult5/ --bindir /home/cloudera/sqoopQueryResult5/

This will create departmentsNew.class and departmentsNew.jar file in /home/cloudera/sqoopQueryResult5/ directory.

No comments:

Post a Comment