Sqoop is a command-line interface application for transferring data between relational databases and Hadoop. Sqoop Helps in efficiently transferring bulk data between Hadoop and the Oracle database.

Prerequisites : 

1.Oracle Database 10g express edition should be installed.

2.Oracle connector(ojdbc6_g.jar) ,The jar file can be downloaded from http://www.oracle.com/technetwork/database/enterprise-edition/jdbc-112010-090769.html

To import data from oracle database to HDFS through Sqoop follow these steps :

Step 1: Create a table in the oracle database and Load data into the table .

For Example :

create table monthlypassengers (Activity_Period varchar(50),Operating_Airline varchar(50),Operating_Airline_IATA_Code varchar(50),Published_Airline varchar(50),Published_Airline_IATA_Code varchar(50),GEO_Summary varchar(50),GEO_Region varchar(50),Activity_Type_Code varchar(50),Cargo_Type_Code varchar(50),Cargo_Aircraft_Type varchar(50),Cargo_Weight_LBS varchar(50),Cargo_Metric_TONS varchar(50));

The data set used in the example is:



Operating    Airline Operating Airline IATA Code Published Airline Published Airline IATA Code GEO Summary GEO Region Activity Type Code Cargo Type Code Cargo Aircraft Type Cargo Weight LBS

Cargo Metric TONS


ABX Air GB ABX Air GB Domestic US Deplaned Cargo Freighter 45423 20.60387


ABX Air GB ABX Air GB Domestic US Enplaned Cargo Freighter 106869 48.47578


ATA Airlines TZ ATA Airlines TZ Domestic US Deplaned Cargo Passenger 55427



ATA Airlines TZ ATA Airlines TZ Domestic US Deplaned Mail Passenger 50278


200507     ATA        Airlines       TZ      ATA      Airlines      TZ  Domestic    US Enplaned Cargo Passenger 74183 33.64941


Step 2 : Import the data of the table emp present in Oracle database to HDFS. Note that the oracle connector shoud be present in the sqoop directory and the command should be executed from the sqoop library.

Syntax :

/usr/bin/sqoop import --connect jdbc:oracle:thin:system/system@<IP address>:1521:xe  --username <username> -P--table <database name>.<table name> --columns "<column names>" --target-dir  <target directory path> -m 1

Example :

[cloudera@localhost sqoop]$ /usr/bin/sqoop import --connect jdbc:oracle:thin:system/system@ 
--username system -P --table system.monthlypassengers --columns "Activity_Period,Operating_Airline,Operating_Airline_IATA_Code,Published_Airline,Published_Airline_IATA_Code,GEO_Summary,GEO_Region,Activity_Type_Code,Cargo_Type_Code,Cargo_Aircraft_Type,Cargo_Weight_LBS,Cargo_Metric_TONS"--target-dir /user/cloudera/sqoop_out -m 1