Create and Insert to Hive table example

Create table :

hive> CREATE TABLE students (name VARCHAR(64), age INT, gpa DECIMAL(3, 2));

OK
Time taken: 1.084 seconds

List tables :

hive> show tables;
OK
students
values__tmp__table__1
Time taken: 0.023 seconds, Fetched: 2 row(s)

Describe table :

hive> describe students;

OK

name                 varchar(64)

age                 int

gpa                 decimal(3,2)

Time taken: 0.052 seconds, Fetched: 3 row(s)

Insert sample data to the above created table :

hive> INSERT INTO TABLE students VALUES (‘ABC’, 35, 1.28), (‘DEF’, 32, 2.32), (‘KLM’, 37, 3.22);
Query ID = root_20180206225557_c24602db-d9bf-480c-ac98-ece3fe4381e8
Total jobs = 1
Launching Job 1 out of 1
Status: Running (Executing on YARN cluster with App id application_1517954394617_0001)


Show data from the above table

hive> select * from students;

OK

ABC 35 1.28

DEF 32 2.32

KLM 37 3.22

Time taken: 0.212 seconds, Fetched: 3 row(s)

Advertisements

Create and Insert to HBase table example

Login into master node :

[ec2-user@ip-123-45-67-89 ~]$ sudo hbase shell

HBase Shell; enter ‘help<RETURN>’ for list of supported commands.
Type “exit<RETURN>” to leave the HBase Shell
Version 1.3.1, rUnknown, Fri Sep 22 21:28:57 UTC 2017

hbase(main):001:0> list tables

Create table :

Command stntax

create ‘<table_name>’, ‘<column_family>’

Example

hbase(main):004:0> create ’employee_hbase’, ‘cf1’

Insert data into above table :

hbase(main):008:0> put ’employee_hbase’, ‘r1’, ‘cf1:empid’, ‘111’
hbase(main):008:0> put ’employee_hbase’, ‘r1’, ‘cf1:name’, ‘sudhir’
hbase(main):008:0> put ’employee_hbase’, ‘r1’, ‘cf1:dept’, ‘RnD’

Show data from table ;

hbase(main):015:0> get ’employee_hbase’, ‘r1’
COLUMN CELL
cf1:dept timestamp=1517963999636, value=RnD
cf1:empid timestamp=1517963967215, value=111
cf1:name timestamp=1517963994900, value=sudhir
1 row(s) in 0.0080 seconds

Count rows :

hbase(main):025:0> count ’employee_hbase’, { INTERVAL => 1 }

Where is emrfs-site.xml ?

The emrfs-site.xml is being create if the EMRFS is enabled when creating the EMR in AWS. You can manage other related configurations by logging into the master node and in the following location,

[ec2-user@ip-123-45-67-89 ~]$ ls -ltr /usr/share/aws/emr/emrfs/conf/emrfs-site.xml
-rw-r–r– 1 root root 609 Feb 6 21:59 /usr/share/aws/emr/emrfs/conf/emrfs-site.xml
[ec2-user@ip-123-45-67-89 ~]$

 

[ec2-user@ip-123-45-67-89 ~]$ cat /usr/share/aws/emr/emrfs/conf/emrfs-site.xml
<?xml version=”1.0″?>
<?xml-stylesheet type=”text/xsl” href=”configuration.xsl”?>

<configuration>

<property>
<name>fs.s3.consistent.retryPeriodSeconds</name>
<value>10</value>
</property>

<property>
<name>fs.s3.consistent.retryCount</name>
<value>5</value>
</property>

<property>
<name>fs.s3.consistent</name>
<value>true</value>
</property>

<property>
<name>fs.s3.consistent.metadata.tableName</name>
<value>EmrFSMetadataTest</value>
</property>

<property>
<name>fs.s3.maxConnections</name>
<value>10000</value>
</property>

</configuration>