Create and Insert to Hive table example

Create table :

hive> CREATE TABLE students (name VARCHAR(64), age INT, gpa DECIMAL(3, 2));

OK
Time taken: 1.084 seconds

List tables :

hive> show tables;
OK
students
values__tmp__table__1
Time taken: 0.023 seconds, Fetched: 2 row(s)

Describe table :

hive> describe students;

OK

name                 varchar(64)

age                 int

gpa                 decimal(3,2)

Time taken: 0.052 seconds, Fetched: 3 row(s)

Insert sample data to the above created table :

hive> INSERT INTO TABLE students VALUES (‘ABC’, 35, 1.28), (‘DEF’, 32, 2.32), (‘KLM’, 37, 3.22);
Query ID = root_20180206225557_c24602db-d9bf-480c-ac98-ece3fe4381e8
Total jobs = 1
Launching Job 1 out of 1
Status: Running (Executing on YARN cluster with App id application_1517954394617_0001)


Show data from the above table

hive> select * from students;

OK

ABC 35 1.28

DEF 32 2.32

KLM 37 3.22

Time taken: 0.212 seconds, Fetched: 3 row(s)

Advertisements

Solved: Hive work directory creation issue

 

Exception:

smartechie:~ sudhir.pradhan$ hive

hiveSLF4J: Class path contains multiple SLF4J bindings.SLF4J: Found binding in [jar:file:/usr/local/Cellar/hive/2.3.1/libexec/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: Found binding in [jar:file:/usr/local/Cellar/hadoop/2.8.0/libexec/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Logging initialized using configuration in jar:file:/usr/local/Cellar/hive/2.3.1/libexec/lib/hive-common-2.3.1.jar!/hive-log4j2.properties Async: trueException in thread “main” java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: ${system:java.io.tmpdir%7D/$%7Bsystem:user.name%7D at org.apache.hadoop.fs.Path.initialize(Path.java:254) at org.apache.hadoop.fs.Path.<init>(Path.java:212) at org.apache.hadoop.hive.ql.session.SessionState.createSessionDirs(SessionState.java:659) at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:582) at org.apache.hadoop.hive.ql.session.SessionState.beginStart(SessionState.java:549) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:750) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:234) at org.apache.hadoop.util.RunJar.main(RunJar.java:148)Caused by: java.net.URISyntaxException: Relative path in absolute URI: ${system:java.io.tmpdir%7D/$%7Bsystem:user.name%7D at java.net.URI.checkPath(URI.java:1823) at java.net.URI.<init>(URI.java:745) at org.apache.hadoop.fs.Path.initialize(Path.java:251) … 12 more

 

Solution :

<property>
<name>hive.exec.scratchdir</name>
<value>/tmp/hive-${user.name}</value>
</property>

<property>
 <name>hive.exec.local.scratchdir</name>
 <value>/tmp/${user.name}</value>
</property>

<property>
<name>hive.downloaded.resources.dir</name>
<value>/tmp/${user.name}_resources</value>
</property>

<property>
<name>hive.scratch.dir.permission</name>
    <value>733</value>
</property>

Unzip Multiple Files from Linux Command Line

Problem :

[hadoop@spradhan]$ unzip *.zip

Archive: a.csv.zip
caution: filename not matched: b.csv.zip
caution: filename not matched: c.csv.zip
caution: filename not matched: d.csv.zip
caution: filename not matched: e.csv.zip

Solution :

[hadoop@spradhan]$ unzip ‘*.zip’

If you run in background,

[hadoop@spradhan]$ nohup unzip ‘*.zip’ &

Copy file or folder from amazon S3 to EC2

  1. Install aws cli in the ec2 instance [if not installed]
  2. $ sudo yum install aws-cli

  3. Configure aws cli
  4. $ aws configure
    AWS Access Key ID [None]: <your_access_key>
    AWS Secret Access Key [None]:<your_secret_key>
    Default region name [None]:
    Default output format [None]:

  5. Execute sync command in ec2 instance
  6. aws s3 sync s3://<path_to_file> <ec2_local_path>

Hive : Unable to start metastore issue

Exception :

smartechie:confluent-3.3.0 sudhir.pradhan$ hive

SLF4J: Class path contains multiple SLF4J bindings.SLF4J: Found binding in [jar:file:/usr/local/Cellar/hive/2.1.0/libexec/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: Found binding in [jar:file:/usr/local/Cellar/hadoop/2.8.0/libexec/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Logging initialized using configuration in jar:file:/usr/local/Cellar/hive/2.1.0/libexec/lib/hive-common-2.1.0.jar!/hive-log4j2.properties Async: trueException in thread “main” java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:578) at org.apache.hadoop.hive.ql.session.SessionState.beginStart(SessionState.java:518) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:705) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:234) at org.apache.hadoop.util.RunJar.main(RunJar.java:148)Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient at org.apache.hadoop.hive.ql.metadata.Hive.registerAllFunctionsOnce(Hive.java:226) at org.apache.hadoop.hive.ql.metadata.Hive.<init>(Hive.java:366) at org.apache.hadoop.hive.ql.metadata.Hive.create(Hive.java:310) at org.apache.hadoop.hive.ql.metadata.Hive.getInternal(Hive.java:290) at org.apache.hadoop.hive.ql.metadata.Hive.get(Hive.java:266) at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:545) … 9 moreCaused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1627) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:80) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:130) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:101) at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3317) at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3356) at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3336) at org.apache.hadoop.hive.ql.metadata.Hive.getAllFunctions(Hive.java:3590) at org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:236) at org.apache.hadoop.hive.ql.metadata.Hive.registerAllFunctionsOnce(Hive.java:221) … 14 moreCaused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1625) … 23 moreCaused by: MetaException(message:Could not connect to meta store using any of the URIs provided. Most recent failure: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused (Connection refused) at org.apache.thrift.transport.TSocket.open(TSocket.java:226) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:466) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:279) at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:70) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1625) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:80) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:130) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:101) at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3317) at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3356) at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3336) at org.apache.hadoop.hive.ql.metadata.Hive.getAllFunctions(Hive.java:3590) at org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:236) at org.apache.hadoop.hive.ql.metadata.Hive.registerAllFunctionsOnce(Hive.java:221) at org.apache.hadoop.hive.ql.metadata.Hive.<init>(Hive.java:366) at org.apache.hadoop.hive.ql.metadata.Hive.create(Hive.java:310) at org.apache.hadoop.hive.ql.metadata.Hive.getInternal(Hive.java:290) at org.apache.hadoop.hive.ql.metadata.Hive.get(Hive.java:266) at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:545) at org.apache.hadoop.hive.ql.session.SessionState.beginStart(SessionState.java:518) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:705) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:234) at org.apache.hadoop.util.RunJar.main(RunJar.java:148)Caused by: java.net.ConnectException: Connection refused (Connection refused) at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:589) at org.apache.thrift.transport.TSocket.open(TSocket.java:221) … 31 more) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:513) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:279) at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:70) … 28 more

 

Solution : Remove all lock files, you should good to go.

rm metastore_db/*.lck

smartechie:confluent-3.3.0 sudhir.pradhan$ ls metastore_db/*.lck

smartechie:confluent-3.3.0 sudhir.pradhan$ ls metastore_db/*.lck

metastore_db/db.lck

dbex.lck

Hive: Unable to instantiate Metastore

Exception :

2017-12-28T15:05:52,943 INFO [main] org.apache.hadoop.hive.metastore.HiveMetaStore – 0: Opening raw store with implementation class:org.apache.hadoop.hive.metastore.ObjectStore2017-12-28T15:05:52,943 INFO [main] org.apache.hadoop.hive.metastore.HiveMetaStore – 0: Opening raw store with implementation class:org.apache.hadoop.hive.metastore.ObjectStoreMetaException(message:Version information not found in metastore. ) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:83) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:92) at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:6883) at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:6878) at org.apache.hadoop.hive.metastore.HiveMetaStore.startMetaStore(HiveMetaStore.java:7136) at org.apache.hadoop.hive.metastore.HiveMetaStore.main(HiveMetaStore.java:7063) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:234) at org.apache.hadoop.util.RunJar.main(RunJar.java:148)Caused by: MetaException(message:Version information not found in metastore. ) at org.apache.hadoop.hive.metastore.ObjectStore.checkSchema(ObjectStore.java:7870) at org.apache.hadoop.hive.metastore.ObjectStore.verifySchema(ObjectStore.java:7848) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:101) at com.sun.proxy.$Proxy20.verifySchema(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMSForConf(HiveMetaStore.java:591) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:584) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:651) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:427) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:79) … 11 more2017-12-28T15:05:56,262 ERROR [main] org.apache.hadoop.hive.metastore.HiveMetaStore – MetaException(message:Version information not found in metastore. )

Solution : follow the link to upgrade the store

https://cwiki.apache.org/confluence/display/Hive/Hive+Schema+Tool

Installing Maven using Yum on EC2 instance (Amazon Linux)

  1. sudo wget http://repos.fedorapeople.org/repos/dchen/apache-maven/epel-apache-maven.repo -O /etc/yum.repos.d/epel-apache-maven.repo
  2. sudo sed -i s/\$releasever/6/g /etc/yum.repos.d/epel-apache-maven.repo
  3. sudo yum install -y apache-maven
  4. mvn –version

And you all set to run the “mvn” command in ec2 instance.

[ec2-user@ip-123-45-67-89 sudhir]$ mvn –version

Apache Maven 3.3.9 (bb52d8502b132ec0a5a3f4c09453c07478323dc5; 2015-11-10T16:41:47+00:00)

Maven home: /usr/share/apache-maven

Java version: 1.7.0_151, vendor: Oracle Corporation

Java home: /usr/lib/jvm/java-1.7.0-openjdk-1.7.0.151.x86_64/jre

Default locale: en_US, platform encoding: UTF-8

OS name: “linux”, version: “4.9.43-17.39.amzn1.x86_64”, arch: “amd64”, family: “unix”