Exception when creating hive table from hdfs parquet file

Problem

FAILED: SemanticException Cannot find class ‘parquet.hive.DeprecatedParquetInputFormat’

Solution

[hadoop@ip-123-45-67-890 extjars]$mkdir extjars

[hadoop@ip-123-45-67-890 extjars]$cd extjars/

Now Download required jars:

[hadoop@ip-123-45-67-890 extjars]$for f in parquet-avro parquet-cascading parquet-column parquet-common parquet-encoding parquet-generator parquet-hadoop parquet-hive parquet-pig parquet-scrooge parquet-test-hadoop2 parquet-thrift
do
curl -O https://oss.sonatype.org/service/local/repositories/releases/content/com/twitter/${f}/1.2.4/${f}-1.2.4.jar
done
curl -O https://oss.sonatype.org/service/local/repositories/releases/content/com/twitter/parquet-format/1.0.0/parquet-format-1.0.0.jar

 

[hadoop@ip-123-45-67-890 extjars]$ ls -ltr
total 5472
-rw-rw-r– 1 hadoop hadoop 891821 Dec 12 23:05 parquet-avro-1.2.4.jar
-rw-rw-r– 1 hadoop hadoop 13205 Dec 12 23:05 parquet-cascading-1.2.4.jar
-rw-rw-r– 1 hadoop hadoop 648465 Dec 12 23:05 parquet-column-1.2.4.jar
-rw-rw-r– 1 hadoop hadoop 10934 Dec 12 23:05 parquet-common-1.2.4.jar
-rw-rw-r– 1 hadoop hadoop 271868 Dec 12 23:05 parquet-encoding-1.2.4.jar
-rw-rw-r– 1 hadoop hadoop 10102 Dec 12 23:05 parquet-generator-1.2.4.jar
-rw-rw-r– 1 hadoop hadoop 1085440 Dec 12 23:05 parquet-hadoop-1.2.4.jar
-rw-rw-r– 1 hadoop hadoop 70280 Dec 12 23:05 parquet-hive-1.2.4.jar
-rw-rw-r– 1 hadoop hadoop 1019306 Dec 12 23:05 parquet-pig-1.2.4.jar
-rw-rw-r– 1 hadoop hadoop 6008 Dec 12 23:05 parquet-scrooge-1.2.4.jar
-rw-rw-r– 1 hadoop hadoop 1758 Dec 12 23:05 parquet-test-hadoop2-1.2.4.jar
-rw-rw-r– 1 hadoop hadoop 1107251 Dec 12 23:05 parquet-thrift-1.2.4.jar
-rw-rw-r– 1 hadoop hadoop 440035 Dec 12 23:05 parquet-format-1.0.0.jar

Copy downloaded jars to hadoop :

[hadoop@ip-123-45-67-890 extjars]$ hadoop fs -mkdir /home/hadoop/extjars

[hadoop@ip-123-45-67-890 extjars]$ hadoop fs -copyFromLocal extjars/* /home/hadoop/extjars/

[hadoop@ip-123-45-67-890 extjars]$ hadoop fs -ls /home/hadoop/extjars
Found 13 items
-rw-r–r– 3 hadoop hadoop 891821 2017-12-12 23:24 /home/hadoop/extjars/parquet-avro-1.2.4.jar
-rw-r–r– 3 hadoop hadoop 13205 2017-12-12 23:24 /home/hadoop/extjars/parquet-cascading-1.2.4.jar
-rw-r–r– 3 hadoop hadoop 648465 2017-12-12 23:24 /home/hadoop/extjars/parquet-column-1.2.4.jar
-rw-r–r– 3 hadoop hadoop 10934 2017-12-12 23:24 /home/hadoop/extjars/parquet-common-1.2.4.jar
-rw-r–r– 3 hadoop hadoop 271868 2017-12-12 23:24 /home/hadoop/extjars/parquet-encoding-1.2.4.jar
-rw-r–r– 3 hadoop hadoop 440035 2017-12-12 23:24 /home/hadoop/extjars/parquet-format-1.0.0.jar
-rw-r–r– 3 hadoop hadoop 10102 2017-12-12 23:24 /home/hadoop/extjars/parquet-generator-1.2.4.jar
-rw-r–r– 3 hadoop hadoop 1085440 2017-12-12 23:24 /home/hadoop/extjars/parquet-hadoop-1.2.4.jar
-rw-r–r– 3 hadoop hadoop 70280 2017-12-12 23:24 /home/hadoop/extjars/parquet-hive-1.2.4.jar
-rw-r–r– 3 hadoop hadoop 1019306 2017-12-12 23:24 /home/hadoop/extjars/parquet-pig-1.2.4.jar
-rw-r–r– 3 hadoop hadoop 6008 2017-12-12 23:24 /home/hadoop/extjars/parquet-scrooge-1.2.4.jar
-rw-r–r– 3 hadoop hadoop 1758 2017-12-12 23:24 /home/hadoop/extjars/parquet-test-hadoop2-1.2.4.jar
-rw-r–r– 3 hadoop hadoop 1107251 2017-12-12 23:24 /home/hadoop/extjars/parquet-thrift-1.2.4.jar

Run hive console and add jars with fully quaified path:

[hadoop@ip-123-45-67-890 extjars]$ hive

hive>add jar hdfs://xxx.xx.xx.xxx:8020/home/hadoop/extjars/parquet-avro-1.2.4.jar;
Added [/mnt/tmp/e941b6f7-f294-4f00-b120-844a6f5418de_resources/parquet-avro-1.2.4.jar] to class path
Added resources: [hdfs://xxx.xx.xx.xxx:8020/home/hadoop/extjars/parquet-avro-1.2.4.jar]
hive> add jar hdfs://xxx.xx.xx.xxx:8020/home/hadoop/extjars/parquet-cascading-1.2.4.jar;
Added [/mnt/tmp/e941b6f7-f294-4f00-b120-844a6f5418de_resources/parquet-cascading-1.2.4.jar] to class path
Added resources: [hdfs://xxx.xx.xx.xxx:8020/home/hadoop/extjars/parquet-cascading-1.2.4.jar]
hive> add jar hdfs://xxx.xx.xx.xxx:8020/home/hadoop/extjars/parquet-column-1.2.4.jar;
Added [/mnt/tmp/e941b6f7-f294-4f00-b120-844a6f5418de_resources/parquet-column-1.2.4.jar] to class path
Added resources: [hdfs://xxx.xx.xx.xxx:8020/home/hadoop/extjars/parquet-column-1.2.4.jar]
hive> add jar hdfs://xxx.xx.xx.xxx:8020/home/hadoop/extjars/parquet-common-1.2.4.jar;
Added [/mnt/tmp/e941b6f7-f294-4f00-b120-844a6f5418de_resources/parquet-common-1.2.4.jar] to class path
Added resources: [hdfs://xxx.xx.xx.xxx:8020/home/hadoop/extjars/parquet-common-1.2.4.jar]
hive> add jar hdfs://xxx.xx.xx.xxx:8020/home/hadoop/extjars/parquet-encoding-1.2.4.jar;
Added [/mnt/tmp/e941b6f7-f294-4f00-b120-844a6f5418de_resources/parquet-encoding-1.2.4.jar] to class path
Added resources: [hdfs://xxx.xx.xx.xxx:8020/home/hadoop/extjars/parquet-encoding-1.2.4.jar]
hive> add jar hdfs://xxx.xx.xx.xxx:8020/home/hadoop/extjars/parquet-format-1.0.0.jar;
Added [/mnt/tmp/e941b6f7-f294-4f00-b120-844a6f5418de_resources/parquet-format-1.0.0.jar] to class path
Added resources: [hdfs://xxx.xx.xx.xxx:8020/home/hadoop/extjars/parquet-format-1.0.0.jar]
hive> add jar hdfs://xxx.xx.xx.xxx:8020/home/hadoop/extjars/parquet-generator-1.2.4.jar;
Added [/mnt/tmp/e941b6f7-f294-4f00-b120-844a6f5418de_resources/parquet-generator-1.2.4.jar] to class path
Added resources: [hdfs://xxx.xx.xx.xxx:8020/home/hadoop/extjars/parquet-generator-1.2.4.jar]
hive> add jar hdfs://xxx.xx.xx.xxx:8020/home/hadoop/extjars/parquet-hadoop-1.2.4.jar;
Added [/mnt/tmp/e941b6f7-f294-4f00-b120-844a6f5418de_resources/parquet-hadoop-1.2.4.jar] to class path
Added resources: [hdfs://xxx.xx.xx.xxx:8020/home/hadoop/extjars/parquet-hadoop-1.2.4.jar]
hive> add jar hdfs://xxx.xx.xx.xxx:8020/home/hadoop/extjars/parquet-hive-1.2.4.jar;
Added [/mnt/tmp/e941b6f7-f294-4f00-b120-844a6f5418de_resources/parquet-hive-1.2.4.jar] to class path
Added resources: [hdfs://xxx.xx.xx.xxx:8020/home/hadoop/extjars/parquet-hive-1.2.4.jar]
hive> add jar hdfs://xxx.xx.xx.xxx:8020/home/hadoop/extjars/parquet-pig-1.2.4.jar;
Added [/mnt/tmp/e941b6f7-f294-4f00-b120-844a6f5418de_resources/parquet-pig-1.2.4.jar] to class path
Added resources: [hdfs://xxx.xx.xx.xxx:8020/home/hadoop/extjars/parquet-pig-1.2.4.jar]
hive> add jar hdfs://xxx.xx.xx.xxx:8020/home/hadoop/extjars/parquet-scrooge-1.2.4.jar;
Added [/mnt/tmp/e941b6f7-f294-4f00-b120-844a6f5418de_resources/parquet-scrooge-1.2.4.jar] to class path
Added resources: [hdfs://xxx.xx.xx.xxx:8020/home/hadoop/extjars/parquet-scrooge-1.2.4.jar]
hive> add jar hdfs://xxx.xx.xx.xxx:8020/home/hadoop/extjars/parquet-test-hadoop2-1.2.4.jar;
Added [/mnt/tmp/e941b6f7-f294-4f00-b120-844a6f5418de_resources/parquet-test-hadoop2-1.2.4.jar] to class path
Added resources: [hdfs://xxx.xx.xx.xxx:8020/home/hadoop/extjars/parquet-test-hadoop2-1.2.4.jar]
hive> add jar hdfs://xxx.xx.xx.xxx:8020/home/hadoop/extjars/parquet-thrift-1.2.4.jar;
Added [/mnt/tmp/e941b6f7-f294-4f00-b120-844a6f5418de_resources/parquet-thrift-1.2.4.jar] to class path
Added resources: [hdfs://xxx.xx.xx.xxx:8020/home/hadoop/extjars/parquet-thrift-1.2.4.jar]

 

 

Advertisements

Solved: Hive work directory creation issue

 

Exception:

smartechie:~ sudhir.pradhan$ hive

hiveSLF4J: Class path contains multiple SLF4J bindings.SLF4J: Found binding in [jar:file:/usr/local/Cellar/hive/2.3.1/libexec/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: Found binding in [jar:file:/usr/local/Cellar/hadoop/2.8.0/libexec/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Logging initialized using configuration in jar:file:/usr/local/Cellar/hive/2.3.1/libexec/lib/hive-common-2.3.1.jar!/hive-log4j2.properties Async: trueException in thread “main” java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: ${system:java.io.tmpdir%7D/$%7Bsystem:user.name%7D at org.apache.hadoop.fs.Path.initialize(Path.java:254) at org.apache.hadoop.fs.Path.<init>(Path.java:212) at org.apache.hadoop.hive.ql.session.SessionState.createSessionDirs(SessionState.java:659) at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:582) at org.apache.hadoop.hive.ql.session.SessionState.beginStart(SessionState.java:549) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:750) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:234) at org.apache.hadoop.util.RunJar.main(RunJar.java:148)Caused by: java.net.URISyntaxException: Relative path in absolute URI: ${system:java.io.tmpdir%7D/$%7Bsystem:user.name%7D at java.net.URI.checkPath(URI.java:1823) at java.net.URI.<init>(URI.java:745) at org.apache.hadoop.fs.Path.initialize(Path.java:251) … 12 more

 

Solution :

<property>
<name>hive.exec.scratchdir</name>
<value>/tmp/hive-${user.name}</value>
</property>

<property>
 <name>hive.exec.local.scratchdir</name>
 <value>/tmp/${user.name}</value>
</property>

<property>
<name>hive.downloaded.resources.dir</name>
<value>/tmp/${user.name}_resources</value>
</property>

<property>
<name>hive.scratch.dir.permission</name>
    <value>733</value>
</property>

Unzip Multiple Files from Linux Command Line

Problem :

[hadoop@spradhan]$ unzip *.zip

Archive: a.csv.zip
caution: filename not matched: b.csv.zip
caution: filename not matched: c.csv.zip
caution: filename not matched: d.csv.zip
caution: filename not matched: e.csv.zip

Solution :

[hadoop@spradhan]$ unzip ‘*.zip’

If you run in background,

[hadoop@spradhan]$ nohup unzip ‘*.zip’ &