Working with Postgres SQL in macOS

Install / upgrade Postgresql

smartechie-macos :~ $  brew <install | upgrade> postgresql

Get installation details

smartechie-macos :~ $  brew info postgres

 result

postgresql: stable 10.4 (bottled), HEAD

Object-relational database system

https://www.postgresql.org/

Conflicts with:

  postgres-xc (because postgresql and postgres-xc install the same binaries.)

/usr/local/Cellar/postgresql/9.6.3 (3,260 files, 36.6MB)

  Poured from bottle on 2017-06-05 at 20:47:39

/usr/local/Cellar/postgresql/10.4 (3,389 files, 39.2MB) *

  Poured from bottle on 2018-07-09 at 14:15:06

From: https://github.com/Homebrew/homebrew-core/blob/master/Formula/postgresql.rb

==> Dependencies

Required: openssl , readline

Optional: python , python@2

==> Options

–with-dtrace

Build with DTrace support

–with-python

Enable PL/Python3 (incompatible with –with-python@2)

–with-python@2

Enable PL/Python2

–without-perl

Build without Perl support

–without-tcl

Build without Tcl support

–HEAD

Install HEAD version

==> Caveats

To migrate existing data from a previous major version of PostgreSQL run:

  brew postgresql-upgrade-database

To have launchd start postgresql now and restart at login:

  brew services start postgresql

Or, if you don’t want/need a background service you can just run:

  pg_ctl -D /usr/local/var/postgres start

Start Postgresql service

If you notice from the info command output , its indicates the data directory for the DB with full start command. Just run that to start the postgresql.

To have launchd start postgresql now and restart at login:

smartechie-macos :~ $ brew services start postgresql

if you don’t want/need a background service you can just run:

smartechie-macos :~ $  pg_ctl -D /usr/local/var/postgres start

Sample commands and example

DDL Commands :

  • create database
  • create table
  • drop table
  • Alter table

 

Trouble shooting

  1. Error
    1. smartechie-macos ~ $ pg_ctl -D /usr/local/var/postgres start && brew services start postgresql
      waiting for server to start….2018-07-16 11:31:59.391 PDT [57964] LOG: could not bind IPv4 address “127.0.0.1”: Address already in use
      2018-07-16 11:31:59.391 PDT [57964] HINT: Is another postmaster already running on port 5432? If not, wait a few seconds and retry.
      2018-07-16 11:31:59.392 PDT [57964] LOG: could not bind IPv6 address “::1”: Address already in use
      2018-07-16 11:31:59.392 PDT [57964] HINT: Is another postmaster already running on port 5432? If not, wait a few seconds and retry.
      2018-07-16 11:31:59.392 PDT [57964] WARNING: could not create listen socket for “localhost”
      2018-07-16 11:31:59.392 PDT [57964] FATAL: could not create any TCP/IP sockets
      2018-07-16 11:31:59.392 PDT [57964] LOG: database system is shut down
      stopped waiting
      pg_ctl: could not start server
      Examine the log output.
      spradhan-macos-2:/ sudhir.pradhan$ psql postgres
      psql: FATAL: role “sudhir.pradhan” does not exist

    2. Solution
      1. smartechie-macos ~ $ sudo -u postgres createuser -s $(whoami); createdb $(whoami)
        Password:Error

      2. Result
        1. smartechie-macos ~$ psql postgres
          psql (10.4, server 9.6.2)
          Type “help” for help.

          postgres=#

 

Advertisements

Solved : Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try

Caused by: java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[DatanodeInfoWithStorage[172.27.10.191:50010,DS-51d68378-35de-4c70-b27e-7a98a53919cc,DISK], DatanodeInfoWithStorage[172.27.10.223:50010,DS-f172c682-713d-4a8f-b8af-69198ddc6756,DISK]], original=[DatanodeInfoWithStorage[172.27.10.191:50010,DS-51d68378-35de-4c70-b27e-7a98a53919cc,DISK], DatanodeInfoWithStorage[172.27.10.223:50010,DS-f172c682-713d-4a8f-b8af-69198ddc6756,DISK]]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via ‘dfs.client.block.write.replace-datanode-on-failure.policy’ in its configuration.

at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:925)

at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:988)

at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1156)

at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:454)

[2017-11-10 07:08:03,097] ERROR Task hdfs-sink-prqt-stndln-rpro-fct-0 threw an uncaught and unrecoverable exception (org.apache.kafka.connect.runtime.WorkerSinkTask:455)

java.lang.RuntimeException: java.util.concurrent.ExecutionException: io.confluent.connect.hdfs.errors.HiveMetaStoreException: Invalid partition for default.revpro-perf-oracle-jdbc-stdln-raw-KFK_RPRO_RI_WF_SUMM_GV: partition=0

at io.confluent.connect.hdfs.DataWriter.write(DataWriter.java:226)

at io.confluent.connect.hdfs.HdfsSinkTask.put(HdfsSinkTask.java:103)

at org.apache.kafka.connect.runtime.WorkerSinkTask.deliverMessages(WorkerSinkTask.java:435)

at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:251)

at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:180)

at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:148)

at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:146)

at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:190)

at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)

at java.util.concurrent.FutureTask.run(FutureTask.java:266)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

at java.lang.Thread.run(Thread.java:748)

Caused by: java.util.concurrent.ExecutionException: io.confluent.connect.hdfs.errors.HiveMetaStoreException: Invalid partition for default.revpro-perf-oracle-jdbc-stdln-raw-KFK_RPRO_RI_WF_SUMM_GV: partition=0

at java.util.concurrent.FutureTask.report(FutureTask.java:122)

at java.util.concurrent.FutureTask.get(FutureTask.java:192)

at io.confluent.connect.hdfs.DataWriter.write(DataWriter.java:220)

… 12 more

Caused by: io.confluent.connect.hdfs.errors.HiveMetaStoreException: Invalid partition for default.revpro-perf-oracle-jdbc-stdln-raw-KFK_RPRO_RI_WF_SUMM_GV: partition=0

at io.confluent.connect.hdfs.hive.HiveMetaStore.addPartition(HiveMetaStore.java:107)

at io.confluent.connect.hdfs.TopicPartitionWriter$3.call(TopicPartitionWriter.java:662)

at io.confluent.connect.hdfs.TopicPartitionWriter$3.call(TopicPartitionWriter.java:659)

… 4 more

Caused by: InvalidObjectException(message:default.revpro-perf-oracle-jdbc-stdln-raw-KFK_RPRO_RI_WF_SUMM_GV table not found)

at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$append_partition_by_name_with_environment_context_result$append_partition_by_name_with_environment_context_resultStandardScheme.read(ThriftHiveMetastore.java:51619)

at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$append_partition_by_name_with_environment_context_result$append_partition_by_name_with_environment_context_resultStandardScheme.read(ThriftHiveMetastore.java:51596)

at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$append_partition_by_name_with_environment_context_result.read(ThriftHiveMetastore.java:51519)

at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)

at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_append_partition_by_name_with_environment_context(ThriftHiveMetastore.java:1667)

at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.append_partition_by_name_with_environment_context(ThriftHiveMetastore.java:1651)

at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.appendPartition(HiveMetaStoreClient.java:606)

at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.appendPartition(HiveMetaStoreClient.java:600)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:498)

at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:152)

at com.sun.proxy.$Proxy51.appendPartition(Unknown Source)

at io.confluent.connect.hdfs.hive.HiveMetaStore$1.call(HiveMetaStore.java:97)

at io.confluent.connect.hdfs.hive.HiveMetaStore$1.call(HiveMetaStore.java:91)

at io.confluent.connect.hdfs.hive.HiveMetaStore.doAction(HiveMetaStore.java:87)

at io.confluent.connect.hdfs.hive.HiveMetaStore.addPartition(HiveMetaStore.java:103)

… 6 more

Root cause :

This error specially occures when clusters with just 3 or less Datanodes may experience Append failures under heavy load. Here’s the config parameter to fix it.

Solution :

Append the following set of configuration to your client (not name node or data node) in hdfs-site.xml file

 

<property>
<name>dfs.client.block.write.replace-datanode-on-failure.enable</name>
<value>true</value>
<description>
If there is a datanode/network failure in the write pipeline,
DFSClient will try to remove the failed datanode from the pipeline
and then continue writing with the remaining datanodes. As a result,
the number of datanodes in the pipeline is decreased. The feature is
to add new datanodes to the pipeline.

This is a site-wide property to enable/disable the feature.

When the cluster size is extremely small, e.g. 3 nodes or less, cluster
administrators may want to set the policy to NEVER in the default
configuration file or disable this feature. Otherwise, users may
experience an unusually high rate of pipeline failures since it is
impossible to find new datanodes for replacement.

See also dfs.client.block.write.replace-datanode-on-failure.policy
</description>
</property>

<property>
<name>dfs.client.block.write.replace-datanode-on-failure.policy</name>
<value>DEFAULT</value>
<description>
This property is used only if the value of
dfs.client.block.write.replace-datanode-on-failure.enable is true.

ALWAYS: always add a new datanode when an existing datanode is removed.

NEVER: never add a new datanode.

DEFAULT:
Let r be the replication number.
Let n be the number of existing datanodes.
Add a new datanode only if r is greater than or equal to 3 and either
(1) floor(r/2) is greater than or equal to n; or
(2) r is greater than n and the block is hflushed/appended.
</description>
</property>

<property>
<name>dfs.client.block.write.replace-datanode-on-failure.best-effort</name>
<value>false</value>
<description>
This property is used only if the value of
dfs.client.block.write.replace-datanode-on-failure.enable is true.

Best effort means that the client will try to replace a failed datanode
in write pipeline (provided that the policy is satisfied), however, it
continues the write operation in case that the datanode replacement also
fails.

Suppose the datanode replacement fails.
false: An exception should be thrown so that the write will fail.
true : The write should be resumed with the remaining datandoes.

Note that setting this property to true allows writing to a pipeline
with a smaller number of datanodes. As a result, it increases the
probability of data loss.
</description>
</property>

This article explanation more regarding the properties.

Create Superuser in HUE

[ec2-user@ip-123-45-67-890 ~]$ ls -ltr /usr/lib/hue/build/env/bin/hue

-rwxr-xr-x 1 root root 523 Sep 22 22:09 /usr/lib/hue/build/env/bin/hue

[ec2-user@ip-123-45-67-890 ~]$ sudo /usr/lib/hue/build/env/bin/hue createsuperuser

Username (leave blank to use ‘root’): sudhir

Email address: mail2sudhir.online@gmail.com

Password:

Password (again):

Superuser created successfully.

[ec2-user@ip-123-45-67-890 ~]$

ERROR when writing file to S3 bucket from EMRFS enabled Spark cluster

ERROR :

18/03/02 01:42:17 INFO RetryInvocationHandler: Exception while invoking ConsistencyCheckerS3FileSystem.mkdirs over null. Retrying after sleeping for 10000ms. com.amazon.ws.emr.hadoop.fs.consistency.exception.ConsistencyException: Directory ‘bucket/folder/_temporary’ present in the metadata but not s3 at com.amazon.ws.emr.hadoop.fs.consistency.ConsistencyCheckerS3FileSystem.getFileStatus(ConsistencyCheckerS3FileSystem.java:506)

 

Root cause :

Mostly the consistent problem comes due to

  • Manual deletion of files and directory from S3 console
  • retry logic in spark and hadoop systems.
  • When a process of creating a file on s3 failed, but it already updated in the dynamodb.
  • when the hadoop process restarts the process as the entry is already present in the dynamodb. It throws the consistent error.

Solution :

Try re-run your spark job by cleaning up the EMRFS metadata in dynamo db.

Follow the steps to clean-up & Restore the indended specific directory in the S3 bucket….

 

Deletes all the objects in the path, emrfs delete uses the hash function to delete the records, so it may delete unwanted entries also, so we are doing the import and sync in the consequent steps

Delete all the metadata

emrfs delete   s3://<bucket>/path

Retrieves the metadata for the objects that are physically present in s3 into dynamo db

emrfs import s3://<bucket>/path 

Sync the data between s3 and the metadata.

emrfs sync s3://<bucket>/path 

After all the operations, to see whether that particular object is present in both s3 and metadata

emrfs diff s3://<bucket>/path 

Amazon Aurora MySql Commands line

Connecting to a Database on a DB Instance Running the MySQL Database Engine

Once Amazon RDS provisions your DB instance, you can use any standard SQL client application to connect to a database on the DB instance. In this example, you connect to a database on a MySQL DB instance using MySQL monitor commands. One GUI-based application you can use to connect is MySQL Workbench; for more information, go to the Download MySQL Workbench page. For more information on using MySQL, go to the MySQL documentation.

 

To connect to a database on a DB instance using MySQL monitor

  1. Find the endpoint (DNS name) and port number for your DB Instance.
    1. Open the RDS console and then choose Instances to display a list of your DB instances.
    2. Choose the MySQL DB instance and choose See details from Instance actions to display the details for the DB instance.
    3. Scroll to the Connect section and copy the endpoint. Also, note the port number. You need both the endpoint and the port number to connect to the DB instance.Screen Shot 2018-02-27 at 11.46.17 AM
  2. Type the following command at a command prompt on a client computer to connect to a database on a MySQL DB instance using the MySQL monitor. Substitute the DNS name for your DB instance for <endpoint>, the master user name you used for <mymasteruser>, and the master password you used for <password>.
    spradhan.macosx$ mysql -h <endpoint> -P 3306 -u <mymasteruser>> -p

    You should see output similar to the following.

    Welcome to the MySQL monitor.  Commands end with ; or \g.
    Your MySQL connection id is 350
    Server version: 5.6.27-log MySQL Community Server (GPL)
    
    Type 'help;' or '\h' for help. Type '\c' to clear the buffer.
    
    mysql>

 

Basic MySQL Commandline operations

To login (from unix shell) use -h only if needed.
# [mysql dir]/bin/mysql -h hostname -u username -ppassword
To login (from windows)
mysql dir/bin/mysql.exe -h hostname -u username -ppassword

Create a database.
mysql> create database [databasename];

List all databases on the server.
mysql> show databases;

Switch to a database.
mysql> use [db name];

To see all the tables in the db.
mysql> show tables;

To see table’s field formats.
mysql> describe [table name];

To delete a db.
mysql> drop database [database name];

To delete a table.
mysql> drop table [table name];

Show all data from a table.
mysql> SELECT * FROM [table name];

To return columns and column information.
mysql> show columns from [table name];

Show particular rows with the given value.
mysql> SELECT * FROM [table name] WHERE [field name] = “value”;

Show all records containing the name “Something” AND the phone number ‘0123456789’.
mysql> SELECT * FROM [table name] WHERE name = “Something” AND phone_number = ‘0123456789’;

Show all records not containing the name “Something” AND the phone number ‘0123456789’ order by the phone_number field.
mysql> SELECT * FROM [table name] WHERE name != “Something” AND phone_number = ‘0123456789’ order by phone_number;

Show all records starting with the letters ‘Something‘ AND the phone number ‘0123456789’.
mysql> SELECT * FROM [table name] WHERE name like “Something%” AND phone_number = ‘0123456789’;

Show all records starting with letters ‘Something’ AND the phone number ‘0123456789’ limit to records 1 through 5.
mysql> SELECT * FROM [table name] WHERE name like “Something%” AND phone_number = ‘0123456789‘ limit 1,5;

Use a regular expression to find records. Use “REGEXP BINARY” to force case-sensitivity. This finds any record beginning with a.
mysql> SELECT * FROM [table name] WHERE rec RLIKE “^a”;

Show unique records.
mysql> SELECT DISTINCT [column name] FROM [table name];

Show selected records sorted in an ascending (asc) or descending (desc).
mysql> SELECT [col1],[col2] FROM [table name] ORDER BY [col2] DESC;

Return number of rows.
mysql> SELECT COUNT(*) FROM [table name];

Sum column.
mysql> SELECT SUM(*) FROM [table name];

Creating a new user. Login as root. Switch to the MySQL db. Make the user. Update privs.
# mysql -u root -p
mysql> use mysql;
mysql> INSERT INTO user (Host,User,Password) VALUES(‘%’,’username’,PASSWORD(‘password’));
mysql> flush privileges;

Change a users password from unix shell.
# [mysql dir]/bin/mysqladmin -u username -h hostname -ppassword ‘new-password’

Change a users password from MySQL prompt. Login as root. Set the password. Update privileges.
# mysql -u root -p
mysql> SET PASSWORD FOR ‘user’@’hostname’ = PASSWORD(‘password’);
mysql> flush privileges;

Recover a MySQL root password :

  1. Stop the MySQL server process.
  2. Start again with no grant tables.
  3. Login to MySQL as root. Set new password.
  4. Exit MySQL and restart MySQL server.
    # /etc/init.d/mysql stop
    # mysqld_safe –skip-grant-tables
    # mysql -u root
    mysql> use mysql;
    mysql> update user set password=PASSWORD(“newpassword”) where User=’root’;
    mysql> flush privileges;
    mysql> quit
    # /etc/init.d/mysql stop
    # /etc/init.d/mysql start

Set a root password if there is no root password.
# mysqladmin -u root password newpassword

Update a root password.
# mysqladmin -u root -p oldpassword newpassword

Allow the user “Someone” to connect to the server from localhost using the password “passwd”. Login as root. Switch to the MySQL db. Give privs. Update privs.
# mysql -u root -p
mysql> use mysql;
mysql> grant usage on *.* to Someone@localhost identified by ‘passwd’;
mysql> flush privileges;

Give user privilages for a db. Login as root. Switch to the MySQL db. Grant privs. Update privs.
# mysql -u root -p
mysql> use mysql;
mysql>INSERT INTO user(Host,Db,User,Select_priv,Insert_priv,Update_priv,Delete_priv,Create_priv,Drop_priv)
VALUES (‘%’,’databasename’,’username’,’Y’,’Y’,’Y’,’Y’,’Y’,’N’);
mysql> flush privileges;

or

mysql> grant all privileges on databasename.* to username@localhost;
mysql> flush privileges;

To update info already in a table.
mysql> UPDATE [table name] SET Select_priv = ‘Y’,Insert_priv = ‘Y’,Update_priv = ‘Y’ where [field name] = ‘user’;

Delete a row(s) from a table.
mysql> DELETE from [table name] where [field name] = ‘fieldvalue’;

Update database permissions/privilages.
mysql> flush privileges;

Delete a column.
mysql> alter table [table name] drop column [column name];

Add a new column to db.
mysql> alter table [table name] add column [new column name] varchar (20);

Change column name.
mysql> alter table [table name] change [old column name] [new column name] varchar (50);

Make a unique column so you get no dupes.
mysql> alter table [table name] add unique ([column name]);

Make a column bigger.
mysql> alter table [table name] modify [column name] VARCHAR(3);

Delete unique from table.
mysql> alter table [table name] drop index [colmn name];

Load a CSV file into a table.
mysql> LOAD DATA INFILE ‘/tmp/filename.csv’ replace INTO TABLE [table name] FIELDS TERMINATED BY ‘,’ LINES TERMINATED BY ‘\n’ (field1,field2,field3);

Dump all databases for backup. Backup file is sql commands to recreate all db’s.
# mysqldump -u username -ppassword –opt > /tmp/alldatabases.sql

Dump one database for backup.
# mysqldump -u username -ppassword –databases databasename > /tmp/databasename.sql

Dump a table from a database.
# mysqldump -u username -ppassword databasename tablename > /tmp/databasename.tablename.sql

Restore database (or database table) from backup.
# mysql -u username -ppassword databasename < /tmp/databasename.sql

Create Table Example 1.
mysql> CREATE TABLE employee (name VARCHAR(20));

Create Table Example 2.
mysql> create table person (personid int(50) not null auto_increment primary key,firstname varchar(35),middlename varchar(50),lastnamevarchar(50) default ‘UNKNOWN’);

Exception when creating hive table from hdfs parquet file

Problem

FAILED: SemanticException Cannot find class ‘parquet.hive.DeprecatedParquetInputFormat’

Solution

[hadoop@ip-123-45-67-890 extjars]$mkdir extjars

[hadoop@ip-123-45-67-890 extjars]$cd extjars/

Now Download required jars:

[hadoop@ip-123-45-67-890 extjars]$for f in parquet-avro parquet-cascading parquet-column parquet-common parquet-encoding parquet-generator parquet-hadoop parquet-hive parquet-pig parquet-scrooge parquet-test-hadoop2 parquet-thrift
do
curl -O https://oss.sonatype.org/service/local/repositories/releases/content/com/twitter/${f}/1.2.4/${f}-1.2.4.jar
done
curl -O https://oss.sonatype.org/service/local/repositories/releases/content/com/twitter/parquet-format/1.0.0/parquet-format-1.0.0.jar

 

[hadoop@ip-123-45-67-890 extjars]$ ls -ltr
total 5472
-rw-rw-r– 1 hadoop hadoop 891821 Dec 12 23:05 parquet-avro-1.2.4.jar
-rw-rw-r– 1 hadoop hadoop 13205 Dec 12 23:05 parquet-cascading-1.2.4.jar
-rw-rw-r– 1 hadoop hadoop 648465 Dec 12 23:05 parquet-column-1.2.4.jar
-rw-rw-r– 1 hadoop hadoop 10934 Dec 12 23:05 parquet-common-1.2.4.jar
-rw-rw-r– 1 hadoop hadoop 271868 Dec 12 23:05 parquet-encoding-1.2.4.jar
-rw-rw-r– 1 hadoop hadoop 10102 Dec 12 23:05 parquet-generator-1.2.4.jar
-rw-rw-r– 1 hadoop hadoop 1085440 Dec 12 23:05 parquet-hadoop-1.2.4.jar
-rw-rw-r– 1 hadoop hadoop 70280 Dec 12 23:05 parquet-hive-1.2.4.jar
-rw-rw-r– 1 hadoop hadoop 1019306 Dec 12 23:05 parquet-pig-1.2.4.jar
-rw-rw-r– 1 hadoop hadoop 6008 Dec 12 23:05 parquet-scrooge-1.2.4.jar
-rw-rw-r– 1 hadoop hadoop 1758 Dec 12 23:05 parquet-test-hadoop2-1.2.4.jar
-rw-rw-r– 1 hadoop hadoop 1107251 Dec 12 23:05 parquet-thrift-1.2.4.jar
-rw-rw-r– 1 hadoop hadoop 440035 Dec 12 23:05 parquet-format-1.0.0.jar

Copy downloaded jars to hadoop :

[hadoop@ip-123-45-67-890 extjars]$ hadoop fs -mkdir /home/hadoop/extjars

[hadoop@ip-123-45-67-890 extjars]$ hadoop fs -copyFromLocal extjars/* /home/hadoop/extjars/

[hadoop@ip-123-45-67-890 extjars]$ hadoop fs -ls /home/hadoop/extjars
Found 13 items
-rw-r–r– 3 hadoop hadoop 891821 2017-12-12 23:24 /home/hadoop/extjars/parquet-avro-1.2.4.jar
-rw-r–r– 3 hadoop hadoop 13205 2017-12-12 23:24 /home/hadoop/extjars/parquet-cascading-1.2.4.jar
-rw-r–r– 3 hadoop hadoop 648465 2017-12-12 23:24 /home/hadoop/extjars/parquet-column-1.2.4.jar
-rw-r–r– 3 hadoop hadoop 10934 2017-12-12 23:24 /home/hadoop/extjars/parquet-common-1.2.4.jar
-rw-r–r– 3 hadoop hadoop 271868 2017-12-12 23:24 /home/hadoop/extjars/parquet-encoding-1.2.4.jar
-rw-r–r– 3 hadoop hadoop 440035 2017-12-12 23:24 /home/hadoop/extjars/parquet-format-1.0.0.jar
-rw-r–r– 3 hadoop hadoop 10102 2017-12-12 23:24 /home/hadoop/extjars/parquet-generator-1.2.4.jar
-rw-r–r– 3 hadoop hadoop 1085440 2017-12-12 23:24 /home/hadoop/extjars/parquet-hadoop-1.2.4.jar
-rw-r–r– 3 hadoop hadoop 70280 2017-12-12 23:24 /home/hadoop/extjars/parquet-hive-1.2.4.jar
-rw-r–r– 3 hadoop hadoop 1019306 2017-12-12 23:24 /home/hadoop/extjars/parquet-pig-1.2.4.jar
-rw-r–r– 3 hadoop hadoop 6008 2017-12-12 23:24 /home/hadoop/extjars/parquet-scrooge-1.2.4.jar
-rw-r–r– 3 hadoop hadoop 1758 2017-12-12 23:24 /home/hadoop/extjars/parquet-test-hadoop2-1.2.4.jar
-rw-r–r– 3 hadoop hadoop 1107251 2017-12-12 23:24 /home/hadoop/extjars/parquet-thrift-1.2.4.jar

Run hive console and add jars with fully quaified path:

[hadoop@ip-123-45-67-890 extjars]$ hive

hive>add jar hdfs://xxx.xx.xx.xxx:8020/home/hadoop/extjars/parquet-avro-1.2.4.jar;
Added [/mnt/tmp/e941b6f7-f294-4f00-b120-844a6f5418de_resources/parquet-avro-1.2.4.jar] to class path
Added resources: [hdfs://xxx.xx.xx.xxx:8020/home/hadoop/extjars/parquet-avro-1.2.4.jar]
hive> add jar hdfs://xxx.xx.xx.xxx:8020/home/hadoop/extjars/parquet-cascading-1.2.4.jar;
Added [/mnt/tmp/e941b6f7-f294-4f00-b120-844a6f5418de_resources/parquet-cascading-1.2.4.jar] to class path
Added resources: [hdfs://xxx.xx.xx.xxx:8020/home/hadoop/extjars/parquet-cascading-1.2.4.jar]
hive> add jar hdfs://xxx.xx.xx.xxx:8020/home/hadoop/extjars/parquet-column-1.2.4.jar;
Added [/mnt/tmp/e941b6f7-f294-4f00-b120-844a6f5418de_resources/parquet-column-1.2.4.jar] to class path
Added resources: [hdfs://xxx.xx.xx.xxx:8020/home/hadoop/extjars/parquet-column-1.2.4.jar]
hive> add jar hdfs://xxx.xx.xx.xxx:8020/home/hadoop/extjars/parquet-common-1.2.4.jar;
Added [/mnt/tmp/e941b6f7-f294-4f00-b120-844a6f5418de_resources/parquet-common-1.2.4.jar] to class path
Added resources: [hdfs://xxx.xx.xx.xxx:8020/home/hadoop/extjars/parquet-common-1.2.4.jar]
hive> add jar hdfs://xxx.xx.xx.xxx:8020/home/hadoop/extjars/parquet-encoding-1.2.4.jar;
Added [/mnt/tmp/e941b6f7-f294-4f00-b120-844a6f5418de_resources/parquet-encoding-1.2.4.jar] to class path
Added resources: [hdfs://xxx.xx.xx.xxx:8020/home/hadoop/extjars/parquet-encoding-1.2.4.jar]
hive> add jar hdfs://xxx.xx.xx.xxx:8020/home/hadoop/extjars/parquet-format-1.0.0.jar;
Added [/mnt/tmp/e941b6f7-f294-4f00-b120-844a6f5418de_resources/parquet-format-1.0.0.jar] to class path
Added resources: [hdfs://xxx.xx.xx.xxx:8020/home/hadoop/extjars/parquet-format-1.0.0.jar]
hive> add jar hdfs://xxx.xx.xx.xxx:8020/home/hadoop/extjars/parquet-generator-1.2.4.jar;
Added [/mnt/tmp/e941b6f7-f294-4f00-b120-844a6f5418de_resources/parquet-generator-1.2.4.jar] to class path
Added resources: [hdfs://xxx.xx.xx.xxx:8020/home/hadoop/extjars/parquet-generator-1.2.4.jar]
hive> add jar hdfs://xxx.xx.xx.xxx:8020/home/hadoop/extjars/parquet-hadoop-1.2.4.jar;
Added [/mnt/tmp/e941b6f7-f294-4f00-b120-844a6f5418de_resources/parquet-hadoop-1.2.4.jar] to class path
Added resources: [hdfs://xxx.xx.xx.xxx:8020/home/hadoop/extjars/parquet-hadoop-1.2.4.jar]
hive> add jar hdfs://xxx.xx.xx.xxx:8020/home/hadoop/extjars/parquet-hive-1.2.4.jar;
Added [/mnt/tmp/e941b6f7-f294-4f00-b120-844a6f5418de_resources/parquet-hive-1.2.4.jar] to class path
Added resources: [hdfs://xxx.xx.xx.xxx:8020/home/hadoop/extjars/parquet-hive-1.2.4.jar]
hive> add jar hdfs://xxx.xx.xx.xxx:8020/home/hadoop/extjars/parquet-pig-1.2.4.jar;
Added [/mnt/tmp/e941b6f7-f294-4f00-b120-844a6f5418de_resources/parquet-pig-1.2.4.jar] to class path
Added resources: [hdfs://xxx.xx.xx.xxx:8020/home/hadoop/extjars/parquet-pig-1.2.4.jar]
hive> add jar hdfs://xxx.xx.xx.xxx:8020/home/hadoop/extjars/parquet-scrooge-1.2.4.jar;
Added [/mnt/tmp/e941b6f7-f294-4f00-b120-844a6f5418de_resources/parquet-scrooge-1.2.4.jar] to class path
Added resources: [hdfs://xxx.xx.xx.xxx:8020/home/hadoop/extjars/parquet-scrooge-1.2.4.jar]
hive> add jar hdfs://xxx.xx.xx.xxx:8020/home/hadoop/extjars/parquet-test-hadoop2-1.2.4.jar;
Added [/mnt/tmp/e941b6f7-f294-4f00-b120-844a6f5418de_resources/parquet-test-hadoop2-1.2.4.jar] to class path
Added resources: [hdfs://xxx.xx.xx.xxx:8020/home/hadoop/extjars/parquet-test-hadoop2-1.2.4.jar]
hive> add jar hdfs://xxx.xx.xx.xxx:8020/home/hadoop/extjars/parquet-thrift-1.2.4.jar;
Added [/mnt/tmp/e941b6f7-f294-4f00-b120-844a6f5418de_resources/parquet-thrift-1.2.4.jar] to class path
Added resources: [hdfs://xxx.xx.xx.xxx:8020/home/hadoop/extjars/parquet-thrift-1.2.4.jar]

 

 

Solved: Hive work directory creation issue

 

Exception:

smartechie:~ sudhir.pradhan$ hive

hiveSLF4J: Class path contains multiple SLF4J bindings.SLF4J: Found binding in [jar:file:/usr/local/Cellar/hive/2.3.1/libexec/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: Found binding in [jar:file:/usr/local/Cellar/hadoop/2.8.0/libexec/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Logging initialized using configuration in jar:file:/usr/local/Cellar/hive/2.3.1/libexec/lib/hive-common-2.3.1.jar!/hive-log4j2.properties Async: trueException in thread “main” java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: ${system:java.io.tmpdir%7D/$%7Bsystem:user.name%7D at org.apache.hadoop.fs.Path.initialize(Path.java:254) at org.apache.hadoop.fs.Path.<init>(Path.java:212) at org.apache.hadoop.hive.ql.session.SessionState.createSessionDirs(SessionState.java:659) at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:582) at org.apache.hadoop.hive.ql.session.SessionState.beginStart(SessionState.java:549) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:750) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:234) at org.apache.hadoop.util.RunJar.main(RunJar.java:148)Caused by: java.net.URISyntaxException: Relative path in absolute URI: ${system:java.io.tmpdir%7D/$%7Bsystem:user.name%7D at java.net.URI.checkPath(URI.java:1823) at java.net.URI.<init>(URI.java:745) at org.apache.hadoop.fs.Path.initialize(Path.java:251) … 12 more

 

Solution :

<property>
<name>hive.exec.scratchdir</name>
<value>/tmp/hive-${user.name}</value>
</property>

<property>
 <name>hive.exec.local.scratchdir</name>
 <value>/tmp/${user.name}</value>
</property>

<property>
<name>hive.downloaded.resources.dir</name>
<value>/tmp/${user.name}_resources</value>
</property>

<property>
<name>hive.scratch.dir.permission</name>
    <value>733</value>
</property>