Step By Step: Getting Kafka installed in Mac OS X Sierra

In this activity we are going to use the beautiful packaging manager tool Homebrew throughout the installation process. This tool make life easier to install and manage the latest version of the software and keep updated.

Step 1 : Install Homebrew (as an administrator)

$ /usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"

The above command will install the following packages :

  • homebrew
  • command line tool for Xcode-8.2
  • Along with other supporting libraries

Following are the sample logs you might see at the time of homebrew installation

Sudhirs-MacBook-Pro:~ sudhir$ /usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"

==> This script will install:
/usr/local/bin/brew
/usr/local/share/doc/homebrew
/usr/local/share/man/man1/brew.1
/usr/local/share/zsh/site-functions/_brew
/usr/local/etc/bash_completion.d/brew
/usr/local/Homebrew
==> The following new directories will be created:
/usr/local/Cellar
/usr/local/Homebrew
/usr/local/Frameworks
/usr/local/bin
/usr/local/etc
/usr/local/include
/usr/local/lib
/usr/local/opt
/usr/local/sbin
/usr/local/share
/usr/local/share/zsh
/usr/local/share/zsh/site-functions
/usr/local/var


==> Cleaning up /Library/Caches/Homebrew...
==> Migrating /Library/Caches/Homebrew to /Users/sudhir/Library/Caches/Homebrew...
==> Deleting /Library/Caches/Homebrew...
Already up-to-date.
==> Installation successful!

==> Homebrew has enabled anonymous aggregate user behaviour analytics.
Read the analytics documentation (and how to opt-out) here:
  https://git.io/brew-analytics

==> Next steps:
- Run `brew help` to get started
- Further documentation: 
    https://git.io/brew-docs

Step 2 : Now homebrew is ready to use. Install wget tool

$ brew install wget

The above command with install wget package along with OpenSSL library.

Sample log for above command …

Sudhirs-MacBook-Pro:~ sudhir$ brew install wget

==> Installing dependencies for wget: openssl
==> Installing wget dependency: openssl
==> Downloading https://homebrew.bintray.com/bottles/openssl-1.0.2j.sierra.bottle.tar.gz
######################################################################## 100.0%
==> Pouring openssl-1.0.2j.sierra.bottle.tar.gz
==> Using the sandbox
==> Caveats
A CA file has been bootstrapped using certificates from the SystemRoots
keychain. To add additional certificates (e.g. the certificates added in
the System keychain), place .pem files in
  /usr/local/etc/openssl/certs

and run
  /usr/local/opt/openssl/bin/c_rehash

This formula is keg-only, which means it was not symlinked into /usr/local.

Apple has deprecated use of OpenSSL in favor of its own TLS and crypto libraries

Generally there are no consequences of this for you. If you build your
own software and it requires this formula, you will need to add to your
build variables:

    LDFLAGS:  -L/usr/local/opt/openssl/lib
    CPPFLAGS: -I/usr/local/opt/openssl/include

==> Summary
🍺  /usr/local/Cellar/openssl/1.0.2j: 1,695 files, 12M
==> Installing wget 
==> Downloading https://homebrew.bintray.com/bottles/wget-1.18.sierra.bottle.tar.gz
######################################################################## 100.0%
==> Pouring wget-1.18.sierra.bottle.tar.gz
🍺  /usr/local/Cellar/wget/1.18: 9 files, 1.6M

Step 3 : Now its time to install Kafka

  • You can search available Kafka builds by executing the following command
$ brew search kafka

Above command will give you the result something like

Sudhirs-MacBook-Pro:~ sudhir$ brew search kafka

kafka                                           kafkacat                                        librdkafka
homebrew/php/php53-kafka            homebrew/php/php54-rdkafka          homebrew/php/php56-kafka            homebrew/php/php70-rdkafka
homebrew/php/php53-rdkafka          homebrew/php/php55-kafka            homebrew/php/php56-rdkafka          homebrew/php/php71-rdkafka
homebrew/php/php54-kafka            homebrew/php/php55-rdkafka          homebrew/php/php70-kafka            Caskroom/cask/kafka-tool
  • Now execute the following command to install Kafka
$ brew install kafka

Above command will give you the result something like

Sudhirs-MacBook-Pro:kafka-0.8.2 sudhir$ brew install kafka

==> Installing dependencies for kafka: zookeeper
==> Installing kafka dependency: zookeeper
==> Downloading https://homebrew.bintray.com/bottles/zookeeper-3.4.9.sierra.bottle.tar.gz
######################################################################## 100.0%
==> Pouring zookeeper-3.4.9.sierra.bottle.tar.gz
==> Caveats
To have launchd start zookeeper now and restart at login:
  brew services start zookeeper
Or, if you do not want/need a background service you can just run:
  zkServer start
==> Summary
🍺  /usr/local/Cellar/zookeeper/3.4.9: 238 files, 18.2M
==> Installing kafka 
==> Downloading https://homebrew.bintray.com/bottles/kafka.sierra.bottle.tar.gz
######################################################################## 100.0%
==> Pouring kafka.sierra.bottle.tar.gz
==> Caveats
To have launchd start kafka now and restart at login:
  brew services start kafka
Or, if you do not want/need a background service you can just run:
  zookeeper-server-start /usr/local/etc/kafka/zookeeper.properties & kafka-server-start /usr/local/etc/kafka/server.properties
==> Summary
🍺  /usr/local/Cellar/kafka/ : 128 files, 35.2M

The above Kafka installation will also install all dependencies, like zookeeper which is required to run Kafka server.

You can see the configurations are installed for Kafka & zookeeper as follows

Sudhirs-MacBook-Pro:~ sudhir$ ls -ltr /usr/local/etc
total 16
drwxr-xr-x   3 sudhir  admin   102 Jan 23 21:42 bash_completion.d
drwxr-xr-x   7 sudhir  admin   238 Jan 23 21:42 openssl
-rw-r--r--   1 sudhir  admin  4945 Jan 23 21:42 wgetrc
drwxr-xr-x   6 sudhir  admin   204 Jan 23 21:47 zookeeper
drwxr-xr- 15 sudhir  admin   510 Jan 23 21:47 kafka
Sudhirs-MacBook-Pro:~ sudhir$ ls -ltr /usr/local/etc/zookeeper/
total 32
-rw-r--r--  1 sudhir  admin  941 Jan 23 21:47 zoo_sample.cfg
-rw-r--r--  1 sudhir  admin  941 Jan 23 21:47 zoo.cfg
-rw-r--r--  1 sudhir  admin  339 Jan 23 21:47 log4j.properties
-rw-r--r--  1 sudhir  admin   44 Jan 23 21:47 defaults
Sudhirs-MacBook-Pro:~ sudhir$ view /usr/local/etc/zookeeper/zoo.cfg
Sudhirs-MacBook-Pro:~ sudhir$ ls -ltr /usr/local/etc/kafka/
total 120
-rw-r--r--  1 sudhir  admin  1037 Jan 23 21:47 zookeeper.properties
-rw-r--r--  1 sudhir  admin  1032 Jan 23 21:47 tools-log4j.properties
-rw-r--r--  1 sudhir  admin  5350 Jan 23 21:47 server.properties
-rw-r--r--  1 sudhir  admin  1900 Jan 23 21:47 producer.properties
-rw-r--r--  1 sudhir  admin  4369 Jan 23 21:47 log4j.properties
-rw-r--r--  1 sudhir  admin  1199 Jan 23 21:47 consumer.properties
-rw-r--r--  1 sudhir  admin  2061 Jan 23 21:47 connect-standalone.properties
-rw-r--r--  1 sudhir  admin  1074 Jan 23 21:47 connect-log4j.properties
-rw-r--r--  1 sudhir  admin   881 Jan 23 21:47 connect-file-source.properties
-rw-r--r--  1 sudhir  admin   883 Jan 23 21:47 connect-file-sink.properties
-rw-r--r--  1 sudhir  admin  2760 Jan 23 21:47 connect-distributed.properties
-rw-r--r--  1 sudhir  admin   909 Jan 23 21:47 connect-console-source.properties
-rw-r--r--  1 sudhir  admin   906 Jan 23 21:47 connect-console-sink.properties

Step 3 : Before starting Kafka server, first start ZooKeeper which is responsible for coordinating & selecting the leader.

  • Check if the Zookeeper really installed
Sudhirs-MacBook-Pro:~ sudhir$ which zkserver
/usr/local/bin/zkserver

Sudhirs-MacBook-Pro:~ sudhir$ zkserver
ZooKeeper JMX enabled by default
Using config: /usr/local/etc/zookeeper/zoo.cfg
Usage: ./zkServer.sh {start|start-foreground|stop|restart|status|upgrade|print-cmd}

  • Now start the zookeeper server
Sudhirs-MacBook-Pro:~ sudhir$ zkserver start

ZooKeeper JMX enabled by default
Using config: /usr/local/etc/zookeeper/zoo.cfg
Starting zookeeper ... STARTED
  • Test if the zookeeper server is really started
Sudhirs-MacBook-Pro:~ sudhir$ telnet localhost 2181
Trying ::1...
Connected to localhost.
Escape character is '^]'.

^CConnection closed by foreign host.

The above log shows that we are able to telnet to the zookeeper port , hence the zookeeper server is up and running

Step 4 : Lets create the symlinks for the configuration directory created after Kafka installation (Its not mandatory but makes life easier to use with out bothering the original directory)

Sudhirs-MacBook-Pro:~ sudhir$ ln -nsf /usr/local/etc/ ./symln.etc/

Sudhirs-MacBook-Pro:~ sudhir$ ls -ltr ~/symln.etc/
total 24
drwxr-xr-x   3 sudhir  admin   102 Jan 23 21:42 bash_completion.d
drwxr-xr-x   7 sudhir  admin   238 Jan 23 21:42 openssl
-rw-r--r--   1 sudhir  admin  4945 Jan 23 21:42 wgetrc
drwxr-xr-x   6 sudhir  admin   204 Jan 23 21:47 zookeeper
drwxr-xr-x  15 sudhir  admin   510 Jan 23 21:47 kafka
lrwxr-xr-x   1 sudhir  admin    15 Jan 23 21:59 etc -> /usr/local/etc/

Step 5 : Now lets start Kafka server

Sudhirs-MacBook-Pro:~ sudhir$ cd ~/symln.kafka/

Sudhirs-MacBook-Pro:~ sudhir$ pwd
/Users/sudhir/symln.kafka/

Sudhirs-MacBook-Pro:~ sudhir$ cd ~/symln.kafka/

Sudhirs-MacBook-Pro: sudhir$ ./bin/kafka-server-start ~/symln.etc/kafka/server.properties

After successful execution of the above command, you might see the following result.

Sudhirs-MacBook-Pro: sudhir$ ./bin/kafka-server-start ~/symln.etc/kafka/server.properties
[2017-01-24 22:50:08,731] INFO KafkaConfig values: 
	advertised.host.name = null
	advertised.listeners = null
	advertised.port = null
	.........
.............
[2017-01-24 22:50:34,751] INFO Completed load of log myTopic-0 with 1 log segments and log end offset 0 in 4 ms (kafka.log.Log)
[2017-01-24 22:50:34,753] INFO Created log for partition [myTopic,0] in /usr/local/var/lib/kafka-logs with properties {compression.type -> producer, message.format.version -> 0.10.1-IV2, file.delete.delay.ms -> 60000, max.message.bytes -> 1000012, min.compaction.lag.ms -> 0, message.timestamp.type -> CreateTime, min.insync.replicas -> 1, segment.jitter.ms -> 0, preallocate -> false, min.cleanable.dirty.ratio -> 0.5, index.interval.bytes -> 4096, unclean.leader.election.enable -> true, retention.bytes -> -1, delete.retention.ms -> 86400000, cleanup.policy -> [delete], flush.ms -> 9223372036854775807, segment.ms -> 604800000, segment.bytes -> 1073741824, retention.ms -> 604800000, message.timestamp.difference.max.ms -> 9223372036854775807, segment.index.bytes -> 10485760, flush.messages -> 9223372036854775807}. (kafka.log.LogManager)
[2017-01-24 22:50:34,754] INFO Partition [myTopic,0] on broker 0: No checkpointed highwatermark is found for partition [myTopic,0] (kafka.cluster.Partition)
[

Step 6 : Now we are good to start out Consumer and Producer to pull and push the data respectively from Kafka topic. (Here we will use command line consumer & producer for test purpose)

  1. Start Consumer
Sudhirs-MacBook-Pro:
 sudhir$ pwd
/Users/sudhir/symln.kafka/

Sudhirs-MacBook-Pro:~ sudhir$ cd ~/symln.kafka/


Sudhirs-MacBook-Pro: sudhir$ ls -ltr bin/
total 216
-r-xr-xr-x  1 sudhir  admin  141 Jan 23 21:47 zookeeper-shell
-r-xr-xr-x  1 sudhir  admin  147 Jan 23 21:47 zookeeper-server-stop
-r-xr-xr-x  1 sudhir  admin  148 Jan 23 21:47 zookeeper-server-start
-r-xr-xr-x  1 sudhir  admin  154 Jan 23 21:47 zookeeper-security-migration
-r-xr-xr-x  1 sudhir  admin  151 Jan 23 21:47 kafka-verifiable-producer
-r-xr-xr-x  1 sudhir  admin  151 Jan 23 21:47 kafka-verifiable-consumer
-r-xr-xr-x  1 sudhir  admin  138 Jan 23 21:47 kafka-topics
-r-xr-xr-x  1 sudhir  admin  157 Jan 23 21:47 kafka-streams-application-reset
-r-xr-xr-x  1 sudhir  admin  153 Jan 23 21:47 kafka-simple-consumer-shell
-r-xr-xr-x  1 sudhir  admin  143 Jan 23 21:47 kafka-server-stop
-r-xr-xr-x  1 sudhir  admin  144 Jan 23 21:47 kafka-server-start
-r-xr-xr-x  1 sudhir  admin  141 Jan 23 21:47 kafka-run-class
-r-xr-xr-x  1 sudhir  admin  152 Jan 23 21:47 kafka-replica-verification
-r-xr-xr-x  1 sudhir  admin  151 Jan 23 21:47 kafka-replay-log-producer
-r-xr-xr-x  1 sudhir  admin  151 Jan 23 21:47 kafka-reassign-partitions
-r-xr-xr-x  1 sudhir  admin  150 Jan 23 21:47 kafka-producer-perf-test
-r-xr-xr-x  1 sudhir  admin  158 Jan 23 21:47 kafka-preferred-replica-election
-r-xr-xr-x  1 sudhir  admin  144 Jan 23 21:47 kafka-mirror-maker
-r-xr-xr-x  1 sudhir  admin  150 Jan 23 21:47 kafka-consumer-perf-test
-r-xr-xr-x  1 sudhir  admin  155 Jan 23 21:47 kafka-consumer-offset-checker
-r-xr-xr-x  1 sudhir  admin  147 Jan 23 21:47 kafka-consumer-groups
-r-xr-xr-x  1 sudhir  admin  148 Jan 23 21:47 kafka-console-producer
-r-xr-xr-x  1 sudhir  admin  148 Jan 23 21:47 kafka-console-consumer
-r-xr-xr-x  1 sudhir  admin  139 Jan 23 21:47 kafka-configs
-r-xr-xr-x  1 sudhir  admin  136 Jan 23 21:47 kafka-acls
-r-xr-xr-x  1 sudhir  admin  144 Jan 23 21:47 connect-standalone
-r-xr-xr-x  1 sudhir  admin  145 Jan 23 21:47 connect-distributed

Execute Kafka consumer in a new terminal (It will write the message in the same console as soon as produces publish the data to Kafka topic named “myTopic”)

Sudhirs-MacBook-Pro: sudhir$ ./bin/kafka-console-consumer --zookeeper localhost:2181 --topic myTopic --from-beginning

Using the ConsoleConsumer with old consumer is deprecated and will be removed in a future major release. Consider using the new consumer by passing [bootstrap-server] instead of [zookeeper].

Now Consumer started, but we don’t see any massage consuming because we didn’t start producer yet and not published any data

2. Now start the producer in a new terminal (will publish the data into Kafka topic “myTopic” to which the consumer already subscribed)

Sudhirs-MacBook-Pro:0.10.1.0 sudhir$ ./bin/kafka-console-producer --broker-list localhost:9092 --topic myTopic

Now type something like “Hello Kafka” and press enter

Sudhirs-MacBook-Pro:0.10.1.0 sudhir$ ./bin/kafka-console-producer --broker-list localhost:9092 --topic myTopic
Hello Kafka

Now observe in Consumer terminal , where you can see the consumer consumed the data published by the producer

Sudhirs-MacBook-Pro: sudhir$ ./bin/kafka-console-consumer --zookeeper localhost:2181 --topic myTopic --from-beginning
Using the ConsoleConsumer with old consumer is deprecated and will be removed in a future major release. Consider using the new consumer by passing [bootstrap-server] instead of [zookeeper].
Hello Kafka

Conclusion :

Its just the begining of Kafka. there are much more to get into.

Kafka provides the following:

  • Performs on persistent messaging with constant time O(1) even with terabytes of stored messages.
  • High throughput.
  • Partitioned massaging service in distributed environment.
  • Support for parallel data streaming & loading into Hadoop.

Note : The performance , latency & throughout might differ according to some of the crucial configuration parameters configured at Kafka setup

I will try to put some use cases regarding those parameters in a different post.

Thanks for visiting the article & hope it helps.

Advertisements

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s