This post was originally published on this site

In the following post, I will show how to install Apache Kafka on a Linux VM.

Apache Kafka is an open-source stream processing platform developed by the Apache Software Foundation written in Scala and Java.

First of all we need a Linux VM (e.g Centos 7) with at least 1 GB of RAM

Connect as root and add a new user for Kafka

[[email protected] ~]# adduser kafka

set the password

[[email protected] ~]# passwd kafka
Changing password for user kafka.
New password:
Retype new password:
passwd: all authentication tokens updated successfully.

now add the newly created user to sudoers group

[[email protected] ~]# gpasswd -a kafka wheel

or

[[email protected] ~]# visudo

find the follwing code

## Allow root to run any commands anywhere
root ALL=(ALL) ALL

add the following code below

kafka ALL=(ALL) ALL

update and reboot the system

[[email protected] ~]# yum update -y

#reboot the system

[[email protected] ~]# reboot

login with the user created above and install java

[email protected] ~]$ sudo yum install java-1.8.0-openjdk -y

 

check the java version

[email protected] ~]$ java -version

openjdk version "1.8.0_144"
OpenJDK Runtime Environment (build 1.8.0_144-b01)
OpenJDK 64-Bit Server VM (build 25.144-b01, mixed mode)

add java variables to .bash_profile

#java settings
export JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk
export JRE_HOME=/usr/lib/jvm/jre

resource profile and check settings

[email protected] ~]$. .bash_profile

[email protected] ~]$ echo $JAVA_HOME
/usr/lib/jvm/jre-1.8.0-openjdk
[[email protected] ~]$ echo $JRE_HOME
/usr/lib/jvm/jre

Now we are ready to install and confgiure Kafka.We will use the confluent platform for this environment. Confluent open source platform is freely available for download.

 

So first of all we have to import Confluents public key.

[[email protected] ~]$ sudo rpm --import http://packages.confluent.io/rpm/3.3/archive.key

After that we need to add the Confluent repository to yum.For that create a new file called confluent.repo in /etc/yum.repos.d

[[email protected] ~]$ sudo vi /etc/yum.repos.d/confluent.repo

and paste the following:

[Confluent.dist]
name=Confluent repository (dist)
baseurl=http://packages.confluent.io/rpm/3.3/7
gpgcheck=1
gpgkey=http://packages.confluent.io/rpm/3.3/archive.key
enabled=1

[Confluent]
name=Confluent repository
baseurl=http://packages.confluent.io/rpm/3.3
gpgcheck=1
gpgkey=http://packages.confluent.io/rpm/3.3/archive.key
enabled=1

Now clean up your yum caches with

[[email protected] ~]$ sudo yum clean all
Loaded plugins: fastestmirror, langpacks
Cleaning repos: Confluent Confluent.dist base extras updates
Cleaning up everything
Maybe you want: rm -rf /var/cache/yum, to also free up space taken by orphaned data from disabled or removed repos
Cleaning up list of fastest mirrors

The repo is now ready for use and we could install Confluent Open Source

[[email protected] ~]$ sudo yum install confluent-platform-oss-2.11

[...]

Dependencies Resolved

=========================================================================================================================================
Package Arch Version Repository Size
=========================================================================================================================================
Installing:
confluent-platform-oss-2.11 noarch 3.3.0-1 Confluent 6.7 k
Installing for dependencies:
confluent-camus noarch 3.3.0-1 Confluent 22 M
confluent-cli noarch 3.3.0-1 Confluent 15 k
confluent-common noarch 3.3.0-1 Confluent 2.4 M
confluent-kafka-2.11 noarch 0.11.0.0-1 Confluent 42 M
confluent-kafka-connect-elasticsearch noarch 3.3.0-1 Confluent 4.3 M
confluent-kafka-connect-hdfs noarch 3.3.0-1 Confluent 91 M
confluent-kafka-connect-jdbc noarch 3.3.0-1 Confluent 6.0 M
confluent-kafka-connect-s3 noarch 3.3.0-1 Confluent 4.7 M
confluent-kafka-connect-storage-common noarch 3.3.0-1 Confluent 88 M
confluent-kafka-rest noarch 3.3.0-1 Confluent 19 M
confluent-rest-utils noarch 3.3.0-1 Confluent 7.3 M
confluent-schema-registry noarch 3.3.0-1 Confluent 27 M

Transaction Summary
=========================================================================================================================================
Install 1 Package (+12 Dependent packages)



Downloading packages:
(1/13): confluent-cli-3.3.0-1.noarch.rpm | 15 kB 00:00:00
(2/13): confluent-common-3.3.0-1.noarch.rpm | 2.4 MB 00:00:00
(3/13): confluent-camus-3.3.0-1.noarch.rpm | 22 MB 00:00:06
(4/13): confluent-kafka-connect-elasticsearch-3.3.0-1.noarch.rpm | 4.3 MB 00:00:01
(5/13): confluent-kafka-2.11-0.11.0.0-1.noarch.rpm | 42 MB 00:00:16
(6/13): confluent-kafka-connect-jdbc-3.3.0-1.noarch.rpm | 6.0 MB 00:00:03
(7/13): confluent-kafka-connect-s3-3.3.0-1.noarch.rpm | 4.7 MB 00:00:01
(8/13): confluent-kafka-connect-hdfs-3.3.0-1.noarch.rpm | 91 MB 00:00:28
(9/13): confluent-kafka-rest-3.3.0-1.noarch.rpm | 19 MB 00:00:07
(10/13): confluent-platform-oss-2.11-3.3.0-1.noarch.rpm | 6.7 kB 00:00:00
(11/13): confluent-kafka-connect-storage-common-3.3.0-1.noarch.rpm | 88 MB 00:00:23
(12/13): confluent-rest-utils-3.3.0-1.noarch.rpm | 7.3 MB 00:00:02
(13/13): confluent-schema-registry-3.3.0-1.noarch.rpm | 27 MB 00:00:04
-----------------------------------------------------------------------------------------------------------------------------------------
Total 6.1 MB/s | 314 MB 00:00:51
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
Installing : confluent-common-3.3.0-1.noarch 1/13
Installing : confluent-rest-utils-3.3.0-1.noarch 2/13
Installing : confluent-kafka-connect-storage-common-3.3.0-1.noarch 3/13
Installing : confluent-kafka-connect-s3-3.3.0-1.noarch 4/13
Installing : confluent-schema-registry-3.3.0-1.noarch 5/13
Installing : confluent-kafka-rest-3.3.0-1.noarch 6/13
Installing : confluent-kafka-connect-hdfs-3.3.0-1.noarch 7/13
Installing : confluent-kafka-connect-jdbc-3.3.0-1.noarch 8/13
Installing : confluent-kafka-connect-elasticsearch-3.3.0-1.noarch 9/13
Installing : confluent-kafka-2.11-0.11.0.0-1.noarch 10/13
Installing : confluent-cli-3.3.0-1.noarch 11/13
Installing : confluent-camus-3.3.0-1.noarch 12/13
Installing : confluent-platform-oss-2.11-3.3.0-1.noarch 13/13
Verifying : confluent-camus-3.3.0-1.noarch 1/13
Verifying : confluent-cli-3.3.0-1.noarch 2/13
Verifying : confluent-schema-registry-3.3.0-1.noarch 3/13
Verifying : confluent-kafka-connect-hdfs-3.3.0-1.noarch 4/13
Verifying : confluent-common-3.3.0-1.noarch 5/13
Verifying : confluent-rest-utils-3.3.0-1.noarch 6/13
Verifying : confluent-kafka-connect-storage-common-3.3.0-1.noarch 7/13
Verifying : confluent-kafka-connect-s3-3.3.0-1.noarch 8/13
Verifying : confluent-platform-oss-2.11-3.3.0-1.noarch 9/13
Verifying : confluent-kafka-rest-3.3.0-1.noarch 10/13
Verifying : confluent-kafka-connect-jdbc-3.3.0-1.noarch 11/13
Verifying : confluent-kafka-connect-elasticsearch-3.3.0-1.noarch 12/13
Verifying : confluent-kafka-2.11-0.11.0.0-1.noarch 13/13

Installed:
confluent-platform-oss-2.11.noarch 0:3.3.0-1

Dependency Installed:
confluent-camus.noarch 0:3.3.0-1 confluent-cli.noarch 0:3.3.0-1
confluent-common.noarch 0:3.3.0-1 confluent-kafka-2.11.noarch 0:0.11.0.0-1
confluent-kafka-connect-elasticsearch.noarch 0:3.3.0-1 confluent-kafka-connect-hdfs.noarch 0:3.3.0-1
confluent-kafka-connect-jdbc.noarch 0:3.3.0-1 confluent-kafka-connect-s3.noarch 0:3.3.0-1
confluent-kafka-connect-storage-common.noarch 0:3.3.0-1 confluent-kafka-rest.noarch 0:3.3.0-1
confluent-rest-utils.noarch 0:3.3.0-1 confluent-schema-registry.noarch 0:3.3.0-1

Complete!

As the binaries are installed we’re able to launch Kafka and Zookeper 🙂

Check the Zookeeper configuration:

[[email protected] ~]$ cat /etc/kafka/zookeeper.properties
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# the directory where the snapshot is stored.
dataDir=/var/lib/zookeeper
# the port at which the clients will connect
clientPort=2181
# disable the per-ip limit on the number of connections since this is a non-production config
maxClientCnxns=0

 

For a lab environment the above setting should be fine no need to change anything.

So now we could start zookeeper

[[email protected] ~]$ sudo zookeeper-server-start /etc/kafka/zookeeper.properties
[...]
[2017-10-20 12:56:08,253] INFO binding to port 0.0.0.0/0.0.0.0:2181 (org.apache.zookeeper.server.NIOServerCnxnFactory)

As Zookeeper is running we could now start Kafka.
Therefore leave the current session running and open a new ssh or terminal session.

[[email protected] ~]$ sudo kafka-server-start /etc/kafka/server.properties[[email protected] ~]$ sudo kafka-server-start /etc/kafka/server.properties
[2017-10-20 12:59:47,386] INFO KafkaConfig values: 
advertised.host.name = null 
advertised.listeners = null 
advertised.port = null 
alter.config.policy.class.name = null 
authorizer.class.name = 
auto.create.topics.enable = true
[...] 

[2017-10-20 12:59:48,745] INFO [Kafka Server 0], started (kafka.server.KafkaServer)
[2017-10-20 12:59:48,809] INFO Waiting 10092 ms for the monitored broker to finish starting up... (io.confluent.support.metrics.MetricsReporter)
[2017-10-20 12:59:58,904] INFO Monitored broker is now ready (io.confluent.support.metrics.MetricsReporter)
[2017-10-20 12:59:58,904] INFO Starting metrics collection from monitored broker... (io.confluent.support.metrics.MetricsReporter)

(output shortened)

Now Kafka is running and we are able to create a new topic.

To check Kafka you could run

[[email protected] lib]$ kafka-topics --list --zookeeper localhost:2181

The output should be similar to (at least no error should occur 😉 )

__confluent.support.metrics

To create a new topic leave the current terminal or ssh session running an start another terminal or ssh session.

kafka-topics --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic mytopic

creates the topic “mytopic”

Topic creation is also visible in Kafka brokers standard out and log:

[...]
[2017-10-20 13:04:00,186] INFO Partition [mytopic,0] on broker 0: No checkpointed highwatermark is found for partition mytopic-0 (kafka.cluster.Partition)
[2017-10-20 13:04:00,189] INFO Partition [mytopic,0] on broker 0: mytopic-0 starts at Leader Epoch 0 from offset 0. Previous Leader Epoch was: -1 (kafka.cluster.Partition)
[...]

list the newly created topic

[[email protected] ~]$ kafka-topics --list --zookeeper localhost:2181
mytopic

Now we could produce some message in the topic mytopic.

[[email protected] ~]$ kafka-console-producer --broker-list localhost:9092 --topic mytopic

>be curios
>learn kafka
>have fun

Leave the producer session running and open a new terminaL or ssh session.
We could run a consumer session which shows all the message of the topic from the beginning

[[email protected] ~]$ kafka-console-consumer --zookeeper localhost:2181 --topic mytopic --from-beginning
>be curios
>learn kafka
>have fun

Also every new message you type in the producer session is visible immediate to your consumer session.

To stop the the producer, consumer and the cluster itself just press CTRL+C.

Hope you enjoyed this post about how to create a small Kafka cluster.

Next post will follow soon 😉

Further reading

http://confluent.io/

http://kafka.apache.org

 

The post Apache Kafka installation on Linux appeared first on blog.muehlbeyer.net.