≡ Menu

How to Setup Logstash on Linux with ElasticSearch, Redis, Nginx

logstash-logoLogstash is an open source central log file management application.

You can collect logs from multiple servers, multiple applications, parse those logs, and store it in a central place. Once it is stored, you can use a web GUI to search for logs, drill-down on the logs, and generate various reports.

This tutorial will explain the fundamentals of logstash and everything you need to know on how to install and configure logstash on your system.

1. Download Logstatsh Binary

Logstash is part of elasticsearch family. Download it from logstash website here. Please note that you should have java installed on your machine for this to work.

or, use curl to download it directly from the website.

wget https://download.elasticsearch.org/logstash/logstash/logstash-1.4.2.tar.gz 

tar zxvf logstash-1.4.2.tar.gz 

cd logstash-1.4.2

Note: We’ll be installing logstash using yum later. For now, we’ll inititally download the binary manually to check how it works from the command line.

2. Logstash Specify Options in Command Line

To understand the basics of logstash, for testing purpose, let us quickly check few things from command line.

Execute the logstash from the command line as shown below. When it prompts, just type “hello world” as the input.

# bin/logstash -e 'input { stdin { } } output { stdout {} }' 
hello world 
2014-07-06T17:27:25.955+0000 base hello world

In the above output, the 1st line is the “hello world” that we entered using stdin.

The 2nd line is the output that logstash displayed using the stdout. Basicially, it just spits out whatever we entered in the stdin.

Please note that specifying the -e command line flag allows Logstash to accept a configuration directly from the command line. This is very useful for quickly testing configurations without having to edit a file between iterations.

3. Modify the Output Format using codec

The rubydebug codec will output your Logstash event data using the ruby-awesome-print library.

So, by re-configuring the “stdout” output (adding a “codec”), we can change the output of Logstash. By adding inputs, outputs and filters to your configuration, it is possible to massage the log data in many ways, in order to maximize flexibility of the stored data when you are querying it.

# bin/logstash -e 'input { stdin { } } output { stdout { codec => rubydebug } }' 
hello world 
{ 
       "message" => "", 
      "@version" => "1", 
    "@timestamp" => "2014-07-06T17:40:48.775Z", 
          "host" => "base" 
} 
{ 
       "message" => "hello world", 
      "@version" => "1", 
    "@timestamp" => "2014-07-06T17:40:48.776Z", 
          "host" => "base" 
}

4. Download ElasticSearch

Now that we have seen how the Logstash works, Lets go ahead one more step. Its obvious that we cannot pass the input and output of everylog manually. So over come this problem we will have to install a software called Elasticsearch .

Download the elasticsearch from here.

Or, use wget as shown below.

curl -O https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-1.4.0.tar.gz

tar zxvf elasticsearch-1.4.0.tar.gz

Start elasticsearch service as shown below:

cd elasticsearch-1.4.0/ 

./bin/elasticsearch

Note: This tutorial specifies running Logstash 1.4.2 with Elasticsearch 1.4.0. Each release of Logstash has a recommended version of Elasticsearch to pair with. Make sure the versions match based on the Logstash version that you are running.

5. Verify ElasticSearch

By default elasticsearch runs on 9200 port.

For testing purpose, we’ll still take the input from the stdin (similar to our previous example), but the output will not be displayed on the stdout. Instead, it will go to elasticsearch.

To verify elasticsearch, let us execute the following. When it asks for the input, just type “the geek stuff” as shoen below.

# bin/logstash -e 'input { stdin { } } output { elasticsearch { host => localhost } }' 
the geek stuff

Since we won’t see the output in the stdout, we should look at the elasticsearch.

Go to the following URL:

http://localhost:9200/_search?pretty

The above will display all the messages available in the elasticsearch. You should see the message that we entered in the above logstash command here in the output.

{ 
  "took" : 4, 
  "timed_out" : false, 
  "_shards" : { 
    "total" : 5, 
    "successful" : 5, 
    "failed" : 0 
  }, 
  "hits" : { 
    "total" : 9, 
    "max_score" : 1.0, 
    "hits" : [ { 
      "_index" : "logstash-2014.07.06", 
      "_type" : "logs", 
      "_id" : "G3uZPQCMQ6ed4joNCuseew", 
      "_score" : 1.0, "_source" : {"message":"the geek stuff","@version":"1","@timestamp":"2014-07-06T18:09:46.612Z","host":"base"} 
    } ] 
  }

6. Logstash Inputs, Outputs, and Codecs

Inputs, Outputs, Codecs and Filters are at the heart of the Logstash configuration. By creating a pipeline of event processing, Logstash is able to extract the relevant data from your logs and make it available to elasticsearch, in order to efficiently query your data.

The following are some of the available inputs. Inputs are the mechanism for passing log data to Logstash

  • file: reads from a file on the filesystem, much like the UNIX command “tail -0a”
  • syslog: listens on the well-known port 514 for syslog messages and parses according to RFC3164 format
  • redis: reads from a redis server, using both redis channels and also redis lists. Redis is often used as a “broker” in a centralized Logstash installation, which queues Logstash events from remote Logstash “shippers”.
  • lumberjack: processes events sent in the lumberjack protocol. Now called logstash-forwarder.

The following are some of the filters. Filters are used as intermediary processing devices in the Logstash chain. They are often combined with conditionals in order to perform a certain action on an event, if it matches particular criteria.

  • grok: parses arbitrary text and structure it. Grok is currently the best way in Logstash to parse unstructured log data into something structured and queryable. With 120 patterns shipped built-in to Logstash, it’s more than likely you’ll find one that meets your needs!
  • mutate: The mutate filter allows you to do general mutations to fields. You can rename, remove, replace, and modify fields in your events.
  • drop: drop an event completely, for example, debug events.
  • clone: make a copy of an event, possibly adding or removing fields.
  • geoip: adds information about geographical location of IP addresses (and displays amazing charts in kibana)

The following are some of the codecs. Outputs are the final phase of the Logstash pipeline. An event may pass through multiple outputs during processing, but once all outputs are complete, the event has finished its execution.

  • elasticsearch: If you’re planning to save your data in an efficient, convenient and easily queryable format
  • file: writes event data to a file on disk.
  • graphite: sends event data to graphite, a popular open source tool for storing and graphing metrics
  • statsd: a service which “listens for statistics, like counters and timers, sent over UDP and sends aggregates to one or more pluggable backend services”.

7. Use Logstash Config File

Now it is time to move from the command line options to configuration file. Instead of specifying the options in the command line, you can specify them in a .conf file as shown below:

# vi logstash-simple.conf 
input { stdin { } } 
output { 
  elasticsearch { host => localhost } 
  stdout { codec => rubydebug } 
}

Now lets ask the logstast to read the configuration file we just created using -f option as shoen below. For testing purpose, this still uses stdin and stdout. So, type a message after entering this command.

# bin/logstash -f logstash-simple.conf 
This is Vadiraj
{ 
       "message" => "This is Vadiraj", 
      "@version" => "1", 
    "@timestamp" => "2014-11-07T04:59:20.959Z", 
          "host" => "base.thegeekstuff.com" 
}

8. Parse the Input Apache Log Message

Now, let us do little bit more advanced configurations. Delete all the entries from the logstash-simple.conf file and add the following lines:

# vi logstash-simple.conf 
input { stdin { } } 

filter { 
  grok { 
    match => { "message" => "%{COMBINEDAPACHELOG}" } 
  } 
  date { 
    match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ] 
  } 
} 

output { 
  elasticsearch { host => localhost } 
  stdout { codec => rubydebug } 
}

Now, execute the logstash command as shown below:

# bin/logstash -f logstash-filter.conf

But, this time, paste the following sample apache log file entry as the intput.

# bin/logstash -f logstash-filter.conf 
"127.0.0.1 - - [11/Dec/2013:00:01:45 -0800] "GET /xampp/status.php HTTP/1.1" 200 3891 "http://cadenza/xampp/navi.php" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:25.0) Gecko/20100101 Firefox/25.0""

The output from the logstatsh will be something similar to the following:

{ 
    "message" => "127.0.0.1 - - [11/Dec/2013:00:01:45 -0800] \"GET /xampp/status.php HTTP/1.1\" 200 3891 \"http://cadenza/xampp/navi.php\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:25.0) Gecko/20100101 Firefox/25.0\"", 
   "@version" => "1", 
 "@timestamp" => "2013-12-11T08:01:45.000Z", 
       "host" => "base.tgs.com", 
   "clientip" => "127.0.0.1", 
      "ident" => "-", 
       "auth" => "-", 
  "timestamp" => "11/Dec/2013:00:01:45 -0800", 
       "verb" => "GET", 
    "request" => "/xampp/status.php", 
"httpversion" => "1.1", 
   "response" => "200", 
      "bytes" => "3891", 
   "referrer" => "\"http://cadenza/xampp/navi.php\"", 
      "agent" => "\"Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:25.0) Gecko/20100101 Firefox/25.0\"" 
}

As you see from the above output, our input is parsed accordingly and all the values are split and stored in the corresponding fields.

The grok filter have extracted the apache log and have break up into useful bits so later point of time we can quey.

9. Logstash Config File for Apache Error Log

Create the following logstash configuration file for apache error_log file.

# vi logstash-apache.conf 
input { 
  file { 
    path => "/var/log/httpd/error_log" 
    start_position => beginning 
  } 
} 

filter { 
  if [path] =~ "error" { 
    mutate { replace => { "type" => "apache_error" } } 
    grok { 
      match => { "message" => "%{COMBINEDAPACHELOG}" } 
    } 
  } 
  date { 
    match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ] 
  } 
} 

output { 
  elasticsearch { 
    host => localhost 
  } 
  stdout { codec => rubydebug } 
}

In the above configuration file:

  • Input file is /var/log/httpd/error_log and the starting position will be the beginning of the file.
  • Filter the input file and rename (mutate) anything with error as apache_error. grok will create a combined apachelog in the message column and data will show the time stamp with the given format.
  • Output will be stored in elasticsearch in localhost and echoed via stdout in ruby print format with the codec => rubydebug

10. Logstash Config File for Both Apache Error Log and Access Log

We can specify wild-card charter to read all the log files with *_log as shown below.

But, we alaso need to change the conditions accordingly to parse both access and error log as shown below.

# vi logstash-apache.conf 
input { 
  file { 
    path => "/var/log/httpd/*_log" 
  } 
} 

filter { 
  if [path] =~ "access" { 
    mutate { replace => { type => "apache_access" } } 
    grok { 
      match => { "message" => "%{COMBINEDAPACHELOG}" } 
    } 
    date { 
      match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ] 
    } 
  } else if [path] =~ "error" { 
    mutate { replace => { type => "apache_error" } } 
  } else { 
    mutate { replace => { type => "random_logs" } } 
  } 
} 

output { 
  elasticsearch { host => localhost } 
  stdout { codec => rubydebug } 
}

11. Setup Additional Yum Repositories

Testing is over. Now we know how Logstash works with elasticseach.

We’ll be installing the following:

  • logstash – Our central log server
  • Elasticsearch – To store the logs
  • Redis – For filter
  • Nginx – To run Kibana
  • Kibana – Is a beautiful GUI dashboard and puts everything together

Before we install, setup the following repositories:

# cd /etc/yum.repos.d/ 

# vi /etc/yum.repos.d/logstash.repo 
[logstash] 
name=Logstash
baseurl=http://packages.elasticsearch.org/logstash/1.4/centos 
gpgcheck=1 
gpgkey=http://packages.elasticsearch.org/GPG-KEY-elasticsearch 
enabled=1 

# vi /etc/yum.repos.d/elasticsearch.repo 
[elasticsearch] 
name=Elasticsearch
baseurl=http://packages.elasticsearch.org/elasticsearch/1.4/centos 
gpgcheck=1 
gpgkey=http://packages.elasticsearch.org/GPG-KEY-elasticsearch 
enabled=1

Also, setup EPEL repository as we discussed earlier.

12. Install Elasticsearch, Nginx and Redis and Logstash

First bring the system up-to-date and then install logstash along with elasticsearch, redis and nginx as shown below:

yum clean all 

yum update -y 

yum install -y install elasticsearch redis nginx logstash

13. Install Kibana

Install Kibana for the dashboard as shown below:

cd /opt/ 
wget https://download.elasticsearch.org/kibana/kibana/kibana-3.1.2.tar.gz

tar -xvzf kibana-3.1.2.tar.gz 

mv kibana-3.1.2 /usr/share/kibana3

14. Configure Kibana

We have to tell kibana about elasticsearch. For this, modify the following config.js.

# vi /usr/share/kibana3/config.js 
elasticsearch: "http://log.thegeekstuff.com:9200"

Inside the above file, search for elasticsearch and change “dev.kanbier.lan” in that line to your domain (for example: log.thegeekstuff.com)

15. Setup Kibana to run from Nginx

We also have to make kibana to run from nginx webserver.

Add the following to nginx.conf

server { 
  listen                *:80 ; 

  server_name           log.thegeekstuff.com; 
  access_log            /var/log/nginx/kibana.myhost.org.access.log; 

  location / { 
    root  /usr/share/kibana3; 
    index  index.html  index.htm; 
  }

Also, don’t forget to set the appropriate ip-address of your server in the redis.conf file.

16. Configure Logstash Config File

Now we need to create a logstash config file similar to the example configuration file we used earlier.

We will be defining the log files path, which port to receive the remote logs and tell logstash about elasticsearch tool.

# vi /etc/logstash/conf.d/logstash.conf 
input { 
  file { 
   type => "syslogpath => [ "/var/log/*.log", "/var/log/messages", "/var/log/syslog" ] 
   sincedb_path => "/opt/logstash/sincedb-access" 
  } 
  redis { 
    host => "10.37.129.8" 
    type => "redis-input" 
    data_type => "list" 
    key => "logstash" 
  } 
  syslog { 
    type => "syslog" 
    port => "5544" 
  } 
} 

filter { 
  grok { 
    type => "syslog" 
    match => [ "message", "%{SYSLOGBASE2}" ] 
    add_tag => [ "syslog", "grokked" ] 
  } 
} 

output { 
 elasticsearch { host => "log.thegeekstuff.com" } 
}"

17. Verify and Start Logstash, Elasticsearch, Redis and Nginx

Start all these services as shown below:

service elasticsearch start 

service logstash start 

service nginx start 

service redis start

18. Verify Logstash web GUI

Open a browser and go the server name (host) that used in the above configuration file. For example: log.thegeekstuff.com

You’ll see a graph something similar to the following, from where you can manipulate, browse, drill-down all the log files that are collected by the logstash.

Logstash Kibana

Now that the log server is ready, you just have to forward remote server logs files managed by rsyslog to this central server by modifying the rsyslog.conf file.

Add your comment

If you enjoyed this article, you might also like..

  1. 50 Linux Sysadmin Tutorials
  2. 50 Most Frequently Used Linux Commands (With Examples)
  3. Top 25 Best Linux Performance Monitoring and Debugging Tools
  4. Mommy, I found it! – 15 Practical Linux Find Command Examples
  5. Linux 101 Hacks 2nd Edition eBook Linux 101 Hacks Book

Bash 101 Hacks Book Sed and Awk 101 Hacks Book Nagios Core 3 Book Vim 101 Hacks Book

Comments on this entry are closed.

  • RT December 9, 2014, 12:00 pm

    For elastic search 1.4.1 and higher, you need to disable CORS in the elastic search config. http.cors.enabled = false.

  • Jalal Hajigholamali December 9, 2014, 11:37 pm

    Hi,
    Very nice article…
    Thanks a lot…

  • Sg December 12, 2014, 5:43 am

    can we set log notification mail or trigger to administrator or owner of the server

  • cybernard December 12, 2014, 11:39 pm

    I have a question.
    vi /etc/logstash/conf.d/logstash.conf
    It states port 5544 in relationship to syslog, but syslog is port 514. What is the correct answer.
    In addition I get complaints about should be a #, {, } on line 3 char 30
    I had to do a hack job to get it to pass –configtest

  • R063R December 23, 2014, 6:16 am

    Hi,
    I installed it on our production environment with Jboss Wildfly integration and it is awesome.
    Thank you a lot for this article!

  • Gonzalo January 6, 2015, 6:42 am

    You should modify the logstash.conf
    Change:

    type => “syslogpath =>

    for

    type => “syslog” path =>

  • Christian January 20, 2015, 4:07 pm

    Gonzalo has the best point. Listen to Gonzalo.

  • Chan October 28, 2015, 10:30 pm

    Hello Folks

    My ELK working fine with the localhost. But I have a remote machine without the root privilege on it and I need to get the logs from the remote machine into my logstash. Since I do not have the root privilege I cannot install logstash forwarder into it. However I have set up a SSH password less authentication to the remote machine. Wondering if anyone succeeded with the below two options. I am failed 🙁

    Q1 : specify the full ssh path on logstash conf like user@ipaddress:/var/log/httpd/log

    Q2 : Mount the remote log file directory over sshfs and give the local path on the logstash conf

    thanks in advance

  • robin February 14, 2017, 4:00 am

    great article man

    really appreciate your work….

  • Andrea February 20, 2017, 5:46 am

    Nice!

  • Bhavik February 20, 2017, 9:18 am

    Hi,
    Thank you for the post.

    Question: Is there any 3rd party remote configuration tool available for Logstash with Role Based Access Control features. We are trying to figure out a central way to manage logstash config, pipelines and input/output modules.

    Thanks in advance