Thứ Tư, 8 tháng 10, 2014

Centralized Logs Management with Logtash, ElasticSearch, and Redis

Deploying a Centralized Logs Management System seems very easy these days with such these great tools:

+ Logtash: collect logs, index logs, process logs, and ship logs
+ Redis: receive logs from logs shippers
+ ElasticSearch: store logs
+ Kibana: web interface with graphs, tables...

We will implement the logs management system as the following architecture:



In  this tutorial, I only deploy one shipper (nginx logs of my Django app) on one machine, and one server to play as logs indexer (redis, logstash, elasticsearch, kibana):


1. On the indexer server, install and run Redis

http://iambusychangingtheworld.blogspot.com/2013/11/install-redis-and-run-as-service.html

2. On the indexer server, install and run ElasticSearch:

$ sudo aptitude install openjdk-6-jre
$ wget https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-0.90.7.deb
$ sudo dpkg -i elasticsearch


3. On the indexer server, download, create config and run Logtash to get log from Redis and store them to ElasticSearch:

+ Download Logtash:

$ sudo mkdir /opt/logstash /etc/logstash
$ sudo cd /opt/logstash
$ sudo wget https://download.elasticsearch.org/logstash/logstash/logstash-1.2.2-flatjar.jar


+ Create Logtash config file /etc/logstash/logstash-indexer.conf with the following content:

input {
        redis {
                host => "127.0.0.1"
                data_type => "list"
                key => "logstash"
                codec => json
        }
}
output {
        elasticsearch {
                embedded => true
        }
}


+ Run Logstash, this will also activate the Kibana web interface on port 9292:

$ java -jar /opt/logstash/logstash-1.2.2-flatjar.jar agent -f /etc/logstash/logstash-indexer.conf -- web


 4. On the shipper machine (my computer), download Logstash, and create config file for Logtash to copy my Django app's logs to the indexer server:

+ Download Logstash:

$ sudo mkdir /opt/logstash /etc/logstash
$ sudo cd /opt/logstash
$ sudo wget https://download.elasticsearch.org/logstash/logstash/logstash-1.2.2-flatjar.jar

+ Create a config file at /etc/logstash/logstash-shipper.conf for Logstash to copy logs file redis at the indexer server:

input {
        file {
                path => "/home/projects/logs/*ecap.log"
                type => "nginx"
        }
}
output {
        redis {
                host => "indexer.server.ip"
                data_type => "list"
                key => "logstash"
        }
}



+ Run Logstash:

$ java -jar /opt/logstash/logstash-1.2.2-flatjar.jar agent -f /etc/logstash/logstash-shipper.conf


5. From a random machine on my network, open browser to access the kibana web interface to manage all the logs:




From now on, If I want to monitor any services's logs, I just need to run a Logstash instance on the server which runs that service.


But, there is one annoying thing: the CPU usages on the indexer server is very high. It's because I'm running all the services (logstash, redis, elasticsearch, kibana) on a same server, and the java processes consume a lot of CPU. Look at the following htop screenshots and you will see:

  • Indexer server, before running all the services:


  • Indexer server, after running all the services:



These are all listening ports on the indexer server:


Some tuning on ElasticSearch maybe helpful. http://jablonskis.org/2013/elasticsearch-and-logstash-tuning/




References:
[0] http://michael.bouvy.net/blog/en/2013/11/19/collect-visualize-your-logs-logstash-elasticsearch-redis-kibana/
[1] http://logstash.net/docs/1.2.2/tutorials/getting-started-centralized
[2] http://logstash.net/docs/1.2.2/tutorials/10-minute-walkthrough/