Sunday, May 4, 2014

Elasticsearch - err failed to connect to master - when changing/using a different IP address



It is a general rule of thumbs to check first your
/var/log/elasticsearch/elasticsearch.log
and
/var/log/logstash/logstash.log
when you experience any form of issues when using Kibana.

I stumbled upon this when I changed the IP/Network of the interface of my test virtual machine holding an ELK (Elasticsearch/Logstash/Kibana) installation to do log analysis for Suricata IDPS.

I managed to solve the issue based on those two sources:
https://github.com/elasticsearch/elasticsearch/issues/4194
http://www.concept47.com/austin_web_developer_blog/errors/elasticsearch-error-failed-to-connect-to-master/

The new IP that I changed is - 192.168.1.166 and the old one was 10.0.2.15
(notice the errs in the logs. It was still trying to connect to the old one below):

root@debian64:~/Work/# more /var/log/elasticsearch/elasticsearch.log
[2014-05-04 07:17:24,960][INFO ][node                     ] [Jamal Afari] version[1.1.0], pid[7178], build[2181e11/2014-03-25T15:59:51Z]
[2014-05-04 07:17:24,960][INFO ][node                     ] [Jamal Afari] initializing ...
[2014-05-04 07:17:24,964][INFO ][plugins                  ] [Jamal Afari] loaded [], sites []
[2014-05-04 07:17:27,828][INFO ][node                     ] [Jamal Afari] initialized
[2014-05-04 07:17:27,828][INFO ][node                     ] [Jamal Afari] starting ...
[2014-05-04 07:17:27,959][INFO ][transport                ] [Jamal Afari] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/192.168.1.166:9300]}
[2014-05-04 07:17:57,977][WARN ][discovery                ] [Jamal Afari] waited for 30s and no initial state was set by the discovery
[2014-05-04 07:17:57,978][INFO ][discovery                ] [Jamal Afari] elasticsearch/F9HgSmYJQcS6bxdgdeurAA
[2014-05-04 07:17:57,986][INFO ][http                     ] [Jamal Afari] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/192.168.1.166:9200]}
[2014-05-04 07:17:58,017][INFO ][node                     ] [Jamal Afari] started
[2014-05-04 07:18:01,026][WARN ][discovery.zen            ] [Jamal Afari] failed to connect to master [[Hellion][zcx2fIF2SrmwSYQ08la6PQ][LTS-64-1][inet[/10.0.2.15:9300]]], retrying...
org.elasticsearch.transport.ConnectTransportException: [Hellion][inet[/10.0.2.15:9300]] connect_timeout[30s]
    at org.elasticsearch.transport.netty.NettyTransport.connectToChannels(NettyTransport.java:718)
    at org.elasticsearch.transport.netty.NettyTransport.connectToNode(NettyTransport.java:647)
    at org.elasticsearch.transport.netty.NettyTransport.connectToNode(NettyTransport.java:615)
    at org.elasticsearch.transport.TransportService.connectToNode(TransportService.java:129)
    at org.elasticsearch.discovery.zen.ZenDiscovery.innerJoinCluster(ZenDiscovery.java:338)
    at org.elasticsearch.discovery.zen.ZenDiscovery.access$500(ZenDiscovery.java:79)
    at org.elasticsearch.discovery.zen.ZenDiscovery$1.run(ZenDiscovery.java:286)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:701)
Caused by: org.elasticsearch.common.netty.channel.ConnectTimeoutException: connection timed out: /10.0.2.15:9300
    at org.elasticsearch.common.netty.channel.socket.nio.NioClientBoss.processConnectTimeout(NioClientBoss.java:137)
    at org.elasticsearch.common.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:83)
    at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
    at org.elasticsearch.common.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:42)
.......
.......
.......
[2014-05-04 07:37:05,783][WARN ][discovery.zen            ] [Vivisector] failed to connect to master [[Hellion][zcx2fIF2SrmwSYQ08la6PQ][LTS-64-1][inet[/10.0.2.15:9300]]], retrying...
org.elasticsearch.transport.ConnectTransportException: [Hellion][inet[/10.0.2.15:9300]] connect_timeout[30s]
    at org.elasticsearch.transport.netty.NettyTransport.connectToChannels(NettyTransport.java:718)
    at org.elasticsearch.transport.netty.NettyTransport.connectToNode(NettyTransport.java:647)
    at org.elasticsearch.transport.netty.NettyTransport.connectToNode(NettyTransport.java:615)
    at org.elasticsearch.transport.TransportService.connectToNode(TransportService.java:129)
    at org.elasticsearch.discovery.zen.ZenDiscovery.innerJoinCluster(ZenDiscovery.java:338)
    at org.elasticsearch.discovery.zen.ZenDiscovery.access$500(ZenDiscovery.java:79)
    at org.elasticsearch.discovery.zen.ZenDiscovery$1.run(ZenDiscovery.java:286)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:701)
Caused by: org.elasticsearch.common.netty.channel.ConnectTimeoutException: connection timed out: /10.0.2.15:9300
    at org.elasticsearch.common.netty.channel.socket.nio.NioClientBoss.processConnectTimeout(NioClientBoss.java:137)
    at org.elasticsearch.common.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:83)
    at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
    at org.elasticsearch.common.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:42)
    at org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
    at org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
    ... 3 more
   
   
That was giving me all sorts of wired errs and failed queries in Kibana. The base of the problem was that I did change IP addresses on the ELK server.

The solution is simple.
Find the Discovery section in  /etc/elasticsearch/elasticsearch.yml
and edit this line from :
# 1. Disable multicast discovery (enabled by default):
#
# discovery.zen.ping.multicast.enabled: false

to

# 1. Disable multicast discovery (enabled by default):
#
 discovery.zen.ping.multicast.enabled: false

Only remove the " # " in front of "discovery.zen.ping.multicast.enabled: false".
Save and restart the service.
service elasticsearch restart

Then everything went back to normal.
In /var/log/elasticsearch/elasticsearch.log:
   
[2014-05-04 07:37:07,936][INFO ][node                     ] [Vivisector] stopping ...
[2014-05-04 07:37:07,970][INFO ][node                     ] [Vivisector] stopped
[2014-05-04 07:37:07,971][INFO ][node                     ] [Vivisector] closing ...
[2014-05-04 07:37:07,979][INFO ][node                     ] [Vivisector] closed
[2014-05-04 07:37:09,685][INFO ][node                     ] [Vibraxas] version[1.1.0], pid[5291], build[2181e11/2014-03-25T15:59:51Z]
[2014-05-04 07:37:09,686][INFO ][node                     ] [Vibraxas] initializing ...
[2014-05-04 07:37:09,689][INFO ][plugins                  ] [Vibraxas] loaded [], sites []
[2014-05-04 07:37:12,597][INFO ][node                     ] [Vibraxas] initialized
[2014-05-04 07:37:12,597][INFO ][node                     ] [Vibraxas] starting ...
[2014-05-04 07:37:12,751][INFO ][transport                ] [Vibraxas] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/192.168.1.166:9300]}
[2014-05-04 07:37:15,777][INFO ][cluster.service          ] [Vibraxas] new_master [Vibraxas][esQHE1EtTuWVK9MVNiQ5jA][debian64][inet[/192.168.1.166:9300]], reason: zen-disco-join (elected_as_master)
[2014-05-04 07:37:15,806][INFO ][discovery                ] [Vibraxas] elasticsearch/esQHE1EtTuWVK9MVNiQ5jA
[2014-05-04 07:37:15,877][INFO ][http                     ] [Vibraxas] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/192.168.1.166:9200]}
[2014-05-04 07:37:16,893][INFO ][gateway                  ] [Vibraxas] recovered [16] indices into cluster_state
[2014-05-04 07:37:16,898][INFO ][node                     ] [Vibraxas] started
[2014-05-04 07:37:17,547][INFO ][cluster.service          ] [Vibraxas] added {[logstash-debian64-3408-4020][dTsgT1H9Srq6mUr_w5rpXQ][debian64][inet[/192.168.1.166:9301]]{client=true, data=false},}, reason: zen-disco-receive(join from node[[logstash-debian64-3408-4020][dTsgT1H9Srq6mUr_w5rpXQ][debian64][inet[/192.168.1.166:9301]]{client=true, data=false}])

 It is also higly recommended that you read the whole Discovery section in your elasticsearch.yml:
############################# Discovery #############################

# Discovery infrastructure ensures nodes can be found within a cluster
# and master node is elected. Multicast discovery is the default.

.....


No comments:

Post a Comment