Various different errors can occur in differing logs, but all can lead to the TSPS shutting itself down. Here are some examples: TrueSight.log ERROR 11/07 15:11:32.192 [Timer-2] c.b.t.p.c.p.e.ESClientConnection BMC_TS-PL000134E Exception while checking the Indexserver status
java.util.concurrent.ExecutionException: RemoteTransportException[[4lVQt-D][127.0.0.1:9300][cluster:monitor/health]]; nested: MasterNotDiscoveredException; at org.elasticsearch.common.util.concurrent.BaseFuture$Sync.getValue(BaseFuture.java:262) at org.elasticsearch.common.util.concurrent.BaseFuture$Sync.get(BaseFuture.java:249) at org.elasticsearch.common.util.concurrent.BaseFuture.get(BaseFuture.java:91) at com.bmc.truesight.platform.components.persistence.es.ESClientConnection.checkIfESRunning(ESClientConnection.java:135) at com.bmc.truesight.platform.components.persistence.es.ESClientConnection.isESReachableWithRetry(ESClientConnection.java:116) indexserver.log [2018-03-17T04:44:12,795][INFO ][o.e.d.z.ZenDiscovery ] [4lVQt-D] master_left [{Niv1l2z}{Niv1l2znSxeebOqV6piIag}{PaNyRE54Ria_Bjxiff5Ksw}{eng-bmpre03.in.lab}{172.20.243.33:9300}], reason [transport disconnected]
[2018-03-17T04:44:12,796][WARN ][o.e.d.z.ZenDiscovery ] [4lVQt-D] master left (reason = transport disconnected), current nodes: nodes: {4lVQt-D}{4lVQt-DZSaiqUDcptLGWnw}{ONkNQsukQfOTRH5rK6-qVg}{eng-bmpre04.in.lab}{172.20.247.34:9300}, local {Niv1l2z}{Niv1l2znSxeebOqV6piIag}{PaNyRE54Ria_Bjxiff5Ksw}{eng-bmpre03.in.lab}{172.20.243.33:9300}, master delaying allocation for [0] unassigned shards, next check in [0s]
1) Stop the secondary TSPS Server 2) Stop the primary TSPS Server 3) On both primary and secondary TSPS nodes, edit the file /etc/sysctl.conf and and the following lines at the end of the file: net.ipv4.tcp_keepalive_time=600
net.ipv4.tcp_keepalive_intvl=60
net.ipv4.tcp_keepalive_probes=20 4) On both primary and secondary TSPS nodes, backup the file: $TRUESIGHTPSERVER_HOME/truesightpserver/modules/elasticsearch/config/elasticsearch.yml
To a location outside $TRUESIGHTPSERVER_HOME5) Edit the file elasticsearch.yml and append the following entries to the end of the file, or edit the values if the lines already exist: # How often a node gets pinged. Defaults to 1s.
NOTE: Paste the content as it is from the Article, else you can get error in parsing.discovery.zen.fd.ping_interval: 30s # How long to wait for a ping response, defaults to 3s. discovery.zen.fd.ping_timeout: 50s # How many ping failures / timeouts cause a node to be considered failed. Defaults to 3. discovery.zen.fd.ping_retries: 5 # If the master does not receive acknowledgement from at least discovery.zen.minimum_master_nodes nodes within a certain time # the cluster state change is rejected. defaults to 30 seconds discovery.zen.commit_timeout: 180s 6) Reboot the primary TSPS machine 7) Start the TSPS Server 8) Make sure UI comes up 9) Reboot the secondary TSPS machine 10) Start the TSPS Server |