How can I configure Eureka Server to recognize if a service is DOWN/OUT_OF

andreas.l...@gmail.com

unread,

Nov 3, 2014, 2:54:48 AM11/3/14

to eureka_...@googlegroups.com

Hi,

I'm trying to configure Eureka Server to recognize when a service registered with the Eureka server is returning DOWN/OUT_OF_SERVICE using a HealthCheckHandler on the clients DiscoveryClient.

But, it does not seem to be invoked by the Eureka server. I've seen some examples using Karyon, is this the only way?

How can I configure the Eureka server to invoke, e.g. /health on the given service to check the service internal state?

I cannot only rely on the service sending heart beats.

Thanks!

Kind regards,
Andreas

tb...@netflix.com

unread,

Nov 3, 2014, 12:10:23 PM11/3/14

to eureka_...@googlegroups.com, andreas.l...@gmail.com

Hi,

Eureka server expects a client to send heartbeats, not the other way around.

The heartbeat logic executes on the client side. The bottom line is that it updates local InstanceInfo status, which is next pushed to Eureka server.

Can you paste the code snippet with your setup/initialization logic?

/Tomasz

andreas.l...@gmail.com

unread,

Nov 4, 2014, 2:59:28 AM11/4/14

to eureka_...@googlegroups.com, andreas.l...@gmail.com

I setup by:

discoveryClient.registerHealthCheck(new DefaultHealthCheckHandler(healthIndicatorService));

where 'healthIndicatorService' is a health check service from Spring Boot.

The config that I guess pushes out the instance info are (in DefaultEurekaClientConfig):

- instanceInfoReplicationIntervalSeconds;
- initialInstanceInfoReplicationIntervalSeconds;

Here's my HealthCheckHandler:

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.boot.actuate.health.Health;
import org.springframework.boot.actuate.health.Status;

import se.payzone.crypto.service.HealthIndicatorService;
import se.payzone.crypto.utils.LogMessage;

import com.netflix.appinfo.HealthCheckHandler;
import com.netflix.appinfo.InstanceInfo.InstanceStatus;

public class DefaultHealthCheckHandler implements HealthCheckHandler {

private final Logger logger = LoggerFactory.getLogger(this.getClass());

private final HealthIndicatorService healthIndicatorService;

public DefaultHealthCheckHandler(
final HealthIndicatorService healthIndicatorService) {
this.healthIndicatorService = healthIndicatorService;
}

@Override
public InstanceStatus getStatus(final InstanceStatus currentStatus) {

this.logger.debug(LogMessage.createForAction("getStatus").markStart()
.toString());

InstanceStatus newStatus = InstanceStatus.UP;

final Health health = this.healthIndicatorService.health();
if (!Status.UP.equals(health.getStatus())) {
newStatus = InstanceStatus.OUT_OF_SERVICE;
}

this.logger.debug(LogMessage.createForAction("getStatus")
.addPart("instanceStatus", newStatus).markEnd().toString());

return newStatus;
}

}

Den måndagen den 3:e november 2014 kl. 18:10:23 UTC+1 skrev tb...@netflix.com:
> Hi,Eureka server expects a client to send heartbeats, not the other way around.

tb...@netflix.com

unread,

Nov 4, 2014, 6:44:48 PM11/4/14

to eureka_...@googlegroups.com, andreas.l...@gmail.com

Hi,

Your code looks good. Maybe you have invalid configuration. I have written simple client that is doing pretty much the same, but has hardcoded instance status:

Client code:

public class SampleDiscoveryClient {

public static void main(String[] args) {

System.setProperty("eureka.region", "default");

System.setProperty("eureka.environment", "test");

System.setProperty("eureka.client.props", "sample-eureka-client");

DiscoveryManager.getInstance().initComponent(

new MyDataCenterInstanceConfig(),

new DefaultEurekaClientConfig());

ApplicationInfoManager.getInstance().setInstanceStatus(InstanceStatus.UP);

DiscoveryClient discoveryClient = DiscoveryManager.getInstance().getDiscoveryClient();

discoveryClient.registerHealthCheck(new DefaultHealthCheckHandler());

System.out.println("Waiting indefinitely");

while (true) {

try {

Thread.sleep(1000000);

} catch (InterruptedException e) {

// IGNORE

}

Health check handler:

public class DefaultHealthCheckHandler implements HealthCheckHandler {

private final Logger logger = LoggerFactory.getLogger(this.getClass());

@Override

public InstanceStatus getStatus(final InstanceStatus currentStatus) {

this.logger.debug("Called get status with current status=" + currentStatus);

InstanceStatus newStatus = InstanceStatus.OUT_OF_SERVICE;

this.logger.debug("Setting instance status to " + newStatus);

return newStatus;

}

Client configuration (taken from eureka/eureka-server/conf/sampleclient):

###Eureka Client configuration for Sample Eureka Client

#Properties based configuration for eureka client. The properties specified here is mostly what the users

#need to change. All of these can be specified as a java system property with -D option (eg)-Deureka.region=us-east-1

#For additional tuning options refer <url to go here>

#Region where eureka is deployed -For AWS specify one of the AWS regions, for other datacenters specify a arbitrary string

#indicating the region.This is normally specified as a -D option (eg) -Deureka.region=us-east-1

eureka.region=default

#Name of the application to be identified by other services

eureka.name=sampleEurekaClient

#Virtual host name by which the clients identifies this service

#eureka.vipAddress=eureka.mydomain.net

#The port where the service will be running and servicing requests

#eureka.port=80

#For eureka clients running in eureka server, it needs to connect to servers in other zones

eureka.preferSameZone=true

#Change this if you want to use a DNS based lookup for determining other eureka servers. For example

#of specifying the DNS entries, check the eureka-client-test.properties, eureka-client-prod.properties

eureka.shouldUseDns=false

eureka.us-east-1.availabilityZones=default

eureka.serviceUrl.default=<your_discovery_service_url>

It worked for me with my test cluster.

/Tomasz

andreas.l...@gmail.com

unread,

Nov 7, 2014, 3:47:30 AM11/7/14

to eureka_...@googlegroups.com, andreas.l...@gmail.com

I tested your code and it does seem to work!

But, if the result from my HealthCheckHandler changes after a while it does not seem to end-up in Eureka server?

It works when I call the Eureka server using the REST API, but I cannot get it to change status because of the result from the HealthCheckHandler.

Regards,
Andreas

tb...@netflix.com

unread,

Nov 7, 2014, 12:16:09 PM11/7/14

to eureka_...@googlegroups.com, andreas.l...@gmail.com

Hi,

I have modified my code a little bit to read instance status from terminal, which is next used by healthcheck during next invocation:

public class SampleDiscoveryClient {

public static void main(String[] args) {

System.setProperty("eureka.region", "default");

System.setProperty("eureka.environment", "test");

System.setProperty("eureka.client.props", "sample-eureka-client");

DiscoveryManager.getInstance().initComponent(

new MyDataCenterInstanceConfig(),

new DefaultEurekaClientConfig());

ApplicationInfoManager.getInstance().setInstanceStatus(InstanceStatus.UP);

DiscoveryClient discoveryClient = DiscoveryManager.getInstance().getDiscoveryClient();

discoveryClient.registerHealthCheck(new DefaultHealthCheckHandler());

LineNumberReader lr = new LineNumberReader(new InputStreamReader(System.in));

while (true) {

System.out.print("Enter new status: ");

try {

String status = lr.readLine();

DefaultHealthCheckHandler.nextStatus = InstanceStatus.valueOf(status);

System.out.println("Set new status value to " + DefaultHealthCheckHandler.nextStatus);

} catch (IOException e) {

e.printStackTrace();

} catch (IllegalArgumentException ex) {

System.err.println("Invalid status value");

}

public class DefaultHealthCheckHandler implements HealthCheckHandler {

private final Logger logger = LoggerFactory.getLogger(this.getClass());

public static InstanceStatus nextStatus = InstanceStatus.UP;

@Override

public InstanceStatus getStatus(final InstanceStatus currentStatus) {

this.logger.debug("Called get status with current status=" + currentStatus);

InstanceStatus newStatus = nextStatus;

this.logger.debug("Setting instance status to " + newStatus);

return newStatus;

}

It works for me. Whenever I change instance status, when healthcheck is called next time (at 30sec interval), the Eureka registry gets updated.

You should see the following log lines printed after your healthcheck is called and changes instance status:

DOWN

Set new status value to DOWN

Enter new status: 2014-11-07 09:09:17,368 INFO com.netflix.discovery.DiscoveryClient$InstanceInfoReplicator:1651 [DiscoveryClient-3] [run] DiscoveryClient_SAMPLEEUREKACLIENT/lgml-tbak - retransmit instance info with status DOWN

2014-11-07 09:09:17,368 INFO com.netflix.discovery.DiscoveryClient:614 [DiscoveryClient-3] [register] DiscoveryClient_SAMPLEEUREKACLIENT/lgml-tbak: registering service...

2014-11-07 09:09:17,443 INFO com.netflix.discovery.DiscoveryClient:619 [DiscoveryClient-3] [register] DiscoveryClient_SAMPLEEUREKACLIENT/lgml-tbak - registration status: 204

2014-11-07 09:09:38,245 INFO com.netflix.discovery.DiscoveryClient$HeartbeatThread:1590 [pool-2-thread-1] [run] DiscoveryClient_SAMPLEEUREKACLIENT/lgml-tbak - Re-registering apps/SAMPLEEUREKACLIENT

2014-11-07 09:09:38,246 INFO com.netflix.discovery.DiscoveryClient:614 [pool-2-thread-1] [register] DiscoveryClient_SAMPLEEUREKACLIENT/lgml-tbak: registering service...

2014-11-07 09:09:38,322 INFO com.netflix.discovery.DiscoveryClient:619 [pool-2-thread-1] [register] DiscoveryClient_SAMPLEEUREKACLIENT/lgml-tbak - registration status: 204

OUT_OF_SERVICE

Set new status value to OUT_OF_SERVICE

Enter new status: 2014-11-07 09:10:17,453 INFO com.netflix.discovery.DiscoveryClient$InstanceInfoReplicator:1651 [DiscoveryClient-1] [run] DiscoveryClient_SAMPLEEUREKACLIENT/lgml-tbak - retransmit instance info with status OUT_OF_SERVICE

2014-11-07 09:10:17,453 INFO com.netflix.discovery.DiscoveryClient:614 [DiscoveryClient-1] [register] DiscoveryClient_SAMPLEEUREKACLIENT/lgml-tbak: registering service...

2014-11-07 09:10:17,527 INFO com.netflix.discovery.DiscoveryClient:619 [DiscoveryClient-1] [register] DiscoveryClient_SAMPLEEUREKACLIENT/lgml-tbak - registration status: 204

andreas.l...@gmail.com

unread,

Nov 10, 2014, 7:44:36 AM11/10/14

to eureka_...@googlegroups.com, andreas.l...@gmail.com

Hmm, your code indeed makes a good example of it working.

I think I found an issue that might cause things to mess it up for my code.

I run 2 copies of the same service (different name) on same host (same IP, same hostname) on my machine.

It seems like that Eureka manages services on host level?

Thing is that if I have a service running as two different processes on same machine and I change the status using REST API, bot services get the new status even though my REST call was targeted a single instance?

Consider two application registered with IDs "My-SERVICE-8042" and "MY-SERVICE-8081" on same machine/instance called "MYHOST57".

This call will change status for both of the applications:

http://localhost:8761/v2/apps/MY-SERVICE-8042/MYHOST57/status?value=UP

//Andreas

tb...@netflix.com

unread,

Nov 10, 2014, 12:19:42 PM11/10/14

to eureka_...@googlegroups.com, andreas.l...@gmail.com

I tried to replicate this error in my environment but it always works fine. What I did, I run two instances of the app I posted above, but with two different names.

First I tried in my local deployment. I could change the statuses independently by posting a new state directly like you did above. Only the target app instance was updated.

The same in AWS deployment.

I vaguely remember now an issue with two apps deployed in the same node, but I cannot remember the details of it.

Looking into the code, we always start with application record fetch, which aggregates a list of server instances. If the latter have overlapping names, it does not matter, as long as app names are different.

/Tomasz

Andreas Eriksson

unread,

Nov 11, 2014, 2:27:37 AM11/11/14

to tb...@netflix.com, eureka_...@googlegroups.com

Hmm, ok...

I can see my client sending its status to URL apps/MY-SERVICE-8042/MYHOST57?status=UP&lastDirtyTimestamp=...

What's the name of the endpoint in Eureka receiving this update? I'll guess I have to debug the request.

PS. Thank you for spending time on something that probably is because me mis-configured och coded something wrong :-/

--

¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
Andreas Eriksson

¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

Andreas Eriksson

unread,

Nov 11, 2014, 2:49:07 AM11/11/14

to tb...@netflix.com, eureka_...@googlegroups.com

I found the class InstanceResource with 'renewLease' method.

I can see that the query param 'status' is set to UP, param 'overriddenStatus' is null and 'lastDirtyTimestamp' is set.

From what I can tell the only part setting a new status is: registry.storeOverriddenStatusIfRequired(this.id, InstanceInfo.InstanceStatus.valueOf(overriddenStatus));

But the condition to get to this code is never fulfilled:

if ((response.getStatus() == Response.Status.NOT_FOUND.getStatusCode()) && (overriddenStatus != null) && (!InstanceInfo.InstanceStatus.UNKNOWN.equals(overriddenStatus)) && (isFromReplicaNode))

Am I'm on to something?

--

¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
Andreas Eriksson

¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

Andreas Eriksson

unread,

Nov 11, 2014, 5:03:44 AM11/11/14

to tb...@netflix.com, eureka_...@googlegroups.com

Sorry for spamming :-/

Think I found the problem.

My HealthCheckHandler returned status 'OUT_OF_SERVICE' which seems to be a state you cannot get back to status 'UP' or any other status (?).

It works fine when switching between 'UP' and 'DOWN'.

--

¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤
Andreas Eriksson

¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤¤

tb...@netflix.com

unread,

Nov 13, 2014, 11:55:32 AM11/13/14

to eureka_...@googlegroups.com, tb...@netflix.com, andreas.l...@gmail.com

The typical pattern is to use UP/DOWN status values on the application side, and OUT_OF_SERVICE from the management console. For example if you use Asgard (https://github.com/Netflix/asgard), you can disable services there (set OUT_OF_SERVICE) when doing red/black pushes.

Have you found out the reason why setting OUT_OF_SERVICE does not work for you?

johnw...@gmail.com

unread,

Oct 23, 2015, 1:10:03 PM10/23/15

to eureka_netflix, andreas.l...@gmail.com

I believe I am seeing a similar issue. If I send an OUT_OF_SERVICE request to a particular app on a given host that runs multiple apps, all Eureka enabled apps on that same host start registering an OUT_OF_SERVICE state.

For example... in my test environment I have a Eureka-server instance and 3 Eureka enabled apps running across 3 nodes. If I issue an OUT_OF_SERVICE to appA running on node1(curl -X PUT http://node1:9090/eureka/v2/apps/appA/node1/status?value=OUT_OF_SERVICE), all apps (including Eureka-server) on node1 start showing an OUT_OF_SERVICE state. Is this expected behavior?

Thanks.

-John

Tomasz Bak

unread,

Oct 23, 2015, 1:44:00 PM10/23/15

to eureka_...@googlegroups.com, andreas.l...@gmail.com

I can confirm that this is current behavior, and it is actually a bug in the implementation.

Please, report it as an issue on github.com/Netflix/eureka. It may take however some weeks before we have time to fix that.

--
You received this message because you are subscribed to the Google Groups "eureka_netflix" group.
To unsubscribe from this group and stop receiving emails from it, send an email to eureka_netfli...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

johnw...@gmail.com

unread,

Oct 23, 2015, 2:31:53 PM10/23/15

to eureka_netflix, andreas.l...@gmail.com

Will do. Thanks for the quick response.

Reply all

Reply to author

Forward

How can I configure Eureka Server to recognize if a service is DOWN/OUT_OF_SERVICE?

andreas.l...@gmail.com

tb...@netflix.com

andreas.l...@gmail.com

tb...@netflix.com

andreas.l...@gmail.com

tb...@netflix.com

andreas.l...@gmail.com

tb...@netflix.com

Andreas Eriksson

Andreas Eriksson

Andreas Eriksson

tb...@netflix.com

johnw...@gmail.com

Tomasz Bak

johnw...@gmail.com