Marathon switches to inactive in Mesos after 2 minutes

310 views
Skip to first unread message

Terry Siu

unread,
Mar 14, 2017, 12:10:39 PM3/14/17
to marathon-framework
Hi all,

I have a non-HA Mesos and Marathon set up in AWS ECS. What I have:

1 Zookeeper
1 Mesos Master
1 Mesos Slave
1 Marathon

Each of the above components is running in its own t2.micro ECS cluster. Everything seems to look fine and dandy with regards to being able to bring up the Mesos and Marathon web UIs after deployment. However, I am unable to get the Marathon framework to stay active and keep observing on restart that it switches to being inactive in Mesos after 2 minutes. I don't have a full understanding of what I think is the Marathon scheduler not able to contact the Mesos master after startup and web searches so far have not provided me a solution. I see the following in my Mesos master logs, which tells me that the master is making attempts to ping Marathon.

W0314 03:22:11.573559 9 master.hpp:2113] Master attempted to send message to disconnected framework 3d45339a-7e1f-4b2a-9cfc-bddfaf704f96-0002 (marathon) at scheduler-ee8b1f2b-74b1...@10.0.128.229:39207

I'm sure I have a misconfiguration somewhere, but thought I shoot this out to the community while I continue to figure this out. Some more details. Each of my docker container is running in the Host network mode. My task definitions are as follows:

ZK (image from zookeeper)
  • port mappings for 2181, 2888, 3888
Mesos master (image from mesosphere/mesos-master:1.0.3)
  • port mapping for 5050
  • environment variables:
    • MESOS_HOSTNAME=<set to public IP of the EC2 instance hosting the master> (needed to set this as the Mesos UI would previously display a pop-up stating it could not connect. Underlying reason found to be UI pinging Mesos master with the private IP)
    • MESOS_IP=<set to private IP of the EC2 instance hosting the master>
    • MESOS_LOG_DIR=/var/log/mesos
    • MESOS_PORT=5050
    • MESOS_QUORUM=1
    • MESOS_REGISTRY=in_memory
    • MESOS_WORK_DIR=/var/tmp/mesos
    • MESOS_ZK=zk://<ZK public IP>:2181/mesos
Mesos slave (image from mesosphere/mesos-slave:1.0.3)
  • porting mapping for 5051
  • command parameter
    • --launcher=posix
  • environment variables:
    • MESOS_CONTAINERIZERS=mesos,docker
    • MESOS_HOSTNAME=<set to public IP of EC2 instance hosting the slave>
    • MESOS_IP=<set to private IP of ECS instance hosting the slave>
    • MESOS_LOG_DIR=/var/log/mesos
    • MESOS_MASTER=zk://<ZK public IP>:2181/mesos
    • MESOS_PORT=5051
    • MESOS_SWITCH_USER=0
    • MESOS_WORK_DIR=/var/tmp/mesos
Marathon (image from mesosphere/marathon)
  • port mappings for 8080
  • command paramters
    • --master zk://<ZK public IP>:2181/mesos
    • --zk zk://<ZK public IP>:2181/marathon
    • --hostname <private IP of EC2 instance hosting Marathon>
    • --disable_ha

If any expert eyes can provide insight of my setup, help would be greatly appreciated.

Thanks,
-Terry
Reply all
Reply to author
Forward
0 new messages