--
You received this message because you are subscribed to the Google Groups "Kubernetes user discussion and Q&A" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-users+unsubscribe@googlegroups.com.
To post to this group, send email to kubernetes-users@googlegroups.com.
Visit this group at https://groups.google.com/group/kubernetes-users.
For more options, visit https://groups.google.com/d/optout.
Looks like you also posted to this issuewhich is probably a better place to discuss.
On Sat, Jan 14, 2017 at 12:56 PM, David Oppenheimer <davi...@google.com> wrote:
I don't know much about Spark, so just trying to understand your requirements.If all you care about is making sure executors get scheduled to *some* node that has a Cassandra pod, then the existing pod affinity feature is sufficient. Put label "cassandra" on the Cassandra pods, and pod affinity for "cassandra" on the executor pods and you're done. Pod affinity works off of labels, not hostnames.If you're looking for something more sophisticated than that, can you explain in more detail?
On Sat, Jan 14, 2017 at 3:08 AM, vincent gromakowski <vincent.g...@gmail.com> wrote:
Some additional information on this issue:- Kubernetes allows pod-to-pod affinity so it's easy to launch Spark executors on the same node as a backend (for instance Cassandra nodes) based on the Cassandra pods label- For each tasks, Spark will try to schedule it on the executor with the best data locality as possible based on the hostname of the executors pods and the hostname of Cassandra pods- Because Spark pods and Cassandra pods advertise different hostnames, Spark doesn't know which hostnames are physically identical. It can result in complete shuffle even executors and cassandra are colocatedResolving this issue would mean to get the physical information (nodes, racks, DC) and use it in Spark and Cassandra advertising/scheduling.
Le jeudi 12 janvier 2017 18:02:51 UTC+1, vincent gromakowski a écrit :Hi all,Does anyone have experience running Spark with CNI on Kubernetes and benefit from data locality with backend nodes like HDFS or Cassandra ? Is there any mechanism in Kubernetes to colocate containers ?TxVincent
--
You received this message because you are subscribed to the Google Groups "Kubernetes user discussion and Q&A" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kubernetes-use...@googlegroups.com.
To post to this group, send email to kubernet...@googlegroups.com.