I've implemented a custom task driver that spins up firecracker microvms and I'm having difficulties with the networking bit.
I'm creating tap networking interfaces for each VM and then assigning them an IP on the interface's range. For example, the tap interface for a task could be `172.18.0.1` and the VM would have the IP `172.18.0.2`. Now I'm returning a DriverNetwork struct (as per the StartTask function signature) with the `172.18.0.2` IP, a supplied port map (right now it's "http = 80") and AutoAdvertise set to true.
I'm running the nomad agent locally with `nomad agent -dev`.
Consul's health check shows as healthy, because I set the service check to use the driver address_mode.
The issue here is that nomad shows the allocation's addresses on the loopback interface (or whatever ip the network_interface is set in the client config.)
This is roughly my task config:
job "example" {
group "vms" {
task "vm" {
driver = "firecracker"
config {
port_map {
http = 80
}
}
resources {
cpu = 500 # 500 MHz
memory = 256 # 256MB
network {
mbits = 10
port "http" {
static = 80
}
}
}
service {
name = "app123"
port = "http"
address_mode = "driver"
check {
address_mode = "driver"
name = "alive"
type = "tcp"
interval = "10s"
timeout = "2s"
}
}
}
}
}
and I get this on an allocation:
$ nomad alloc status 672164e9
ID = 672164e9
Eval ID = 5dc2f315
Name = example.vms[0]
Node ID = 76c1e075
Job ID = example
Job Version = 6
Client Status = running
Client Description = Tasks are running
Desired Status = run
Desired Description = <none>
Created = 20s ago
Modified = 1s ago
Deployment ID = db938cae
Deployment Health = healthy
Task "vm" is "running"
Task Resources
CPU Memory Disk Addresses
500 MHz 256 MiB 300 MiB http: 127.0.0.1:80
Task Events:
Started At = 2019-05-06T20:19:12Z
Finished At = N/A
Total Restarts = 0
Last Restart = N/A
Recent Events:
Time Type Description
2019-05-06T16:19:12-04:00 Started Task started by client
2019-05-06T16:19:10-04:00 Task Setup Building Task Directory
2019-05-06T16:19:10-04:00 Received Task received by client
Is there a way to set it to show the 172.x IP I'm supplying it?
Does it matter? Should I just use Consul directly for service discovery? Consul appears to be storing the right address in its catalog (hence, the health check succeeds.)
Am I missing documentation on custom task drivers or are we expected to look into already-implemented task drivers (like docker, qemu, lxc, singularity, etc.)? The latter is fine, I know what it's like to run a small business :)
Thanks!