Docker, IPv6 and –net=”host”

As you recall from the last few blog posts, I went through basic IPv6 deployment for Docker Engine, Docker HubDocker Registry and Docker Swarm.  All of those configurations were using default Docker networking using the Docker-provided bridge layout.

Over the past few months, I have met with several customers who don’t use the Docker bridge setup at all. They use the Docker run option of –net=”host” to do what they call “native” networking.  Simply put, this flag has containers run using the networking configuration of the underlying Linux host.  The advantage of this is that it is brain-dead simple to understand, troubleshoot and use.  The one drawback to this setup is that you can very easily have port conflicts.  Meaning that if I run a container on port 80 which is listening natively on the Linux host and I run another container that needs that same port number then there is a conflict because only one listener can be active for that port at a time.

All of the customers I have met with have no need to run  containers on the same host that use the exact same port so this is a magical option for them.

IPv6 works just as it should in this networking scenario.  Let’s take a look at an example setup.

In the ‘ip a’ output below, you can see that ‘docker-v6-1’ has IPv6 addressing on the eth0 interface (See the Docker Engine post on enabling IPv6):

root@docker-v6-1:~# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 00:0c:29:f3:f8:48 brd ff:ff:ff:ff:ff:ff
    inet 192.168.80.200/24 brd 192.168.80.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fd15:4ba5:5a2b:1009:e91e:221:a4a0:2223/64 scope global temporary dynamic
       valid_lft 83957sec preferred_lft 11957sec
    inet6 fd15:4ba5:5a2b:1009:20c:29ff:fef3:f848/64 scope global dynamic
       valid_lft 83957sec preferred_lft 11957sec
    inet6 fe80::20c:29ff:fef3:f848/64 scope link
       valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 00:0c:29:f3:f8:52 brd ff:ff:ff:ff:ff:ff
4: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
    link/ether 02:42:93:33:cc:66 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 scope global docker0
       valid_lft forever preferred_lft forever
    inet6 fd15:4ba5:5a2b:100a::1/64 scope global
       valid_lft forever preferred_lft forever
    inet6 fe80::42:93ff:fe33:cc66/64 scope link
       valid_lft forever preferred_lft forever
    inet6 fe80::1/64 scope link
       valid_lft forever preferred_lft forever

As a test, I will run a NGINX container using the –net=”host” option. Before I run the container, I disable the “ipv6only” functionality in the NGINX default.conf file so that I have dual stack support.

Create/edit a NGINX default.conf file with the following setting changed:

listen       [::]:80 ipv6only=off;

Now, run the container with the –net-“host” option set and bind mount a volume to where that default.conf file is located on the Docker host:

root@docker-v6-1:~# docker run -itd --net="host" -v ~/default.conf:/etc/nginx/conf.d/default.conf nginx

Using the –net=”host”, the new container will use the same IPv4 and IPv6 address of the host and listen on port 80 (the default in the NGINX setup):

root@docker-v6-1:~# netstat -nlp | grep nginx
tcp6       0      0 :::80                   :::*                    LISTEN      2554/nginx: master

Test accessing the NGINX default page over IPv4 using the 192.168.80.200 address (reference eth0 above):

root@docker-v6-1:~# wget -O - http://192.168.80.200
--2016-04-21 11:00:50--  http://192.168.80.200/
Connecting to 192.168.80.200:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 612 
Saving to: ‘STDOUT’

 0% [                                                                                                 ] 0           --.-K/s              

Welcome to nginx!

.....[output truncated for clarity]

Test accessing the NGINX default page over IPv6 using the fd15:4ba5:5a2b:1009:20c:29ff:fef3:f848 address (Reference eth0 above)”

root@docker-v6-1:~# wget -O - http://[fd15:4ba5:5a2b:1009:20c:29ff:fef3:f848]
--2016-04-21 11:01:13--  http://[fd15:4ba5:5a2b:1009:20c:29ff:fef3:f848]/
Connecting to [fd15:4ba5:5a2b:1009:20c:29ff:fef3:f848]:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 612 
Saving to: ‘STDOUT’

 0% [                                                                                                 ] 0           --.-K/s              

Welcome to nginx!

.....[output truncated for clarity]

It works!

Remember that this is cool and all but watch out for port conflicts between containers.  Shown below is an example of what you will see if you run two containers on the same host with –net=”host” set and both use the same port.  You will see one or more of the containers exit and likely pop-up a message in the log that looks like this:

root@docker-v6-1:~# docker logs b47aa56cc822
2016/06/17 17:49:34 [emerg] 1#1: bind() to [::]:80 failed (98: Address already in use)
nginx: [emerg] bind() to [::]:80 failed (98: Address already in use)

Docker Swarm and etcd with IPv6

Hey gang!

It has been awhile since I posted here.  Part of that is due to pure laziness and business travel nonsense and the other part is that I was waiting on a couple of bugs to get resolved for Docker Swarm.  Well, the bugs have been resolved and I have gotten this whole thing to work, finally.

As you recall from the last few blog posts, I have went through basic IPv6 deployment for Docker Engine, Docker Hub and Docker Registry.  In this post I will talk about using IPv6 with Docker Swarm.

There is a bunch of content out there to teach you Swarm and how to get it running. Here is a link to the main Docker Swarm docs: https://docs.docker.com/swarm/.  What I care about is doing basic stuff with Docker Swarm over IPv6.

As with my previous Docker posts, I am not teaching you what the various Docker tools are and the same goes for this post with Docker Swarm.  Normally, I don’t even tell you how to install Docker Engine.  This time I will give you some basic guidance to get it running as you will need to grab the latest Docker Swarm Release Candidate that has some IPv6 issues merged.  There are a few issues that got resolved that allowed me to get this thing running:

One issue was with the command “docker -H tcp://[ipv6_address]”: https://github.com/docker/docker/issues/18879 and that was resolved in https://github.com/docker/docker/pull/16950. This fix was in Docker Engine 1.10.

Another issue was with Swarm discovery and the use of ‘swarm join’ command: https://github.com/docker/swarm/issues/1906 and resolved in  https://github.com/docker/docker/pull/20842. This fix is in Docker Swarm 1.2.0-rc1 and above.

Like most things with IPv6, you can reference a host via a DNS name just so it has a AAAA record or a local /etc/hosts reference.  In this post I will be using IPv6 literals so you see the real IPv6 address stuff working.  NOTE: As with web browsers or pretty much anything where you need to call an IPv6 address along with a port number, you must put the IPv6 address in [] brackets so that the address can be differentiated from the port number.

In this post I am expanding on my earlier Docker Engine post where I added the –ipv6 flag to the DOCKER_OPTS line in the /etc/default/docker config . In this post I am using the following line in the /etc/default/docker file:

DOCKER_OPTS="-H=tcp://0.0.0.0:2375 -H=unix:///var/run/docker.sock --ipv6 --fixed-cidr-v6=2001:db8:cafe:1::/64"

Note, that each host running docker will have a configuration similar to this only the IPv6 prefix will be different (see diagram and interface output below).

Let’s get started.

Below is a diagram that shows the high-level layout of the lab setup I am testing with.  I have four VMs running Ubuntu and deployed via Vagrant.  The four VMs are assigned IPv6 addresses out of the 2001:db8:cafe:14::/64 network. The “etcd-01” VM will run the etcd (a distributed key-value store) service, “manager-01” will run as the Docker Swarm Manager role and nodes “node-01” and “node-02” will be Docker Swarm nodes that run the containerized workloads.  As I stated above, the Docker daemon needs to be told which IPv6 prefix to use for containers. The “–ipv6 –fixed-cidr-v6” prefix shown in the “DOCKER_OPTS” line I referenced above is what does that. You can see on manager-01, node-01 and node-02 that under the “docker0” bridge there are prefixes specific to each host. manager-01 = 2001:db8:cafe:1::/64, node-01 = 2001:db8:cafe:2::/64 and node-02 = 2001:db8:cafe:3::/64.  Containers that launch on these nodes will get an IPv6 address out of the corresponding prefix on that specific Docker bridge.  Again, I spell all of this out in the blog post here.

swarm-v6-topo

Note: Vagrant is limiting in that when you add a new “config.vm.network” entry you have no option to add it to an existing NIC. Each line of “config.vm.network” creates a new NIC so I didn’t have the option to purely dual stack a single interface. Instead, it created an eth2 with the IPv6 address I assigned.  No biggie but it does look odd when you look at the interface output (example shown below).

vagrant@node-01:~$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 08:00:27:08:9d:5f brd ff:ff:ff:ff:ff:ff
    inet 10.0.2.15/24 brd 10.0.2.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::a00:27ff:fe08:9d5f/64 scope link
       valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 08:00:27:27:f1:28 brd ff:ff:ff:ff:ff:ff
    inet 192.168.20.12/24 brd 192.168.20.255 scope global eth1
       valid_lft forever preferred_lft forever
    inet6 fe80::a00:27ff:fe27:f128/64 scope link
       valid_lft forever preferred_lft forever
4: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 08:00:27:b3:7a:d2 brd ff:ff:ff:ff:ff:ff
    inet6 2001:db8:cafe:14::c/64 scope global
       valid_lft forever preferred_lft forever
    inet6 fe80::a00:27ff:feb3:7ad2/64 scope link
       valid_lft forever preferred_lft forever
5: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
    link/ether 02:42:fa:77:dd:e0 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 scope global docker0
       valid_lft forever preferred_lft forever
    inet6 2001:db8:cafe:2::1/64 scope global tentative
       valid_lft forever preferred_lft forever
    inet6 fe80::1/64 scope link tentative
       valid_lft forever preferred_lft forever

I am not recommending or talking about how to use Vagrant to deploy this as you may want to deploy Docker Swarm on another type of setup but you can grab my super basic Vagrantfile that I used.  I have a few Vagrant plugins such as virtualbox, vagrant-hostmanager and vagrant-vbguest, so make sure you have that stuff squared away first before you use my Vagrantfile – https://github.com/shmcfarl/swarm-etcd

When IPv6 is enabled for the Docker daemon, it deals with adding a route for that prefix that is defined as well as enabling the appropriate forwarding settings (see: https://docs.docker.com/engine/userguide/networking/default_network/ipv6/) but you do have to ensure that routing is correctly setup for getting to each Docker IPv6 prefix on all nodes or you will end up with broken connectivity to<>from containers running on different nodes.  The configurations below are to statically set the IPv6 routes on each node to reach the appropriate Docker IPv6 prefix:

There are three routes added. One for each Docker IPv6 prefix on manager-01, node-01 and node-02 (reference the diagram above) etcd-01:

vagrant@etcd-01:~$ sudo ip -6 route add 2001:db8:cafe:1::/64 via 2001:db8:cafe:14::b
vagrant@etcd-01:~$ sudo ip -6 route add 2001:db8:cafe:2::/64 via 2001:db8:cafe:14::c
vagrant@etcd-01:~$ sudo ip -6 route add 2001:db8:cafe:3::/64 via 2001:db8:cafe:14::d

Here is the IPv6 route table for etcd-01:

vagrant@etcd-01:~$ sudo ip -6 route
2001:db8:cafe:1::/64 via 2001:db8:cafe:14::b dev eth2  metric 1024
2001:db8:cafe:2::/64 via 2001:db8:cafe:14::c dev eth2  metric 1024
2001:db8:cafe:3::/64 via 2001:db8:cafe:14::d dev eth2  metric 1024
2001:db8:cafe:14::/64 dev eth2  proto kernel  metric 256
fe80::/64 dev eth0  proto kernel  metric 256
fe80::/64 dev eth1  proto kernel  metric 256
fe80::/64 dev eth2  proto kernel  metric 256

manager-01: (Note: only routes to node-01 and node-02 are needed as the etcd-01 service is not running in a container in this example:

vagrant@manager-01:~$ sudo ip -6 route add 2001:db8:cafe:2::/64 via 2001:db8:cafe:14::c
vagrant@manager-01:~$ sudo ip -6 route add 2001:db8:cafe:3::/64 via 2001:db8:cafe:14::d

Here is the IPv6 route table for manager-01:

vagrant@manager-01:~$ sudo ip -6 route
2001:db8:cafe:2::/64 via 2001:db8:cafe:14::c dev eth2  metric 1024
2001:db8:cafe:3::/64 via 2001:db8:cafe:14::d dev eth2  metric 1024
2001:db8:cafe:14::/64 dev eth2  proto kernel  metric 256
fe80::/64 dev eth0  proto kernel  metric 256
fe80::/64 dev eth1  proto kernel  metric 256
fe80::/64 dev eth2  proto kernel  metric 256

node-01:

vagrant@node-01:~$ sudo ip -6 route add 2001:db8:cafe:1::/64 via 2001:db8:cafe:14::b
vagrant@node-01:~$ sudo ip -6 route add 2001:db8:cafe:3::/64 via 2001:db8:cafe:14::d

node-02:

vagrant@node-02:~$ sudo ip -6 route add 2001:db8:cafe:1::/64 via 2001:db8:cafe:14::b
vagrant@node-02:~$ sudo ip -6 route add 2001:db8:cafe:2::/64 via 2001:db8:cafe:14::c

Test reachability between each nodes (i.e. node-01 can reach etcd-01 at 2001:db8:cafe:14::a):

vagrant@node-01:~$ ping6 2001:db8:cafe:14::a
PING 2001:db8:cafe:14::a(2001:db8:cafe:14::a) 56 data bytes
64 bytes from 2001:db8:cafe:14::a: icmp_seq=1 ttl=64 time=0.638 ms
64 bytes from 2001:db8:cafe:14::a: icmp_seq=2 ttl=64 time=0.421 ms
64 bytes from 2001:db8:cafe:14::a: icmp_seq=3 ttl=64 time=0.290 ms

The basic setup of each node is complete and it is time to setup etcd and Docker Swarm.

Disclaimer: I am not recommending a specific approach for deploying etcd or Docker Swarm. Please check the documentation for each one of those for the recommended deployment options.

On the etcd-01 node, download and untar etcd:

curl -L  https://github.com/coreos/etcd/releases/download/v2.2.5/etcd-v2.2.5-linux-amd64.tar.gz -o etcd-v2.2.5-linux-amd64.tar.gz

tar xzvf etcd-v2.2.5-linux-amd64.tar.gz

rm etcd-v2.2.5-linux-amd64.tar.gz

cd etcd-v2.2.5-linux-amd64/

In the example below, I am setting a setting a name with the IPv6 address of the etcd-01 node and then running etcd on the console:

vagrant@etcd-01:~/etcd-v2.2.5-linux-amd64$ MY_IPv6="[2001:db8:cafe:14::a]"
vagrant@etcd-01:~/etcd-v2.2.5-linux-amd64$
vagrant@etcd-01:~/etcd-v2.2.5-linux-amd64$ ./etcd \
> -initial-advertise-peer-urls http://$MY_IPv6:2380 \
> -listen-peer-urls="http://0.0.0.0:2380,http://0.0.0.0:7001" \
> -listen-client-urls="http://0.0.0.0:2379,http://0.0.0.0:4001" \
> -advertise-client-urls="http://$MY_IPv6:2379" \
> -initial-cluster-token etcd-01 \
> -initial-cluster="default=http://$MY_IPv6:2380" \
> -initial-cluster-state new
2016-04-01 00:34:41.915304 I | etcdmain: etcd Version: 2.2.5
2016-04-01 00:34:41.915508 I | etcdmain: Git SHA: bc9ddf2
2016-04-01 00:34:41.915922 I | etcdmain: Go Version: go1.5.3
2016-04-01 00:34:41.916287 I | etcdmain: Go OS/Arch: linux/amd64
2016-04-01 00:34:41.916676 I | etcdmain: setting maximum number of CPUs to 1, total number of available CPUs is 1
2016-04-01 00:34:41.917671 W | etcdmain: no data-dir provided, using default data-dir ./default.etcd
2016-04-01 00:34:41.917858 N | etcdmain: the server is already initialized as member before, starting as etcd member...
2016-04-01 00:34:41.918340 I | etcdmain: listening for peers on http://0.0.0.0:2380
2016-04-01 00:34:41.918809 I | etcdmain: listening for peers on http://0.0.0.0:7001
2016-04-01 00:34:41.919324 I | etcdmain: listening for client requests on http://0.0.0.0:2379
2016-04-01 00:34:41.919644 I | etcdmain: listening for client requests on http://0.0.0.0:4001
2016-04-01 00:34:41.920224 I | etcdserver: name = default
2016-04-01 00:34:41.920416 I | etcdserver: data dir = default.etcd
2016-04-01 00:34:41.921540 I | etcdserver: member dir = default.etcd/member
2016-04-01 00:34:41.921949 I | etcdserver: heartbeat = 100ms
2016-04-01 00:34:41.922321 I | etcdserver: election = 1000ms
2016-04-01 00:34:41.922612 I | etcdserver: snapshot count = 10000
2016-04-01 00:34:41.923036 I | etcdserver: advertise client URLs = http://[2001:db8:cafe:14::a]:2379
2016-04-01 00:34:41.923664 I | etcdserver: restarting member d68162a449565404 in cluster 89b5c84d35f7a1e at commit index 10
2016-04-01 00:34:41.923919 I | raft: d68162a449565404 became follower at term 2
2016-04-01 00:34:41.924187 I | raft: newRaft d68162a449565404 [peers: [], term: 2, commit: 10, applied: 0, lastindex: 10, lastterm: 2]
2016-04-01 00:34:41.924780 I | etcdserver: starting server... [version: 2.2.5, cluster version: to_be_decided]
2016-04-01 00:34:41.926767 N | etcdserver: added local member d68162a449565404 [http://[2001:db8:cafe:14::a]:2380] to cluster 89b5c84d35f7a1e
2016-04-01 00:34:41.926885 N | etcdserver: set the initial cluster version to 2.2
2016-04-01 00:34:43.325015 I | raft: d68162a449565404 is starting a new election at term 2
2016-04-01 00:34:43.325357 I | raft: d68162a449565404 became candidate at term 3
2016-04-01 00:34:43.326087 I | raft: d68162a449565404 received vote from d68162a449565404 at term 3
2016-04-01 00:34:43.326594 I | raft: d68162a449565404 became leader at term 3
2016-04-01 00:34:43.327107 I | raft: raft.node: d68162a449565404 elected leader d68162a449565404 at term 3
2016-04-01 00:34:43.328579 I | etcdserver: published {Name:default ClientURLs:[http://[2001:db8:cafe:14::a]:2379]} to cluster 89b5c84d35f7a1e

Note: You can run etcd with the “-debug” flag as well which is very helpful in seeing the GETs and PUTS for the swarm nodes.

On the manager-01 node, fire up the Swarm manager. I am doing a ‘docker run’ and publishing the manager port (4000 – user defined) on the host which maps to the Docker Engine port (2375) for the Swarm container. I am specifically calling for the Swarm 1.2.0-rc1 image that I referenced before which has the latest IPv6 bug fixes. The manager role is launched in the container and it references etcd at etcd-01’s IPv6 address and port (see “client URLs” in etcd output above) (Note: I am running it without the ‘-d’ flag so that it runs in the foreground):

vagrant@manager-01:~$ docker run -p 4000:2375 swarm:1.2.0-rc1 manage etcd://[2001:db8:cafe:14::a]:2379
time="2016-04-01T00:36:01Z" level=info msg="Initializing discovery without TLS"
time="2016-04-01T00:36:01Z" level=info msg="Listening for HTTP" addr=":2375" proto=tcp

On each Swarm node, kick off a “swarm join”. Similar to the manager example above, a “docker run” is used to launch a container using the 1.2.0-rc1 Swarm image. The node is doing a “join” (participating in the discovery process) and advertising its own IPv6 address and references etcd at etcd-01’s IPv6 address and port (see why testing reachability from the node AND its containers was required?) 😉
node-01:

vagrant@node-01:~$ docker run swarm:1.2.0-rc1 join --advertise=[2001:db8:cafe:14::c]:2375 etcd://[2001:db8:cafe:14::a]:2379
time="2016-04-01T00:36:36Z" level=info msg="Initializing discovery without TLS"
time="2016-04-01T00:36:36Z" level=info msg="Registering on the discovery service every 1m0s..." addr="[2001:db8:cafe:14::c]:2375" discovery="etcd://[2001:db8:cafe:14::a]:2379"

node-02:

vagrant@node-02:~$ docker run swarm:1.2.0-rc1 join --advertise=[2001:db8:cafe:14::d]:2375 etcd://[2001:db8:cafe:14::a]:2379
time="2016-04-01T00:36:57Z" level=info msg="Initializing discovery without TLS"
time="2016-04-01T00:36:57Z" level=info msg="Registering on the discovery service every 1m0s..." addr="[2001:db8:cafe:14::d]:2375" discovery="etcd://[2001:db8:cafe:14::a]:2379"

Back on the Manager node you will see some messages like these:

time="2016-04-01T00:37:39Z" level=info msg="Registered Engine node-01 at [2001:db8:cafe:14::c]:2375"
time="2016-04-01T00:38:00Z" level=info msg="Registered Engine node-02 at [2001:db8:cafe:14::d]:2375"

SWEET!

Now, do some Docker-looking stuff. Take a look at the running Docker containers on the Swarm cluster. Point Docker at the manager-01 IPv6 address and published port. The “docker ps -a” shows that the swarm containers (running the join –advertise) is running on node-01 and node-02.

vagrant@node-01:~$ docker -H tcp://[2001:db8:cafe:14::b]:4000 ps -a
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS               NAMES
335952181ea4        swarm:1.2.0-rc1     "/swarm join --advert"   5 minutes ago       Up 5 minutes        2375/tcp            node-02/compassionate_wilson
057159f355b0        swarm:1.2.0-rc1     "/swarm join --advert"   5 minutes ago       Up 5 minutes        2375/tcp            node-01/adoring_mirzakhani

“docker images” shows the swarm image:

vagrant@node-01:~$ docker -H tcp://[2001:db8:cafe:14::b]:4000 images
REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
swarm               1.2.0-rc1           2fe11064a124        8 days ago          18.68 MB

“docker info” shows basic info about the Swarm cluster to include the two nodes (node-01/node-02):

vagrant@node-01:~$ docker -H tcp://[2001:db8:cafe:14::b]:4000 info
Containers: 2
 Running: 2
 Paused: 0
 Stopped: 0
Images: 2
Server Version: swarm/1.2.0
Role: primary
Strategy: spread
Filters: health, port, dependency, affinity, constraint
Nodes: 2
 node-01: [2001:db8:cafe:14::c]:2375
  └ Status: Healthy
  └ Containers: 1
  └ Reserved CPUs: 0 / 1
  └ Reserved Memory: 0 B / 1.019 GiB
  └ Labels: executiondriver=, kernelversion=3.13.0-83-generic, operatingsystem=Ubuntu 14.04.4 LTS, storagedriver=aufs
  └ Error: (none)
  └ UpdatedAt: 2016-04-01T00:42:33Z
 node-02: [2001:db8:cafe:14::d]:2375
  └ Status: Healthy
  └ Containers: 1
  └ Reserved CPUs: 0 / 1
  └ Reserved Memory: 0 B / 1.019 GiB
  └ Labels: executiondriver=, kernelversion=3.13.0-83-generic, operatingsystem=Ubuntu 14.04.4 LTS, storagedriver=aufs
  └ Error: (none)
  └ UpdatedAt: 2016-04-01T00:42:47Z
Plugins:
 Volume:
 Network:
Kernel Version: 3.13.0-83-generic
Operating System: linux
Architecture: amd64
CPUs: 2
Total Memory: 2.038 GiB
Name: 39ed14412c1f
Docker Root Dir:
Debug Mode (client): false
Debug Mode (server): false
WARNING: No kernel memory limit support

Run another container:

vagrant@node-01:~$ docker -H tcp://[2001:db8:cafe:14::b]:4000 run -it ubuntu /bin/bash
root@61324b3d7117:/#

Check that the container shows up under “docker ps -a” and check which Swarm node it is running on:

vagrant@node-01:~$ docker -H tcp://[2001:db8:cafe:14::b]:4000 ps -a
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS               NAMES
61324b3d7117        ubuntu              "/bin/bash"              19 seconds ago      Up 18 seconds                           node-01/tender_blackwell
335952181ea4        swarm:1.2.0-rc1     "/swarm join --advert"   7 minutes ago       Up 7 minutes        2375/tcp            node-02/compassionate_wilson
057159f355b0        swarm:1.2.0-rc1     "/swarm join --advert"   8 minutes ago       Up 8 minutes        2375/tcp            node-01/adoring_mirzakhani

Run another container:

vagrant@node-01:~$ docker -H tcp://[2001:db8:cafe:14::b]:4000 run -itd ubuntu /bin/bash

Check that the container shows up and that it is running on the other Swarm node (because Swarm scheduled it there):

vagrant@node-01:~$ docker -H tcp://[2001:db8:cafe:14::b]:4000 ps -a
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS               NAMES
650d6678eba0        ubuntu              "/bin/bash"              12 seconds ago      Up 11 seconds                           node-02/pedantic_ardinghelli
61324b3d7117        ubuntu              "/bin/bash"              2 minutes ago       Up 2 minutes                            node-01/tender_blackwell
335952181ea4        swarm:1.2.0-rc1     "/swarm join --advert"   9 minutes ago       Up 9 minutes        2375/tcp            node-02/compassionate_wilson
057159f355b0        swarm:1.2.0-rc1     "/swarm join --advert"   9 minutes ago       Up 9 minutes        2375/tcp            node-01/adoring_mirzakhani

Check the IPv6 address on each container (Hint: Each container running on a different Swarm node should have a different IPv6 prefix):

vagrant@node-01:~$ docker -H tcp://[2001:db8:cafe:14::b]:4000 attach 6132
root@61324b3d7117:/# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
8: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
    link/ether 02:42:ac:11:00:03 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.3/16 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 2001:db8:cafe:2:0:242:ac11:3/64 scope global
       valid_lft forever preferred_lft forever
    inet6 fe80::42:acff:fe11:3/64 scope link
       valid_lft forever preferred_lft forever
vagrant@node-01:~$ docker -H tcp://[2001:db8:cafe:14::b]:4000 attach 650d
root@650d6678eba0:/# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
8: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
    link/ether 02:42:ac:11:00:03 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.3/16 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 2001:db8:cafe:3:0:242:ac11:3/64 scope global
       valid_lft forever preferred_lft forever
    inet6 fe80::42:acff:fe11:3/64 scope link
       valid_lft forever preferred_lft forever

Check IPv6 reachability between the two containers that are running on different Swarm nodes:

root@61324b3d7117:/# ping6 2001:db8:cafe:3:0:242:ac11:3
PING 2001:db8:cafe:3:0:242:ac11:3(2001:db8:cafe:3:0:242:ac11:3) 56 data bytes
64 bytes from 2001:db8:cafe:3:0:242:ac11:3: icmp_seq=1 ttl=62 time=0.993 ms
64 bytes from 2001:db8:cafe:3:0:242:ac11:3: icmp_seq=2 ttl=62 time=0.493 ms
64 bytes from 2001:db8:cafe:3:0:242:ac11:3: icmp_seq=3 ttl=62 time=0.362 ms

Very nice! We have one container on node-01 with an IPv6 address from the IPv6 prefix that is set in the DOCKER_OPTS line on node-01 and we have another container running on node-02 that has an IPv6 address from a different IPv6 prefix from the DOCKER_OPTS line on node-02. The routes we created earlier are allowing these nodes and containers to communicate with each other over IPv6.

TROUBLESHOOTING:

Here is a quick summary of troubleshooting tips:

  • Make sure you are on Docker Engine 1.11 or above
  • Make sure you are on Docker Swarm 1.2.0 or above
  • Make sure that you can ping6 between every node and container from every other node. The routes created on each node (or on the first hop router) are critical in ensuring the containers can reach each other. This is the #1 issue with making this work correctly.
  • Run etcd with the “-d” flag for debugging
  • Run “docker swarm” with the debugging enabled (“debug manage”) on the manager
  • Check the etcd node to make sure the Swarm nodes are registered in the K/V store:
vagrant@node-01:~$ curl -L -g http://[2001:db8:cafe:14::a]:2379/v2/keys/?recursive=true | json_pp
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 661 100 661 0 0 80560 0 --:--:-- --:--:-- --:--:-- 94428
{
"action" : "get",
"node" : {
"nodes" : [
{
"dir" : true,
"key" : "/docker",
"modifiedIndex" : 5,
"createdIndex" : 5,
"nodes" : [
{
"createdIndex" : 5,
"nodes" : [
{
"createdIndex" : 5,
"nodes" : [
{
"key" : "/docker/swarm/nodes/[2001:db8:cafe:14::c]:2375",
"modifiedIndex" : 50,
"ttl" : 131,
"value" : "[2001:db8:cafe:14::c]:2375",
"expiration" : "2016-04-01T01:02:39.198366301Z",
"createdIndex" : 50
},
{
"ttl" : 153,
"modifiedIndex" : 51,
"key" : "/docker/swarm/nodes/[2001:db8:cafe:14::d]:2375",
"value" : "[2001:db8:cafe:14::d]:2375",
"expiration" : "2016-04-01T01:03:00.867453746Z",
"createdIndex" : 51
}
],
"dir" : true,
"key" : "/docker/swarm/nodes",
"modifiedIndex" : 5
}
],
"key" : "/docker/swarm",
"modifiedIndex" : 5,
"dir" : true
}
]
}
],
"dir" : true
}
}

Enjoy!

Docker Registry with IPv6

If you have been following along, you know that I started a series of posts aimed at identifying IPv6 support for the various Docker components/services.

The first blog post was focused on Docker Engine, which has pretty reasonable support for basic IPv6.

The second blog post was focused on Docker Hub, which has zero IPv6 support. This is due to it being hosted on AWS and no IPv6-enabled front-end is deployed.

This blog post will focus on Docker Registry.

As I stated in the past two blog entries, I am not here to teach you Docker (what it is, how to deploy it, etc..). I am simply showing basic functionality of various Docker components/services when used with IPv6.

For information on setting up your own Docker Registry, check out:

https://docs.docker.com/registry/

I am using Docker version 1.8.3, Docker Compose version 1.4.2 and Docker Registry version 2.

I am using the same Ubuntu 14.04.3 hosts that I have used in the last two blog posts.

My setup uses two hosts with the following configuration:

  • docker-v6-1:
    • Role: Docker Registry
    • IPv6 Address: fd15:4ba5:5a2b:1009:20c:29ff:fef3:f848/64
  • docker-v6-2:
    • Role: Docker Host/Client
    • IPv6 Address: fd15:4ba5:5a2b:1009:20c:29ff:febb:cbf8/64

My Docker Registry (running on ‘docker-v6-1’) uses self-signed cert and is started using either the ‘docker run’ syntax or Docker Compose. I show both examples below:

docker run:

docker run -d -p 5000:5000 --restart=always --name registry \
  -v `pwd`/certs:/certs \
  -e REGISTRY_HTTP_TLS_CERTIFICATE=/certs/domain.crt \
  -e REGISTRY_HTTP_TLS_KEY=/certs/domain.key \
  registry:2

Use Docker Compose to run Docker Registry
I am using a file named “docker-compose.yml” to launch my registry.

registry:
  restart: always
  image: registry:2
  ports:
    - 5000:5000
  environment:
    REGISTRY_HTTP_TLS_CERTIFICATE: /certs/domain.crt
    REGISTRY_HTTP_TLS_KEY: /certs/domain.key
  volumes:
    - /certs:/certs

Run Docker Compose:

docker-compose up -d

Verify Connectivity
On the Docker host/client (“docker-v6-2”),  verify that the Docker Registry host (“docker-v6-1”) can be reached over IPv6:

root@docker-v6-2:~# ping6 -n docker-v6-1.example.com
PING docker-v6-1.example.com(fd15:4ba5:5a2b:1009:20c:29ff:fef3:f848) 56 data bytes
64 bytes from fd15:4ba5:5a2b:1009:20c:29ff:fef3:f848: icmp_seq=1 ttl=64 time=0.402 ms
64 bytes from fd15:4ba5:5a2b:1009:20c:29ff:fef3:f848: icmp_seq=2 ttl=64 time=0.367 ms

Docker Registry Push/Pull Verification
Now that connectivity to the Docker Registry host is working, tag a local Docker image and then push it (over IPv6) to the Docker Registry:

root@docker-v6-2:~# docker tag ubuntu docker-v6-1.example.com:5000/ubuntu

root@docker-v6-2:~# docker push docker-v6-1.example.com:5000/ubuntu
The push refers to a repository [docker-v6-1.example.com:5000/ubuntu] (len: 1)
a005e6b7dd01: Image successfully pushed
002fa881df8a: Image successfully pushed
66395c31eb82: Image successfully pushed
0105f98ced6d: Image successfully pushed
latest: digest: sha256:167f1c34ead8f1779db7827a55de0d517b7f0e015d8f08cf032c7e5cd6979a84 size: 6800

A tcpdump on the Docker Registry shows traffic between docker-v6-1 and docker-v6-2 for the ‘push’ using the previously defined port 5000:

root@docker-v6-1:~# tcpdump -n -vvv ip6 -i eth0
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
19:36:09.283820 IP6 (hlim 64, next-header TCP (6) payload length: 40) fd15:4ba5:5a2b:1009:20c:29ff:febb:cbf8.56066 > fd15:4ba5:5a2b:1009:20c:29ff:fef3:f848.5000: Flags [S], cksum 0x65b1 (correct), seq 2754754540, win 28800, options [mss 1440,sackOK,TS val 645579 ecr 0,nop,wscale 7], length 0
19:36:09.283930 IP6 (hlim 64, next-header TCP (6) payload length: 40) fd15:4ba5:5a2b:1009:20c:29ff:fef3:f848.5000 > fd15:4ba5:5a2b:1009:6540:bb36:2e23:f5a2.56066: Flags [S.], cksum 0xcd92 (incorrect -> 0x50bd), seq 1577491031, ack 2754754541, win 28560, options [mss 1440,sackOK,TS val 859496 ecr 645579,nop,wscale 7], length 0

It works!

Happy Dockering,

Shannon

Docker Hub – We don’t need no stinking IPv6!

I got a lot of great feedback on my last post about the basic configuration of Docker Engine with IPv6.  The next topic that I wanted to cover (and was excited to test) was Docker Hub with IPv6.

My hopes and dreams were smashed in about 15 seconds when I found out that IPv6 is not enabled for hub.docker.com and none of my docker login, docker search, docker pull, docker push or even a browser session to https://hub.docker.com would work over IPv6.

An nslookup reveals no IPv6. Nada. Zip. :

> hub.docker.com
Server:	208.67.222.222
Address:	208.67.222.222#53

Non-authoritative answer:
hub.docker.com	canonical name = elb-default.us-east-1.aws.dckr.io.

Authoritative answers can be found from:
> docker.com
Server:	208.67.222.222
Address:	208.67.222.222#53

Non-authoritative answer:
docker.com	nameserver = ns-1289.awsdns-33.org.
docker.com	nameserver = ns-1981.awsdns-55.co.uk.
docker.com	nameserver = ns-207.awsdns-25.com.
docker.com	nameserver = ns-568.awsdns-07.net.
docker.com
origin = ns-207.awsdns-25.com
mail addr = awsdns-hostmaster.amazon.com
serial = 1
refresh = 7200
retry = 900
expire = 1209600
minimum = 86400

Insert your favorite sad panda image here. :-(

I know the Docker folks are in a bind with this since they are using Amazon who is likely the last cloud provider on earth who does not have real IPv6 support (none in EC2-VPC but you can in EC2-Classic).

I will move on from Docker Hub and start checking out other Docker stuff like Registry, Compose, etc…

See you next time. Sorry for the epic failure of this post.

Shannon

Basic Configuration of Docker Engine with IPv6

This is the start of a blog series dedicated to enabling IPv6 for the various components in the Docker toolbox.

I am starting the series off by talking about the basic configuration for enabling IPv6 with Docker Engine.  There are some good examples that the Docker folks have put together that you will want to read through: https://docs.docker.com/engine/userguide/networking/default_network/ipv6/

Disclaimer: I am not teaching you Docker.  There are a zillion places to go learn Docker.  I am making the dangerous assumption that you already know what Docker is, how to install it and how to use it.

I am also not teaching you IPv6.  There are also a zillion places to go learn IPv6.  I am making the even more dangerous assumption that you know what IPv6 is, what the addressing details are and how to use it.

Diagram

The graphic below shows a high-level view of my setup.  I have two Docker hosts (docker-v6-1 and docker-v6-2) that are running Ubuntu 14.04.  As of this first post, I am using Docker 1.8.2. Both hosts are attached to a Layer-2 switch via their eth0 interfaces.  I am using static IPv4 addresses (not relevant here) for the host and StateLess Address AutoConfiguration (SLAAC) for IPv6 address assignment out of the Unique Local Address (ULA) FD15:4BA5:5A2B:1009::/64 range.

Blog- Docker Engine - Basic IPv6

Preparing the Docker Host for IPv6:

As I mentioned before, I am using SLAAC-based assignment for IPv6 addressing on each host.  You can use static, SLAAC, Stateful DHCPv6 or Stateless DHCPv6 if you want.  I am not covering any of that as they don’t pertain directly to Docker.

Each Docker host as an IPv6 address and can reach the outside world:

2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 00:0c:29:f3:f8:48 brd ff:ff:ff:ff:ff:ff
    inet 192.168.80.200/24 brd 192.168.80.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fd15:4ba5:5a2b:1009:cc7:2609:38b7:e6c6/64 scope global temporary dynamic
       valid_lft 86388sec preferred_lft 14388sec
    inet6 fd15:4ba5:5a2b:1009:20c:29ff:fef3:f848/64 scope global dynamic
       valid_lft 86388sec preferred_lft 14388sec
    inet6 fe80::20c:29ff:fef3:f848/64 scope link
       valid_lft forever preferred_lft forever
root@docker-v6-1:~# ping6 -n www.google.com
PING www.google.com(2607:f8b0:400f:802::2004) 56 data bytes
64 bytes from 2607:f8b0:400f:802::2004: icmp_seq=1 ttl=255 time=13.7 ms
64 bytes from 2607:f8b0:400f:802::2004: icmp_seq=2 ttl=255 time=14.5 ms

Since I am using router advertisements (RAs) for my IPv6 address assignment, it is important to force the acceptance of RAs even when forwarding is enabled:

sysctl net.ipv6.conf.eth0.accept_ra=2

Now, if you haven’t already, install Docker using whatever method you are comfortable with.  Again, this is not a primer on Docker. :-)

Docker! Docker! Docker!

Now that the IPv6 basics are there on the host and you have Docker installed, it is time to set the IPv6 subnet for Docker.  You can do this via the ‘docker daemon’ command or you can set it in the /etc/default/docker file.  Below is the example using the ‘docker daemon’ command. Here, I am setting the fixed IPv6 prefix as FD15:4BA5:5A2B:100A::/64.

root@docker-v6-1:~# docker daemon --ipv6 --fixed-cidr-v6="fd15:4ba5:5a2b:100a::/64

Here is the same IPv6 prefix being set, but this is using the /etc/default/docker file:

DOCKER_OPTS="--dns 8.8.8.8 --dns 8.8.4.4 --ipv6 --fixed-cidr-v6=fd15:4ba5:5a2b:100a::/64"

Let’s fire up a container and see what happens. The example below shows that the container got an IPv6 address out of the prefix we set above:

root@docker-v6-1:~# docker run -it ubuntu bash
root@aea405985524:/# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
5: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
    link/ether 02:42:ac:11:00:01 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fd15:4ba5:5a2b:100a:0:242:ac11:1/64 scope global
       valid_lft forever preferred_lft forever
    inet6 fe80::42:acff:fe11:1/64 scope link
       valid_lft forever preferred_lft forever

Ping the outside world:

root@aea405985524:/# ping6 www.google.com
PING www.google.com(den03s10-in-x04.1e100.net) 56 data bytes
64 bytes from den03s10-in-x04.1e100.net: icmp_seq=1 ttl=254 time=14.6 ms
64 bytes from den03s10-in-x04.1e100.net: icmp_seq=2 ttl=254 time=12.5 ms

Fire up another container and ping the first container over IPv6:

root@docker-v6-1:~# docker run -it ubuntu bash
root@e8a8662fad76:/# ping6 fd15:4ba5:5a2b:100a:0:242:ac11:1
PING fd15:4ba5:5a2b:100a:0:242:ac11:1(fd15:4ba5:5a2b:100a:0:242:ac11:1) 56 data bytes
64 bytes from fd15:4ba5:5a2b:100a:0:242:ac11:1: icmp_seq=1 ttl=64 time=0.094 ms
64 bytes from fd15:4ba5:5a2b:100a:0:242:ac11:1: icmp_seq=2 ttl=64 time=0.057 ms
Add the 2nd Docker host

Sweet! We have one host (docker-v6-1) running with two containers that can reach each other over IPv6 and reach the outside world.  Now let’s add the second Docker host (docker-v6-2).

Repeat all of the steps from above but change the IPv6 prefix that Docker is going to use. Here is an example using FD15:4BA5:5A2B:100B::/64:

DOCKER_OPTS="--dns 8.8.8.8 --dns 8.8.4.4 --ipv6 --fixed-cidr-v6=fd15:4ba5:5a2b:100b::/64

In order to have containers on one host reach containers on another host over IPv6, we have to figure out routing. You can enable host-based routing (the example I will show below) or you can just use the Layer-3 infrastructure you likely already have in your Data Center. I would recommend the latter option. Remember that Docker is not doing NAT for IPv6 so you have to have some mechanism to allow for pure L3 reachability between the various IPv6 address spaces you are using.
Here is an example of using host-based routing on each of the two Docker hosts. First, configure a static IPv6 route on the first Docker host (i.e. docker-v6-1). The route statement below says to route all traffic destined for the fd15:4ba5:5a2b:100b::/64 prefix (the one being used on docker-v6-2) to the IPv6 address of the docker-v6-2 eth0 interface.

root@docker-v6-1:~# ip -6 route add fd15:4ba5:5a2b:100b::/64 via fd15:4ba5:5a2b:1009:20c:29ff:febb:cbf8

Now, do the same on the 2nd Docker host (docker-v6-2). This route statement says to route all traffic destined for the fd15:4ba5:5a2b:100a::/64 prefix (used on docker-v6-1) to the IPv6 address of the docker-v6-1 eth0 interface:

root@docker-v6-2:~# ip -6 route add fd15:4ba5:5a2b:100a::/64 via fd15:4ba5:5a2b:1009:20c:29ff:fef3:f848

The final test is to ping from one container on docker-v6-1 to a container on docker-v6-2:

root@e8a8662fad76:/# ping6 fd15:4ba5:5a2b:100b:0:242:ac11:1
PING fd15:4ba5:5a2b:100b:0:242:ac11:1(fd15:4ba5:5a2b:100b:0:242:ac11:1) 56 data bytes
64 bytes from fd15:4ba5:5a2b:100b:0:242:ac11:1: icmp_seq=3 ttl=62 time=0.570 ms
64 bytes from fd15:4ba5:5a2b:100b:0:242:ac11:1: icmp_seq=4 ttl=62 time=0.454 ms

It works!

We will build on this scenario in upcoming posts as we walk through enabling IPv6 functionality in a variety of Docker network scenarios and other Docker services.

Shannon

VMware Fusion 8 Pro – IPv6 NAT

I just upgraded to VMware Fusion 8 Pro and noticed that there was a new feature in there for IPv6 NAT. You all know my views on NAT, especially IPv6 NAT but we won’t get into all of that here. :-)

It looks as though you have to be on Fusion 8 Pro to get this feature. It is super simple to enable.

Below is a basic view of my topology.  My Mac (using the en0 adapter) has an IPv6 address from my local CPE (connected to Comcast).  I have a Linux VM attached to a custom network (vmnet2) that has the IPv4 subnet of 172.16.1.0/24 and the autogenerated (by Fusion) Unique Local IPv6 prefix of FD15:4BA5:5A2B:1002::/64
diagram-fusion-v6-nat

Here is what the Linux host looks like prior to enabling IPv6 NAT:

2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 00:0c:29:2e:cf:c0 brd ff:ff:ff:ff:ff:ff
    inet 172.16.1.129/24 brd 172.16.1.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::20c:29ff:fe2e:cfc0/64 scope link
       valid_lft forever preferred_lft forever

In VMware Fusion 8 Pro, you can enable IPv6 NAT for a network by going into VMware Fusion > Preferences > Network > then select the custom network that you want to enable IPv6 NAT on. The graphic shown below is what my vmnet2 network looks like:
screenshot_130

With IPv6 NAT enabled, the Linux host now has an IPv6 address:

2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 00:0c:29:2e:cf:c0 brd ff:ff:ff:ff:ff:ff
    inet 172.16.1.129/24 brd 172.16.1.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fd15:4ba5:5a2b:1002:24fd:5bf0:baba:4866/64 scope global temporary dynamic
       valid_lft 86398sec preferred_lft 14398sec
    inet6 fd15:4ba5:5a2b:1002:20c:29ff:fe2e:cfc0/64 scope global dynamic
       valid_lft 86398sec preferred_lft 14398sec
    inet6 fe80::20c:29ff:fe2e:cfc0/64 scope link
       valid_lft forever preferred_lft forever

You can see that the Linux host gets two addresses out of the ULA prefix that was autogenerated by Fusion (see the graphic). The first address is the IPv6 privacy extension address and the second is the EUI-64 derived IPv6 address.

I can now ping from the Linux host to the outside (via IPv6 NAT):

localadmin@v6-nat-demo:~$ ping6 -n www.google.com
PING www.google.com(2607:f8b0:400f:803::2004) 56 data bytes
64 bytes from 2607:f8b0:400f:803::2004: icmp_seq=1 ttl=255 time=12.7 ms
64 bytes from 2607:f8b0:400f:803::2004: icmp_seq=2 ttl=255 time=15.0 ms
64 bytes from 2607:f8b0:400f:803::2004: icmp_seq=3 ttl=255 time=14.8 ms

Ping, the ultimate test of success, works. :-)

Thanks,
Shannon

Using OpenStack Heat to Deploy an IPv6-enabled Instance

In this post I will talk about how to use a basic OpenStack Heat template to build a dual-stack (IPv4 and IPv6) Neutron network, router and launch an instance that will use StateLess Address AutoConfiguration (SLAAC) for IPv6 address assignment.

In the May, 2015 post I discussed, in detail, how to build a dual-stack tenant and use a variety of IPv6 address assignment methods (SLAAC, Stateless DHCPv6, Stateful DHCPv6) for OpenStack instances.

To build on the previous post, I wanted to show a basic Heat template for building an IPv4 and IPv6 network with the basic parameters such as CIDR, gateway and pools.  I want Heat to also launch an OpenStack instance (what Heat calls a Server) that attaches to those networks.  Finally, the template will create a new security group that will create security group rules for both IPv4 and IPv6.

The Heat template that I am referring to in this post can be found here: https://github.com/shmcfarl/my-heat-templates/blob/master/single-v6-test.yaml. That specific template is using SLAAC.  You can also take a look at this template which uses Stateless DHCPv6: https://github.com/shmcfarl/my-heat-templates/blob/master/stateless-demo-slb-trusty.yaml. You can modify the template from there to play around with DHCPv6 Stateful. Hint, it’s all in the resource properties of:

ipv6_address_mode: <slaac/dhcpv6-stateless/dhcpv6-stateful>
ipv6_ra_mode: <slaac/dhcpv6-stateless/dhcpv6-stateful>)

Heat Template

I am not going to teach you Heat. There are countless resources out there that do a much better job than I ever could on teaching Heat.  A couple of places to start are:

The Heat Orchestration Template (HOT) Guide is a great resource for finding the various parameters, resources and properties that can be used in Heat.

The primary place to dig into IPv6 capabilities within Heat is in the Heat template guide under OS::Neutron::Subnet.  You can jump to it here: http://docs.openstack.org/hot-reference/content/OS__Neutron__Subnet.html.  I am not going to walk through all of what is in the guide but I will point out specific properties that I have used in the example Heat template I referenced before.

Let’s take a look at the IPv6-specific parts of the example template.  In the example template file I have created a parameter section that includes various items such as key, image, flavor and so on.  The IPv6 section includes:

  • The private IPv6 network (2001:db8:cafe:1e::/64)
  • The private IPv6 gateway (2001:db8:cafe:1e::1)
  • The beginning and ending range of the IPv6 address allocation pool (2001:db8:cafe:1e::2 – 2001:db8:cafe:1e:ffff:ffff:ffff:fffe)
private_net_v6:
    type: string
    description: Private IPv6 subnet address
    default: 2001:db8:cafe:1e::/64
private_net_v6_gateway:
    type: string
    description: Private IPv6 network gateway address
    default: 2001:db8:cafe:1e::1
private_net_v6_pool_start:
    type: string
    description: Start of private network IPv6 address allocation pool
    default: 2001:db8:cafe:1e::2
private_net_v6_pool_end:
    type: string
    description: End of private network IPv6 address allocation pool
    default: 2001:db8:cafe:1e:ffff:ffff:ffff:fffe

The next section to look at is in the “resources” section and this is where things go into action. The “private_v6_subnet” has various resource types and properties to include:

  • Version is IPv6
  • IPv6 address and RA modes are SLAAC
  • The network property (set in the parameter section)
  • The CIDR property which is the “private_net_v6” from the parameter section
  • The gateway IPv6 address is defined in the “private_net_v6_gateway” parameter
  • The allocation pool is defined in the “private_net_v6_pool_start/end” parameters
  private_v6_subnet:
    type: OS::Neutron::Subnet
    properties:
      ip_version: 6
      ipv6_address_mode: slaac
      ipv6_ra_mode: slaac
      network: { get_resource: private_net }
      cidr: { get_param: private_net_v6 }
      gateway_ip: { get_param: private_net_v6_gateway }
      allocation_pools:
        - start: { get_param: private_net_v6_pool_start }
          end: { get_param: private_net_v6_pool_end }

The next IPv6-relevant area of the resource section is “router_interface_v6”. In the “router_interface_v6” resource, there is a reference to the previously created “router” resource (see template file for full resource list) and the “private_v6_subnet”. This entry is simply attaching a new router interface to the Private IPv6 subnet.

  router_interface_v6:
    type: OS::Neutron::RouterInterface
    properties:
      router: { get_resource: router }
      subnet: { get_resource: private_v6_subnet }

Next, there is the Server (AKA “instance” or “VM”) creation section. There is nothing IPv6 specific here. On the network property line, Heat is pointing to “get_resource: private_net” which is the private network that both IPv4 and IPv6 subnets are associated with. That line, basically, attaches the server to a dual-stack network.

server1:
    type: OS::Nova::Server
    properties:
      name: Server1
      image: { get_param: image }
      flavor: { get_param: flavor }
      key_name: { get_param: key_name }
      networks:
        - network: { get_resource: private_net }
      config_drive: "true"
      user_data_format: RAW
      user_data: |
        #!/bin/bash
      security_groups: [{ get_resource: server_security_group }]

Finally, there is the security group section which enables rules for both IPv4 and IPv6. In this example ports 22, 80 and ICMP are open for IPv4 and IPv6.

server_security_group:
    type: OS::Neutron::SecurityGroup
    properties:
      description: Heat-deployed security group.
      name: heat-security-group
      rules: [
        {remote_ip_prefix: 0.0.0.0/0,
        protocol: tcp,
        port_range_min: 22,
        port_range_max: 22},
        {remote_ip_prefix: 0.0.0.0/0,
        protocol: icmp},
        {remote_ip_prefix: 0.0.0.0/0,
        protocol: tcp,
        port_range_min: 80,
        port_range_max: 80},
        {remote_ip_prefix: "::/0",
        ethertype: IPv6,
        protocol: tcp,
        port_range_min: 22,
        port_range_max: 22},
        {remote_ip_prefix: "::/0",
        ethertype: IPv6,
        protocol: icmp},
        {remote_ip_prefix: "::/0",
        ethertype: IPv6,
        protocol: tcp,
        port_range_min: 80,
        port_range_max: 80}]

Now, let’s deploy this template and see how it all looks. I am deploying the Heat “stack” using the Heat “stack-create” command (alternatively you can deploy it using the ‘Orchestration > Stacks > Launch Stack’ interface in the OpenStack Dashboard). In this example I am running “stack-create” using the “-r” argument to indicate ‘rollback’ (in the event something goes wrong, I don’t want the whole stack to build out). Then I am using the “-f” argument to indicate that I am using a file to build the Heat stack. The stack is named “demo-v6”:

root@c71-kilo-aio:~$ heat stack-create -r -f Heat-Templates/single-v6-test.yaml demo-v6
+--------------------------------------+------------+--------------------+----------------------+
| id                                   | stack_name | stack_status       | creation_time        |
+--------------------------------------+------------+--------------------+----------------------+
| 688388f5-4ae1-4d39-bf85-6f9a591a4420 | demo-v6    | CREATE_IN_PROGRESS | 2015-06-29T15:44:18Z |
+--------------------------------------+------------+--------------------+----------------------+

After a few minutes, the Heat stack is built:

root@c71-kilo-aio:~$ heat stack-list
+--------------------------------------+------------+-----------------+----------------------+
| id                                   | stack_name | stack_status    | creation_time        |
+--------------------------------------+------------+-----------------+----------------------+
| 688388f5-4ae1-4d39-bf85-6f9a591a4420 | demo-v6    | CREATE_COMPLETE | 2015-06-29T15:44:18Z |
+--------------------------------------+------------+-----------------+----------------------+

Here is a messy view of the obligatory OpenStack Dashboard Network Topology view (Note: Some Horizon guru needs to line break the IPv4 and IPv6 addresses for the instances so they are readable ;-)):

net-topo

 

Here’s a cleaner view of things:

  • Network list – You can see the new Heat-built “test_net” with the two subnets (IPv4/IPv6) as well as the previously built (by the admin) “Public-Network”:
root@c71-kilo-aio:~$ neutron net-list
+--------------------------------------+----------------+------------------------------------------------------------+
| id                                   | name           | subnets                                                    |
+--------------------------------------+----------------+------------------------------------------------------------+
| 2e03a628-e85e-4519-b1bb-a579880be0ae | test_net       | 93764f36-c56b-4c65-b7d7-cb78a694353b 10.10.30.0/24         |
|                                      |                | cafb610a-2aaa-4640-b0f0-8bb4b60cbaf2 2001:db8:cafe:1e::/64 |
| f6a55029-d875-48a8-aab9-1a5a5399592b | Public-Network | dda7d8f1-89a6-40bb-b11b-64a62c103828 192.168.81.0/24       |
|                                      |                | f2107125-c98e-4375-a81f-d0f4d34bdae3 2001:db8:cafe:51::/64 |
+--------------------------------------+----------------+------------------------------------------------------------+
  • Subnet list:
root@c71-kilo-aio:~$ neutron subnet-list
+--------------------------------------+----------------------------------------+-----------------------+---------------------------------------------------------------------------------+
| id                                   | name                                   | cidr                  | allocation_pools                                                                |
+--------------------------------------+----------------------------------------+-----------------------+---------------------------------------------------------------------------------+
| 93764f36-c56b-4c65-b7d7-cb78a694353b | demo-v6-private_subnet-6evpyylqyux7    | 10.10.30.0/24         | {"start": "10.10.30.2", "end": "10.10.30.254"}                                  |
| dda7d8f1-89a6-40bb-b11b-64a62c103828 | Public-Subnet-v4                       | 192.168.81.0/24       | {"start": "192.168.81.5", "end": "192.168.81.254"}                              |
| f2107125-c98e-4375-a81f-d0f4d34bdae3 | Public-Subnet-v6                       | 2001:db8:cafe:51::/64 | {"start": "2001:db8:cafe:51::3", "end": "2001:db8:cafe:51:ffff:ffff:ffff:fffe"} |
| cafb610a-2aaa-4640-b0f0-8bb4b60cbaf2 | demo-v6-private_v6_subnet-vvsmlbkc6sds | 2001:db8:cafe:1e::/64 | {"start": "2001:db8:cafe:1e::2", "end": "2001:db8:cafe:1e:ffff:ffff:ffff:fffe"} |
+--------------------------------------+----------------------------------------+-----------------------+---------------------------------------------------------------------------------+
  • Here is the “subnet-show” of the Heat-built subnet for the Private IPv6 subnet. The allocation pool range, gateway, IPv6 version, IPv6 address mode and IPv6 RA modes are all defined as we wanted (based on the Heat template):
root@c71-kilo-aio:~$ neutron subnet-show demo-v6-private_v6_subnet-vvsmlbkc6sds
+-------------------+---------------------------------------------------------------------------------+
| Field             | Value                                                                           |
+-------------------+---------------------------------------------------------------------------------+
| allocation_pools  | {"start": "2001:db8:cafe:1e::2", "end": "2001:db8:cafe:1e:ffff:ffff:ffff:fffe"} |
| cidr              | 2001:db8:cafe:1e::/64                                                           |
| dns_nameservers   |                                                                                 |
| enable_dhcp       | True                                                                            |
| gateway_ip        | 2001:db8:cafe:1e::1                                                             |
| host_routes       |                                                                                 |
| id                | cafb610a-2aaa-4640-b0f0-8bb4b60cbaf2                                            |
| ip_version        | 6                                                                               |
| ipv6_address_mode | slaac                                                                           |
| ipv6_ra_mode      | slaac                                                                           |
| name              | demo-v6-private_v6_subnet-vvsmlbkc6sds                                          |
| network_id        | 2e03a628-e85e-4519-b1bb-a579880be0ae                                            |
| subnetpool_id     |                                                                                 |
| tenant_id         | dc52b50429f74aeabb3935eb3e2bcb04                                                |
+-------------------+---------------------------------------------------------------------------------+
  • Router port list – You can see that the router has IPv4/IPv6 addresses on the tenant and public network interfaces:
root@c71-kilo-aio:~$ neutron router-port-list demo-v6-router-txy5s5bcixqd | grep ip_address | sed -e 's#.*ip_address": "\([^"]\+\).*#\1#'
10.10.30.1
2001:db8:cafe:1e::1
192.168.81.75
2001:db8:cafe:51::3f
  • Security Group list:
root@c71-kilo-aio:~$ neutron security-group-list
+-----------------------------+---------------------+---------------------------------------------------+
| id                          | name                | security_group_rules                              |
+-----------------------------+---------------------+---------------------------------------------------+
| 69f81e8e-5059-4a...         | heat-security-group | egress, IPv4                                      |                                                       |                             |                     | egress, IPv6                                      |
|                             |                     | ingress, IPv4, 22/tcp, remote_ip_prefix: 0.0.0.0/0|                                                       |                             |                     | ingress, IPv4, 80/tcp, remote_ip_prefix: 0.0.0.0/0|
|                             |                     | ingress, IPv4, icmp, remote_ip_prefix: 0.0.0.0/0  |
|                             |                     | ingress, IPv6, 22/tcp, remote_ip_prefix: ::/0     |
|                             |                     | ingress, IPv6, 80/tcp, remote_ip_prefix: ::/0     |
|                             |                     | ingress, IPv6, icmp, remote_ip_prefix: ::/0       |
+--------------------------------------+---------------------+------------------------------------------+
  • Server/Instance list:
root@c71-kilo-aio:~$ nova list
+--------------------------------------+---------+--------+------------+-------------+-----------------------------------------------------------+
| ID                                   | Name    | Status | Task State | Power State | Networks                                                  |
+--------------------------------------+---------+--------+------------+-------------+-----------------------------------------------------------+
| d7bfc606-f9da-4be5-b3e8-2219882c3da6 | Server1 | ACTIVE | -          | Running     | test_net=10.10.30.3, 2001:db8:cafe:1e:f816:3eff:fea8:7d2c |
+--------------------------------------+---------+--------+------------+-------------+-----------------------------------------------------------+

Finally, inside the instance, you can see that both IPv4 and IPv6 addresses are assigned:

root@c71-kilo-aio:~$ ip netns exec qrouter-d2ff159a-b603-4a3b-b5f7-481bff40613e ssh fedora@2001:db8:cafe:1e:f816:3eff:fea8:7d2c
The authenticity of host '2001:db8:cafe:1e:f816:3eff:fea8:7d2c (2001:db8:cafe:1e:f816:3eff:fea8:7d2c)' can't be established.
ECDSA key fingerprint is 41:e2:ea:28:e5:6d:ae:50:24:81:ad:5e:db:d7:a0:21.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '2001:db8:cafe:1e:f816:3eff:fea8:7d2c' (ECDSA) to the list of known hosts.
[fedora@server1 ~]$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1400 qdisc pfifo_fast state UP group default qlen 1000
    link/ether fa:16:3e:a8:7d:2c brd ff:ff:ff:ff:ff:ff
    inet 10.10.30.3/24 brd 10.10.30.255 scope global dynamic eth0
       valid_lft 79179sec preferred_lft 79179sec
    inet6 2001:db8:cafe:1e:f816:3eff:fea8:7d2c/64 scope global mngtmpaddr dynamic
       valid_lft 86400sec preferred_lft 14400sec
    inet6 fe80::f816:3eff:fea8:7d2c/64 scope link
       valid_lft forever preferred_lft forever

I hope this gives you a starting point for adding IPv6 to your Heat template collection.

Thanks,
Shannon

Tenant IPv6 Deployment in OpenStack Kilo Release

You folks know that I have been dancing to the IPv6 music for the last 12 years or so. IPv6 is so popular that I still earn about 75 cents a month on my book royalties. :-)

I have been fairly disappointed in the deployment of IPv6 in an OpenStack environment.  Many individuals and companies have beat on IPv6 over the past couple of years to get it to where it is today in OpenStack and we should all be super greatful.  In this post I will go over some design and deployment considerations for IPv6 in the OpenStack Kilo release.

Note: I refer to the term “tenant” vs. what OpenStack calls a “project”.

Cloud and IPv6

IPv6 has had a rough go of it over the many years it has been around. It is usually a rough go when you try to inject it into any cloud deployment. And I do mean ANY cloud deployment.  The hard stuff with cloud and IPv6 includes (not an exhaustive list):

  • API Endpoints – Enabling IPv6 on the non-Tenant-facing portion of the cloud stack
  • Provisioning, Orchestration and Management – This includes CI/CD, bare-metal provisioning (i.e. Cobbler, Ironic), automation tools such as Ansible, Puppet, etc.. and the overall Data Center management tools
  • Management/communication protocol interoperability and support of IPv6
  • Support for IPv6 in both virtual and physical networking
  • Oh, let’s not forget – expertise

In this post I am not going to address all of the above as none of us have that kind of time.  What I will talk about is making a choice to enable ALL of OpenStack for IPv6 or settling for the really important part, which is the tenant-facing part and how that looks when configured.

There are two common ways of embracing IPv6 within OpenStack (I am not discussing IPv6-only options here):

  • Dual Stack everything
  • Conditional Dual Stack

Figure 1 shows an example of a “dual stack everything” approach. It is what is sounds like. Everything in the entire stack is both IPv4 and IPv6 enabled.  This includes the API endpoints, the DBs, the tools and systems that surround OpenStack (i.e. provisioning and automation).  This, kids, can be a real pain in the butt, especially if this is a brownfield deployment where you have to go in an muck around with database entries for IPv6 and so on.

Figure 1. Dual Stack Everything

ds-everything

This is where “conditional dual stack” comes in.  It allows you to control when and where IPv6 is enabled.  In many cases, it is just in the tenant space, at least initially.

Figure 2 shows an example of where we have an IPv4-only OpenStack control plane (API/DB/etc) but dual stack for anything that faces something a tenant would see/control/interact with.  Optionally, this is where you can begin experimenting (I stress that word) with IPv6-only tenant networks.

Figure 2. Conditional Dual Stack

conditional

 

No matter which path you go down (and there are variances to the two I have pointed out here), you will end up dealing with IPv6 in the tenant space.  So, let’s talk about that part.

Tenant IPv6 Address Options

There are many combinations to IPv6 address assignment and even the types of addresses you can use.  For tenant address types you can use:

You can use those independently or together.  As far as planning out the use of GUA, ULA or combination of the two you have to think about the requirements of the tenant.  Most of the time the tenant has no clue what ‘routable’ means or NAT and they shouldn’t.  In the context of IPv6, we don’t want to do NAT, ever. Well, with one exception – we may need, for a very temporary use case, NAT64 where we translating incoming IPv6 connections into back-end IPv4-only nodes.  This is an entirely different use case than wanting to translate between two IPv6 address spaces.  Don’t do that. :-)

With all of that said, I tend to see two of MANY address options in production deployments:

  • Cloud provider assigned addressing – Provider owns and assigns IPv6 addressing
  • Tenant-provided addressing – Tenant brings their own IPv6 addressing and the cloud provider has to route it

What I mean by “cloud provider assigned addressing” is that the owner of the system (the cloud provider – which could be a public provider or just an Enterprise IT shop) will obtain a Provider Independent (PI) or Provider Assigned (PA) IPv6 address block, design an address plan and then assign blocks out of that plan to each tenant. This works well, is easy to do and is the way you should do it.

Tenant-provided addressing is possible, but messy.  In the Enterprise space, this is not something that usually happens unless you are dealing with a merger & acquisition (M&A) situation where an acquired company wants their existing IPv6 addressing to be used in the acquiring company’s cloud service.  This is something that is requested when an Enterprise or another SP is using another public cloud provider.  Again, it is totally doable but it requires a lot of planning with BGP, address space design, etc..

From an address assignment perspective, you can use:

  • Static/manual configuration
  • StateLess Address AutoConfiguration (SLAAC) – IPv6 prefix (i.e /64) is assigned to the end node via a router advertisement (RA) and the node self-constructs the interface ID (IID) portion of the address (i.e. the last /64 bits)
  • Stateful DHCPv6 – Just like IPv4 DHCP, a DHCPv6 server hands out full IPv6 addressing and any configured options
  • Stateless DHCPv6 – A combination of SLAAC (for address assignment) and DHCPv6 (for option assignment – DNS name, domain name)

In the OpenStack Kilo release we have functional support for SLAAC, Stateful DHCPv6 and Stateless DHCPv6.  I will cover the configuration of all three in this post.

Example Topology

The lab setup I am using for this post includes:

  • Kilo
  • Devstack – only because I did this testing before packages were available for RDO/Packstack, Ubuntu/Juju/MaaS or any other intaller
  • All-in-one node (although in another lab I have I did this same setup on 7 nodes and it worked as expected)
  • The lab I did this test in did not have external IPv6 connectivity to the Internet, but I do have IPv6 throughout that lab so I could verify proper routing beyond OpenStack

Figure 3 shows a basic topology of the example setup.  The all-in-one (AIO) has a management network interface (eth0) and an OpenStack public network interface (eth1).  Both interfaces are connected to a Top-of-Rack (ToR) switch which has an uplink to the Data Center Aggregation Layer switches (via trunked VLANs).  The Aggregation Layer switches have VLAN interfaces for each network and are providing IPv4 and IPv6 routing for the rest of the lab. There are IPv6 routes configured on the Aggregation Layer switches for each IPv6 prefix within the OpenStack tenant space.  Off of the ToR is a DNS server that is IPv4 and IPv6 enabled.

Figure 3. Example Topology

topo

Before I walk through each address assignment type, I will create the public network and subnets as the cloud admin. This can be done in the dashboard or via the Neutron client CLI. In this example a “public” network is created and the upstream Data Center L3 switches (–router:external) will be used as the gateway. An IPv4 subnet is created with an allocation range so that the addresses assigned do not collide with IPv4 addresses assigned on the Data Center L3 switches. An IPv6 subnet is created which also has an allocation range defined. If you have an upstream gateway address that is not .1 (v4) or ::1 (v6) then you need to identify the real gateway address using the “–gateway” option:

neutron net-create public --router:external

neutron subnet-create --name public-subnet --allocation-pool start=172.16.12.5,end=172.16.12.254 public 172.16.12.0/24

neutron subnet-create --ip-version=6 --name=public-v6-subnet --allocation-pool start=2001:db8:cafe:d::5,end=2001:db8:cafe:d:ffff:ffff:ffff:fffe --disable-dhcp public 2001:db8:cafe:d::/64

Let’s walk through SLAAC, Stateful DHCPv6 and Stateless DHCPv6.

SLAAC

Figure 4 gives a topology view of the SLAAC example I am referencing.  The Neutron router has a private network with IPv4 and IPv6 subnets and a public network with gateway connections (Also, dual stack) to the Data Center. The instance will have a dual stack connection and receive DHCPv4 address assigned from DNSMASQ and IPv6 address assignment via radvd in SLAAC mode:

Figure 4. SLAAC Topology Example

slaac-topo

When you use SLAAC mode in the tenant space, the Router Advertisement Daemon (radvd) is used to send router advertisements (RAs) in response to Router Solicitations (RS) as well as at a regular interval.  When you create a Neutron network and subnet for IPv6 you have to select the “IPv6 Address Configuration Mode” from the “Subnet Details” screen in the dashboard. In this case you would select “SLAAC: Address discovered from OpenStack Router”.  If you were doing this via CLI, you would use the option arguments of “–ipv6-address-mode=slaac –ipv6-ra-mode=slaac” when you create the subnet.  In the example below, a new Neutron network (“private”) is created along with the IPv6 subnet creation (IPv4 subnet creation is not shown). The option arguments are added to identify SLAAC mode. Also, no DNS servers are added here as that information for IPv6 would not get injected into the instance and the instance (since it is set for SLAAC-only) would not ask for DNS and other options:

neutron net-create private
 
neutron subnet-create --ip-version=6 --name=private_v6_subnet --ipv6-address-mode=slaac --ipv6-ra-mode=slaac private 2001:db8:cafe::/64

Created a new subnet:
+-------------------+----------------------------------------------------------------+
| Field             | Value                                                          |
+-------------------+----------------------------------------------------------------+
| allocation_pools  | {"start": "2001:db8:cafe::2", "end": "2001:db8:cafe:0:ffff:ffff:ffff:fffe"}|
| cidr              | 2001:db8:cafe::/64                                             |
| dns_nameservers   |                                                                |
| enable_dhcp       | True                                                           |
| gateway_ip        | 2001:db8:cafe::1                                               |
| host_routes       |                                                                |
| id                | 42cc3dbc-938b-4ad6-b12e-59aef7618477                           |
| ip_version        | 6                                                              |
| ipv6_address_mode | slaac                                                          |
| ipv6_ra_mode      | slaac                                                          |
| name              | private_v6_subnet                                              |
| network_id        | 7166ce15-c581-4195-9479-ad2283193d06                           |
| subnetpool_id     |                                                                |
| tenant_id         | f057804eb39b4618b40e06196e16265b                               |
+-------------------+----------------------------------------------------------------+

Inside the tenant (as a tenant member or admin) a Neutron router needs to be created along with attaching the router to the public network and the two subnets:

neutron router-create router

neutron router-gateway-set router public

neutron router-interface-add router private_v4_subnet

neutron router-interface-add router private_v6_subnet

When an instance boots inside the tenant and attaches to the “private” network it will have access to both the IPv4 and the IPv6 subnets that were defined previously. Note: Both Windows (since Windows 7/Server 2008) and Linux (from a long time ago) support SLAAC out of the box just so basic IPv6 protocol support is enabled.  The instance will receive and IPv4 address via DNSMASQ and an IPv6 address via radvd. Note: The host will NOT receive any kind of IPv6 DNS entry from OpenStack. There are a couple of considerations to understand with this. An instance can get both IPv4 and IPv6 DNS information for a host over IPv4 transport. You don’t need to access a DNS server natively over IPv6 in order to receive IPv6 host information in a lookup. But, if you do want to have an IPv6 entry in the /etc/resolv.conf file then, in SLAAC mode, you will need to configure it manually, setup cloud-init to inject it or just bake that entry into the image that is booted.

A highly summarized view of a tcpdump capture reveals the basic flow of a SLAAC exchange.  The ICMPv6 Flags field states “none” and the Prefix Information Flag field has “auto” enabled. The combination of “none” (Managed and Other bits = 0) and “auto” (Auto bit = 1) indicates SLAAC is used and DHCPv6 is not used. You can read more on this in RFC 4861:

IP6 (hlim 255, next-header ICMPv6 (58) payload length: 16) fe80::f816:3eff:fe79:5acc > ff02::2: [icmp6 sum ok] ICMP6, router solicitation, length 16
	  source link-address option (1), length 8 (1): fa:16:3e:79:5a:cc
	    0x0000:  fa16 3e79 5acc

IP6 (hlim 255, next-header ICMPv6 (58) payload length: 56) fe80::f816:3eff:fec3:17b4 > ff02::1: [icmp6 sum ok] ICMP6, router advertisement, length 56
	hop limit 64, Flags [none], pref medium, router lifetime 30s, reachable time 0s, retrans time 0s
	  prefix info option (3), length 32 (4): 2001:db8:cafe::/64, Flags [onlink, auto], valid time 86400s, pref. time 14400s
	    0x0000:  40c0 0001 5180 0000 3840 0000 0000 2001
	    0x0010:  0db8 cafe 0000 0000 0000 0000 0000
	  source link-address option (1), length 8 (1): fa:16:3e:c3:17:b4
	    0x0000:  fa16 3ec3 17b4

The instance will send out a router solicitation and the router will reply with a router advertisement. In that RA, the router will send various bits of information (various lifetimes and other stuff) and, in the case of SLAAC, it will send out a “Flags [none]”, indicating that it is not giving out anything other than a prefix. In that RA is the prefix from the subnet that was configured earlier (2001:db8:cafe::/64). The instance will use that prefix, along with its own IID (discussed earlier) to come up with a valid 128-bit IPv6 address. The instance will use the router’s link-local address as its gateway.  After the host has the IPv6 address it will go through a series of events such as Duplicate Address Detection (DAD) and so on. I won’t be explaining all of that stuff here. 😉

Stateful DHCPv6

Figure 5 is a topology view of the Stateful DHCPv6 layout. It is pretty much the same as with SLAAC only using a different IPv4 and IPv6 prefix.

Figure 5. Stateful DHCPv6 Topology Example

stateful-topo

Just like with SLAAC mode, radvd is used with Stateful DHCPv6 mode.  Instead of using the prefix announcement (Auto bit set to 1) in the RA to assist with IPv6 address assignment (SLAAC mode), Stateful DHCPv6 uses the Managed bit (set to 1) and the Auto bit (set to 0) to instruct the instance to perform a  DHCPv6 solicit for both address assignment and other options.

Configuring OpenStack to use Stateful DHCPv6 is done using the same process as was shown in the SLAAC example only with different options. In the dashboard, “DHCPv6 stateful: Address discovered from OpenStack DHCP” is selected in the “subnet details” screen.  Unlike with SLAAC, a DNS name server entry can be added and it will be sent to the instance.  If the Neutron client CLI is used then the option arguments would be: “–ipv6-address-mode=dhcpv6-stateful –ipv6-ra-mode=dhcpv6-stateful”.  Here’s an example:

neutron net-create private-dhcpv6
 
neutron subnet-create --ip-version=6 --name=private_dhcpv6_subnet --ipv6-address-mode=dhcpv6-stateful --ipv6-ra-mode=dhcpv6-stateful private-dhcpv6 2001:db8:cafe:1::/64 --dns-nameserver 2001:db8:cafe:a::e
Created a new subnet:
+-------------------+----------------------------------------------------------------+
| Field             | Value                                                          |
+-------------------+----------------------------------------------------------------+
| allocation_pools  | {"start": "2001:db8:cafe:1::2", "end": "2001:db8:cafe:1:ffff:ffff:ffff:fffe"} |
| cidr              | 2001:db8:cafe:1::/64                                           |
| dns_nameservers   | 2001:db8:cafe:a::e                                             |
| enable_dhcp       | True                                                           |
| gateway_ip        | 2001:db8:cafe:1::1                                             |
| host_routes       |                                                                |
| id                | 545ea206-9d14-4dca-8bae-7940719bdab5                           |
| ip_version        | 6                                                              |
| ipv6_address_mode | dhcpv6-stateful                                                |
| ipv6_ra_mode      | dhcpv6-stateful                                                |
| name              | private_dhcpv6_subnet                                          |
| network_id        | 55ed8333-2876-400a-92c1-ef49bc10aa2b                           |
| subnetpool_id     |                                                                |
| tenant_id         | f057804eb39b4618b40e06196e16265b                               |
+-------------------+----------------------------------------------------------------+

After performing “neutron router-interface-add” for each subnet (see SLAAC example), it’s time to setup the instance operating system for DHCPv6 operation. To learn about how Microsoft Windows supports IPv6 address assignment check out this blog.  No Linux version that I have ever worked with is configured, by default, to use DHCPv6.  For Ubuntu you have to add the “inet6 dhcp” line in the /etc/network/interfaces file:

auto eth0
iface eth0 inet dhcp
iface eth0 inet6 dhcp

I have tested CentOS 7 and Fedora 21 in this test bed and the following configuration works on those (add to the appropriate interface in /etc/sysconfig/network-scripts/):

IPV6INIT="yes"
DHCPV6C="yes"

Included below is a highly summarized output from a tcpdump capture of a Stateful DHCPv6 exchange.  There is a lot more going on here than in the SLAAC example.  Just like the SLAAC example, there is a RS/RA exchange but that is where the similarity ends.  In the RA below, the ICMPv6 Flags field is set to “managed” (M bit = 1). Note: that when the M bit is set to 1, the O bit, per RFC 4861 is redundant and can be ignored.  The instance will receive the RA and see that the “managed” flag is set and will then begin the DHCPv6 client process.  The instance will send a DHCPv6 Solicit to the ‘all DHCP servers’ well-known IPv6 multicast address (ff02::1:2).  Radvd will do a DHCPv6 advertise with all of the good stuff (address and options). The instance will do a DHCPv6 request for options (it processed the address info from the advertise) and the server will finish the sequence with a DHCPv6 reply.  You can read a summary view of this process here:

IP6 (hlim 255, next-header ICMPv6 (58) payload length: 24) fe80::f816:3eff:fe77:e5a0 > ff02::1: [icmp6 sum ok] ICMP6, router advertisement, length 24
	hop limit 64, Flags [managed], pref medium, router lifetime 30s, reachable time 0s, retrans time 0s
	  source link-address option (1), length 8 (1): fa:16:3e:77:e5:a0
	    0x0000:  fa16 3e77 e5a0

IP6 (hlim 1, next-header UDP (17) payload length: 64) fe80::f816:3eff:fe22:386b.546 > ff02::1:2.547: [udp sum ok] dhcp6 solicit (xid=85680b (client-ID hwaddr/time type 1 time 482446373 fa163e22386b) (option-request DNS-server DNS-search-list Client-FQDN SNTP-servers) (elapsed-time 101) (IA_NA IAID:1042430059 T1:3600 T2:5400))

IP6 (class 0xc0, hlim 64, next-header UDP (17) payload length: 175) fe80::f816:3eff:fe06:176f.547 > fe80::f816:3eff:fe22:386b.546: [udp sum ok] dhcp6 advertise (xid=85680b (client-ID hwaddr/time type 1 time 482446373 fa163e22386b) (server-ID hwaddr type 1 fa163e06176f) (IA_NA IAID:1042430059 T1:43200 T2:75600 (IA_ADDR 2001:db8:cafe:1::4 pltime:86400 vltime:86400)) (status-code success) (preference 255) (DNS-search-list openstacklocal.) (DNS-server 2001:db8:cafe:a::e) (Client-FQDN))

IP6 (hlim 1, next-header UDP (17) payload length: 106) fe80::f816:3eff:fe22:386b.546 > ff02::1:2.547: [udp sum ok] dhcp6 request (xid=9cb172 (client-ID hwaddr/time type 1 time 482446373 fa163e22386b) (server-ID hwaddr type 1 fa163e06176f) (option-request DNS-server DNS-search-list Client-FQDN SNTP-servers) (elapsed-time 0) (IA_NA IAID:1042430059 T1:3600 T2:5400 (IA_ADDR 2001:db8:cafe:1::4 pltime:7200 vltime:7500)))

IP6 (class 0xc0, hlim 64, next-header UDP (17) payload length: 186) fe80::f816:3eff:fe06:176f.547 > fe80::f816:3eff:fe22:386b.546: [udp sum ok] dhcp6 reply (xid=9cb172 (client-ID hwaddr/time type 1 time 482446373 fa163e22386b) (server-ID hwaddr type 1 fa163e06176f) (IA_NA IAID:1042430059 T1:3600 T2:6300 (IA_ADDR 2001:db8:cafe:1::4 pltime:7200 vltime:7500)) (status-code success) (DNS-search-list openstacklocal.) (DNS-server 2001:db8:cafe:a::e) (Client-FQDN))

Stateless DHCPv6

Stateless DHCPv6 is a combo of SLAAC (for address assignment) and DHCPv6 (for options).  For the sake of consistency with the last two sections, here is a topology sample for Stateless DHCPv6:

Figure 6. Stateless DHCPv6 Sample Topology

stateless-topo

The configuration of Stateless DHCPv6 is similar to that of Stateful DHCPv6.  In the dashboard, select “DHCPv6 stateless: Address discovered from OpenStack Router and additional information from OpenStack DHCP” from the “subnet details” screen.  The options argument changes if using the Neutron client CLI: “–ipv6-address-mode=dhcpv6-stateless –ipv6-ra-mode=dhcpv6-stateless private-dhcpv6-stateless”.  Here is an example:

neutron net-create private-dhcpv6-stateless

neutron subnet-create --ip-version=6 --name=private_dhcpv6_stateless_subnet --ipv6-address-mode=dhcpv6-stateless --ipv6-ra-mode=dhcpv6-stateless private-dhcpv6-stateless 2001:db8:cafe:2::/64 --dns-nameserver 2001:db8:cafe:a::e
Created a new subnet:
+-------------------+--------------------------------------------------------------+
| Field             | Value                                                        |
+-------------------+--------------------------------------------------------------+
| allocation_pools  | {"start": "2001:db8:cafe:2::2", "end": "2001:db8:cafe:2:ffff:ffff:ffff:fffe"} |
| cidr              | 2001:db8:cafe:2::/64                                         |
| dns_nameservers   | 2001:db8:cafe:a::e                                           |
| enable_dhcp       | True                                                         |
| gateway_ip        | 2001:db8:cafe:2::1                                           |
| host_routes       |                                                              |
| id                | edd1d404-e949-4cdf-9812-334bbf0e5cec                         |
| ip_version        | 6                                                            |
| ipv6_address_mode | dhcpv6-stateless                                             |
| ipv6_ra_mode      | dhcpv6-stateless                                             |
| name              | private_dhcpv6_stateless_subnet                              |
| network_id        | f65c6e60-d31e-4a1c-8136-599c3855b86a                         |
| subnetpool_id     |                                                              |
| tenant_id         | f057804eb39b4618b40e06196e16265b                             |
+-------------------+--------------------------------------------------------------+

After performing “neutron router-interface-add” for each subnet (again, see the SLAAC example), edits need to be made to the instance operating system for Stateless DHCPv6 operation. In the Ubuntu /etc/network/interfaces file, ensure the “inet6” line is set to “auto” and enable the stateless flag:

iface eth0 inet6 auto
dhcp 1

Again, I only tested CentOS7 and Fedora 21 with this setup so your mileage may vary but for those OSes, set the following for the appropriate interface in /etc/sysconfig/network-scripts/:

IPV6INIT="yes"
DHCPV6C="yes"
DHCPV6C_OPTIONS="-S"

Below is the output from tcpdump for the Stateless DHCPv6 exchange.  Radvd sends out the RS with the ICMPv6 Flags field set with the O bit set to 1 and the Prefix Information Flag field has “auto” enabled. This combination basically says “I, the router, will provide you, the instance,  a prefix via SLAAC and you come back asking for options”.  The instance will issue a “dhcp6 inf-req” to get the options and radvd will reply:

IP6 (hlim 255, next-header ICMPv6 (58) payload length: 56) fe80::f816:3eff:fec1:bc52 > ff02::1: [icmp6 sum ok] ICMP6, router advertisement, length 56
	hop limit 64, Flags [other stateful], pref medium, router lifetime 30s, reachable time 0s, retrans time 0s
	  prefix info option (3), length 32 (4): 2001:db8:cafe:2::/64, Flags [onlink, auto], valid time 86400s, pref. time 14400s
	    0x0000:  40c0 0001 5180 0000 3840 0000 0000 2001
	    0x0010:  0db8 cafe 0002 0000 0000 0000 0000
	  source link-address option (1), length 8 (1): fa:16:3e:c1:bc:52
	    0x0000:  fa16 3ec1 bc52

IP6 (hlim 1, next-header UDP (17) payload length: 44) fe80::f816:3eff:fefe:d157.546 > ff02::1:2.547: [udp sum ok] dhcp6 inf-req (xid=d2dbc8 (client-ID hwaddr type 1 fa163efed157) (option-request DNS-server DNS-search-list Client-FQDN SNTP-servers) (elapsed-time 94))

IP6 (class 0xc0, hlim 64, next-header UDP (17) payload length: 88) fe80::f816:3eff:fe2d:a6de.547 > fe80::f816:3eff:fefe:d157.546: [udp sum ok] dhcp6 reply (xid=d2dbc8 (client-ID hwaddr type 1 fa163efed157) (server-ID hwaddr type 1 fa163e2da6de) (DNS-search-list openstacklocal.) (DNS-server 2001:db8:cafe:a::e) (lifetime 86400))

I am sorry for the legnth of the blog post. In retrospect I should have broken this post into multiple posts but oh well.

I hope this was helpful.

Cheers,
Shannon

OpenStack – Juno Release – Layer 3 High-Availability

In the Juno release of OpenStack we were blessed with many new features and functionality. One of them is the Layer 3 (L3) High-Availability (HA) functionality. The L3 HA support that came in Juno uses Keepalived with Virtual Routing Redundancy Protocol (VRRP).

This blog is not meant to give you a full dump of Keepalived or VRRP. It’s not even meant to give you a deep dive on L3 HA. I will provide some explanation of my pictures and configs but you really should familiarize yourself with the original info on L3 HA:

Why do we need L3 HA in OpenStack?

Historically, there was not much in the way of providing a tenant a First Hop Redundancy Protocol (FHRP) for the tenant Neutron router/L3 agent.  If a tenant used a router for connectivity to other tenants or external access (Internet) and the node that housed the L3 agent died or the agent itself puked then the tenant was isolated with no connectivity in or out.  The one exception to this is if you did not use a tenant router and instead used the provider network model with VLANs and the first L3 hop for the tenant instance was a physical L3 device (like an aggregation layer L3 switch). In that case the FHRP (i.e. VRRP, HSRP, GLBP) being used between the redundant aggregation layer switches would provide your L3 HA capabilities.

So, we needed an answer for this L3 HA issue. In Juno, the L3 HA functionality was released so that we now had redundancy for the neutron router (L3 agent).

High-Level Considerations

There are a few things to keep in mind when using the L3 HA functionality:

      • L3 HA can be configured manually via the Neutron client by an admin:
        • neutron router-create --ha True|False
      • L3 HA functionality can be set as a system default within the /etc/neutron/neutron.conf and l3_agent.ini files (see examples later in the post)
      • Existing non-HA enabled router can be updated to HA:
        • neutron router-update <router-name> --ha=True
    • Requires a minimum of two network nodes (or controllers – each running the L3 agent)

What does the tenant see?

The tenant sees one router with a single gateway IP address. (Note:Non-Admin users cannot control if the router is HA or non-HA).  From the tenant’s perspective, the router behaves the same in HA or non-HA mode. In Figure 1 below, the tenant sees the instances, private network and a single router (the example below is from an admin user).

Figure 1. Tenant View of L3 HA-enabled Router

tenant-view

Routing View

In Figure 2 below, a basic view of the routing and VRRP setup is shown.  In this example the tenant network is assigned 10.10.30.x/24. VRRP is using 169.254.0.x over a dedicated HA-only network that traverses the same tenant network type (i.e VXLAN).  The router (L3 agent) on the left is the VRRP master and is the tenant gateway (10.10.30.1).

Tenant instances will use the 10.10.30.1 as their gateway and traffic northbound will pass through the master (the left router in this example). Return traffic will pass back through the router acting as a master.

Figure 2. Routing View of L3 HA

routing-view

Host View of L3 HA

In Figure 3 below,  there are three OpenStack nodes: A compute node, control node and a network node. (Note: In this example the control node is acting as a network node as well).  The L3 HA-enabled router has been created and there is a neutron router (L3 agent) running on both the control and network nodes. Keepalived/VRRP is enabled by the L3 HA code. In this example, br-int has a variety of ports connecting:

  • Port to br-tun (used for VXLAN)
  • Port to br-eth1 (used for eth1 connection)
  • Port to qrouter (qr-xxxx)
  • Port to keepalived (ha-xxxx)

The Neutron router (qrouter) in this example has the ports listed above for keepalived and br-int but also has a port to br-eth1 (via qg-xxxx).

The compute node has instances attached to br-int (via tap interfaces, veth pairs and linux bridge). br-int has a port connecting it to br-tun, again, used in this example for VXLAN.

Figure 3. Host View of L3 HA

host-view

Figure 4 shows a basic traffic flow example.  L3 HA uses keepalived/VRRPv2 to manage the master/backup relationship between the Neutron routers (L3 agents). VRRPv2 control traffic is using the 169.254.192.x network (configurable) and advertisements are sent using the well-known IPv4 multicast group of 224.0.0.18 (not configurable).

In this example, traffic leaving  an instance on the compute node will follow the path to its default gateway (via the VXLAN tunnels between br-tun on each node). The traffic flows through whichever L3 agent is acting as ‘master’. In this example the L3 agent is running on the control node.  The L3 agent will then route traffic towards the destination.

Figure 4. Traffic Flow Example for L3 HA

traffic-flow

Enabling L3 HA in Juno

On the node running Neutron server (controller), edit the /etc/neutron/neutron.conf file and uncomment/edit the following lines:

router_distributed = False
# =========== items for l3 extension ==============
# Enable high availability for virtual routers.
l3_ha = True
#
# Maximum number of l3 agents which a HA router will be scheduled on. If it
# is set to 0 the router will be scheduled on every agent.
max_l3_agents_per_router = 3
#
# Minimum number of l3 agents which a HA router will be scheduled on. The
# default value is 2.
min_l3_agents_per_router = 2
#
# CIDR of the administrative network if HA mode is enabled
l3_ha_net_cidr = 169.254.192.0/18

On the nodes running the L3 agent (in my example the control and network nodes), edit the /etc/neutron/l3_agent.ini file and uncomment/edit the following lines (Note: Set a better password than the one I’ve included ;-)):

# Location to store keepalived and all HA configurations
ha_confs_path = $state_path/ha_confs

# VRRP authentication type AH/PASS
ha_vrrp_auth_type = PASS

# VRRP authentication password
ha_vrrp_auth_password = cisco123

# The advertisement interval in seconds
ha_vrrp_advert_int = 2

Restart the L3 agent service on each node:

systemctl restart neutron-l3-agent.service

If you completed the above steps then you can create the neutron router without any ‘–ha’ flags set and the router will be created as an HA-enabled router. You can also create an HA-enabled router using the ‘–ha=True’ flag as shown in the following example (Note: Only admins have permissions to run with the –ha flag set):

[root@net1 ~]# neutron router-create --ha True test1
Created a new router:
+-----------------------+--------------------------------------+
| Field                 | Value                                |
+-----------------------+--------------------------------------+
| admin_state_up        | True                                 |
| distributed           | False                                |
| external_gateway_info |                                      |
| ha                    | True                                 |
| id                    | 1fe9e406-2bb5-42c4-af62-3daef314e181 |
| name                  | test1                                |
| routes                |                                      |
| status                | ACTIVE                               |
| tenant_id             | 45e1c2a0b3a244a3a9fad48f67e28ef4     |
+-----------------------+--------------------------------------+

Once the router is created via the dashboard or via CLI (or via Heat) and your networks are all attached, you can look at the keepalived settings in the /var/lib/neutron/ha_confs/<id>/keepalived.conf file. In the example below, the “interface ha-0d655b16-c6” is the L3 HA interface. VRRP will track that inteface. The virtual IP (VIP) address is for the HA-xxx interface is 169.254.0.1 (only the master holds this address). The tenant-facing router IP (10.10.30.1) and the public-facing router IP (192.168.81.13) are the VIPs for each respective networks. The external default gateway is 192.168.81.2.

[root@net1 ~]# cat /var/lib/neutron/ha_confs/719b853f-539e-420b-a76b-0440146f05de/keepalived.conf
. . . output abbreviated 
vrrp_instance VR_1 {
    state BACKUP
    interface ha-0d655b16-c6
    virtual_router_id 1
    priority 50
    nopreempt
    advert_int 2
    authentication {
        auth_type PASS
        auth_pass cisco123
    }
    track_interface {
        ha-0d655b16-c6
    }
    virtual_ipaddress {
        169.254.0.1/24 dev ha-0d655b16-c6
    }
    virtual_ipaddress_excluded {
        10.10.30.1/24 dev qr-c3090bd6-1b
        192.168.81.13/24 dev qg-4f163e63-c4
}
    virtual_routes {
        0.0.0.0/0 via 192.168.81.2 dev qg-4f163e63-c4
    }

Run a tcpdump on the L3 HA interface (inside the router namespace) to watch the VRRPv2 advertisements:

[root@net1 ~]# ip netns exec qrouter-719b853f-539e-420b-a76b-0440146f05de tcpdump -n -i ha-0d655b16-c6
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ha-0d655b16-c6, link-type EN10MB (Ethernet), capture size 65535 bytes
14:00:03.123895 IP 169.254.192.33 > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 50, authtype simple, intvl 2s, length 20
14:00:05.125386 IP 169.254.192.33 > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 50, authtype simple, intvl 2s, length 20
14:00:07.128133 IP 169.254.192.33 > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 50, authtype simple, intvl 2s, length 20
14:00:09.129421 IP 169.254.192.33 > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 50, authtype simple, intvl 2s, length 20
14:00:11.130814 IP 169.254.192.33 > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 50, authtype simple, intvl 2s, length 20
14:00:13.131529 IP 169.254.192.33 > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 50, authtype simple, intvl 2s, length 20

Testing a failure
First, check who is master:

[root@net1 ~]# cat /var/lib/neutron/ha_confs/719b853f-539e-420b-a76b-0440146f05de/state
master
[root@net2 ~]# cat /var/lib/neutron/ha_confs/719b853f-539e-420b-a76b-0440146f05de/state
backup

Simulate a failure by shutting down the HA interface (remember that it was in the tracked interface list). Have a ping running on the instance to verify connectivity through the failure.:

[root@net1 ~]# ip netns exec qrouter-719b853f-539e-420b-a76b-0440146f05de ifconfig ha-0d655b16-c6 down

Check to see that the master role changed after failure:

[root@net1 ~]# cat /var/lib/neutron/ha_confs/719b853f-539e-420b-a76b-0440146f05de/state
fault
[root@net2 ~]# cat /var/lib/neutron/ha_confs/719b853f-539e-420b-a76b-0440146f05de/state
master

Checking in on the ping shows a delay but no loss (In a loaded system you will likely see a brief loss of traffic):

ubuntu@server1:~$ ping 8.8.8.8
64 bytes from 8.8.8.8: icmp_seq=20 ttl=127 time=65.4 ms
64 bytes from 8.8.8.8: icmp_seq=21 ttl=127 time=107 ms
64 bytes from 8.8.8.8: icmp_seq=22 ttl=127 time=64.5 ms
64 bytes from 8.8.8.8: icmp_seq=23 ttl=127 time=67.6 ms

Happy testing!

What you can expect from this site

I am horrible at blogging.  It takes me months to login to a blog and type anything. Much of what you will read here is drivel and nonsense. You’ve been warned.

When I do have something to say it will likely be about OpenStack, Docker, IPv6, general DC networking or whatever I happen to be interested at that moment.  I will likely be posting stuff that is not ready for primetime but yet I got it working. Don’t mistake ping testing for production-ready. I am usually too far out on the edge for anything I do to be considered production ready. 😉

Be patient, be gentle and hopefully I will have something interesting to post.