Running Tailscale in GitLab CI/CD with Alpine container

OAuth client and running service in Alpine container

  1. Background
  2. Tailscale
  3. Tailscale ACL
  4. NixOS
  5. OAuth client
  6. Alpine container
    1. GitLab CI
  7. Additional reading

Skip to configuration.

Background §

Previously, I used cloudflared-powered SSH certificate to access my servers. Since SSH connection is proxied through a cloudflared tunnel–which is created by initiating outbound connection only–I could have restricted the SSH port to localhost only without having to open inbound port. It worked for my workstation–I could initiate SSH through cloudflared which opens a browser and I authenticate on Cloudflare Access through OTP.

But obviously that wouldn’t work for automated deployment–building by my blog in GitLab CI then deploys to my web servers automatically. Cloudflare Access supports authenticating through service tokens, but cloudflared apparently could not grab SSH certificate through this way. Reading through the documentation, I don’t see any mention of which username to associate with, so Cloudflare Access could not identify which identity to issue an SSH certificate to.

For CI/CD pipeline use case, Cloudflare instead recommends to use WARP connector which creates a Site-to-Site VPN tunnel between networks: a source network which hosts a build server that runs the pipelines connecting to a destination network which hosts the web servers. The connector agent acts as a subnet router and can either runs on a router, server or directly in each server. However, I run my pipelines on public shared runners which are serverless to me as I don’t manage the underlying servers. This rules out a long-running tunnel, so the only method left is to run the WARP client in the CI/CD pipeline, specifically the deployment job. However, WARP client is not meant to run on ephemeral container. It could, but imagine cleaning up all the stale clients in the Zero Trust portal.

Tailscale §

Reading through comments on SSH through Cloudflare Access, some mentioned they have moved on to Tailscale instead. When I search the web for “tailscale gitlab”, the first result is this official guide by Tailscale. The guide notably mentions the concept of ephemeral nodes, a Tailscale device registered with this mode will be automatically removed from the device list after it has gone offline.

The guide is brief, too brief in fact. The guide mentions creating an auth key which only lasts up to 90 days; perhaps it’s a good security practice, but I find having to update GitLab secret every quarter to be unappealing. Instead, an OAuth client should be used because it has no expiry, which is also the recommended and only option when running in Github Action. Even the auth key documentation also suggest to use OAuth client to create auth key, instead of creating it directly. I also faced difficulty running the tailscaled daemon on node:alpine which I later figured out.

Tailscale ACL §

The first thing to do after I signed up for Tailscale is to replace the allow-all-by-default ACL with the following ACL, which allows port 22 from owner’s devices (my workstation) and GitLab Runner to my web servers. The ACL is also used to create tags–a tag must exist in the ACL first before you can assign it to a device.

{
  "tagOwners": {
    "tag:server1": ["autogroup:owner"],
    "tag:server2": ["autogroup:owner"],
    "tag:ci":      ["autogroup:owner"],
  },

  "acls": [
    {
      "action": "accept",
      "src":    ["autogroup:owner", "tag:ci"],
      "dst":    ["tag:server1:22", "tag:server2:22"],
      "proto":  "tcp",
    },
  ],

  // Owner must have SSH access
  "tests": [
    {
      "src":    "[email protected]",
      "accept": ["tag:server1:22", "tag:server2:22"],
      "proto":  "tcp",
    },
  ],
}

NixOS §

I added my web servers under the Machines tab and tagged them with server1 and server2 respectively. I saved the (non-ephemeral and non-reusable) auth key to a file “/run/secrets/tailscale_key” in my servers and chmod it to 600. Each server has a unique auth key. Add the following lines and sudo nixos-rebuild switch. The servers should then show up and I manually approved them because I have device approval enabled.

/etc/nixos/configuration.nix
services.tailscale = { enable = true; authKeyFile = "/run/secrets/tailscale_key"; extraDaemonFlags = [ "--no-logs-no-support" ]; };

OAuth client §

In Tailscale admin console, navigate to Settings → OAuth clients. Generate a new client with read+write permission to “Auth Keys”.

Create a new GitLab CI/CD variable:

  • Check “masked and hidden” and “protect variable”.
  • Uncheck “expand variable”.
  • Name the key TS_OAUTH_SECRET and paste the secret under value.

Alpine container §

Skip to the actual commands: GitLab CI

Tailscale provides an official Alpine-based container image tailscale/tailscale:stable which is probably the easiest way to run it in container. I don’t use it because the Alpine package repository only has LTS version of Nodejs, not the latest release as available at the node:alpine image. Instead of using tailscale image as a base, I prefer to use the larger Nodejs as a base and install tailscale on top of it.

It took me a few attempts to run Tailscale in the node:alpine image as I was unfamiliar with the behaviour of init/OpenRC in an Alpine container. These are my attempts in order:

  1. tailscale up failed because tailscaled is not running.
  2. rc-service tailscale start failed because “openrc” is not installed.
  3. It still failed with openrc.
  4. Found this workaround which worked but I wasn’t sure why.
  5. Then I found this StackOverflow question. One of the answers mentioned a container doesn’t really boot, which explains why the container doesn’t install nor run openrc by default.
  6. Instead of starting a service (which executes tailscaled), I could just run tailscaled in the background instead.
  7. I follow the tailscaled’s environment variables and default arguments of its alpine package, including the $PATH override. I only changed the --state from “/var/lib/tailscale/tailscaled.state” to “mem:” since it will be a ephemeral node.

GitLab CI §

.gitlab-ci.yml
before_script: - apk update && apk add tailscale - export PATH="/usr/libexec/tailscale:$PATH" - export TS_DEBUG_FIREWALL_MODE=nftables - tailscaled --socket=/run/tailscale/tailscaled.sock --state="mem:" --port=41641 --no-logs-no-support >/dev/null 2>&1 & - tailscale up --auth-key="${TS_OAUTH_SECRET}?ephemeral=true&preauthorized=true" --advertise-tags=tag:ci --hostname="gitlab-$(cat /etc/hostname)" --accept-routes

Additional reading §

GitOps for Tailscale ACLs with GitLab CI