Nice nginx features for developers

June 02, 2021

A lot of people use nginx as a web server and fallback for something like haproxy or traefik for service routing. But you can use nginx for that too! In my experience nginx provides rich and flexible ways to route your requests. Here are few things that worked well for me when I was wearing a developer hat.

First, let’s look at the simple config that just forwards requests from http://proxy.local/ address to a single http://backend.local:10000.

user nginx;
worker_processes auto;

events {}

http {
    access_log  /var/log/nginx/access.log combined;

    # include /etc/nginx/conf.d/*.conf;

    upstream backend {
        server backend.local:10000;
    }

    server {
        server_name proxy.local;
        listen 8000;

        location / {
            proxy_pass http://backend;
        }
    }
}

You declare your backend service as an upstream group. Each instance of the backend is described with a server directive.

Then you declare entrypoint with a server and location. Given that it’s nginx you can go crazy with regexp location matching and stuff but it’s not what is required in the case of service routing.

Finally, you forward requests with proxy_pass directive.

From this simple config, we can start to build the necessary complexity.

Active/Passive backend configuration

If your service needs active/passive configuration where one server is the main for requests handling and the other is a backup then you can configure it like this:

    ...
    upstream backend {
        server main-backend.local:10000;
        server backup-backend.local:10000 backup;
    }
    ...

backup option tells nginx that this server in the upstream group will be used only if the primary server is unavailable.

By default, server is marked as unavailable after 1 connection error or timeout. This can be tuned with max_fails option for each server in an upstream group like this:

    ...
    upstream backend {
        # Try 3 times for the main server
        server main-backend.local:10000 max_fails=3;

        # Try 10 times for backup server
        server backup-backend.local:10000 backup max_fails=10;
    }
    ...

In addition to connection errors and timeouts, you can use various HTTP error codes like 500 as unsuccessful attempts. This is configured by the proxy_next_upstream directive.

...
    upstream backend {
        server main-backend.local:10000;
        server backup-backend.local:10000 backup;
    }

    server {
        server_name proxy.local;
        listen 8000;

        # Switch to the next upstream in  case of connection error, timeout
        # or HTTP 429 error (rate limit).
        proxy_next_upstream error timeout http_429;

        location / {
            proxy_pass http://backend;
        }
    }
...

Proxy to Kubernetes service

max_fails option is crucial if your nginx is running inside Kubernetes and you want to proxy requests to the Kubernetes service (using cluster DNS). In this case, you should have a single server with max_fails=0 like this:

    ...
    upstream backend {
        server app.my-team.svc max_fails=0;
    }
    ...

This way nginx will not mark Kubernetes service as unavailable. It won’t try to do passive health checks. All of these are not needed because Kubernetes service is doing active health checks by itself with readiness probes.

Flexible routing with `map`

Sometimes you need to route requests based on some header value. Or query parameter. Or cookie value. Or hostname. Or any combination of those.

And this is the case where nginx really shines. It’s the only (in my experience) proxy server that allows requests routing with almost arbitrary logic.

The key part that makes this possible is ngx_http_map_module. This module allows you to define variable from the combination of other variables with regular expressions. Sounds complicated but wait for it.

Say, we have 3 backend services that are serving different kinds of data:

Live data service that returns the most recent data that were just collected.
Historical data service that returns old data.
Aggregated data service that returns precalculated data.

Call it microservices architecture, whatever.

These services are exposed to users via the same endpoint https://<date>.api.com/?report=<report>. Here are a few examples to give you an idea of how it works:

https://2021-04-01.api.com/?report=list_records should route to the historical data service
https://api.com/?report=list_records should route to the live data service
https://api.com/?report=counters should also route to the aggregated data service
https://2018-11-01.api.com/?report=counters should route to the aggregated data service

This may seem like an ugly API but this is how the real world often looks like and you have to deal with it.

So let’s write a routing configuration. First, define 3 upstream groups:

upstream live {
    server live-backend-1:8000;
    server live-backend-2:8000;
    server live-backend-3:8000;
}

upstream hist {
    server hist-backend-1:9999;
    server hist-backend-2:9999;
}

upstream agg {
    server agg-backend-1:7100;
    server agg-backend-2:7100;
    server agg-backend-3:7100;
}

Next, define the server that will listen for all requests and somehow route them:

    server {
        server_name *.api.com "";
        listen 80;

        location / {
            # FIXME: proxy pass to who?
            proxy_pass http://???;
        }
    }

The question is what should we write in proxy_pass directive?

Since nginx configuration is declarative we can write proxy_pass http://$destination/ and build the destination variable with maps.

In our example service, we make a routing decision based on the report query variable and date subdomain. This is what we need to extract into our variables:

map $host $date {
	"~^((?<subdomain>\d{4}-\d{2}-\d{2}).)?api.com$" $subdomain;
	default "";
}

Map will parse $host variable (one of the many predefined nginx variables) and set the result of parsing into our $date variable. Inside the map, there are parsing rules.

In my case there are 2 rules - the main one with regex and the other is a fallback denoted with the default keyword.

You can inspect the regex in regex101. The first symbol ~ marks the rule as a regular expression. Our regex starts with ^ and ends with $ which denotes the start and end of the line - it’s a kind of a best practice for regexes to explicitly match the whole string and I use it as much as possible. To extract the subdomain we create a group with parenthesis. Inside that group I use \d{4}-\d{2}-\d{2} to parse the date format 2021-05-01. There is also ?<subdomain> thing inside the group. This is called capture group and it’s just to give a name to the matched part of the regex. Capture group is then used on the right side of the map rule to assign its value to the $date variable. Note that subdomain is optional so we need to wrap in parenthesis together with the dot (subdomain delimiter) and add ? to the whole group.

Phew! The regex part is done so we may relax.

To extract report we don’t need to use a map because nginx provides arg_<param> predefined variables for query parameters. So report query parameter can be accessed as arg_report.

The full list of nginx variables can be googled with “nginx varindex” and is located here.

Ok, so now we have the date and report. How can we construct $destination variable from it? With another map! The trick here is that you can use a combination of variables to create the new variable in the map:

map "$arg_report:$date" $destination {
    "~counters:.*" agg;
    "~.*:.+" hist;
    default live;
}

The combination here is a string where 2 variables are joined with a colon. Colon is a personal choice and used for convenience. You can use any symbol, just make sure that regex will be unambiguous.

In the map, we have 3 rules.

First is to set $destination to agg when report query parameter is counters.
Second is to set $destination to hist when $date variable is not empty.
The default value set when nothing else matches is to set $destination to live.

Regexes in the map are evaluated in order.

Note that $destination value is the name of the upstream group.

Here is the full config:

events {}

http {
    upstream live {
        server live-backend-1:8000;
        server live-backend-2:8000;
        server live-backend-3:8000;
    }

    upstream hist {
        server hist-backend-1:9999;
        server hist-backend-2:9999;
    }

    upstream agg {
        server agg-backend-1:7100;
        server agg-backend-2:7100;
        server agg-backend-3:7100;
    }

    map $host $date {
        "~^((?<subdomain>\d{4}-\d{2}-\d{2}).)?api.local$" $subdomain;
        default "";
    }

    map "$arg_report:$date" $destination {
        "~counters:.*" agg;
        "~.*:.+" hist;
        default live;
    }

    server {
        server_name *.api.com "";
        listen 80;

        location / {
            proxy_pass http://$destination/;
        }
    }
}

Passing request to Consul services

If you use Consul for service discovery then your services can be accessed via DNS provided by Consul. It’s as simple as curl myapp.service.consul.

Very convenient but nobody knows how to resolve names in .consul zone. Consul docs gives a few ways to configure it universally in your infrastructure. I’ve used dnsmasq with great success.

Anyway, to route requests in nginx via Consul DNS you don’t have to go hard. There is a resolver directive in nginx for using custom DNS servers.

Here is how to forward requests via Consul DNS from nginx:

...
    server {
        server_name *.api.com "";
        listen 80;

        # Resolve using Consul DNS. Fallback to Google and Cloudflare DNS.
        resolver 10.0.0.1:8600 10.0.0.2:8600 10.0.0.3:8600 8.8.8.8 1.1.1.1;
        location /v1/api {
            proxy_pass http://prod.api.service.consul/;
        }
        location /v1/rpc {
            proxy_pass http://prod.rpc.service.consul/;
        }
    }
...

Update: Nice people at lobste.rs pointed out that proxy_pass caches DNS response until restart. There are a few ways to fix this. First, put the Consul service URL into the upstream and use valid option in resolver directive for tuning DNS response TTL. The other option is to use a variable for proxy_pass as described by Jeppe Fihl-Pearson here. Apparently, when nginx sees a variable in proxy_pass it will honor the TTL of DNS response.

Yes, it’s not dynamic in the way that traefik does it. If a new service needs to be added you have to edit the nginx config somehow while traefik does this automatically.

But you can implement decent service discovery using consul template that will update nginx config from consul data.

Conclusion

Nginx is a very versatile tool. It has a rich configuration language that enables nice features for developers.

Active/passive load balancing with configured failover
Flexible requests routing
Easy integration with Consul DNS

Yes, it’s not perfect - the upstream healthchecks are passive (in the open source version), configuration defaults are not modern, initial setup is rough.

But given all the richness, investing a little bit of time into it is worth it. Before ditching it in favor of something else, think hard about all the features that nginx provides.