Receive New Tutorials

PHP DevOps Tutorial: How to Use Nginx With Your Web Application

– {{showDate(postTime)}}

Codementor PHP expert mentor Chris Fidao is a PHP developer who runs Servers for Hackers, the newsletter about servers, for programmers.

He recently sat down with us during Codementor’s office hours to talk about what programmers should know about servers and how to use Nginx with your web apps, specifically the different ways Nginx can be used to host applications (php/fastCGI, python and uWSGI, load balancing in nginx, and more)

The text below is a summary done by the Codementor team and may vary from the original video and if you see any issues, please let us know!

What is Nginx?

Ngnix is a web server for static files, and as an application proxy it can communicate with dynamically coded applications.
Nginx can be used as a load bouncer, which means it can distribute traffic amongst one or more kind of backend servers.
Nginx can also be used as a web cache like Varnish, where it will pay attention to HTTP cache headers and decide when to certify some cache instead of pulling the cache directly from a server.

Some of the high-level things Nginx can do are:

Using Gzip Compression to compress files that serve like your CSS in your data script
Read an SSL connection, decrypt it, and respond to it.
The Nginx+, which is the commercial product from Ngnix, does logging. It’s also able to stream music and video, and there are monitoring features.

Nginx vs Apache

Nginx does static content very well, and lately people are saying it has taken over Apache in terms of the percentage of people using servers on a Linux type of server environment. Both Nginx and Apache have the same functionality and are great with static content, but they have different process management. A process is essentially your application running, likely in the background, and it may span threads as mini versions of the process, which takes up memory but also makes it quicker and easier to work with.

From the image above, you can see how Apache and Nginx take requests, and the main difference is a bent loop.

By default, Apache spawns processes for every web request on every web connection, and after it handles and responds to the process, the process is then killed. Apache will then create a new process for the next request and so on. However, the way processes are handled in Apache will have a lot of overhead, which means it will be CPU intensive, since having high numbers of processes is a heavy thing for a server to have to do. Apache has improved this with MPMs, or multi-processing modules. MPMs will spawn threads, so you can have the situation where a process can handle an HTTP connection, and then each thread can handle a lot of requests or each process and thread can handle their own HTTP connection and request at the same time.

Furthermore, the difference between a connection and a request is that HTTP can open a connection, and within that connection it can handle most requests. Therefore, through these MPM modules, Apache has a more memory efficient way of handling many requests at the same time. All in all, Apache’s process management creates processes and threads, handles connections and requests, and spends a lot of overhead managing these processes and requests. Although a typical server has a lot of RAM that can handle many processes, having too many processes may negatively affect traffic and connection.

Unlike Apache, Nginx is built for high concurrency, so it can handle multiple concurrent requests at the same time, and it’s “single” process. You can have have more than one process, but there are very few of them and they’re single threaded. This means Ngnix is more efficient, since it doesn’t spawn multiple threads and runs in an efficient event loop.For instance, if you have to read a file or have some kind of network operation, Nginx can actually do something else and process another request while it’s waiting for the other one to finish. So, as opposed to Apache, Nginx will have few processes but it is still able to handle a lot of simultaneous concurrent requests because of its synchronous nature.

In conclusion, Nginx handles static files very efficiently, and sometimes it’s as fast as a cache like Varnish.

Configuring Your Nginx Server

server {
   listen 80 default_site;

   root /var/www;
   index index.html index.htm; !
   server_name example.com www.example.com; !
   location / {
       try_files $uri $uri/ =404;
   } 
}

This is a really basic configuration for Nginx to serve files on a website. It’s almost as simple as the Servers for Hackers website because it’s just a static site generated with Sculpin, which is a PHP static site generator.

So as you can see from the code, Nginx will listen on port 80, which is a regular HTTP port, and we tell it that this is the default site. If Nginx gets a request where there’s no website configuration, it goes to port 80 because it’s the default site. We then set our web root at /var/www, which is where our web files are. The index will be index.html or if it fails, index.htm, while we set the server_name to be example.com or www.example.com. Lastly, we have a try_files directive in a location block, which will work to match whatever URI you gave here and any sub directory or any sub URI. In this case it will try_files, which is trying to find a file on the server. At first it will try to find the exact URI as a file, and after that fails, it will try to find the URI as a directory, and if that fails, it will respond to the 404. So again, this is just a really basic Nginx setup for a static website with index HTML files, CSS, Javascript, etcetera, but no dynamic application yet.

It is highly recommended to use H5BP for your server configuration setup. If you install Nginx on Linux servers such as Ubuntu or Debian, you typically have the sites enabled and sites available structure already.

The H5BP includes really useful configurations and defaults for your Nginx servers such as:

Gzip Setup
Mime Types
Cache Expiration
“Proper” SSL Config
SPDY Setup
X-Domain/Web Fonts
Cache Busting
Security (XSS, dot files)

To expound on some of the benefits listed above, it includes more mime types to match files being used by the server and file types, and it has better SSL ciphers over old SSL protocols.

Using Nginx as an Application Proxy

Nginx can take a web request and proxy it, or send it, to something else to your application. PHP developers might not be used to this process simply they’ve been so used to using Apache as things like map uses it and those usually have mod_php loaded, which essentially loads PHP inside of Apache. What you end up with is a situation where Apache is finding PHP files that end in .php and just parsing them with PHP in it. Apache just treats those like static files, except it also runs PHP code inside of them. That’s just really an exception to how web applications work. If you use Ruby or Python or Java or something else, you’d have a more typical setup where it gets more complicated when you have more things going on than what you would get with Apache.

In this picture above, the laptop on the top represents a web client (e.g. a web browser like Chrome, which makes web requests at an application). That HTTP request goes to Nginx and then Nginx has to pass the request off your application, so Nginx doesn’t talk directly to your application. There’s a gateway in between, as there’s always a middle-man between your code and a web server. Nginx talks to gateways and the gateway talks to your application.

Here are some protocols, or languages, the gateways speak.

Gunicorn or unicorn for Python and Ruby can listen on HTTP and act like webservers. PHP-FPM uses fastCGI and WSGI (a Python-based gateway interface protocol) to talk to your applications.

To illustrate how gateways work, let’s say you’re using PHP and you have a PHP application (the green code at the bottom of the slide). You’d use PHP-FPM to talk to the application, which means PHP-FPM will be the gateway in this case. PHP-FPM uses fastCGI to speak, so what happens is, Nginx takes a web request, turns it into a fastCGI request, and sends it back to PHP-FPM. PHP-FPM sends that to PHP—your web application—and then sends the response from PHP back to Nginx. The gateway is doing a lot of work here: It’s accepting requests from Nginx, sending them to PHP, and then sending the responses back to Nginx, which ends up back in your browser (or however you access the application). Nginx can send requests to HTTP listeners, to fastCGI listeners, to WSGI listeners, and it even talk directly to memcache. You can do some fancy caching in HTTP.

location /static {
   try_files $uri $uri/ =404;
}

location / {
   proxy_pass 127.0.0.1:9000;
   proxy_param APPENV production;
   include proxy_params;
}

This is what it looks like to proxy a request from Nginx to another application. When you have an application, Nginx sits in front of it and has to accept requests, but you need a way to tell Nginx when to serve static files (e.g. your Javascript, CSS images, etc.), and when to send the request off to your application. A lot of time a setup just reserves a directory under which static files it serves. In this example, that’s the static directory. Anything inside of the static directory, any sub-directories of the static directory, and the sub-files get treated as a static file. You also see the try_files again here, and it does the URI or the URI as a directory. If it fails to find that, it returns the 404. Within your application, all of your static files need to be in a static sub-directory in your web root. There are other ways to do this as well, but this is the common setup.

If it’s not in the static directory, the Nginx will send it to your application, which listens to all other URIs other than the static one, and here it is just going to proxy pass it to a gateway listening at local host port 9000. When you see proxy pass, it is using an HTTP proxy and passing the HTTP request to get to another process also listening on HTTP.

In the example above, the application is listening to HTTP local host 9000, and you can pass it an application environment of production. This is kind of an arbitrary key value, so in this case it is used to pass an environmental variable in telling the application to run as production. The example also uses proxy_params, which includes a configuration file that comes with Nginx and has some extra parameters you usually want passed when you’re proxying requests.

location ~ \.php$ {
   fastcgi_split_path_info ^(.+\.php)(/.+)$;

   fastcgi_pass 127.0.0.1:9000;
   # Or:
   #fastcgi_pass unix:/var/run/php5-fpm.sock;
   fastcgi_index index.php;
   fastcgi_param APPENV production;
   include fastcgi.conf;
}

PHP-FPM uses fastCGI to listen to ports instead of HTTP. In the example above, you can see it’s still listening at local host port 9000, but it won’t accept an HTTP request. It will only accept one made in the fastCGI protocol, and this is a standard way to process PHP files.

In HTTP proxies, URLs that are not static are matched and passed into the application. In PHP, only files that end in .php are matched and passed to the application. There are some extra parameters in the example here that PHP specifically needs. Because PHP matches an actual file in the URI instead of any URI that’s not part of the static directory, you’d need to do the split_path_info on the top.

For instance: Let’s pretend you’re visiting a website at some/uri/index.php/blog/post/1. If you happen to have a PHP application run at a sub directory (some/uri/index.php), that entire URI is going to get passed to the fastCGI application, but this is not desired. To fix this, use fastCGI’s split_path_info to split the path on the PHP file, and /blog/post/1 will end up being used as the URL to the application. The PHP application will then think the URL is blog/post/1 and not some/URI/index.php/blog/post/1. This is a typical setup you’d need for things like Laravel, symphony or other application frameworks to know and route to the proper URI so it gets to the right controller and responds to the correct thing.

Other than the split_path_info, the example code has another parameter to pass the environment out as production for this application. Lastly, the fastcgi.com at the bottom includes proxy_params, which sets up extra information the PHP-FPM process might need to successfully fill out PHP’s server global or AIM and provide server type information for PHP to use. A lot of frameworks rely on that information.

location /static {
   try_files $uri $uri/ =404;
}

location / {
   uwsgi_pass 127.0.0.1:9000;
   uwsgi_param APPENV production;
   include uwsgi_params;
}

Lastly, in WSGI Nginx will take away an HTTP request, convert it to the correct protocol the gateway uses, and then send it off to the gateway. The gateway then talks to the application and sends back the response.

Nginx as a Load Balancer

There are other software load balancers such as HAProxy, but Nginx makes a pretty good load balancer as well.

Here is a good configuration for a load balancing:

upstream app_example {
   zone backend 64k;
   least_conn;
   server 127.0.0.1:9000 max_fails=3 fail_timeout=30s;
   server 127.0.0.1:9001 max_fails=3 fail_timeout=30s;
   server 127.0.0.1:9002 max_fails=3 fail_timeout=30s;
}

server { 
   # usual stuff omitted

   location / {
       health_check;
       include proxy_params; 

       proxy_pass http://app_example/; 
       # Handle Web Socket connections
       proxy_http_version 1.1;
       proxy_set_header Upgrade $http_upgrade;
       proxy_set_header Connection "upgrade";
   } 
}

the upstream are just servers that you will be load balancing in between.

In this example, there are three web servers and they are listening to port 9000, 9001, and 9002 respectively. In order to get Nginx to balance traffic between these three servers whenever it gets some request, make an upstream and name it whatever you want. In this example, the upstream is called app_example. The least_conn here is an algorithm that will tell Nginx to use to balance between the three servers, so anytime Nginx gets a new request it’s going to use the algorithm to decide which server to send it to. Nginx will look at the three servers and track which one currently has the least amount of connections, and it will send the request to the one with the least connections.

upstream app_example {
   zone backend 64k;
   least_conn;
   server 127.0.0.1:9000 max_fails=3 fail_timeout=30s;
   server 127.0.0.1:9001 max_fails=3 fail_timeout=30s;
   server 127.0.0.1:9002 max_fails=3 fail_timeout=30s;
}

Each of these three servers can fail three times before it’s taken out of the rotation of servers that Nginx will balance between. So if it fails three times within 30 seconds, Nginx will stop sending request to that server for 30 seconds. After 30 seconds, it will start trying again. This extra parameter stops Nginx from sending requests to servers that may be down or broken.

server { 
   # usual stuff omitted

   location / {
       health_check;
       include proxy_params; 

       proxy_pass http://app_example/; 
       # Handle Web Socket connections
       proxy_http_version 1.1;
       proxy_set_header Upgrade $http_upgrade;
       proxy_set_header Connection "upgrade";
   }

Health check will run checks against servers and if it finds a server has failed or is not responding, it will stop sending requests to it. This works in conjunction with the zone backend 64k. If Nginx is only running one process we won’t need the health check, but typically Nginx will have as many processes as the CPU cores. In other words, if you have a server with four CPU cores, you might run Nginx with four processes.

If there are more than one processes running, you’ll need some shared memory Nginx can use to keep track of which servers are up and running and which servers have failed. You can do so by reserving 64k of memory for Nginx to use for the health checks.

The include proxy_params is just a boiler plate to use any time you’re proxying to applications or backend servers. The proxy pass is an HTTP request to the backend servers, and if you’re using web sockets you can do that with Node.js. You can load balance between servers that use web sockets by adding the HTTP version and setting two headers, which will upgrade an HTTP request to a web socket request. This will be particularly handy if you’re using web sockets and need a little balancing.

Nginx as a Web Cache

If you are familiar with the web cache/web accelerator Varnish, Nginx can act like it, which will help you save web requests from actually getting to your servers. You can save the result of your request for a static file or even a dynamic request in Nginx, and Nginx can serve that request directly to a client without having to do the backend process because it’s a cache response. Nginx can cache static files such as CSS and Javascript, and it can also do dynamic requests as well.

This is pretty good for a blog. For example, the blog at fideloper.com/ uses fastCGI caching, and since it’s a dynamic content that doesn’t change often, it’s a good use case. Nginx saves the results of whatever PHP spits out and sends that to a user’s browser, which saves the server from having to do any PHP processing for most of the content. Moreover, there’s actually some load balancing. So in the same way you can save the response from a fastCGI, PHP-FPM response, you can also save the response from a load balance request.

Some terminologies to go along with cache servers:

Cache Server – Server (Nginx) which does web caching, pays attention to cache headers
Origin Server – Server containing actual resources that the cache server caches. Determines “Cache Policy”

The origin server has two responsibilities:
1. To have files and have an application to be the authority on what the file is.
2. To send cache policy which I’m sending in air quotes again. (e.g. The origin server can set rules to cache a CSS file for one month, cache an image file for one hour, and don’t cache the HTML responses. The cache server will have to follow those rules.)

If you in your browser sends a request to a website, you will receive that request first and then it will inspect the headers of the request. It will also see if it has a version of the requested resource in its memory in cache.

For example, if your browser requests a CSS file and then Nginx, the cache server, sees that it has that CSS file in cache, it will just respond with the CSS file. If it doesn’t, then it will go to the origin server, get the CSS file out, and then sends it back to the browser. In other words, the cache server can save the origin server from handling any requests, which is sort of like a bandwidth and memory saver.

If the origin server does receive a request, for instance a CSS file, it will serve that CSS file and then tell the cache server for how long to cache it, even if it can’t be cached. If it’s a request for a dynamic application, then the request goes to the gateway, to your code, back to the Nginx, back to your cache server and finally to back to your browsers. The only new thing in this picture we have here is the extra layer—the cache server—which can handle certain responses.

Here is a configuration for caching:

proxy_cache_path /tmp/nginx levels=1:2 keys_zone=my_zone:10m inactive=60m;
proxy_cache_key "$scheme$request_method$host$request_uri";

server {
   # Stuff Omitted
   location / {
       proxy_cache my_zone;
       add_header X-Proxy-Cache $upstream_cache_status;
       include proxy_params;
       proxy_pass http://172.17.0.18:9000;
   }
}

/tmp/nginx is just where Nginx is going to store cache files

Levels=1:2 says how Nginx can save files in that temp Nginx location. In this case it will use the MD5 hash of the cache key and it will create some directories based on that to save files. You don’t have to set the levels directive here at all and just let Nginx cache directly in the Nginx temp in the Nginx directory

keys_zone is named my_zone, and 10 megabytes (10m) of cache space is reserved so Nginx can use ten megabytes for cache files. If your Nginx was being used purely as a cache server, you might want to have a few gigabytes reserved depending on your server.

The next directive is inactive=60m, where if a file isn’t requested within 60 minutes, Nginx removes it from cache. This is Nginx giving you a way to cache the most commonly requested files. Therefore, even if a file has a cache period of a month, if it’s only requested once or twice a day, it’s not necessarily going to be in Nginx as a cache. This is good for a busy site.

Proxy_cache_key tells Nginx how to decide to cache a file. It creates a key which uses the scheme (e.g. HTTP vs HTTPS), the request_method (e.g. get vs post vs put), the host (e.g. serversforhackes.com), and the request_uri. It will use MD5 on this information together to create a cache key. There are other variables, however. For instance, you could add a user cookie variable, and that would incorporate cookies into the request cache key. The side effect of doing so, however, is Nginx ending up making it a cache per user because users typically have a unique cookie. This is not a typical use case of a cache server, but if that use case works for you (e.g. if you have stuff you want to cache per individual user of your website), that’s an option you have.

The server block is another location block. The proxy_cache is named my_zone, which is the key thing to tell Nginx to cache the responses from requests made to this location block. One neat thing here the add_header called X-proxy cache, which gives the value of $upstream_cache_status. It determines if something is a proxy hit, proxy miss, or bypass, which is a neat way of being able to look at the headers you get back from a web request.

Once again, include proxy_params will proxy pass this to a server. Here it happens to be listening at this IP address at port 9000, and it’s going to take that and just respond, and just cache the response from that proxy server.

In conclusion, the cache can do HTTP proxying and actually save responses from fastCGI and uWSGI. Anything a proxy is to do, it can save the response from, which is pretty neat because you can save the response from dynamic applications, not just static files. It’s really easy to change it, since you’d just have to change the word proxy to fastCGI or to uWSGI, or just however you proxy your applications. You can use that and cache the responses from your applications or from static files.