This article presents the benefits of using a reverse cache proxy and the configuration to setup it up on Nginx. Having a reverse cache proxy gives you a huge performance gain, allowing you to serve more concurrent users on your landing pages!
New to Nginx?
Nginx is an open source HTTP Web server and reverse proxy server. Pronounced as “Engine-Ex,” Nginx currently powers popular websites like Netflix, box, Hulu, WordPress.com, maxcdn, Yandex, and Dropbox.
If you’re new to Nginx, check out this article on Codementor based on the office hours hosted by Chris Fidao: PHP DevOps Tutorial: How to Use Nginx With Your Web Application
Benefits of using a Reverse Cache Proxy
How long have you been waiting for that TechCrunch article? How terrible would it be if your Web server crashed because of the traffic generated by that article?
Typically, such articles are linked to your landing page and will drive over 20,000 visitors. Most of these people are new visitors that have never visited your website before and they’re all firing the following sequence of events:
- User requests landing page from Nginx
- Nginx requests webroot/index.php from PHP5-FPM
- PHP5-FPM spawns a PHP5 process to interpret webroot/index.php
- PHP5 interprets your application’s code starting from webroot/index.php
- Your application code will most probably use a database to query results.
Imagine repeating this process for over 10,000 concurrent visitors!
Since all the pages generated by this process are equivalent, we can think of a caching strategy which goes by the following: generating only one page, storing it in the memory and serving it to all other users.
Here’s a step by step explanation of what’s happening:
- Step 1: Bob opens your website. A request is sent to your domain and gets processed by the Nginx reverse cache proxy on port 80.
- Step 2: The reverse cache proxy checks if this document is available in the cache, but since it’s not available, it forwards the request to the Nginx backend (port 8080) which then calls your typical stack (say PHP5-FPM, PHP5, your application code, MySQL, Redis)
- Step 3: The response from the web server (nginx backend) is forwarded to the reverse cache proxy.
- Step 4: The same response is served to the user.
- Step 5: All subsequent traffic is directly served from the Reverse Cache Proxy because if you go back to Step 2, the reverse cache proxy now finds the document in its cache.
Configuration and Setup
The following configuration is for Ubuntu 14.04:
Open /etc/nginx/nginx.conf
using your favorite text editor and add the following lines right under the http {
definition:
proxy_cache_path /var/www/cache levels=1:2 keys_zone=my-cache:8m max_size=1000m inactive=600m;
proxy_temp_path /var/www/cache/tmp;
real_ip_header X-Forwarded-For;
The first 2 lines create a cache directory. The real X-Forwarded-For header instructs Nginx to forward the original IP address to the Backend (on port 8080) or else all traffic would seem coming from 127.0.0.1.
Next we need to create the virtual host under /etc/nginx/sites-available/website
server {
listen 80;
server_name _;
server_tokens off;
location / {
proxy_pass http://127.0.0.1:8080/;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_cache my-cache;
proxy_cache_valid 3s;
proxy_no_cache $cookie_PHPSESSID;
proxy_cache_bypass $cookie_PHPSESSID;
proxy_cache_key "$scheme$host$request_uri";
add_header X-Cache $upstream_cache_status;
}
}
server {
listen 8080;
server_name _;
root /var/www/your_document_root/;
index index.php index.html index.htm;
server_tokens off;
location ~ \.php$ {
try_files $uri /index.php;
fastcgi_pass 127.0.0.1:9000;
fastcgi_index index.php;
fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
include /etc/nginx/fastcgi_params;
}
location ~ /\.ht {
deny all;
}
}
And then enabling it by doing the following:
cd
ln -s /etc/nginx/sites-available/website /etc/nginx/sites-enabled/website
/etc/init.d/nginx restart
The first server definition is for the reverse cache proxy running on port 80.
The second server definition is for the backend (typical nginx configuration which port 8080 instead of 80).
proxy_pass http://127.0.0.1:8080/
forwards traffic to port 8080, where the Nginx backend is located
proxy_cache my-cache
defines which cache to use, my-cache in this case, which we added earlier in nginx.conf
proxy_cache_valid 3s
sets caching time to 3 seconds. The number of seconds before deeming the cache expired (purging the cache). This number can be increased or decreased depending on the freshness of the content on your website. Although “3 seconds” seems to be a very low number, it does make a huge difference when presented with the scenario discussed here. When presented with a big number of concurrent visitors, a caching time of 3 seconds will only let 1 request to be forwarded to your backend every 3 seconds, which is much better than having around 10,000 requests hit your backend within a short period of time.
proxy_no_cache $cookie_PHPSESSID
forbids the reverse cache proxy from caching requests that have a PHPSESSID cookie. Or else you will end up with your logged in users’ pages cached and displayed to other people. If you’re using a PHP framework that uses a cookie name other than the default PHPSESSID for cookies, make sure to replace it.
proxy_cache_bypass $cookie_PHPSESSID
instructs the proxy to bypass the cache and forwards the request to the backend if the incoming request contains a PHPSESSID cookie. Or else you’ll end up showing logged in users, the logged out version (served from the cache).
proxy_cache_key "$scheme$host$request_uri"
defines a key for caching. The following uses $request_uri which is good for storing a different version of a page depending on the url (think different GET parameters, different content).
add_header X-Cache $upstream_cache_status
can be used for debugging, returns HIT
, BYPASS
or EXPIRED
depending if the request was served from the cache (HIT
) or served from the backend (MISS).EXPIRED
means the key was found in the cache but it has expired and got forwarded to the backend. It can be used in Chrome’s developer tools in the Network tab
Performance gain
Nginx is amazingly fast at serving static content. Serving content from the cache is a much easier task than spawning PHP processes, interpreting PHP libraries and executing bytecode.
Here are the benchmarks using ab
(apache bench):
ab -n 1000 -c 100 http://120.0.0.1:80
This command sends 1000 requests (100 concurrent) to the reverse cache proxy (on port 80)
On the other hand, this command
ab -n 1000 -c 100 http://120.0.0.1:8080
sends 1000 requests (100 concurrent) to the nginx backend (on port 8080)
We can clearly see a huge difference, mainly on the key components marked in red!
It takes 0.2 seconds to run 1000 requests on port 80 compared to 2.5 seconds on port 8080.12.5 times faster!
4300 requests per second on port 80 compared to 400 requests per second on port 8080. 10.7 times faster!
23ms time per request on port 80 compared to 252ms on port 8080! 10.9 times faster!
Comparison to PHP accelerators
Having a PHP accelerator is always recommended, although not very efficient in this scenario especially when compared to a reverse cache proxy.
A PHP accelerator is a PHP extension that improves the performance of PHP scripts by caching the compiled bytecode of PHP files in shared memory to reduce the overhead of re-compiling source code on each subsequent request. The stored OpCode of a certain file is then cleared whenever its source code is changed.
PHP5.5 has a built-in accelerator called OpCache that needs to be manually enabled on some linux distributions. Versions prior to PHP5.5 need to install APC.
The reason why a reverse cache proxy will outperform a PHP accelerator in this scenario, is because even if some or most of the files’ bytecode is cached, PHP5-FPM still has to launch PHP processes, which will still have to execute the bytecode in order to display the output. Finally, a pragmatic decision would be to set up both, a reverse cache proxy and a PHP accelerator to cater for several scenarios.
Why not use Varnish?
Varnish is a reverse cache proxy that only focuses on HTTP. Unlike Nginx which is mainly a Web Server that also acts as a Reverse Cache Proxy, a mail server and load balancer. There are several articles on the internet discussing the differences between Nginx and Varnish, so I’ll keep it concise.
Both Nginx and Varnish perform a great job at reverse cache proxying. Varnish is much more configurable than Nginx but requires more memory and CPU. However it’s much easier to set up Nginx as a reverse cache proxy as well as the backend, without having to install anything new. Adding new software to your stack will not be an easy task when your infrastructure starts to grow.
Conclusion
Setting up Nginx as a reverse cache proxy is an easy task that can have a huge performance gain in some scenarios. It can save you a downtime when you get an article on a big news website and will reduce your infrastructure bills.
PHP Expert Codementor Jad Joubran is the Co-founder and CTO of eTobb.com, an online platform connecting patients and doctors. He has managed a team of 5 developers and 2 designers and 6+ years of experience with PHP. He has also been selected as the Top 20 Lebanese Entrepreneurs in Lebanon by Executive Magazine in 2013. (Beirut, Lebanon and Silicon Valley, United States)