Nginx Optimization for high traffic websites

9 Steps to Optimize Nginx for better performance

Nginx is an open-source reverse proxy server, fast, lightweight, and the ideal replacement for Apache webserver.

Nginx and Apache comparison

Apache and Nginx are the two most common open source web servers in the world. Together, they are responsible for serving over 50% of the traffic on the internet.

  • NGINX is about 2.5 times faster than Apache
  • NGINX can be deployed as a standalone web server, and as a frontend proxy for Apache and other web servers
  • NGINX – Event-driven architecture
  • Apache – process‑or‑thread‑per‑connection approach

 

The following tweaks can be done to Nginx configuration for full performance.

  • Switch from TCP to UNIX domain sockets
  • Adjust Worker Processes
  • Setup upstream load balancing
  • Disable access log files
  • Enable GZip
  • Cache information about frequently accessed files
  • /etc/sysctl.conf tuning
  • Monitoring the no of connections
  1. Switch from TCP to Unix domain sockets

UNIX sockets have better performance over TCP, UNIX sockets have the ability to copy data and make fewer context switches by identifying the current machine, it also avoids some check and operations like TCP negotiation and routing, etc..

Eg:
For TCP domain sockets we use IP sockets server 127.0.0.1:8080; for changing the sockets to UNIX domain sockets we replace the IP with the port server socket file unix:/var/run/fastcgi.sock;

If you are using more than 1000+ connections per server use the TCP server because they scale better.

  1. Adjust Worker Processes

The worker process is responsible for processing the request. Adjusting the worker processes in the Nginx config file to the number of CPU cores will improve the performance. Use the below command to execute.

cat /proc/cpuinfo | grep processor

  1. Adjust Worker Connections

The max number of simultaneous connections for 1 worker process is 512. This enables you to serve 512 clients. For 2 processes you will be able to serve 1024 clients.

  1. Setup Upstream Load Balancing

Multiple upstream backends on the same machine produce higher throughput than a single backend. For example, if we have 1000 max children, we can divide the no across multiple backend files according to the weight that is specified.

Eg:
upstream backend {
server 10.1.0.101 weight=4;
server 10.1.0.102 weight=2;
server 10.1.0.103;
}

Here the first server is selected twice as often as the second, which again gets twice the requests compared to the third.

  1. Disable access log files

Access logging of high traffic websites involves high I/O and disk space. Instead of writing the access log to the disk immediately, we can buffer the log to the memory

ie: add buffer=size parameter to the access log directive to write log entries to disk when the memory buffer fills up. If you don’t want the access log you can use these to disable the access logging

access_log off;
log_not_found off;
error_log /var/log/nginx-error.log warn;

If you want access logging, but don’t want to write to the disk immediately you can buffer the log to the memory ie: write log entries to disk when the memory buffer fills up.
access_log /var/log/nginx/access.log main buffer=16k;

  1. Enable Gzip

Compression is a huge performance accelerator. Compressing responses often significantly reduce the size of transmitted data. We can specify the compression level in gzip_comp_level 6;, but increasing the gzip_comp_level too high will waste CPU cycle. You can specify the compression type, length, and compress proxied request.

  1. Cache information about frequently accessed files

Caching helps web server load faster. In Nginx, we can specify open_file_cache to the static web content, open_file_cache will cache metadata only. In this we can specify the validity, duration, caching of error pages. Three technique to cache content generated by the webserver

  • Moving content closer to users
  • Moving content to faster machines
  • Moving content off of overused machines
  1. PHP-FPM pool tuning

High traffic websites utilize a lot of processor power. The default PHP-FPM configuration will consume a lot of RAM and CPU resources. We can see the process that is been created by monitoring TOP, HTOP commands in Linux

Increase PHP-FPM performance by adjusting the following 4 values

  • max_children: the maximum number of children that can be alive
  • start_servers: the number of children created on startup.
  • min_spare_servers: the minimum number of children idle.
  • max_spare_servers: the maximum number of children in idle.

PHP-FPM can be tuned according to static and dynamic process. In static process we can set pm.max_children & pm.max_requests,

Eg:
pm = static
pm.max_children = 5

By setting pm.max_children 5 (fixed value) we are setting the maximum number of children that can be alive on the server. Similarly, we can kill and respawns processes after handling a certain number of requests by setting pm.max_requests In Dynamic process we are setting pm.max_children, pm.start_servers, pm.min_spare_servers, pm.max_spare_servers, pm.process_idle_timeout. In dynamic process management we use:
pm = dynamic

We can calculate the Max process by the formula below:
Total Max Processes = (Total Ram – (Used Ram + Buffer)) / (Memory per php process)
Max request per process is unlimited by default, but it’s good to set some low value like 200 and avoid memory issues.

  1. /etc/sysctl.conf tuning

Cloud instances are not fully optimized for the full availability.ie: the default systems may fail under high load. So we need to perform sysctl tweaking for maximum concurrency. Here is the list of tweaking that can be done to sysctl configuration file.

  • Increase system IP port limits to allow for more connections
  • Number of packets to keep in the backlog before the kernel starts dropping them
  • Increase socket listen backlog
  • Increase TCP buffer sizes
  • Disable Ping-flood attacks
  • Do less swapping and use RAM
  1. Monitoring the number of connections

By using Cloudwatch metrics and Monitis we can monitor the logs, connections, resource usage, number of waiting threads, and also generate a notification when the threshold value exceeds.

Conclusion

NGINX performance optimization is an essential part of every business, by constantly monitoring and tweaking the server resources we will be able to reduce the cost and increase the performance of the server considerably.

Cloud Migration using CloudEndure

With highly automated lift and shift solution CloudEndure continually replicates your source machines into a staging area in your AWS account without any downtime or impacting the performance.

Benefits of CloudEndure

  • Replication of many machines in parallel as part of large-scale migration projects
  • Non-disruptive replication and testing so business operations continue as usual.
  • Support for any source infrastructure and all applications running on supported operating systems, including databases and other write-intensive workloads
  • Highly automated orchestration minimizes project length and IT skills needed.
  • Physical machines, including both on-premises and co-location data centers
  • Virtual machines from any hypervisor, including VMware, Microsoft Hyper-V, and others
  • Replication is also supported between Regions or Availability Zones in AWS

Migration Steps

Steps need to be performed on CloudEndure portal

  1. Login to CloudEndure portal at https://console.cloudendure.com/ and create a new Migration project by clicking on the “+” button in the upper left side > Give project name > Click “Create Project”

2. Go to Setup & Info > AWS Credentials

    • Create IAM using in the destination AWS account with the Permissions mentioned in this page
    • Paste the AWS access key ID and Secret Key ID in this page and click SAVE

3. Go to Replication Settings

a. Migration source:- Select “Other infrastructure” if the source server is outside AWS

b. Migration Target:- Select the destination AWS region

c. Replication Servers:- Choose the following details for replication servers

        • Instance type for replication and converter servers
        • Disk type, available options are Fast SSD disk, ordinary disks.
        • The subnet where replication servers will be launched
        • Security Groups to apply to the Replication Servers
        • Option to enable VPN connection to the source server
        • Enable/Disable disk encryption
        • Setup Tags for replication servers (Name Tag is reserved and can’t be used)
        • Enable/Disable Network Bandwidth Throttling

d. Click “Save”

4. Go to “Machines” tab in LHS

5. Copy the URL to download CloudEndure agent and the command to install the agent.

Steps need to be performed in Source Server

Windows

  1. Connect to Source server using RDP
  2. Open any web-browser and paste the URL to download CloudEndure agent
  3. Save the installation file and open CMD prompt in the same location and execute the command to install the agent
  4. The agent will check the number of disks attached and the total GB of data to be migrated and install the CloudEndure Service
  5. Exit from RDP connection

Linux

  1. Connect to Source server using SSH
  2. Copy the line that includes the download the installer to the source server terminal and execute the command in “run the installer” section to install the agent.
  3. The agent will check the number of disks attached and the total GB of data to be migrated and install the CloudEndure Service
  4. Exit from SSH session

Steps need to be performed on CloudEndure portal

  1. Once the CloudEndure agent installation is completed in the source machine, the source machine will be shown in CloudEndure portal’s Machines section.
  2. Then replication server is created in the AWS account and starts the initial data replication.

3. The ETA for initial data replication depends on the amount of the data to be transferred and the bandwidth

Setup Target Server configuration (CloudEndure Portal)

  1. Click on “Machines” > Select the source server > BLUEPRINT

2. Select the following details:

    • Instance Type:
    • Launch Type: On-demand/Dedicated instance/Dedicated Host
    • Subnet
    • Security Group
    • Private IP
    • Elastic IP
    • Public IP
    • Placement Group
    • IAM Role
    • Use Existing instance ID
    • Initial target instance state: – Started/Stopped
    • tags
    • Disks: – Choose Disk type & Disk IOPS

3. Save BLUEPRINT

Launch Target Server

Once the Initial data sync is completed, we can perform launch action. There are 2 types of Launch modes

    • Test Mode: To test and verify that data is migrated successfully.
    • CutOver Mode:
    1. Before you start the Cutover, open the User Console > Machines page. There, verify that each Source machine you want to cutover has the following status indications:
        • DATA REPLICATION PROGRESS – Continuous Data Replication
        • ETA | LAG – n/a | none
        • STATUS – Target machine can be launched
        • MIGRATION LIFE CYCLE – Ready for Testing/ Tested/ Cutover.

2. Click Launch target Machine and select Cutover

3. Click Continue

4. Check the Cutover progress on the Job Progress window

5. Once the job is finished, get the new server details from “Machines” > “Target”

  1. Once the cutover/migration is finished remove the source server from the Migration project.

2.  Click Continue