User Guide - Basic HTTP Load Balancing with HAProxy
- 20/10/2011 8:09 AM
WARNING: This user guide is out of date and in the process of being updated - if you need assistance with load balancing or high availability, please contact eApps Support and updated documentation will be provided to you.
Applicable Plans - eApps Cloud Hosting Plans
User Guide - Basic HTTP Load Balancing with HAProxy
Basic Load Balancing configuration
While every scenario is different, a general configuration for a standard load balancing setup would consist of three Virtual Machines. You can add a shared database server also if your needs require it.
This documentation is to setup a basic load balancing configuration. If you have not already done so, please see the Load Balancing and High Availability Overview - http://support.eapps.com/apps/lb_overview for details on different load balancing approaches.
Please note - the VM(s) used for the load balancer will do nothing else. They cannot be used for a web server, a mail server, or for any other purpose.
Here are the recommended resources for the Virtual Machines. You can get more accurate resource requirements for your configuration by doing load testing and analyzing usage over time but the recommendations below have been a successful starting point for existing customers using this configuration in our environment.
- Load Balancer - the Virtual Machine for the load balancer has to be built using the Load Balancer Appliance template. We recommend the following resources for the load balancer VM:
- RAM: 1536MB
- CPU Shares: 80
- Primary Disk: 8GB
- Data Transfer: 300GB
- Port Speed: 5Mbps
- IP Addresses: 2
- Web/App Server - Web/App Server - the two Virtual Machines for the web or application servers should be built using the template and resources that best suit your needs. You can easily add additional web/app server Virtual Machines to the load balancing setup. Our recommended resources for these VM are as follows:
- RAM: 4096
- CPU Shares: 120
- Primary Disk: 8GB
- Data Transfer: 25GB
- Port Speed: 5Mbps
- IP Addresses: 2
- DB server - in many load balancing configurations, a separate DB (database) server is shared between the two Web/App servers. This way you would only have to configure and update the database in one place. You will need a Virtual Machine built with a CentOS template using these recommended resources:
- RAM: 2048
- CPU Shares: 120
- Primary Disk: 8GB
- Data Transfer: 25GB
- Port Speed: 5Mbps
- IP Addresses: 2
Before you continue with the load balancing configuration, please contact eApps Support. The second IP address that was purchased for each of the Virtual Machines will need to be configured as an internal (private) IP address, and this can only be done by eApps technicians. Once this has been done, you will receive an e-mail letting you know that you can continue on with the load balancing configuration.
Configuring the Load Balancer Virtual Machine
Configuring the haproxy service
Configuring the haproxy service with SSL
Configuring the Web and Application servers
Apache Web Server and check.txt
Java servers and check.txt
Restarting the haproxy service
Web site statistics
LogFormat, SetEnvIf, and CustomLog configuration
DNS Configuration
Load Balancer Virtual Machine
Mail server configuration
Configuring the Load Balancer Virtual Machine
In order to set up load balancing, you will need to configure HAProxy, which runs as a Linux service called haproxy on the Load Balancer VM.
To configure the haproxy service, connect to the Load Balancer VM using SSH. You will need to be able to work as the root user, and be able to navigate the Linux filesystem and use standard Linux commands. You will also need to be able to edit files in a text editor - vi, vim, and nano are available.
Before making any changes, make a note of the public and private IP addresses for all three Virtual Machines. You can find the IP addresses in the eApps Portal - click on the Virtual Machines tab, and find the Load Balancer and Web/App servers in the list. The public IP addresses will vary depending on the Cloud Zone the VM is in, but the private IP addresses will always start with 10.0... Make a note of the IPs, and which VMs they are associated with.
Configuring the haproxy service
Connect to the load balancer Virtual Machine by using the public IP address of the VM. Log in as the root user.
$ ssh root@public_IP root@public_IP’s password: Linux lb1 2.6.26-2-xen-amd64 #1 SMP Tue Jan 25 06:13:50 UTC 2011 x86_64 The programs included with the Debian GNU/Linux system are free software; the exact distribution terms for each program are described in the individual files in /usr/share/doc/*/copyright. Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent permitted by applicable law. lb1:~# |
Change directories to /etc/haproxy. The configuration file to edit is haproxy.cfg. Make a backup copy of that file first in case you need to revert back to a default state. Then open the file in a text editor (vim is used in this example).
lb1:~# cd /etc/haproxy/ lb1:/etc/haproxy# ls -l total 16 drwxr-xr-x 2 root root 4096 2011-04-01 14:46 errors -rw-r—r— 1 root root 971 2011-08-10 18:09 haproxy.cfg -rw-r—r— 1 root root 2696 2011-04-01 14:47 haproxy.cfg_orig lb1:/etc/haproxy# cp haproxy.cfg{,.bck} lb1:/etc/haproxy# vim haproxy.cfg |
Make no changes to the global or defaults section of the haproxy.cfg file unless you have a very specific reason to.
Make your changes to this section of the file:
listen webfarm 192.168.114.55:80 mode http stats enable stats auth admin:password balance roundrobin cookie JSESSIONID prefix option httpclose option forwardfor option httpchk HEAD /check.txt HTTP/1.0 server webA 192.168.114.56:80 cookie A check server webB 192.168.114.57:80 cookie B check option forwardfor except 192.168.114.55 reqadd X-Forwarded-Proto:\ https reqadd FRONT_END_HTTPS:\ on |
Change the following lines:
listen webfarm 192.168.114.55:80
- change the IP address to the public IP address of the Load Balancer VM. Make sure the :80 port stays in place.
stats auth admin:password
- change password to a password of your choosing. This is used to log in to the HAProxy web stats page.
cookie JSESSIONID prefix
- change to: cookie SERVERID insert indirect nocache
server webA 192.168.114.56:80 cookie A check
- change the IP address to the private IP of one of the Web/App Server VMs. Make sure the :80 port stays in place.
server webB 192.168.114.57:80 cookie B check
- change the IP address to the private IP of the other Web/App Server VM. Make sure that the :80 port stays in place.
option forwardfor except 192.168.114.55
- change the IP address to the public IP of the Load Balancer VM.
Here is an example of a working haproxy.cfg file, with the changes highlighted. Remember that you will need to substitute your own public and private IP addresses, and choose a different password for your Stats.
listen webfarm 68.169.54.39:80 mode http stats enable stats auth admin:ha234% balance roundrobin cookie SERVERID insert indirect nocache option httpclose option forwardfor option httpchk HEAD /check.txt HTTP/1.0 server webA 10.0.23.9:80 cookie A check server webB 10.0.23.10:80 cookie B check option forwardfor except 68.169.54.39 reqadd X-Forwarded-Proto:\ https reqadd FRONT_END_HTTPS:\ on |
Configuring the haproxy service with SSL
If your web sites or web applications are using SSL, you will need to do some additional configuration to the haproxy service and also to your web sites.
HAProxy configuration
Connect to the Load Balancer Virtual Machine, and change directories to /etc/haproxy. Then make a copy of the haproxy.cfg file for safety, and open the original haproxy.cfg file for editing. All of this is documented in Configuring the haproxy service.
Make no changes to the global or defaults section of the haproxy.cfg file unless you have a very specific reason to.
Make your changes to this section of the file:
listen webfarm 192.168.114.55:80 mode http stats enable stats auth admin:password balance roundrobin cookie JSESSIONID prefix option httpclose option forwardfor option httpchk HEAD /check.txt HTTP/1.0 server webA 192.168.114.56:80 cookie A check server webB 192.168.114.57:80 cookie B check option forwardfor except 192.168.114.55 reqadd X-Forwarded-Proto:\ https reqadd FRONT_END_HTTPS:\ on |
You will need to replace this entire section of the file, so that it looks like this:
listen webfarm bind 192.168.114.55:80,192.168.114.55:443 mode http stats enable stats auth admin:password balance roundrobin server webA 192.168.114.56 cookie A check port 80 server webB 192.168.114.57 cookie A check port 80 option httpchk HEAD /check.txt HTTP/1.0 |
Change the following lines:
bind 192.168.114.55:80,192.168.114.55:443
- change the IP address to the public IP address of the Load Balancer VM. This line causes the VM to allow connections on both port 80 and port 443. Note the indentation for the line.
stats auth admin:password
- change password to a password of your choosing. This is used to log in to the HAProxy web stats page.
server webA 192.168.114.56 cookie A check port 80
- change the IP address to the private IP of one of the Web/App Server VMs. The port 80
allows haproxy to check using http.
server webB 192.168.114.57 cookie B check port 80
- change the IP address to the private IP of the other Web/App Server VM. The port 80
allows haproxy to check using http.
Here is an example of a working haproxy.cfg file, with the changes highlighted. Remember that you will need to substitute your own public and private IP addresses, and choose a different password for your Stats.
listen webfarm bind 68.169.54.39:80,68.169.54.39:443 mode http stats enable stats auth admin:ha234% balance roundrobin server webA 10.0.23.9 cookie A check port 80 server webB 10.0.23.10 cookie A check port 80 option httpchk HEAD /check.txt HTTP/1.0 |
Virtual Machine configuration
Once you have configured the haproxy service to use SSL, you will need to set up SSL on both your Virtual Machines. See the User Guide: Using SSL (SSL Certificates) - http://support.eapps.com/apps/ssl for information on SSL configuration. You will need to install your SSL certificate on both VMs so that the web site looks the same to the browser no matter which Virtual Machine answers the request.
Configuring the Web and Application servers
The haproxy service looks for a file on each of the Web/App VMs being load balanced, called check.txt. This is referenced in three lines in the /etc/haproxy/haproxy.cfg file on the load balancer VM:
server webA 192.168.114.56:80 cookie A check
server webB 192.168.114.57:80 cookie B check
option httpchk HEAD /check.txt HTTP/1.0
The last line looks for the check.txt file in the DocumentRoot of the web site. This can be a zero size file, but it should return an HTTP Response status of 200 (Success). In other words, when you browse to the public IP of one of the Web/App Server VMs and request that file, you should receive no errors - http://public_ip/check.txt.
Apache Web Server and check.txt
To create the check.txt file, log in to the Web/App Server Virtual Machines via SSH, and then change directories to the DocumentRoot of the web site. Using the touch command, create the file, and then set it to the correct owner and group with the chown command.
Remember, this is just an example. Your actual DocumentRoot and directory names will be different.
[root@web1 ~]# cd /home/webadmin/web1.eapps-example.com/html/ [root@web1 html]# touch check.txt [root@web1 html]# ll total 36 -rw-r—r— 1 webadmin webadmin 43 Aug 10 18:40 1.gif -rw-r—r— 1 root root 0 Aug 12 13:48 check.txt -rw-r—r— 1 webadmin webadmin 1575 Aug 10 18:40 index.shtml -rw-r—r— 1 webadmin webadmin 127 Aug 10 18:40 web_site_lb.gif -rw-r—r— 1 webadmin webadmin 7319 Aug 10 18:40 web_site_lemon_tree.jpg -rw-r—r— 1 webadmin webadmin 129 Aug 10 18:40 web_site_lt.gif -rw-r—r— 1 webadmin webadmin 127 Aug 10 18:40 web_site_rb.gif -rw-r—r— 1 webadmin webadmin 130 Aug 10 18:40 web_site_rt.gif [root@web1 html]# chown webadmin:webadmin check.txt [root@web1 html]# |
You will need to do this on each of the VMs being load balanced.
Java servers and check.txt
If you are using a Java application server such as Tomcat, JBoss, or GlassFish, then you will need to add a line to the Edit Directives or the VirtualHost block for the web site so that the check.txt file can be found by the Apache web server, while all other content goes to the Java server. You can do this with either mod_jk or mod_proxy_ajp.
mod_jk
For mod_jk, add this line to Edit Directives or the VirtualHost block for the web site:
SetEnvIf Request_URI "/check.txt*" no-jk
You will need to make sure that mod_jk is installed.
mod_proxy_ajp
For mod_proxy_ajp, add this line to Edit Directives or the VirtualHost block for the web site:
ProxyPass /check.txt !
Since mod_proxy_ajp is an Apache module, no additional software needs to be installed.
You will need to do this on each of the Web/App VMs.
Restarting the haproxy service
Once the configuration is complete, you will need to restart the haproxy service so that your changes are read.
However, don’t restart the service until you have finished the configuration of your web site or web application on your load balanced Virtual Machines. If you restart haproxy before that, you will receive errors because the haproxy service will not be able to “see” the Web/App server VMs.
The haproxy service is restarted with the /etc/initi.d/haproxy restart command.
lb1:~# /etc/init.d/haproxy restart Restarting haproxy: haproxy. lb1:~# |
If you receive any errors when restarting the haproxy service, go back and recheck your configuration. Generally the error messages give a good indication as to what the actual problem is. If you see errors about “webfarm”, make sure that you can ping the private IP addresses for the load balanced VMs from the HAProxy VM.
This completes the basic load balancing configuration. You will still need to configure your web sites or web applications on the Web/App Servers. |
Load Balancer Statistics
There is a web interface for the load balancer that shows the statistics of the Web/App Server Virtual Machines. You can access this by browsing to the public IP of Load Balancer Virtual Machine, and appending the the location of the stats page (/haproxy?stats) to the URL: http://public_ip/haproxy?stats
Log in as admin, and use the password you set for this page. This password is found in the haproxy.cfg file, in this line:
stats auth admin:password
This is an example of what the page looks like:
The page shows you, among other things, whether the load balanced servers (webA and webB) are up or down, how long they have been up or down, and how many bytes of traffic have gone in or out. Click the Refresh now link under the Display option: heading to refresh the statistics.
Web site statistics
Because the load balancer uses HAProxy, which is a proxy and not a packet forwarder, the source of the HTTP requests to the web server will be from the private IP address of the Load Balancer VM, instead of the IP addresses of the actual visitors who are accessing the site.
If you are trying to track site visitors, then you will need to change the way that access to the web sites are logged. What needs to change is the default LogFormat and the default CustomLog, along with the addition of an environment variable to tell the web server not to log the requests for the check.txt file. All of these changes are made in the Edit Directives for the web sites.
You will need to do this on every Web/App Server Virtual Machine.
LogFormat, SetEnvIf, and CustomLog configuration
LogFormat
The default LogFormat used by the Apache web server configuration file (/etc/httpd/conf/httpd.conf) logs (among other things) the date and time of the request, the Referer (the IP address the request came from) and the User Agent (the browser being used). Since the Referer will always be the IP address of the HAProxy VM, another variable must be added to the LogFormat so that the IP address of the actual client making the request is logged. This value is called “X-Forwarded-For”.
Add this line to the Edit Directives for each web site being load balanced:
LogFormat "%{X-Forwarded-For}i %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined
SetEnvIf
To keep the requests from the HAProxy VM for the check.txt file out of the logs, add this line to the Edit Directives for each web site being load balanced:
SetEnvIf Request_URI "^/check\.txt$" dontlog
CustomLog
The existing CustomLog line needs to be edited so that it will read the SetEnvIf. By default, the CustomLog line will look like this:
CustomLog "/home/webadmin/web1.eapps-example.com/access_log" "combined"
Change it so that it looks like this:
CustomLog "/home/webadmin/web1.eapps-example.com/access_log" "combined" env=!dontlog
Make sure to use your existing CustomLog, which will have your correct hostname.
These configuration changes will need to be added to the Edit Directives for the virtual host in the Control Panel. To do this, go to System > Website Management and click on the Server Name for the web site being load balanced.
This takes you to the Website Management screen. Click on Edit Directives, and add the new logging directives to the end of the file.
Once you have made your changes, click Save, and then click Apply Changes at the top of the screen.
You can now install an application like AWStats or Google Analytics on each Virtual Machine being load balanced in order to capture web site visits.
DNS Configuration
Load Balancer Virtual Machine
The domain name that is being used for the web sites or web applications being load balanced needs to be created as a DNS entry, with the DNS A record pointing to the IP address of the Load Balancer Virtual Machine.
If you are using eApps DNS, you will do this configuration in the DNS Manager in the eApps Portal. Please see the User Guide: DNS Manager - http://support.eapps.com/portal/dns for more information if necessary.
If you are not using eApps DNS, then you will need to contact your DNS provider if you need assistance adding an A record for your domain.
Mail server configuration
If you are going to receive e-mail at the domain you are using for the web site or application, you will need to configure one or both of the application server Virtual Machines to be the MX (Mail eXchanger) for the domain. See the Email Handling - MX section of the User Guide: DNS Manager for more information - http://support.eapps.com/portal/dns#email_handling_mx.
eApps also offers the Enterprise Email Service, an Exchange compatible e-mail service using the Zimbra Collaboration Server. More information can be found here: Enterprise Email and Collaboration Services. This will allow you to host your e-mail at eApps, but not directly on your Web/App servers.
If you wish to host your e-mail at a third party, you can use a service like Google Apps - http://support.eapps.com/apps/google_mx.
Links to other information
HAProxy main site - http://haproxy.1wt.eu/
HAProxy Documentation - http://haproxy.1wt.eu/#docs
eApps Portal User Guides - http://support.eapps.com/portal_docs
Control Panel User Guides - http://support.eapps.com/eapps_control_panel
Application User Guides - http://support.eapps.com/apps_docs