Apache Webserver
Performance tuning
Apache server performance
Apache server performance can be improved by adding additional
hardware resources such as RAM, faster CPU etc. But, most of the
time, the same result can be achieved by custom configuration of
the server. This article looks into getting maximum performance
out of Apache with the existing hardware resources, specifically
on the Linux systems. Of course, it is assumed that there is
enough hardware resources, especially enough RAM that the server
isn't swapping frequently. First two sections look into various
Compile-Time and Run-Time configuration options. Run-Time
section assumes that Apache is compiled with prefork MPM. HTTP
compression and caching is discussed next. Finally, using
separate servers for serving static and dynamic contents are
being discussed. Basic knowledge of compiling and configuring
Apache, and Linux are assumed.
Compile-Time Configuration Options
Load only the required modules
The Apache HTTP Server is a modular program where the
administrator can choose the functionality to include in the
server by selecting a set of modules. The modules can be either
statically compiled to the httpd binary or else can be compiled
as Dynamic Shared Objects (DSOs). DSO modules can be either
compiled when the server is built or else can use the apxs
utility to compile and add at a later date. The module mod_so
must be statically compiled into the Apache core to enable DSO
support.
Run apache with only the required modules. This reduces the
memory footprint and hence the server performance. Statically
compiling modules will save RAM that's used for supporting
dynamically loaded modules, but one has to recompile Apache
whenever a module is to be added or dropped. This is where the
DSO mechanism comes handy. Once the mod_so module is statically
compiled, any other module can be added or dropped using the
LoadModule command in httpd.conf file - of course, you will have
to compile the modules using apxs if it wasn't compiled when the
server was built.
Choose appropriate MPM
Apache server ships with a selection of Multi-Processing Modules
(MPMs) which are responsible for binding to network ports on the
machine, accepting requests, and dispatching children to handle
the requests. Only one MPM can be loaded into the server at any
time.
Choosing an MPM depends on various factors such as whether the
OS supports threads, how much memory is available, scalability
versus stability, whether non-thread-safe third-party modules
are used, etc.. Linux systems can choose to use a threaded MPM
like worker or a non-threaded MPM like prefork:
Worker MPM uses multiple child processes. It's multi-threaded
within each child and each thread handles a single connection.
Worker is fast and highly scalable and the memory footprint is
comparatively low. It's well suited for multiple processors. On
the other hand, worker is less tolerant to faulty modules and
faulty threads can affect all the threads in a child process.
Prefork MPM uses multiple child processes, each child handles
one connection at a time. Prefork is well suited for single or
double CPU systems, speed is comparable to that of worker and
it's highly tolerant to faulty modules and crashing children.
But the memory usage is high, more traffic leads to more memory
usage.
Run-Time Configuration Options
DNS lookup
The HostnameLookups directive enables DNS lookup so that
hostnames can be logged instead of the IP address. This adds
latency to every request since the DNS lookup has to be
completed before the request is finished. HostnameLookups is Off
by default in Apache 1.3 and above. Leave it Off and use
post-processing program such as logresolve to resolve IP
addresses in Apache's access logfiles. Logresolve ships with
Apache.
When using Allow from or Deny from directives, use IP address
instead of a domain name or a hostname. Otherwise a double DNS
lookup is performed to make sure that the domain name or the
hostname is not being spoofed.
AllowOverride
If AllowOverride is not set to 'None', then Apache will attempt
to open .htaccess file (as specified by AccessFileName
directive) in each directory that it visits. For example:
DocumentRoot /var/www/html <Directory /> AllowOverride all
</Directory>If a request is made for URI /index.html, then
Apache will attempt to open /.htaccess, /var/.htaccess, /var/www/.htaccess,
and /var/www/html/.htaccess. These additional file system
lookups add to the latency. If .htaccess is required for a
particular directory, then enable it for that directory alone.
FollowSymLinks and SymLinksIfOwnerMatch
If FollowSymLinks option is set, then the server will follow
symbolic links in this directory. If SymLinksIfOwnerMatch is
set, then the server will follow symbolic links only if the
target file or directory is owned by the same user as the link.
If SymLinksIfOwnerMatch is set, then Apache will have to issue
additional system calls to verify whether the ownership of the
link and the target file match. Additional system calls are also
needed when FollowSymLinks is NOT set. For example:
DocumentRoot /vaw/www/html <Directory /> Options
SymLinksIfOwnerMatch </Directory> For a request made for URI /index.html,
Apache will perform lstat() on /var, /var/www, /var/www/html,
and /var/www/html/index.html. These additional system calls will
add to the latency. The lstat results are not cached, so they
will occur on every request.
For maximum performance, set FollowSymLinks everywhere and never
set SymLinksIfOwnerMatch. Or else, if SymLinksIfOwnerMatch is
required for a directory, then set it for that directory alone.
Content Negotiation
Avoid content negotiation for fast response. If content
negotiation is required for the site, use type-map files rather
than Options MultiViews directive. With MultiViews, Apache has
to scan the directory for files, which add to the latency.
MaxClients
The MaxClients sets the limit on maximum simultaneous requests
that can be supported by the server. No more than this much
number of child processes are spawned. It shouldn't be set too
low such that new connections are put in queue, which eventually
time-out and the server resources are left unused. Setting this
too high will cause the server to start swapping and the
response time will degrade drastically. Appropriate value for
MaxClients can be calculated as: MaxClients = Total RAM
dedicated to the web server / Max child process size Child
process size for serving static file is about 2-3M. For dynamic
content such as PHP, it may be around 15M. The RSS column in "ps
-ylC httpd --sort:rss"shows non-swapped physical memory usage by
Apache processes in kilo Bytes.
If there are more concurrent users than MaxClients, the requests
will be queued up to a number based on ListenBacklog directive.
Increase ServerLimit to set MaxClients above 256.
MinSpareServers, MaxSpareServers, and
StartServers
MaxSpareServers and MinSpareServers determine how many child
processes to keep while waiting for requests. If the
MinSpareServers is too low and a bunch of requests come in, then
Apache will have to spawn additional child processes to serve
the requests. Creating child processes is relatively expensive.
If the server is busy creating child processes, it won't be able
to serve the client requests immediately. MaxSpareServers
shouldn't be set too high, it can cause resource problems since
the child processes consume resources.
Tune MinSpareServers and MaxSpareServers such that Apache need
not frequently spwan more than 4 child processes per second
(Apache can spwan a maximum of 32 child processes per second).
When more than 4 children are spawned per second, a message will
be logged in the ErrorLog.
The StartServers directive sets the number of child server
processes created on startup. Apache will continue creating
child process until the MinSpareServers setting is reached.
Doesn't have much effect on performance if the server isn't
restarted frequently. If there are lot of requests and Apache is
restarted frequently, set this to a relatively high value.
MaxRequestsPerChild
The MaxRequestsPerChild directive sets the limit on the number
of requests that an individual child server process will handle.
After MaxRequestsPerChild requests, the child process will die.
It's set to 0 by default, that means the child process will
never expire. It is appropriate to set this to a value of few
thousands. This can help prevent memory leakage since the
process dies after serving a certain number of requests. Do not
set this too low, since creating new processes does have
overhead.
KeepAlive and KeepAliveTimeout
The KeepAlive directive allows multiple requests to be sent over
the same TCP connection. This is particularly useful while
serving HTML pages with lot of images. If KeepAlive is set to
Off, then for each images, a separate TCP connection has to be
made. Overhead due to establishing TCP connection can be
eliminated by turning On KeepAlive.
KeepAliveTimeout determines how long to wait for the next
request. Set this to a low value, perhaps between two to five
seconds. If it is set too high, child processed are tied up
waiting for the client when they could be used for serving new
clients.
HTTP Compression & Caching
HTTP compression is completely specified in HTTP/1.1. The server
uses gzip or deflate encoding method to the response payload
before it is sent to the client. Client then decompresses the
payload. There is no need to install any additional software at
the client side since all major browsers support this. Using
compression will save bandwidth and improve response time,
studies have found a mean compression gain of 75.2 % . HTTP
Compression can be enabled in Apache using mod_deflate module.
Payload is compressed only if the browser requests compression,
otherwise uncompressed content is served. A compression aware
browser inform the server that it prefers compressed content
through the HTTP request header - "Accept-Encoding: gzip,deflate".
Then the server responds with compressed payload and the
response header set to "Content-Encoding:gzip
Separate server for static and dynamic
content
Apache processes serving dynamic content takes about 3M to 20M
of RAM. It grows to accommodate the content it's serving and
never decreases until the process dies. Say an Apache process
grows to 20M to serve a dynamic content. After completing the
request, it is free to serve any other request. If a request for
an image comes in, then this 20M process is serving a static
content which could as well be served by a 1M process. Memory is
used inefficiently.
Use a tiny Apache (with minimum modules statically compiled) as
the front-end server to serve static contents. Request for
dynamic contents are forwarded to the heavy Apache (compiled
with all required modules). Using a light front-end server has
the advantage that the static contents are served fast without
much memory usage and only the dynamic contents are passed over
to the heavy server.
Request forwarding can be achieved by using mod_proxy and
rewrite_module modules. Suppose there is a lightweight Apache
server listening to port 80 and the heavyweight Apache listening
on port 8088. Then the following configuration in the
lightweight Apache can be used to forward all request except
request for images to the heavyweight server.
ProxyPassReverse / http://%{HTTP_HOST}:8088/
RewriteEngine on
RewriteCond %{REQUEST_URI} !.*\.(gif|png|jpg)$
RewriteRule ^/(.*) http://%{HTTP_HOST}:8088/$1 [P]
All requests, except for images, are proxied to the backend
server. Response is received by the frontend server and then
supplied to the client. As far as client is concerned, all the
response seem to come from a single server.
Reducing network load
The following three modules can be used to reduce the network
load generated by Apache. Of these, only mod_gzip will have any
effect if the bandwidth bottleneck is outside of your control,
like the user's modem connection.
mod_gzip
The mod_gzip module attempts to reduce bandwidth use by
compressing data that is being sent out. If the browser claims
to accept 'gzip' encoding, files can get compressed using the
Lempel-Ziv coding (LZ77), the same algorithm used by the UNIX 'gzip'
command. This compression comes at a cost of processing time on
both the server and the client, however. The modules allows
specifying which files are eligible for compression, so that
files that are already (partially) compressed, and which would
have little to gain by compression (like .gz files, .jpeg files,
etc.) can be skipped.
The module is especially effective when using it to compress
text-files (and thus HTML files) which are easily compressible.
The mod_gzip module and the accompanying documentation can be
found at http://www.remotecommunications.com/apache/mod_gzip/.
In Apache 2.0, mod_gzip's functionality is replaced by a new
standard module, mod_deflate, which is documented in the
standard documentation.
mod_bandwidth
Mod_bandwidth is a bandwidth throttling module, useful for
keeping the traffic of a whole Apache installation or of
specific VirtualHosts or directories in check. It allows for two
rates, one for general data and one for files larger than a
specified value, making it convenient for quenching
file-downloads so the rest of the site is still responsive.
Throttling can also be done at the OS level, rather than the
application level. For throttling entire Apache installations or
IP-based virtualhosts this is often more efficient. However,
throttling a name-based virtualhost or a directory within a
virtualhost is not possible that way.
The throttling comes at a price of extra calculations on every
packet send, and a local scratchboard to keep track of bandwidth
usage. It is a good example of non-speed oriented optimizations.
Mod_bandwidth can be found at
http://www.cohprog.com/mod_bandwidth.html.
Documentation is included in the C source file.
mod_proxy
Another method to reduce traffic for a specific server and
increase the speed with which pages are served is by using a
front proxy. The standard Apache module mod_proxy can serve as a
front proxy. A front proxy keeps a cache of recently requested
pages and returns the page from that cache if at all possible.
Front proxies are only useful if the real storage of the data is
slower than the cache of the proxy. For example, frequently
requested data that resides on a remote NFS server or on a slow
disk or CD can be significantly sped up by a front proxy.
Another common use of mod_proxy (with the help of mod_rewrite)
is to split up requests to several servers, based on the URL, so
that static content is taken from one server while
auto-generated and server-intensive pages are taken from
another.
Mod_proxy is useful for more than just playing front proxy, but
it is not suited for every task. Since it was designed to proxy
for other servers, not the server it is loaded into, it can be
tricky to incorporate into existing setups. And since mod_proxy
is a cache, the logfiles of the actual server no longer contains
information of hits that were satisfied in the proxy.
Furthermore, since all requests have to pass through the proxy
server, it is still a bandwidth and speed bottleneck. Both
mod_proxy and the documentation for it (which includes examples
of configuring it as a front proxy) are part of the standard
Apache distribution, though the module is not compiled in by
default.
Speeding up CGI scripts
CGI scripts are any program that gets executed on-demand by the
webserver, and that uses the Common Gateway Interface to
transfer information from and to the webserver and the browser
that did the original request. Even though compiled C programs
that use CGI are not technically scripts, they are often
referred to as CGI scripts as well. A large portion of today's
web programs are actually CGI scripts, though with the advent of
PHP and other HTML-embedded server-side languages, those are
getting more popular.
This section gives some hints on how to speed up the execution
of CGI scripts. mod_fastcgi is a general FastCGI module, which
uses FastCGI rather than normal CGI to connect to the CGI
scripts. FastCGI uses some tricks to reduce the fork/exec
overhead of a CGI script, but is not entirely backward
compatible with normal CGI. mod_php and other such
language-specific modules use language-specific information to
speed up the execution process.
FastCGI
FastCGI is a slightly more complex alternative to normal CGI.
With normal CGI, the webserver communicates with the CGI script
through environment variables, and the client browser with the
CGI script through its standard input. With FastCGI, each script
acts as a daemon, being started once and handling multiple
requests. Instead of environment variables, the server passes
all information about the request through standard input,
allowing FastCGI scripts to even be executed on different
servers, over extra TCP connections.
The FastCGI interface allows for far more efficient use of
resources, especially for oft-requested scripts, but might
require a rewrite of the script in question to work properly.
There are FastCGI API libraries for most popular languages, most
of which allow a script to be used both though CGI and FastCGI
without the need for modifications, but scripts that do not
(yet) use these APIs do need to be modified, if not rewritten.
FastCGI is best explained on its website, www.fastcgi.com, which
also contains the Apache module and the API libraries for most
languages.
mod_<language>
For several scripting languages (including Perl, Python, PHP,
Tcl and Ruby) there are separate interpreter modules that give
the language more control over Apache, as well as a performance
boost. In general this is done by using specific knowledge about
the language, and by keeping the language's Interpreter or
Virtual Machine hanging around, passing it scripts as they get
invoked. This avoids the execution overhead, and in some
languages the compiling phase.
Each of these modules defines a Handler, to which specific file
extensions and mime-types can be mapped, so that files of that
type automatically get parsed by the right module. Because the
interface to the scripts is far more like CGI than FastCGI, CGI
scripts often need no or little modification to work properly.
The modules generally also allow embedding the language directly
in HTML, using special tags to indicate the start and end of
such embedded code. PHP started this trend by being specifically
designed for the task. Though the resulting file looks like a
HTML page, it should not be thought of as such: it is a script.
The performance of the resulting script is often somewhat worse
than a normal script that outputs HTML, because the HTML file
has to undergo extra parsing to extract the HTML snippets.
The Perl module, mod_perl, can be found at: perl.apache.org.
For Python there are two competing modules, mod_snake and
mod_python. mod_snake was originally written for Apache 2.0, has
been ported to Apache 1.3 but is currently not being maintained.
mod_python provides less functionality, but is also more
lightweight because of it.
For Tcl there are actually several modules, each with their own
special focus. They can all be found under tcl.apache.org.
PHP is a language that got popular mainly because it was easily
embeddable in HTML. It is currently in its fourth incarnation
and is still undergoing improvements and expansions. PHP can be
found at www.php.net
Ruby is another OO scripting language that is growing in
popularity, with its own language module. mod_ruby can be found
at www.modruby.net.