Apache2 Authentication against Active
Directory
Apache 2 secure reverse proxy running on Debian Linux and
authenticating against Windows 2003 Server Active Directory
using secure LDAP via mod_auth_pam and pam_ldap.
Requirements
Debian 3.1 (Sarge)
Apache 2
mod_auth_pam
pam_ldap
mod_ssl
open_ldap
mod_python
Authentication
Apache2
Apache2
Installation Click here.In the case of this implementation, there is a single unsecured home
page in the main server, and then an IP based secure,
authenticated virtual server proxying each origin server. There
is one file created to contain the specific configuration for
each of the proxying servers within /etc/apache2/sites-enabled.
It is in these files where we configure authentication.
Providing mod_auth_pam is loaded (look in /etc/apache2/mods-enabled)
and has not been disabled elsewhere in the configuration, and
that there is no competing authentication module loaded, all
that is required is something of the form:
AuthType Basic
AuthName "Secure reverse proxy"
Require valid-user
Allow from all
This type of authentication didn't work reliably under
apache2-mpm-worker but did under apache2-mpm-prefork.Occasional
missing requisite items (eg images and style sheets) in Internet
Explorer and segmentation faults reported in Apache's error log.
Using other browsers, there were still errors logged, but no
missing items.
PAM Configuration
Apache2's PAM configuration will be held in /etc/pam.d/apache2.
This contains:
auth required pam_ldap.so
auth required pam_caseless_listfile.so onerr=fail item=user
sense=deny file=/path
account required pam_permit.so
This first line indicates that authentication against pam_ldap
is required. The second line uses a (slightly) custom module to
block certain accounts. The thrid line prevents users needing
accounts on the system running Apache.The (slightly) custom
module is worthy of further discussion. We use a number of
accounts for system registration purposes, and their usernames
and passwords are widely known. We therefore need to prevent
them from being used for authentication purposes here. There is
a standard module, pam_listfile, that allows us to do this, but
it uses case sensitive username matching. We wanted case
insentitive matching.
A quick and dirty fix requires a one line change and a recompile. The
diff for pam_listfile.c is:
/* adapted for caseless comparison of the
content of the list file.
* This involves changing strcmp(aline,sitemp) to strcasecmp(aline,sitemp)
*/
390c395
retval = strcmp(aline,citemp);
retval = strcasecmp(aline,citemp);
We made the change slightly less dirty by creating a new module based
on the original.
pam_ldap Configuration
The configuration for pam_ldap is held in /etc/pam_ldap.conf.
It's useful while testing to set up
debugging (but don't forget to turn it off when the system goes
into service). Therefore add lines something like:
debug 1 logdir /tmp
Leaving referrals turned on significantly slowed down authentication,
and since they are not needed in our application, add the line:
referrals no
Tell pam_ldap which domain(s) or host(s) to send requests to with the
host directive. There are multiple Domain Controllers in our
Windows domain and each has an LDAP server. We wanted to be able
to select an arbitrary one by specifying the domain name rather
than giving one or more host names. However, using SLDAP we
found that the certificate presented by each server had
contained the server's host name, and the mismatch between the
domain name and host name upset OpenLDAP. We tried setting
tls_checkpeer no to turn off certificate checking, but this
didn't
seem to make any difference. In the end, we listed host names of
the Domain Controllers (space separated). The result should be
that requests are directed to the first one in the list unless
or until there is a time out, when the next one will be tried.
The resulting host directive looks something like:
host dc1.our.domain.name.tld
dc2.our.domain.tld dc3.our.domain.name.tld
The search base is set with the base directive. We have user
records potentially under any organisational unit,so the search
has to start at the top of our tree.
base dc=our,dc=domain,dc=name,dc=tld
Anonymous queries are not normally accepted by AD, so we need to set a
distinguished name and password to bind as. We couldn't make
this work with normal way of giving a distinguished name, so we
used Microsoft's User Principal Name (UPN) format:
binddn name@our.domain.name.tld
bindpw yourpassword
The best port to use appears to be the secure port of the Global
Catalog:
port 3269
Since user records could be anywhere, the search scope needs to be
subtree:
scope sub The object class were interested in is User, so set a
filter:
pam_filter objectclass=User
The best login attribute to use in AD is sAMAccountName:
pam_login_attribute sAMAccountName
We're not providing for password updating, but for potential future
use, set the pam_password
directive appropriately:
pam_password ad
Turn on SSL:
ssl on
Regardless of whether we set tls_checkpeer or not, we couldn't get this
working without having
access to the server certificate authority certificate.
Getting the right certificate in the right format can be a lengthy
process. Any Windows machine attached to the domain can get this
certificate. From the Control Panel select Internet Options,
then Content, Certificates and click on the Trusted Root
Certification Authorities tab. In that list there should be one
or more certificates issued to and issued by the name of the
Windows domain or the organisation.
Select the one with the latest expiry date and click Export..., then
Next >. Select Base-64 encoded X.509 (.CER),the click Next >.
Browse to temporarily save it somewhere sensible, then make the
obvious clicks to finish.
This file needs to be transferred to the system running Apache 2 and
placed (say) under /usr/share/ca-certificates.
Tell pam_ldap to look at it using something of the form:
tls_cacertfile /usr/share/ca-certificates/ldap_server.cer
Finally, disable SASL security so we can work with AD:
sasl_secprops maxssf=0
HTML rewriting
We use a mod_python filter to perform crude but effective regular
expression modification of URLs
in HTML and CSS files. The result is that files are rewritten as
they come through the reverse proxy.
More complex processing could be used at the cost of
performance. The performance hit could probably be offset by
caching the transformed files.
The filter looks like this:
from mod_python import apache
import re
replacements = (
(re.compile('http://www.le.ac.uk'),'https://wwwgate.le.ac.uk:8000'),
(re.compile('http://www.lwms.ac.uk'),'https://wwwgate.le.ac.uk:8001'),
(re.compile('http:/(?!/)'),'/'),
)
def outputfilter(filter):
# AddOutputFilter didn't seem to work for proxied requests,
# so use SetOutputilter and have all types come through here
if filter.req.content_type != 'text/html' and
filter.req.content_type != 'text/css':
filter.pass_on()
else:
if not hasattr(filter.req,'temp_doc'): # the start
filter.req.temp_doc = [] # create new attribute to hold document
# If content-length ended up wrong, Gecko browsers truncated
data, so
if "Content-Length" in filter.req.headers_out:
del filter.req.headers_out["Content-Length"]
temp_doc = filter.req.temp_doc
s = filter.read()
while s: # could get '' at any point, but only get None at end
temp_doc.append(s)
s = filter.read()
if s is None: # the end
temp_doc = ''.join(temp_doc)
for (regex,new) in replacements:
temp_doc = regex.sub(new,temp_doc)
#filter.req.set_content_length(len(temp_doc)) # this didn't seem
to work
filter.write(temp_doc)
filter.close()
Development of this gave a few hitches. It is important to
understand how filters work. They can be called any number of
times and fed an arbitrary chunk of data during the processing
of a single request. The readline method does not seem to
reliably read whole lines so line at a time processing was not
as easy as it should have been. In the end, we set a temp_doc
attribute on the request, used that to buffer the entire file,
ran the regular expression over it, and then wrote it out.
Performing the rewriting usually causes the length of the data
to change, so that the content_length header is no longer
correct. This can result in the browser stopping reading
before the end of the data is reached. Setting the header to the
new length didn't seem to work,
so we resorted to removing the header altogether.
The module (file) containing the filter can be located anywhere
convenient, provided the containing
directory is on the Python Path. This can be done using the
PythonPath directive, best placed in
/etc/apache2/conf.d/python.conf:
PythonPath "sys.path+['/path']"
Finally, configure the filter in each reverse proxying virtual
server
with (assuming the module is called mangleurls.py) something
like:
PythonOutputFilter mangleurls MANGLEURLS
SetOutputFilter MANGLEURLS
Virtual Servers Configuration
As was said earlier, there is a virtual server for each proxied server.
The configuration specific
to each virtual server is contained in a file under
/etc/apache2/sites-enabled. The virtual servers
use different ports on the same host name for ease of adding new
ones. Care needs to be taken that no firewalls along the route
block any of the ports used. Alternatively, a new IP and name
could be used for each proxied site. Example virtual proxy
server configuration looks something like:
NameVirtualHost www.test1.tld:8000
<VirtualHost www.test1.tld:8000>
# SSL Engine Switch:
# Enable/Disable SSL for this virtual host.
SSLEngine on
SSLCertificateFile /etc/apache2/ssl/cert/server.crt
SSLCertificateKeyFile /etc/apache2/ssl/key/server.key
SSLCACertificatePath /etc/apache2/ssl/cacert
ProxyRequests Off
# for reverse proxy Off is correct
<Proxy *>
AuthType Basic
AuthName "Authentication Domain Goes Here"
Require valid-user
Allow from all
</Proxy>
ProxyPass / http://www.name.tld/
ProxyPassReverse / http://www.name.tld/
PythonOutputFilter mangleurls MANGLEURLS
SetOutputFilter MANGLEURLS
</VirtualHost>
Other
We wished to automatically update the list of blocked accounts from a
group within AD.
We did this with a Python script using python_ldap and run as a
cron job. This turned out
to be fairly easy to do:
import ldap
import sys
banned_list = []
ldap.set_option(ldap.OPT_X_TLS_REQUIRE_CERT,
ldap.OPT_X_TLS_NEVER)
ldap.set_option(ldap.OPT_REFERRALS, 0)
conn = ldap.initialize("ldaps://domain.name.tld:3269")
conn.simple_bind_s('name@your.domain.name.tld','password')
results = conn.search_s('DN of blocked user group',
ldap.SCOPE_BASE, 'objectClass=*', ['member'])
if len(results) != 1: # there can be only one!
sys.exit("found %d results from LDAP search, expected 1"%len(results))
else:
rec = results[0][1]
if rec.has_key('member'):
for dn in rec['member']: # we have a list of group member DNs,
but we need their sAMAccountNames
r = conn.search_s(dn, ldap.SCOPE_BASE, 'objectClass=*', ['sAMAccountName'])
if len(r) == 1:
try:
banned_list.append(r[0][1]['sAMAccountName'][0])
except KeyError:
sys.exit("Unable to get sAMAccountName")
else:
sys.exit("found %d results from LDAP search, expected 1"%len(r))
for u in banned_list: print u
conn.unbind_s()
Resources