Writing a Proxy
Draft
$Id: proxy.html,v 1.2 1997/05/02 17:23:53 adam Exp adam $
Overview
A proxy is a security tool whose intent is to enforce
policies, usually that policy is some variant of "only
messages that conform to a protocol will be allowed to pass"
This document covers what a proxy does, and how to go about
writing a useful one.
What a proxy does
A proxy enforces a part of security policy. A good starting
security policy is that only what is explicitly permitted is
not denied. The section of that policy that proxies enforce
is that only those things that are really doing what they
claim to be can be allowed to happen.
To enforce a security policy, a proxy needs to take an
incoming connection, "open up" the packets, make a decision
about what to do with the data, and act on that decision. The
proxy unwraps the connection to make sure that the data is
what is expected. This seemingly small step ensures that if a
protocol is being violated, the program that is attacked is
one on the firewall system. Having an attack focus on the
firewall, where there should be containment tools in place,
rather than the internal system, for whom security is a
secondary function, is an appropriate separation of
responsibilities. The proxy needs to decide what to do with a
packet, and can base that on the contents of the packet, on
the authentication of the packet, or on rules that control
what the proxy does. For example, if a packet comes in marked
"Send to Accounts Payable" and the proxy has been instructed
never to send packets to accounts payable, then the packet, as
legitimate as it may be, is disallowed, because the firewall
may not communicate with accounts payable directly. Once a
decision has been made (and possibly logged), actions are
taken, such as repackaging the data, and sending it on to its
destination.
Location of proxies
Firewall proxies are normally located on a bastion host, a
tightly controlled environment for security purposes. I
have seen them used in other places, near a back end
system in order to provide additional logging and control
capabilities, and this is a perfectly legitimate use, and
good security practice. Its good security practice
because there is often a 'trust but verify' stance taken towards
the firewall. Anything coming through should breeze
through your local copy of the proxy, but if there is a
problem, you have a local defensive system. Proxies
should run chrooted on a system to prevent their failure
from impacting the machine on which they run.
Accepting a connection
Proxies should use the strongest authentication available.
Usually, that means that they require two factor
authentication (what you have, what you know) for
interactive login, and use of a secret key in a
cryptographic protocol to authenticate a machine in other
circumstances. (Passwords and source machine names are
both bad ways to work. Cryptographic protocol design is
not simple, and you should avoid it. Consider using SSL,
or ANSI X9.17. See Bruce Schneier's "Why Cryptography is
Hard" essay (on www.counterpane.com) for more on this. The proxy should log
the opening of a connection, and the result of the
authentication step.
Examining a packet
Unwrap the connection. Figure out whats inside. Parse
the structure, walk the tree. At each step, consider what
the data here is, and if its what you expect. As an
example, if you're writing a packet filter, and you expect
a TCP packet (packet type 6), and you find a packet of
type 63, then thats not an acceptable packet. If there is
data you can't parse, thats not acceptable, either. You
must be able to parse a complete packet before sending an ok.
If the protocol being proxied has debug modes, privledged
commands, or other non-standard uses, blocking them by
default at the proxy is appropriate.
If the protocol is a command or view based protocol (say,
FTP as opposed to telnet), then it may be appropriate for
the proxy to make decisions about commands, not allowing
commands not on an approved list. So, an FTP proxy could
prohibit the SITE command, and an HTTP proxy could block
POST commands.
Logging
The proxy should log when it starts and stops. It should
log when a connection comes in, and the results of any
authentication step. If there is a debug mode, its
invocation is probably cause for logging. Logs are used
after a problem has happened to find out what happened.
Logs can also be used to analyze performance and usage
patterns.
Anything out of the ordinary, any error conditions, should
be logged. Its easy to throw information away.
Any protocol specific proxy should have a mode in which the
controlling parts of the protocol may be logged. For
example, an ftp proxy should have a mode where the command
portions of the protocol may all be logged along with the
responses to both client and server. This sort of mode may
seem excessive but is invaluable in problem diagnosis or
incident response.
Repacking the data
From the server's point of view, the proxy is the client,
while from the clients point of view, the proxy is the
server. To pull this off, the data should usually be
repacked as it appeared. (Hefty use of pointers rather than
copies can make this much faster.) However, the act of
repackaging it can prevent data that is "hidden" from being
sent along. For example, if the data coming in should be
six C style strings, and each is checked for appropriate
data in it, but there is additional data after the sixth
null, then a repacking proxy will not send the data along,
because it copies six strings, while one that checks the
data and passes it as is will. If that additional data was
placed there maliciously, the second proxy has done the
wrong thing.
Basic structure of a proxy
main() {
parse_config();
listen();
authenticate();
if (TRUE != parse(data))
{
syslog;
exit
}
newdata=package(data);
sendto(desthost, newdata);
close;
}
Simplicity
Despite all of this, a proxy should be kept simple. This
document is long in an attempt to clearly explain the
concepts, but often translating this into practice can be
done with a small amount of code.
A word about threading: Don't. Writing a proxy is
probably something new to you, and taking advantage of the
operating system's memory protection features is a good
thing. Process code is simpler to debug and maintain, and
can always be converted to threaded use if the performance
gain requires it. We've seen proxies written
multi-threaded to gain speed when their big speed loss was
in cryptographic activity. The threading made the product
unstable, and it was still slow. Don't use threads.
Reviews
Proxies are not really incredibly complicated software, so
you may be tempted to not review the design before you start
coding. Thats a mistake. Design reviews are incredibly
important parts of the process to ensure that the code you
write does what you want. When you're mostly done with the
code, a code review is essential.
Fail Safe
The concept of fail safe is very important. If anything
goes wrong, the proxy must either go into an error handling
state or shut down. If you go into an error handling state,
you must be extremely careful not to allow information to
pass inappropriately, nor to allow state or variables to
propagate up from the error handling routines to the main
code body.
Adam Shostack
Last modified: Wed Apr 23 09:46:43 EDT 1997