Writing a Proxy
	
	Draft
	$Id: proxy.html,v 1.2 1997/05/02 17:23:53 adam Exp adam $
	
Overview
	A proxy is a security tool whose intent is to enforce
	policies, usually that policy is some variant of "only
	messages that conform to a protocol will be allowed to pass"
	This document covers what a proxy does, and how to go about
	writing a useful one.
What a proxy does
	A proxy enforces a part of security policy.  A good starting
	security policy is that only what is explicitly permitted is
	not denied.  The section of that policy that proxies enforce
	is that only those things that are really doing what they
	claim to be can be allowed to happen.
	To enforce a security policy, a proxy needs to take an
	incoming connection, "open up" the packets, make a decision
	about what to do with the data, and act on that decision.  The
	proxy unwraps the connection to make sure that the data is
	what is expected.  This seemingly small step ensures that if a
	protocol is being violated, the program that is attacked is
	one on the firewall system.  Having an attack focus on the
	firewall, where there should be containment tools in place,
	rather than the internal system, for whom security is a
	secondary function, is an appropriate separation of
	responsibilities.  The proxy needs to decide what to do with a
	packet, and can base that on the contents of the packet, on
	the authentication of the packet, or on rules that control
	what the proxy does.  For example, if a packet comes in marked
	"Send to Accounts Payable" and the proxy has been instructed
	never to send packets to accounts payable, then the packet, as
	legitimate as it may be, is disallowed, because the firewall
	may not communicate with accounts payable directly.  Once a
	decision has been made (and possibly logged), actions are
	taken, such as repackaging the data, and sending it on to its
	destination.
Location of proxies
	    Firewall proxies are normally located on a bastion host, a
	    tightly controlled environment for security purposes.  I
	    have seen them used in other places, near a back end
	    system in order to provide additional logging and control
	    capabilities, and this is a perfectly legitimate use, and
	    good security practice.  Its good security practice
	    because there is often a 'trust but verify' stance taken towards
	    the firewall.  Anything coming through should breeze
	    through your local copy of the proxy, but if there is a
	    problem, you have a local defensive system.  Proxies
	    should run chrooted on a system to prevent their failure
	    from impacting the machine on which they run.
Accepting a connection
	    Proxies should use the strongest authentication available.
	    Usually, that means that they require two factor
	    authentication (what you have, what you know) for
	    interactive login, and use of a secret key in a
	    cryptographic protocol to authenticate a machine in other
	    circumstances.  (Passwords and source machine names are
	    both bad ways to work.  Cryptographic protocol design is
	    not simple, and you should avoid it.  Consider using SSL,
	    or ANSI X9.17.  See Bruce Schneier's "Why Cryptography is
	    Hard" essay (on www.counterpane.com) for more on this. The proxy should log
	    the opening of a connection, and the result of the
	    authentication step.
Examining a packet
	    Unwrap the connection.  Figure out whats inside.  Parse
	    the structure, walk the tree.  At each step, consider what
	    the data here is, and if its what you expect.  As an
	    example, if you're writing a packet filter, and you expect
	    a TCP packet (packet type 6), and you find a packet of
	    type 63, then thats not an acceptable packet.  If there is
	    data you can't parse, thats not acceptable, either.  You
	    must be able to parse a complete packet before sending an ok.
	    If the protocol being proxied has debug modes, privledged
	    commands, or other non-standard uses, blocking them by
	    default at the proxy is appropriate.
	    If the protocol is a command or view based protocol (say,
	    FTP as opposed to telnet), then it may be appropriate for
	    the proxy to make decisions about commands, not allowing
	    commands not on an approved list.  So, an FTP proxy could
	    prohibit the SITE command, and an HTTP proxy could block
	    POST commands.
Logging
	  The proxy should log when it starts and stops.  It should
	  log when a connection comes in, and the results of any
	  authentication step.  If there is a debug mode, its
	  invocation is probably cause for logging.  Logs are used
	  after a problem has happened to find out what happened.
	  Logs can also be used to analyze performance and usage
	  patterns.
	  Anything out of the ordinary, any error conditions, should
          be logged.  Its easy to throw information away.
          Any protocol specific proxy should have a mode in which the
          controlling parts of the protocol may be logged. For
          example, an ftp proxy should have a mode where the command
          portions of the protocol may all be logged along with the
          responses to both client and server. This sort of mode may
          seem excessive but is invaluable in problem diagnosis or
          incident response.
Repacking the data
	  From the server's point of view, the proxy is the client,
	  while from the clients point of view, the proxy is the
	  server.  To pull this off, the data should usually be
	  repacked as it appeared.  (Hefty use of pointers rather than
	  copies can make this much faster.)  However, the act of
	  repackaging it can prevent data that is "hidden" from being
	  sent along.  For example, if the data coming in should be
	  six C style strings, and each is checked for appropriate
	  data in it, but there is additional data after the sixth
	  null, then a repacking proxy will not send the data along,
	  because it copies six strings, while one that checks the
	  data and passes it as is will.  If that additional data was
	  placed there maliciously, the second proxy has done the
	  wrong thing.
Basic structure of a proxy
	    main() {
	    parse_config();
	    listen();
	    authenticate();
	    if (TRUE != parse(data))
                       {
                       syslog;
                       exit
                       }
	    newdata=package(data);
	    sendto(desthost, newdata);
	    close;
	    }
	  	    
 
Simplicity
	    Despite all of this, a proxy should be kept simple.  This
	    document is long in an attempt to clearly explain the
	    concepts, but often translating this into practice can be
	    done with a small amount of code.
	    A word about threading:  Don't.  Writing a proxy is
	    probably something new to you, and taking advantage of the
	    operating system's memory protection features is a good
	    thing.  Process code is simpler to debug and maintain, and
	    can always be converted to threaded use if the performance
	    gain requires it.  We've seen proxies written
	    multi-threaded to gain speed when their big speed loss was
	    in cryptographic activity.  The threading made the product
	    unstable, and it was still slow.  Don't use threads.
Reviews
	  Proxies are not really incredibly complicated software, so
          you may be tempted to not review the design before you start
          coding.  Thats a mistake.  Design reviews are incredibly
          important parts of the process to ensure that the code you
          write does what you want.  When you're mostly done with the
          code, a code review is essential.
Fail Safe
	  The concept of fail safe is very important.  If anything
          goes wrong, the proxy must either go into an error handling
	  state or shut down.  If you go into an error handling state,
	  you must be extremely careful not to allow information to
          pass inappropriately, nor to allow state or variables to
          propagate up from the error handling routines to the main
          code body.
    
    Adam Shostack
Last modified: Wed Apr 23 09:46:43 EDT 1997