Shostack + Friends Blog Archive

 

Code/Data Separation

As I mentioned in my “Blue Hat Report,” I want to expand on one of my answers I gave to a question there. My answer involved better separation of code and data. I’ve since found, in talking to a variety of folks, that the concept is not so obvious as it seems to me.

macro-dialog.jpg
The basic idea is that when opening a document, a program has to make a decision on how to treat various bits of it. When the bits are jumbled together, its harder to make the right decisions. It’s also harder to write security wrappers that will parse for things like Javascript or Office document macros, when those can be scattered throughout the document. The parser needs to understand the whole document, in the way that the receiver will, rather than just the code parts.

So if we were to separate code and data the way we’ve separated presentation and data into CSS and HTML, we should give serious thought to breaking out an HTML ‘script’ section. Yes, this would be hard, involving standardization and there’s a huge back-compatability issue to be dealt with. But it seems to me that a separate script section would mostly or completely break cross site scripting attacks.

Similarly, with MS Office moving to an XML data format, it would be great to have an explicit “macros” setting at the top of the document. (I haven’t checked to see where macros can occur in the current definition, but my belief is they can be scattered through the file.) [Update: See Kevin Boske’s comment, apparently Microsoft is doing this.]

Several years back, I had a conversation with the person responsible for macro security in Office. I really wanted “tell me more” to link, not to the help, but to either a static analysis of the macros, or their content. Through the conversation, I was convinced that that was a great idea for a few hundred, or maybe even a few thousand people, but I was unable to suggest a dialog box that would give a typical user useful decision-making context and data.

If macros were at the top of the XML, then I could do what I really wanted to do: Read the macro myself before opening the document. (I don’t trust that “disable macros” is fool-proof.) If I were writing a document firewall, I could make it faster and more effective.

One final point: Separating code and data allows the parsers to be smaller and more modular, which means faster and more reliable.

By separating code and data, not only do you gain security, but you gain performance and reliability. The sooner we start dealing with the back-compatability issues, the better off we’ll be.

5 comments on "Code/Data Separation"

  • beth says:

    Well, you say it’s been years since you had that conversation with the macro security person, I think the time is over-ripe…what do THEY think, or what are they prepared to do?

  • Adam says:

    They’re open to suggestions. I said, “here’s the problem as I see it,” they said “here’s the problem as we see it, how can we solve it?”

  • Kevin Boske says:

    Adam, In Office “12” we address this problem with our new file formats for Word, PowerPoint and Excel by explicitly disallowing VBA in our default formats (“macro-free”) and only allowing VBA in our “macro-enabled” alternative formats. The files are differentiated based on thier extension and content type. For example, a macro-free Excel Workbook will carry the extension “.xlsx”, whereas a macro-enabled workbook’s extension is “.xlsm”. If a macro-free file is found to have a VBA project during the load process, the load will fail before opening the VBA “Part”, end of story. You don’t have to trust the “disable macros”, the application simply won’t open the file. Take a look at Brian Jone’s blog entry for more: http://blogs.msdn.com/brian_jones/archive/2005/07/12/438262.aspx

  • Maximillian Dornseif says:

    What is the difference between ‘Code’ and ‘Data’? See my thoughts on the issue at http://md.hudora.de/blog/guids/54/93/6470713426167024.html
    I really would love to see some discussion on what you actually mean with code & data.

  • adam says:

    Hi Max,
    While you make a good point, and there are clearly fuzzy boundaries, there are also places where the answer is reasonably clear. In those places, it is helpful to take advantage of the clarity. This frees up time and energy to worry about those fuzzy spots.

Comments are closed.