20th Annual Computer Security Applications Conference December 6-10, 2004, Tucson, AZ, by Jeremy Epstein

Review of the
20th Annual Computer Security Applications Conference,
Tucson, AZ
December 6-10, 2004

Review by Jeremy Epstein
December 20, 2004

The 20th Annual Computer Security Applications Conference (ACSAC) was held Dec 6-10 in Tucson AZ. This is a three-track conference (two refereed paper tracks and an un-refereed case studies track). Following are my notes, which cover the papers and panels I found most interesting. All papers (and slides for some of the speakers & panelists) are available at www.acsac.org.

Distinguished Practitioner

The Distinguished Practitioner speech was given by Steve Lipner of Microsoft. Steve is a longtime fixture in the research community, having led development of DEC's A1 operating system among other projects. Steve described Microsoft's development process for building secure software. Among his key points:

New employees, especially those fresh out of college, do not arrive with the ability to develop secure software. Academic training teaches them how to build security features (e.g., crypto algorithms), but not how to build things that operate securely.

Training is a key part of how they operate, including new employee orientation and regular refresher courses.

As has long been known, security can't be bolted on at the end or tested in by penetration testing. Security is part of the design & implementation process. Security does a final review near the end to make sure everything went OK, but that's a final review, not when things start getting looked at.

They try to learn from the security problems that surface in the field, to improve the process by recognizing patterns of failure.

Since implementing the Security Development Lifecycle, they've seen a drop of at least 50% in externally reported vulnerabilities compared to products that don't use SDL. They grade based on externally reported problems since they have no control over those, and it avoids fudging the numbers.

SDL is expensive, but it pays off. Unlike Common Criteria, the focus isn't on paperwork. CC focuses on testing security features, but that's not where attackers look. SDL is much more effective at reducing vulnerabilities than CC. In a Q&A session, Lipner noted that:

If there's not enough money to do everything, threat modeling and static code analysis are the most effective uses of resources.

The quality of code (from a security perspective) anecdotally seems to be improving after developers are trained.

The research community has impacted the SDL by providing good research in static analysis tools. Formal methods has also had an impact, not in doing the full formal methods process, but in cherry-picking the techniques.

Session: Intrusion Detection

"An Intrusion Detection Tool for AODV-based Ad hoc Wireless Networks" Giovanni Vigna, Sumit Gwalani, Kavitha Srinivasan, Elizabeth Belding-Royer and Richard Kemmerer, University of California Santa Barbara, USA

Intrusion detection in ad hoc wireless networks is much harder than in traditional wired networks because there's no perimeter, all nodes participate in routing, nodes may move in and out of range, etc. They've extended the STAT framework to do detection for wireless networks. They look for both local and distributed scenarios (the latter requiring multiple sensors). To make this real, they've built a testbed with dynamic networks. Since it's hard to simulate the true dynamic nature, they have a simulator that generates packet traces that they then feed into the wireless driver. Attacks can be detected with a relatively small number of false positives. Placement of sensors is much more important than in the static (wired) world; they have to do a baseline and then figure out where to place the sensor. All of the sensors have to trust each other in their architecture.

"Automatic Generation and Analysis of NIDS Attacks" Shai Rubin, Somesh Jha and Barton Miller, University of Wisconsin, Madison, USA

This paper won both the Outstanding Student Paper and the Outstanding Paper awards.

NIDSs miss some attacks; attackers take advantage of this to create attacks that are equivalent to known attacks but differ only in how the attack is represented at a protocol level. The idea of this paper is to build tools that can accept a representation of the attack and create variations, which it then (automatically) launches and checks to see if they're caught (i.e., no false negatives). The tool is based on a formal model of the vulnerability, and is useful to both black hats (who want to generate variants that can't be detected) and white hats (to determine if a given TCP sequence is an attack). The tool includes both transport (TCP) level transformations like fragmentation, retransmission, and packet ordering, as well as application level transformations like padding with innocuous steps and alternate ways to cause the attack. The transforms, which are based on real attack patterns seen in the wild, are simple and preserve semantics. They can also do *backwards* derivation - if you find what looks like an attack, see if it reduces to one of the known attacks. Using the tool they found a handful of bugs in SNORT. They're planning to try it against commercial NIDS to see what variants they can catch, as well as what they miss.

Debate: The Relationship of System & Product Specifications & Evaluations Debate chair: Marshall Abrams, MITRE, USA Panelists: Stu Katzke, NIST, USA; Jean Schaffer, NSA, USA; Mary Ellen Zurko, IBM, USA; Steve Lipner, Microsoft, USA

I missed the introductory position announcements, but here are some points from the Q&A:

Q: Most security problems are due to lack of bounds checking and other implementation bugs. At what EAL level can we expect these will be caught? And if they can't be caught at any level why bother doing an evaluation at all?

A: Vendors should do this before evaluation. Looking at adding hoops before an evaluation that vendors should do on their own to catch these.

A: CC doesn't give any statements about the fundamental system characteristics, which is why CC isn't an effective way to secure products.

Q: Shouldn't we be looking more at process, since this is what we're all pointing to?

A: Looking at adding more process into next version of CC (this summer).

A: Tools (and languages) can take the place of other assurance mechanisms to some extent. HCI has done this - APIs have eliminated the need for programmers to have stacks of documentation on how to build menus.

Q: TCSEC requirements for avoiding things like buffer overflows were at B2/B3 (equivalent to CC EAL5/EAL6). Number of people capable of exploiting that type of flaws was much smaller, and thought commercial world would never have to worry about it.

A: Anticipated threat environment for TCSEC B2/B3 is roughly equal to what we see today for home users on the Internet.

Q: Might it be appropriate to add more requirements to lower levels (EAL2) so it becomes useful?

A: Better approach would be to throw out what we have today and start over.

Q: CMM was looked at and rejected by CC; should we change that?

A: Motorola announced at a recent meeting that they're CMM Level 5, but couldn't guarantee that they hadn't had any buffer overflows. That's not very helpful.

A: Panelist was part of SSE CMM definition effort, and saw that it was just process. For example, if a process said "we never check for buffer overflows" and stick to that, you can be CMM Level 5 (which is focused on repeatability)... and be totally insecure! So not enthralled with CMM as a cure-all.

A: Vendors who go through CMM and CC find that the processes are the same, but CMM is missing "common sense" part (i.e., do the processes meet the requirements for security).

Q: Is there any scientific evidence that CMM or any other process helps with security?

A: No.

Q: Does CC or CMM address vulnerabilities deliberately introduced by developers (i.e., insider threat)?

A: No. An open, collaborative environment tends to reduce the risk, because many people are looking at each other's work.

Malware Session

"Using Predators to Combat Worms and Viruses - a Simulation Based Study" Ajay Gupta and Daniel C. DuVarney, Stony Brook University, USA

Predators are "benevolent self-propagating code" - worms that have a good intention. There are potential positives (e.g., much faster propagation than relying on patching, and might stop the spread of some attacks even if no patch is available by adjusting firewall settings or turning off vulnerable services) but lots of negatives (including legal/social, as well as risks of malfunction). There are issues how the predator spreads, whether using the same entry point as the worm or having a dedicated "predator port" (which in itself introduces new attack avenues). They've simulated predator behavior using finite state machines to represent actions, and looked at various tuning parameters in predators. Different types of predators include "classic" (do not immunize the machine), "persistent" (do not immunize, but lie in wait for an attacker), and "immunizing" (tries to spread, and closes the door behind it). Depending on the type of predator, the fanout level, the time allowed, etc., you can contain attacks. By limiting the parameters, they can avoid overloading the network. The paper contains nice graphs showing how different settings impact the infection rate and the steady state. Among the problems (besides the risk of malicious predators, which the propose to constrain using code signing) are worms locking the predator out.

WIP Session

"Finding Security Errors in Java Applications Using Lightweight Static Analysis" Benjamin Livshits, Stanford University

There's lots of static analysis tools that look for problems in C/C++, because the problems come from poor language and API design (buffer overruns happen unless you actively prevent them). By contrast, Java protects from those obvious errors, and leads to deeper errors. They've built a tool that finds two types of errors: bad session stores and SQL injection. In 10 web-based Java apps (mostly blogging tools), each consisting of 10s of KLOC, they found 14 session store problems (and 8 false positives) and 6 SQL injection problems.

"Access Control for Distributed Health Care Applications" Lillian Røstad, Norwegian University of Science and Technology

Norwegian healthcare uses RBAC extensively, with override rules for emergency access. However, the overrides are frequently used when there's no real emergency. They're investigating why people feel the need to bypass, and whether the bypasses are appropriate.

"Augmenting Address Space Layout Randomization with Island Code" Haizhi Xu, Syracuse University

Return-into-libc attacks can be frustrated by moving libc around the address space as a unit, but once an attacker finds it, all the relative positions are unchanged. If there's only 16 bits of randomization in placement, that can be broken in a few minutes. Their idea is to move each entry point individually, rather than all of libc. Even if the attacker finds one, that doesn't help them find anything else.

"The DETER Testbed" Terry V. Benzel, Information Sciences Institute

The goal is to build a testbed to run malicious code to see how preventative technology works. Technology is based on Univ of Utah system. Containment is the key goal, so malicious code doesn't escape to the Internet. To make experiments effective and repeatable, they can do automatic reconfiguration of all of the systems, which allows complete test setup & teardown in 10 minutes. See www.isi.edu/deter for more info.

"Vertical Sensitivity for the Information Security Health Rating for Enterprises" Arcot Desai Narasimhalu, Singapore Management University

S&P and Moody's rate bonds; can we use a scheme similar to that to rate cyber-threats? CxOs are able to understand rating schemes that measure overall risk. He calls the result an INFOSeMM rating, with ratings from DDD to AAA depending on resilience of infrastructure, intelligence, and practices.

Invited Essayist The invited essayist was Rebecca Mercuri, who is currently at Harvard's Radcliffe Institute, and is best known for her work in electronic voting. Her topic was "Transparency & Trust in Computational Systems". Trust means many things in different contexts; even many of our standard measures (e.g., Orange Book evaluations) don't say anything about how trust is created, only about the rules and metrics (with the implication that following the rules gives trust). There are conflicts between the notions of security by obscurity ("trust me") and open source ("transparency"); she likened the former to moving ICBMs around the country so an attacker wouldn't know where the real ones were vs. making everything available. There's an assumption that transparency is the same as trust, but it's not. Programmers use the word "trust" to mean "control" - if you control the code, then you trust it - but that's not accurate either. Increasing usability increases transparency to people. For example, if you tell the user a task will take a while, then they'll be OK with the delay. Paradoxically, making things *too* easy may make some people distrust the system. And transparency can enhance confidence in inherently untrustworthy products, which isn't the long-term desired goal.

Classic papers

"If A1 is the Answer, What was the Question? An Edgy Nai:f's Retrospective on Promulgating the Trusted Computer Systems Evaluation Criteria", Marv Schaefer, Books With a Past presented by Paul Karger, IBM

Paul gave a history of the development and motivation of the TCSEC (aka Orange Book). The reason for TCSEC was so procurement staff without technical expertise could write competitive procurements that allow them to buy secure systems. Early in the design of the TCSEC, Ted Lee proposed a "Chinese menu" with many different dimensions of measurement but that was too complex. The Nibaldi study at MITRE came up with a set of seven levels (which eventually became A1, B3, B2, B1, C2, C1, and D), with the notion that B1 and C1 were "training wheels" and were not intended for serious use.

Things that went wrong included imprecision in the wording and lack of definitions (e.g., what does it mean to "remove obvious flaws"), over specification and customer naivete' (too many customers decided they wanted the "best thing" so specified A1, even when they didn't need it), and the rush to get the standard out the door. There were "yards deep" comments on the 1983 draft, but the NCSC director insisted on minimal changes before the 1985 final version came out. Interpretation caused "criteria creep", which meant that products evaluated in year N might no longer be evaluatable in year N+1. The Trusted Network Interpretation (TNI) and Trusted Database Interpretation (TDI) were put out prematurely. And the "C2 by 92" mandate was dead on arrival because of slow evaluations. Some requirements were put in strange places (e.g., negative ACLs at B3, simply because they couldn't fit anywhere else).

TCSEC fostered research that exposed shortfalls in our knowledge (e.g., John McLean's System Z), problems with automated formal proofs, etc. It also led to more flexible criteria including the German criteria and eventually CC. The result of the flexible criteria is that evaluations aren't comparable, and vendors can make vacuous claims (i.e., can get an EAL4 evaluation of a system that doesn't claim to have any security capabilities).

The battle may be lost, because the systems customers demand are far larger and more complex than the systems that were thought to be unsecurable 30 years ago.

Paul emphasized several times that the paper in the proceedings is well worth reading, and contains far more than the presentation.

"A Look Back at 'Security Problems in the TCP/IP Protocol Suite'" Steven M. Bellovin, AT&T Labs -- Research

The original paper was one of Steve's first at AT&T, and is his swan song as he prepares to leave AT&T for Columbia University.

Steve described many of the vulnerabilities he found in the TCP protocol suite, and gave many anecdotes of what's gone wrong, including the "AS 7007" incident where a small ISP erroneously advertised itself as having the best routing on the internet, and promptly got swamped by the traffic. The earliest email problems were in 1984, but some like phishing are relatively new. The idea behind reserved ports on UNIX systems (ports less than 1024 are "privileged") was a bad idea then and worse now.

Lessons learned:

The original internet architecture wasn't designed to be secure, and we're still paying the price.

Cryptography is important, but frequently used as a fig leaf (especially SSL).

Despite all this, most problems on the internet today are due to buggy code or weak passwords, not protocol flaws. Attackers are more likely to attack code than protocols, not because protocols are strong but because code is weak.

Protocols should be analyzed for security during development, not after it's done (unlike WEP, for example)

PANEL - The Cyber Enemy Within...Countering The Threat From Malicious Insiders

Chair: Dick Brackney, Advanced Research and Development Activity, USA Panelists: Terrance Goan, Stottler, Henke Associates, USA, Shambhu Upadhyaya; University of Buffalo, USA; Allen Ott, Lockheed Martin, Orincon Information Assurance, USA

Dick noted that types of damage insiders can cause (eavesdrop, steal/damage information, use information fraudulently, deny access to other authorized users) and noted that a recent DOD Inspector General report claims that 87% of 1000 intruders examined were insiders. Their goal is to reduce the time between defection and detection. They'd like to have anomaly detection algorithms that detect abnormal insider behavior - something that might make you suspicious, but not a single point the way there is where you see a particular attack.

Terrance wants to find ways to detect malicious insiders without signature matching or anomaly detection. Finding malicious insiders is hard because they have legitimate access and can do fairly safe probing without it looking strange, including non-cyber events that are part of the attacks. Personal relationships can also help cover things up - Hansen, when confronted, was able to explain away his suspicious behavior. Moves post 9/11 to encourage sharing and efficiency means that "need to know" violations aren't treated seriously. In short, network security personnel may be incapable of identifying suspicious activity because the line between "normal" and "abnormal" is so fuzzy. He advocates identifying the greatest risks and implementing reliable partial solutions - for example, relying on personnel reports to be a "sensor" in finding insider attacks, or providing anomaly reports to information owners who might spot something suspicious that wouldn't be noticed by a system administrator. They've built a system that looks at documents in the system and captures key phrases that it then searches for on Google. If the phrase hits on internet sites, then the document it comes from probably isn't sensitive, but if it only hits on restricted databases then it probably is sensitive. A person sees the matches and validates; the system learns from the results of the searches. This reduces false positives over time.

Allen described DAIwatch, which looks for "activities" not signatures. They use AI, fuzzy mapping, etc. to understand what's going on based on input from operating systems, IDSs, focused searches, etc. By correlating different information sources they've seen lots of network router & IP configuration problems, erroneous registry settings, logins from unknown programs & machines, unknown network services, etc. They're in transition to trying this in financial and government sites.

In the Q&A session, someone pointed out that Hansen was a system administrator, and therefore had legitimate access. How would any of these systems work against a sysadmin? The conclusion was that you need procedural controls (e.g., "two man rule").

Conference Reception

The conference reception was held Thursday evening. Addison Wesley generously donated several dozen computer security textbooks which were given out as door prizes. In what some thought was a deliberate plant, Steve Bellovin, a longtime UNIX aficionado, won "The .NET Developer's Guide to Windows Security". Despite some early concerns, there were enough drink tickets for all concerned.

New Security Paradigms Workshop Panel

"Designing Good Deceptions in Defense of Information Systems"

They set up a system with lots of fake stuff to entice attackers. To make it convincing, they needed to define a believable policy as to how the system works, and how things don't work. For example, something might fail because the network is down, the system has already been hacked, or the software is a new release. They then try to map suspicion to figure out the attacker's likely moves. Deception can be used as a detector: if the policy causes them to say "the network is down", benign users will go away but malicious users will try to bypass the system. It's a form of "active intrusion detection".

"A Serial Combination of Anomaly and Misuse IDSes Applied to HTTP Traffic"

They put together anomaly and misuse detectors to see whether they agreed or disagreed. They then mapped out different combinations using two real web servers, and tried to figure out which of the combinations are most likely, and whether the combination can be used to reduce the fraction of alerts that have to be examined by a human. Out of 2.2 million events, they reduced the alarm rate from 450K to 20K possible events, or a factor of more than 20. The number of "unknown" events also dropped dramatically. The combination can miss certain types of attacks, but the reduced false positive rate made it more likely that the alerts would be examined.

"Securing a Remote Terminal Application with a Mobile Trusted Device"

The goal is to allow you to safely access your home machine from an internet cafe'. VNC is a good way to start, since the VNC protocol pushes screen images, and doesn't allow queries. They use a PDA to enhance the authentication scheme; the PDA is used to establish a master secret over an SSL link, and that master secret is then used by the public terminal to connect to the home machine. You don't have to trust the public terminal except to pass input through to the home machine and to display frame buffers accurately. Any malware on the public terminal can't fetch files or execute commands on the home system, but the public terminal might keep images in its cache of what you've seen, so there's still trust issues. The overhead involved is relatively small. More information at www.parc.com/csl/projects/usable-security/