Review of the
19th Annual Computer Security Applications Conference (ACSAC),
Las Vegas, NV, USA
December 8-12, 2003
Review by Jeremy Epstein
December 31, 2003
Monday and Tuesday had a number of tutorials, plus a workshop on secure web services which I attended. The workshop speakers included representatives from Reactivity, Datapower, IBM, Microsoft, and webMethods (yours truly). There were almost 50 attendees for the workshop.
One key point was in agreement among all the speakers: standards alone aren't enough; when you hook up a system using web services, you introduce new risks that can't be addressed by standards. In particular, web services definitions (such as WSDL) tell an attacker exactly what a message should look like, which provides a template to use for creating attacks such as those using SQL injection. Additionally, web services typically expose data which would otherwise be much more closely held. So while the fundamental problems aren't any different from any other network-based service, web services make the attacker's job somewhat easier, and increase the value of the target. There was also some agreement that attacks coming from the inside are at least as risky as attacks from the outside. Simply saying "it's behind the firewall" isn't good enough.
Several speakers proposed that the proper solution is to use centralized security devices that enforce all the security requirements in one place, rather than trying to individually harden every platform that's providing web services.
Delegation was also proposed as a key problem (as in almost all distributed systems). But there are many different delegation needs, and it's very hard to do securely.
The 2003 Distinguished Practitioner was Dr. Clark Weissman from Northrup Grumman. Clark's presentation was on the architecture for the new avionics system targeted for 20 years out. A typical (current) military flight will have information on the airplane from unclassified (such as logistics info) through Top Secret (such as targets). Because Multi-Level Secure (MLS) systems aren't readily available today, the aircraft is run as "system high" (all information is treated as classified as the most classified information), which makes maintenance difficult. In some cases, the pilot may not be cleared to see all of the information (such as information about the specific target until a certain point in the mission).
Based on Moore's law, they expect to have thousands of CPUs available on the airplane, which will allow building Multi Single Level (MSL) systems, with one process per processor. The processors at any given classification will be hooked together using VPNs, so encryption keeps the data separate. High assurance Encryption Processing Elements (EPEs) safeguard the keys and provide the encrypted tunnel. A Control Element (CE) can clear the encryption keys if there's a problem (such as aircraft capture).
They're targeting a Common Criteria EAL7 evaluation (the highest level defined in the criteria). To make that possible, the trusted portion of the code (that keeps classified data separate) must be minimized; their target is under 10,000 source lines of code.
Most attacks are targets of opportunity - the script kiddies & other attackers are more interested in the QUANTITY of systems they have than what information the systems have. Many people don't realize that their home systems are therefore targets of opportunity. One recent case had 15,000 systems under hackers control, and another that the Navy saw had 140,000 systems under hacker control. Spammers are taking advantage of this (as has been widely reported) and are taking over home systems to install open relays and porn sites. The victim (home user) doesn't even know they've been compromised. Some 30% of spam is transferred this way. The attackers don't care of they're noisy and get detected, because they have so many other systems under control. It only takes 15 minutes (on average) from when a vulnerable system is connected to the Internet until it's hacked, and some systems are attacked while they're still in the process of being installed. Some hackers are using this for extortion.
The real threat, by contrast, is Targets Of Choice. These types of attacks are more likely to be happening from the inside, and the attackers don't want to be detected. There's lots of information out there on exploits, but relatively little on how the bad guys are organized, motivated, etc. Honeypots give an opportunity to see what they're up to, by giving you the initiative. Honeypots can be used as a "fixing mechanism": if you set up a honeypot that's "vulnerable" to Code Red (i.e., so a Code Red attack goes after it), then it can turn around and attack that system and install the patch to make it immune. Similar tactics are possible against spammers. Honeypots also give you a chance to see attacks without the false positives of an IDS, since (nearly) everything that gets picked up by a honeypot is an attack (so there's a very good signal/noise ratio). A honeypot also works in environments where you have SSL & SSH, unlike IDSs which can't see the traffic. However, honeypots have disadvantages such as a limited field of view (they only see what they're attacked with) and the risk that if they're not done carefully they could spread an attack. To make honeypots easier to use, the honeypot community is coming out in spring 2004 with a bootable CD-ROM that turns a machine into a honeypot.
They have two hypotheses on catching advanced insiders: redirection and honeytokens. Redirection requires making a honeypot a duplicate of a real system, and redirect things that look suspicious into the duplicate copy for closer surveillance. Finding the suspicious things can be based on hotzoning (watching for any attempts to access non-production ports, such as trying to connect to a telnet port on a mail server), known attacks, or bait & switch. Honeytokens are based on putting in false information, such as creating fake user accounts that don't belong to anyone, and looking for attempts to access them (which indicates dictionary attacks or offline cracking).
His 10 year prediction is that honeypots will grow in importance in government and academia, but not substantially in the commercial world. They'll never be as ubiquitous as firewalls.
"Bayesian Event Classification for Intrusion Detection" was presented by Christopher Kruegel from UC Santa Barbara. Typical IDSs with multiple models sum the output of the models and compare to a threshold, which doesn't capture the relationship of models, or external information. The result is too many false positives or false negatives, depending on the tuning settings. Their goal is to use a Bayesian network to reduce the false positives, by representing the interdependencies of the different models. They build a set of models - one for each type of system call (e.g., filesystem operations) - which gives a total of four models for UNIX system call monitoring. They then put the models together looking at the relationships. They claim that the resulting system was always better than a threshold-based system. An audience member asked whether that was an artifact of the Lincoln Labs data used for testing; the speaker said it was not.
"Information Detection: A Bio-Informatics Approach" was presented by Scott Coull from Rensselaer Polytechnic Institute (and was the winner of the "best student paper" award). Their idea is to use some of the techniques used in DNA matching to try to find relationships between actions and signatures of attacks. The idea is to find one or more alignments, where an alignment can be global (finding the most overall alignments) or local (finding the largest block of alignments). For experimentation, they looked for patterns in system call captures from the "acct" UNIX command. They used semi-global alignment, and made it highly tunable by adjusting penalties and bonuses for gaps and matches. They don't know what the proper model is, though, and in particular what are acceptable mutations. An audience member asked whether users can be differentiated using this approach (e.g., for forensics); they didn't know. They accommodate changing behavior over time using "concept drift" which changes the match levels over time without changing user signatures.
"Design, Implementation and Test of an Email Virus Throttle" by Matt Williamson of HP Labs was presented by a colleague, as Matt's wife was 9 months pregnant at the time. This paper extends Matt's work which one the "best paper" award at ACSAC last year. The idea is to throttle the load based on new email destinations, just as his paper last year looked at throttling TCP connections to new hosts to keep worms from using a subverted system to attack other systems. They empirically found that there's a low repeatability of recipients: if you send a message to Jane Doe, the odds are relatively low that the next message (or the next few) will also be to Jane Doe. This is in contrast to web pages, where there's a lot of locality (the odds are high that the next page will come from the same host as one of the past few pages). To avoid unnecessarily slowing down email processing, they look at "slack" time: if you're idle for a while, the throttle is relaxed to allow sending more messages. Unlike the TCP throttle, there's no "working set" maintained because of the lack of locality, just a limit of messages per unit time. The throttle doesn't block messages, but only queues them and dribbles them out. This is an effective way to control email worms. A throttle can be implemented in the client, in the server, or in a proxy.
"Practical Random Number Generation Software" was presented by John Viega from Virginia Polytechnic Institute. This paper is a summary of lessons learned with random numbers in real systems. RNGs and PRNGs are critical to many systems. However, even if they use hardware, they're not very random. In software, random data is quite scarce. The goal is to continually collect entropy from the system, and use metrics to estimate how much you've got so good random numbers can be provided. Unfortunately, entropy isn't absolute: it's relative to what the attacker can see, so need to include a threat model. He suggests that collecting entropy as part of system initialization is a good idea, since it's one of the few times you can get the administrator's attention. Most RNG systems tend to overestimate the entropy available to them, which gives non-random values.
"Isolated Program Execution: An Application-Transparent Approach for Executing Untrusted Programs" was presented by Zhenkai Liang from Stony Brook University (and was the winner of the "outstanding paper" award). Their idea is to create a "virtualized file system" where you can run a command, and it creates copy-on-write versions of any files you modify. When the program is finished, you're presented with a list of all the files that were touched, and can either accept or reject the collected changes. The prototype they built allows you to diff the old & new versions of a file to see what changes were made. This is useful to run programs that you suspect may be malicious but also sound useful, as well as to test out installing a program without risking it modifying a file you didn't intend. No change is required to the program being tested. At times, the virtualization can be pretty tricky, as (for example) when a file or directory is renamed or deleted, or if permissions are changed. No other user of the system sees any of the file system changes until the commit is done; the commit operation fails if there have been other changes made to the file after the copy-on-write operation occurred (i.e., it doesn't try to sort out interleaved operations). The prototype is limited to only handling file operations, which isn't very realistic (you can't "undo" a change made through a socket to a database), so they disallow all network access. Also, it may be hard for non-technical users to determine whether a set of changes is reasonable... how many non-geeks would know that changing a .history file is perfectly normal?
I missed most of Prof Gene Spafford's classic paper "A Failure to Learn from the Past", but heard much of the heated debate. The consensus in the community seems to be that nothing much has changed in the past 15 years since the Morris Worm, and in fact things are getting worse with respect to security. Our code is as poorly built as ever; we still suffer from the same types of flaws. The only difference is that we're more reliant than ever on our software. The relative diversity of the Internet 15 years ago helped limit the damage to only 10% of the computer systems; today's near mono-culture means that nearly everyone is vulnerable to the attack du jour. While the CERT was formed in reaction to the Morris worm and the lack of good ways to promptly distribute critical security information, it has instead become a bottleneck. I was particularly amused by his statement "It is sobering to realize that our overall infrastructure security might well be better had UNIX been written in Cobol rather than C." Perhaps most frightening is the fact that with an increasing use of VoIP phones, in a future attack we may not even be able to use the telephone as a way to communicate about the attack. While there are now numerous laws regarding computer crime, they're hardly ever used.
The first speaker, Daniel Faigin from Aerospace, noted that spam is cost effective: at a response time of 0.05%, 3 million email addresses will yield 1500 hits, which is a good business. Spam has also become a security problem, as an increasing number of spams are sending viruses and worms. There's also privacy concerns - web bugs in messages confirm email addresses even if attachments aren't opened or web sites aren't visited (this is particularly a problem for Outlook users). The newly-passed legislation in the US is likely to be useless at all levels, and many even increase spam. Direct charging for sending messages doesn't work because you can't find out who a spammer is. For example, if the spammer takes over a user's home system (as is common), should the owner of that system get charged for the spam that gets generated by that machine? While it may be philosophically appealing, it's not realistic.
The second speaker, Matt Bishop from UC Davis, tried to define spam. It's usually considered "bulk unsolicited email" or "unsolicited commercial email". But when does something become bulk? If I send to all my friends? If I send to everyone who attended a conference? When does it become commercial, and when have you opted in? What is unsolicited? How much of a relationship must I have with an organization before they can send me messages? Defining characteristics would make it easier to stop spam.
The third speaker, Tasneem Brutch from Kaiser Permanente noted that they're taking a pragmatic approach because of the cost to employees for handling spam. The estimate is that US corporations lost $10 billion/year in additional hardware to process the spam, lost productivity, IT resources to eliminate the messages, etc. She believes there must be enough legal & economic disincentives through federal and international laws to make life harder for spammers. Tier-1 internet providers must do some blocking.
In the discussion period, someone commented that classified US networks are using Doubleclick technology to "help" the analyst find related (but potentially unknown) information. For example, clicking on a document on one classified site might cause you to get an (unsolicited) classified message suggesting other sites with related documents. This is a form of classified spam!
(*) For me, Las Vegas is like Disneyland (artificial, overpriced, crowded, and noisy) without any of the charm. YMMV.