Review of 
Financial Cryptography,
the Workshop on Real-life Cryptographic protocols and Standardization,
and the Workshop on Ethics in Computer Security Research,
St. Lucia
February 28-March 4, 2011

Review by Omar Choudary

Introduction by Sven Dietrich

The Financial Cryptography and Data Security conference (FC'11) took place in Rodney Bay, St. Lucia, 28 February to March 4, 2011. In attendance were about 100 participants from academia, industry, and government. The FC program chair was George Danezis, the general chair was Steven Murdoch, and the local arrangement chair was Fabian Monrose.

The main conference took place at the conference facilities at the Bay Gardens Beach Resort in Rodney Bay February 28 to March 3, 2011. The three affiliated workshops

took place concurrently at the Bay Gardens Hotel, a shuttle ride away, on March 4, 2011. The workshops had about 20-45 participants each, which fluctuated during the day, as workshop participants were free to wander between all three workshops.

On Saturday, March 5, 2011, a full-day excursion took a large group of participants on the 100-person-capacity catamaran Tango Too down the west coast of St. Lucia to Soufriere. There was ample opportunity to chat and mingle with other participants.

Presented here are two sets of notes, one from Omar Choudary for FC'11 and some of the three workshops, and another from Tyler Moore, for the Workshop on Ethics in Computer Security Research.

Comments from the FC11 presentations, by Omar Choudary

28 February 2011

Conference opening keynote (chair Andrew Patrick)

What I Learnt When Trying to Build a Bank
(presented by Jolyon Clulow)

The talk was about the speaker's experience in building a bank. The main topics discussed were Tesco Mobile: moile payments within the Tesco bank; 3000-4000 stores within UK; 20% of population walks through tesco in any given week; intention to add banking capabilities in Tesco stores which provides large number of branches; main task: migrate RBS customers to Tesco new infrastructure (all in less than 3 years). Jol is responsible for security architecture and deployment of security apps. The security architecture service provides non-trivial challenges: fit into organisation, make people understand the security issues and how they affect the overall system. Then Jol explained the strategy to get 90-95% discount on software solutions: get first 2 players and tell them you will choose on a brute manner, up to 50% less the price they ask; then get the third player and put him through at a lower target price then put all 3 compete at an even lower price.

Session: Security Economics (chair Tyler Moore)

Collective Exposure: Peer Effects in Voluntary Disclosure of Personal Data
(presented by Rainer Boehme)

Data from online app where people put some personal data in order to get financial credit. Name replaced by nick name. Only age and city required. Analysis based on: a) subjective testing, trained users categorising the amount/type of information put by users; b) objective, algorithms based on length of input data and some average of data by other users that could identify yourself. Result: users generally put (voluntarily) enought data to be identified.

It's about the Benjamins
(presented by Serge Egelman)

61% of US computers infected, while people saying they want more security, so why the infections? What are the mitigation techniques? Experiment to pay people to install unknown executable using Mechanical Turk (Amazon) as experimental platform (paying $0.01, x5, x 10, x50, x100). Program would run for 1 hour and monitor if the user would quit. After the hour user gets the code and claims payment. For 50% of users they asked root privilege. Participation: 291/141/64 (viewed, download, exec) at 0.01$, through 1105/738/474 at 1$. People with antivirus had more malware than people without AV. Feedback based on the Mechanical Turk. Even security-diligent people would admit to infect their machine once the price was right.

Session: Privacy and voting (chair Nick Hopper)

Evaluating the privacy risks of location based services
(presented by Julien Freudiger)

By sharing a location we share part of our identity. Using data-mining we infer identity from location. The paper is about how many location samples do we need to get identity? (Privacy erosion) Type vs quantity of location information. Experiment EPFL + Nokia Research (from 2009, still going) to analyse location (Wi-fi, GPS, GSM, etc.) and other uses (e.g. Bluetooth) from each of 150 participants. Used state-of-the art algorithms to infer identity from location (using points of interest such as Home/Work), based on clustering; see On the anonymity of Home/Work Location Pairs, K. Partridge and P. Golle. Infer also points of interests. Results show Home/Work anonymity set - how may users share their home/work location with others. For most location types, just a few location points give a good uPOI estimate; the uPOI gives a probability of being at home, work, supermarket, etc. It seems to be a kind of scaling law: with just small location data you can get a large amount of identity data.


Q: What is the clustering based on?.
Answer: clustering at the office level, around 50 square meters area (based on research vs legal aspects).

Selections: Internet voting system with Over-the-shoulder coercion-resistance
(presented by Jeremy Clark)

Coercion-resistance as the main challenge analysed; although there are other issues. 2 previous methods for protection: JCJ/Civitas vs AFT. JCJ allows revocation. Selections is about getting best of both: linear tallying + revocation. Authentication using passwords. Panic Passwords creates something that looks genuine but is bad; useful to give false information so we don't reveal our real vote. Building blocks: a) exponential ElGamal: Enc(g^m) where Enc(m) is regular ElGamal. Panic Passwords: basic solution (giving 2 passwords) doesn't work because I can just ask for both. 5P system: passwords are 5 words away from a dictionary. Protocol: a) registration: soundness is bad; b) vote: create 5 values (B, c', pi_1, g^p, pi_2); c', pi_1 used for Roster/ReRandomisation; g^p the encoded password; g^p, pi_2 used as the voting proof. Proof by Security Game (Sketch). The solution is coercion-resistant if voter achieve goal with certainty when deceiving and adversary cannot distinguish between genuine and fake vote. Conclusion: over the shoulder coercion and vote selling can be mitigated but still requires in-person interaction; still issue with voter's untrusted computer.


John McHugh: can attacker detect there is a panic password in use?.
Answer: attacker knows the system and enough panic passwords should be in use.

Nick Hopper: if I want to detect the 5 panic passwords, I just ask again in 5 minutes and see when you cheat.
Answer: there is indeed the memory issue.

Rob Johnson: user chooses its own password?
Answer: yes. Malice vs An.On
(presented by Benedikt Westermann)

Anonymity networks: Tor and AN.ON. Tor: onion router mechanism. An.On: each new TCP channel encrypted with mix keys. Similarities between the two, but An.On lacks replay protection and relay cell. However AN.ON has better latency parameters, so the authors want to include the 2 properties missing. Attacker: monitors link from user to first mix and wants to find link user-website. Redirect attack: make user visit the attacker website and use browsing history and other data. Can be done by modifying the HTTP response (from 200 to 302 and then giving the attacker's website). 24% success rate. Replay attack: with and without load.


John McHugh: how do you detect the rate of attack per server (Google, Facebook, etc)?
Answer: with Google seems tough.

Nick Hopper: first attack relied on open communication from user to mix. But this might be fixed by using ssl or something similar. What other problems are there?
No answer. Session: Security and privacy (chair Rachel Greenstadt)

Absolute Pwnage: Security Risks of Remote Administration Tools
(presented by Jay Novak)

Case study: Absolute Manage. Could allow remote admin to see the person in front using camera. Absolute Manage analysis. Communication over 2 channels, port 3971, 3970. Problems: broken encryption and missing authentication. Broken encryption: Blowfish in ECB mode, hard-coded keys, keys are text phrases, revealed by running strings command. Authentication (none): just based on seed value (which is the serial number in UTF) encrypted using the known key. It is possible to actually ask for the seed value: set NeedSeedValue to true.


Burt Rosenberg: did you reverse engineer Blowfish?
Answer: yes, using IDA Pro.

Bernhard Esslinger: what was the response from media?
Answer: Absolute Manage said that next version will use public key crypto and they didn't find any complaints on this problem.

Jeremy: when was the product released?
Answer: past summer.

Rachel Greenstadt: what are the general problems observed across multiple systems?
Answer: generally key management although only looked at Absolute Manage in detail.

Protocol for anonymously establish digital provenance in reseller chains
(presented by Ben Palmer)

Many reseller chains sell exclusively digital media: Amazon, iTunes; resellers create a large chain. Customer wants to verify that the reseller did purchase the item from an honest chain. Digital signatures and DRM do not really solve the problem here: digital signature does not verify a given reseller. DRM cannot really deal with untrusted reseller (e.g. Apple Fairplay system where Apple is assumed to be trusted). Threat model: spoofing, counterfeiting; both parties would like anonymity and unlinkability. Solution using the Tagged Transaction Protocol: relies on a Tag Generation Center Building blocks: a) signature scheme with provable security (e.g. RSA with PSS padding scheme); b) zero knowledge proofs. What's in the tag: Tag={pk_x, L_x, pk_tag, a}{sk_TGC}. Main steps of protocol consist in a) registration: done once per reseller; b) generate tag: reseller->supplier ->TGC->supplier->reselller. Possible to reissue a tag. TGC allows tag verification.


Rachel Greenstadt: what incentive has the client in this?
Answer: client finds a cheaper version of the good but wants to be sure is genuine.

Ross: how do you verify the digital item is not a repackage of another nasty stuff (e.g. some kind of sniffing app masquerading as skype)?
Answer: using verified hashes.

Impeding Individual User profiling in Shopper Loyalty Programs
(presented by Patrick Traynor)

About loyalty cards: used for data mining, etc. Sometimes they don't even give discounts but raise price artificially and then lower the price to the actual value once you give your personal card. How to make an infrastructure through which consumers can make linkage across multiple events difficult? Solution based on randomising the club card you are using: ShopAnon. Different than e-Cash. Didn't deal with Points Allocator system. Assumptions: a) users can use their mobile phone (e.g. to store multiple bar codes); b) retailers can ban cards if they wish. Architecture: a) you open the phone and then the appropriate bar code pops up; b) based on your location and a server; c) using Oblivious Transfer: make it hard for server to know which tag sent to you. Tests show that 95-100 percent of the times the barcode on the phone worked.


Sven Dietrich: if you pay by card you are linking the user.
Answer: yes, try to avoid that.

Serge Egelman: club card is opt-in system. Your system seems like
fraud by braking the contract. Answer: this is a different view

Certified Lies: Detecting and Defeating Government Interception Attacks Against SSL
(presented by Christian Soghoian)

CAs give certificates to anyone for money. Browsers accept a large number of "trusted" certificates, especially government CAs. Governments unlikely to use own CAs to sign malicious certificates. TLS statement: CAs should be carefully verified before signing. Packet Forensic tool used to detect network topology, including man in the middle. Christian defined a new attack, called Compelled Certificate Creation Attack, where the government asks a company for a new signing CA. A solution is based on assumption that you trust CA changes between a country but there is a problem if you go from US-based to Russia-based CA. CA/Browser Forum (incl. Paypal, MS, Google, Mozilla, etc.) is a cabal and must be replaced, because their meetings are held private and by invitation only. They decide all that goes into browser. Questions

Ian Goldberg: Don't think there is any cabal there. Are we going to see 6 more weeks of this SSL stuff or the forum will just give up?
A: probably the forum will not give up as the revenue stream is too high.

Serge Egelman: what hope do you have for addressing better the CA/trusting issue?
Answer: not advocating for my system to be deployed, but to get other people involved and trying to bring a better system.

1 March 2011

Session: Crypto I (chair Kazue Sako)

Authenticated key exchange under bad randomness
(presented by Guomin Yang)

Paper focused on public keys. Security Model: adversary can eavesdrop, modify or drop messages; an oracle is also available. 2 types of attacks on randomness were presented. Reset-1: the randomness is completely controlled by the adversary. Reset-2: the adversary doesn't know the randomness but he is able to make the same value for different sessions (see Reset Attack against Virtual Machines - NDSS 2010). Protocols investigated: a) SIG-DH ISO, SIGMA/IKE, JFK, if used with DS the reset attack can give the key from 2 signatures; b) HMQV: implicit authentication but vulnerable to the reset attack, see paper by Menezes and Ustaoglu in Journal of Applied Cryptography, vol. 2, no. 2, 2010. Reset-1 attack model: a) randomness chosen by adversary; b) does not consider forward secrecy. Reset-2 considers forward secrecy. Without this Reset-1 is a stronger attack. The goal is to design and proof the security of a protocol based on HMQV. Construction: a) transformation from Reset-2 to Reset-1 by using a secure pseudo-random function that will make the randomness different across sessions; b) use a DS scheme with forward secrecy and apply the PRF transformation.


Kazue Sako: why is the first scheme called reset-1 since the adversary is just controlling the randomness completely?
Answer: because is stronger than reset-2 and to make it easy to memorise.

Oblivious Outsourced Storage with Delegation
(presented by Martin Franz)

Motivation: on document sharing (e.g. Google doc). Central storage can see the clear text documents and have access to privileges and user patterns. Set to provide a solution to this issue. Rating agency access: privacy is paramount for financial markets. Work based on Basic ORAM: introduced in 1996 by Goldreich and Ostrovsky. Oblivious access: the server does not see what documents are accessed and the access type. Construction based on Square Root ORAM: server stores 2*sqrt(n) additional memory locations, half as dummy data, half as cache. Access by: a) get all cache; b) modify/read cache; c) reshuffle cache. Authors provide a solution in which users cannot learn information about items they don't posses. Delegation: data owner can delegate access to clients. The assumption is that server behaves semi-honest: follows protocol and keeps database correctly but tries to learn as much as possible (user patterns, document data, etc). Security model: a) access security, using encryption; b) access privacy, all operations on database look the same for server and other clients. Clients identify data using tags. Each item has: a) tag, unique random id; b) payload, symmetrically encrypted, signed version of item; c) keybox, encryption of symmetric key, used to encrypt payload. Read/Write/Insert operation: a) scan the cache (get each element and try to decrypt and verify signature); b) access main database: request dummy item if found in cache; c) write back item into the cache. Solution in sublinear asymptotic complexity. Questions

Ian Goldberg how we manage an insert operation securely?
Answer: one solution is to use fixed size database. Other solution is to make a reshuffle only when a given percentage of inserts have been done so the server only sees that some items have been inserted. But is necessarily to have a big non-empty ORAM at the beginning. Kazue Sako: the good thing on google docs is that many people can collaborate. What about this?
Answer: it doesn't work for this model. Ross: your word processing app is on the server. While in this case you do some processing on the client which is a different model; so in this case you need your own word processor like a copy of MS Office. Session: Crypto 2 (chair Radu Sion)

Homographic signatures for digital photographs
(presented by Rob Johnson)

Goal: enable anyone to verify that digital photographs are real (spot modifications). Maybe some time in the future the camera manufacture can include crypto module to sign images. A photo editor would have a limited access to update the signature just for a set of allowed modifications. Canon already has some history on using MAC-enabled usb sticks. Challenge: supporting post-capture edits. Idea based on using Markle Hash Trees, but these don't scale well: for 512x512 image we need 12000 witnesses proving. Better novel idea: 2-Dimensional Merkle Hash using partial overlapping hashing. In this way for an image 512x512 just 1500 witnesses needed. Croppable Signatures used for verification. Other image operations supported. Cropping based on DCT: scaling I means cropping F. We get scalable signatures by taking the DCT of image for representation. We can combine scaling and cropping to support signatures over these operations. Paper contains proofs.


Question: what about legitimate use of photoshop?
Answer: there is some research on more operations and definitely thinking about how to solve the problem.

Ahmad: what is the big difference from Merkle hashes?
Answer: main contribution is efficient version of Merkle hashes.

George: adding some smoke is interesting. Lightning and Darkening is also used in press.
Answer: we could try to come with a non-injective function applied to the pixels. But just a mask would not satisfy our method.

Ian Goldberg: if you only cut out a little bit of the data your hashes can be brute forced?
A: you can use a PRNG and GGM construction to enable good randomness.

Steffen: did you look at the scale of image? This might create artifacts.
Answer: not much analysis but on several tests images were ok.

Ian Goldberg: what about adding watermarking before signature (a kind of 2 signature)?
A: problem with updating/modifying images later.

Sven Dietrich: what about messing with the firmware of the camera by using CHDK, which can be broken?
Answer: yes that is a vulnerability, if you screw up the module implementation; similar to how the Canon module was broken.

Revisiting the computational practicality of private information retrieval
(presented by Femi Olumofin)

Alice makes a query to database : q(i) and receives response: x(i). The main problem tackled is the complexity of communication. Most PIR schemes have communication complexity, which is th limiting factor to deployment. Some existing results: no single-server computational PIR scheme is as efficient as trivial PIR; see Sion and Carbunar NDSS 2007. Our result/contribution: extend analysis of practicality to multi-server PIR schemes, never done in full before. Related work: a) offline preprocessing - Beimel et al. Crypto 2000; b) secure coprocessors - Smith et al. 2001, Asonov et al. 2004; c) multi-server PIR schemes? Sion and Carbunar - NDSS 2007. Authors have conducted experiments on many datasets and found: a) efficient single-server PIR; b) best method using GPUs: server-side processing rate of 2Gb/s; c) tests with 2-28 GB data. Analysis over many different datasets and communication speends, using multi-server PIR: a) scheme by Chor et al.(MPIR-C), 0.3 s/GB to 26.7 GB/s; b) Goldberg et al. scheme (MPIR-G), 1.4 s/GB to 5.7 GB/s. Tests with 1-256 GB fetched 5 to 10 blocks from server per database. Tested trivial vs non-trivial PIR schemes. 3 bandwidth scenarios: a) end-user home internet - 9 Mbps D, 2 Mbps U; b) ethernet lan - 10 Gbps D/U; c) commercial inter-site internet - 1.5 Gbps U/D. Trivial scheme is good but not the best comparisons between databases that fit (16 GB) vs non-fit (32 GB) in RAM. Trivial schemes become better once the bandwidth is larger. Result: not all non-trivial schemes are more efficient than the trivial schemes. Older PIR schemes are not as efficient as the trivial scheme. Recent lattice-based PIR scheme is 10x more efficient. Multi-server PIR schemes are 100x to 1000x more efficient. Future research is to analyse PIR in cloud computing environment. Questions

Ahmad: did you also consider all the GPU processing in this context?
Answer: no, we did consider both CPU and GPU. Claims are not based on GPUs exclusively.

Question: what is the consequence for using this in large scale practice?
Answer: These solutions may be too computational expensive for use in all applications, but might be ok to use partially.

Rob Johnson: how will this change in 10-20 years?
Answer: I expect in 10 years PIR should be practical.

Radu Sion: it would be useful too look at new stuff and how much does it cost per transaction (in terms of CPU cycles).

Question: what does it mean to run in 1GB/s? I am generally interested in bits.
Answer: it refers to the block size. Based on how you choose your block size this is how much you could transfer per second.

Session: Crypto 2 (short papers) (chair Radu Sion)

Optimal one round almost perfectly secure message transmission
(presented by Mohammed Ashraful)

Goal: introducing modular construction for secure message transmission (SMT) protocols. Efficient 1-round SMT protocol. SMT problem: a) sender and receiver connected by synchronous incomplete network; b) n bidirectional wires c) attacker can corrupt t out of the n paths; Objective: transmit message privately and reliably. Applications on multi-party computation and key distribution in sensor networks. Primitives used: a) (n,t,delta) - Send; b) (n-2t) x n information matrix. Receiver gets the information-matrix with probability at least (1-delta). First (n-t) rows are completely random. The adversary succeeds if he can replace the polynomial used for secret messaging without any of the parties observing. Privacy amplification: sender and receiver share a vector of size n, adversary only knows at most t ( less than n) of them. Protocol used: a) APSMT for n=2t + 1; b) call the (n,t,delta) - Send in parallel; c) generate (n-t) one-time pads from n random elements. Questions

Question: does your protocol provide authentication as well?
Answer: this protocol is information-theoretic, no shared key at start so implicit authentication.

Radu Sion: are you aware of some specific problem or application?
Answer: there are some papers on this (cited in the main paper).

A new Approach Towards Coercion-Resistant remote e-voting in linear time
(presented by Rolf Haenni)

Adaptation of the JCJ Voting protocol. How to do the JCJ in linear time: contribution is towards optimization of efficiency. Intro on good voting system: a) only authorized voters can vote; b) no voter can vote more than once; c) valid votes cannot be altered; d) all valid votes must be counted. Privacy: a) voters cannot be inter-linked; b) verifiability; c) voters should be able to verify their vote has been correct; d) coercion-resistance, i.e. voters cannot be urged (forced or given money) to vote in a particular way, to vote at random, not to vote, or to give away private keying material. JCJ was proposed by A. Juels, D. Catalano, and M. Jakobson in WPES '05. Offers correctness, privacy, verifiability and coercion-resistance; probably the best protocol available. Problems: quadratic-time tallying procedure which means not yet ready for practical purposes. System must accept unrestricted number of votes so an attacker could flood. There is no secure platform. This paper addresses the quadratic-time tallying procedure of JCJ. Main protocol: a) setup, ElGamal cryptosystem, key pair for registrars and talliers; b) registration; c) vote casting, ballot B (X, Y, Z) where X is id, Y is vote, Z is zero-knowledge proof. d) tallying, where votes with invalid proofs are removed, removing duplicates, etc. Problem is that we do the whole thing in O(N^2). One possibility to improve computation is to use Smith/Weber's method: talliers share secret random number b. But this is insecure if applied after mix-net, by using id and id^2. Idea is to apply this method before the first mix. Modified protocol: a,b) setup/registration unchanged; c) extended ballot, B = (X, Y, Z, I) where I = E(i) and ask authorities to insert a random number of additional fake votes for each index i. d) modified tallying, where duplicate votes are removed using Smith/Weber's method, and remaining votes (X, Y, I) are mixed (1st mix-net). Modified tallying runs in O(N) time. Still missing a formal proof and implementation.


Radu Sion: what are the advantages of e-voting?
Answer: I don't care but it is being used widely (e.g. in Switzerland).

2 March 2011

Session: Hardware Security (chair Ahmad-Reza Sadeghi)

An attack on PUF-based session key exchange and a hardware-based countermeasure: erasable PUFs
(presented by Ulrich Ruhmair)

Physically Unclonable Functions (PUFs) offer: a) identification; b) session key exchange; c) oblivious transfer. Our contribution: an attack on the key exchange protocol of Tuyls and Skoric and a new PUF. Setting: bank card scenario. Bank card contains PUF. Card inserted into many terminals. Goal 1: authenticate bank card to bank HQ. Goal 2: exchange session key between ATM card and bank HQ. Assumption: even if the attack gains access to medium he should not compromise other session keys; this was broken. The protocol TS'07: a) atm + card + PUF: stores on card ID_puf, n, m = h^n(x) - x counter; b) bank sends some random challenges C_k to which it knows the PUF response R_k; c) card/PUF sends back using some encryption key derived from m the responses; d) the session key is generated based on m and the PUF responses. The attack relies on the following assumptions: a) attacker can eavesdrop on communication; b) no digital keys on card; c) attacker gains access at least twice and can measure selected challenge-responses from the PUF on the card. Phase 1: attacker gets access to card and reads value n and m at time T1 from the card. Phase 2: attacker knows n(T1) and m(T1), gets n(T2) and m(T2) = h&n(t2)(x)= h, derives K1(T2) = h(m(t2), ID_puf). Then can access the challenges. Phase 3: at time T3 the attacker uses the same challenges from time T2 and makes a session key K(T2) = h(K1(T2), R_k(T2)). Consequences: confidentiality is broken. Propose a new type of PUF to solve problem: Erasable PUF. This is a special type of strong PUF where single responses can be erased without affecting the rest of the PUF. This is not currently possible in current architecture, since PUFs are generated by the interaction of many components which affect many responses. There is a recent construction: crossbar-based SHIC PUFs which might be able to be used as erasable PUFs. Authors fabricated several ALILE-based diodes and experiments show that small voltage levels allow to store some information. Questions.

Ian Goldberg is the protocol actually broken, since you only have backwards access?
Answer: it is a matter of assumptions and perspectives. But based on our assumptions it is broken.

Question: the challenge for the erasable PUF must be harder than just which x/y cell you target?
Answer: no, the crosspath is constructed in a way that you can reach your target cell very quickly.

Omar: at time T3 the bank has already started using other set of challenges? So this might make the session key unusable?
Answer: no, because of the way the bank will refresh the challenge-responses database.

Peeling away layers of an RFID system
(presented by Henryk Plotz)

Most RFID protocols broken: Legic Prime, HID Prox, Hitag, Mifare Classic, and to less extent Legi Advant and HID iClass. Focus on the Legic Prime: one of the oldest systems at 13.56 MHz. Proposed a standard for ISO-14443 as annex F, but was not accepted. Master token system control: used to create an hierarchy of privileges. Segment protection: depending on the privilege you can read, write or no access at all. Attack using a 2-path attack: using both black-box analysis and hardware reverse engineering. Hardware reverse engineering: a) remove cover; b) acetone for plastic; c) fuming nitric acid or simmer in rosin for chip cover; d) polish down layer for layer with very fine sand paper; e) analysis: photographs under optical microscope, determine all the blocks (gate functions), follow all the paths between gates and see what happens. Protocol reverse engineering using proxmark3 tool. Connected oscilloscope directly to the proxmark device to analyse raw samples. Following a loop of steps: assumption, sniffing, examination, verification. Simple experimentation yields the cipher initialization: actually obfuscation, no encryption. All parameters are easy to find once the obfuscation is known. Analysing the master token design leads to complete braking the master token system. Better design needed, using a good random generator, mutual authentication and message authentication for all exchanged messages. Questions

Ian Goldberg: the reason they didn't implement the security mechanism was perhaps for limiting costs?
Answer: may be.

Steven Murdoch: what is the relative die size and cost between MiFare Classic and Legic Prime?
Answer: not sure now, but Legic size much smaller than the Classic.

George Danezis: did you actually need to peel the card?
Answer: not really, if we would have known from the beginning the obfuscation logic only uses small number of states.

Ahmad: do you have any inside information on the risk analysis that was performed when the cards were produced?
Answer: No. Also for the Mifare Classic the security division made some tests. NXP has definitely a security division, but the cards were also broken.

Ian Goldberg: are you aware if the hardware person might have given you wrong results?
Answer: not probably, since the results were verified by software.

Session: Banking security (chair John McHugh)

Might financial cryptography kill financial innovation? - The curious case of EMV
(presented by Omar Choudary)

This paper looks at the possibilities of extending current payment systems. The main interest is to allow micro-merchants, such as people in the city market, to allow non-cash payments. There is already an existing system in the US, the which allows payment using magnetic-stripe card by connecting a reader to the phone. The authors of this paper look at the more complex system used in Europe and other countries, EMV. The main solutions proposed are: a) use the CAP mechanism to enable merchant authentication, this can only be verified at the bank level; b) use the Dynamic Data Authentication mechanism to enable merchant authentication, as this can be verified by clients as well; c) use a non-bank payment system such as Paypal together with the credit cards and mobile phones. The authors also propose to use the DDA-enabled EMV cards as authentication tokens. Questions.

Moti Yung: what is the answer to the question in the title?
Answer: it depends on how much banks or other organisations will be interested in trying this system.

Jon Callas: Q: how can we interfere with regulation in order to provide more open standards?
A: it seems to be hard with the current closed EMV system. In the US is easier considering the current market.

hPIN/hTAN: A lightweight and low cost e-banking solution against untrusted computers
(presented by Shujun Li)

hPIN/hTAN: A lightweight and low cost e-banking solution against untrusted computers. Motivation: untrusted computers are a big problem for e-banking. Existing solutions suffer from security-usability dilemma. Authors' solution: hPIN/hTAN, simplistic design + open framework. Two parts: hPIN for login + hTAN for transaction. Three h-s: hardware + hashing + human. Three no-s: no keypad + no OOB channel + no encryption. Proof of concept + user study. Better security-usability. Assumption: attacker has full control of the user's computer. hPIN for login: encrypted text from keyboard, you have a translation keypad on the usb token. E.g. to input 4318 you input ajwe on the computer; now the banking system will know how to decrypt ajwe to 4318. Security aspects: PIN confidentiality, user/server authentication. Usability test shows: success rate 91%, median login time 27.5 s, to complete transaction 70 s using ATmega32 @ 16 MHz. Changes required to client/server.


Ian Goldberg: how does you your system actually verify that the amount is being entered?
Answer: you have clear text showing what you are typing which also gets you attention.

Question: how does your user study compares with customer population?
Answer: do not know.

Rob: is there any measure to avoid politeness in your user study?
Answer: agreed the participants (own students) were more polite.

Panel session: the future of banking security and financial transactions for the 21st century
(chair Jean Camp)

Ross Anderson.   Interesting time in dynamics of payments systems. Two sides to the problem: a) you have to appeal to customers; b) big changes happen rarely. US has fight between retailers and customers: Walmart vs VISA/MasterCard. Issue with contactless where no PIN is used. How do we move towards a secure element in NFC-technology? Useful in many situations and markets such as the one in Affrica. There are no vendor plans to create secure applets that are inserted into the secure element of the phone; they just have an oyster-card-like tapped on the back. What we want: easy/secure way to do transactions. Hardware needs to be able to cope with new system, and maybe remove untraceability. Such payment schemes are going to take VISA/MasterCard out of the markets and they are very aware and afraid of this, so will try to limit market changes and innovation. All of this will involve some kind of legal discussions and new laws. What to do to make federated authentication work? Economic vision: how to do it well when your phone has 5 card tokens inside? What happens when your phone is taken/stolen? How many banks/institutions you need to call? Who is going to take up on responsibility? How to change business model such that banks and other institutions compete to be your friend?

Ahmad-Reza Sadeghi.   Based on discussions with German banks, security is not an issue but the business model. Social networks will be the basis for financial systems across different communities, exchanging information, credit and other financial stuff. Banks are observing this market and will probably create new communities to target the social factors. Injecting fake information in order to profit on the new system, will create a kind of anti-social network, where people will be afraid of all the social ads. We could see freedom fight vs terrorist patterns as users start to use the social network technologies.

Steven M. Bellovin.  Not sure if banks are stupid, greedy or ignorant. Banks design on the method "follow the money". Security analysis: fraud costs that much, security costs almost the same, so they really don't care. Banks don't seem to understand today's/yesterday's cyber attacks. They seem to understand just the physical attacks: very good at designing resistant vaults. Economic incentives of existing systems are wrong.

Lenore D. Zuck.   Focus on financial infrastructure in the US. There is very limited research and academic knowledge in the financial sector infrastructure. Only possible if we can come with a good funding proposal. Effort must come from academic community, to make the funding organizations move into pushing money. There are several areas which have received a lot of funding, e.g. botnets. FBI, NSF, others have been channeling money into this. NSF organising workshop on security of financial infrastructure (late 2009); result is that financial sector and academic community speak different languages. We need to get to a common sense of understanding. US payment systems settles 4.3 trillion $ a day VISA total amount is 5.4 trillion $ a year FBIIC and other organisations were organising a workshop for 2 days; financial community sees as having no threat, minor concerns on vanilla things: software correctness and insider threats. Difficult to openly admit that they have problems. Yesterday there was an attack on Morgan Stanley by Chinese hackers. Generally the systems are not secure but it is hard to analyse as they are closed systems. NIST and DSA are starting a funding program for financial security infrastructure, giving 2-5 million $, so basically nothing. Outcome: need to put more pressure to get more money.


Jon to Lenore: there are chunks of small amounts of money which can be easy to get. Can this be adapted for the academic world, just make a couple web pages and get the $10K?
Answer: Not worth even doing the 2 web pages for just 10K $.

Salvatore to Lenore: high network/infrastructure attacks are the big thing and were banks focus and say academics don't have knowledge/solutions.
Ross: I worked in banks 25 years ago and things haven't changed. The security can be easily broken if you are within their space. The only solution is to train PhD students that then go into the banks and chop their systems and they can try to evolve those systems.

Jean Camp: do banks actually talk to each other?
Answer: of course they do, but nothing more secure comes out. Ross: unfortunately the risk assessment is done via multiple layers of management which remove any trace of real risk assessment from the low layers (i.e. engineers).

Ahmad: we generated good PhD student that got accepted by the industry but within 2 years they are completely brain-washed. Don't see the solution to this.

Lenore: dosn't matter how well we train students, since banks use 3rd party software in any solution. Banks believe that the PhD student can create secure software that works with their 3rd party applications.

Ross: we need to be realistic about what important attacks can happen into the high level financial institution. The average guy from inside cannot do much. You can do organizational stuff and maybe terrorists could do a DDoS before christmass to bring down the payment market.

Bernhard Esslinger: is there a big competition among banks to have less frauds than the other banks? Ahmad: they have numbers internally but there is no good outside transparency to actually see this is the case. The issue is: do we actually need banks at all?

Session: Privacy short session (chair Roger Dingledine)

Beyond Risk-Based Access Control: Towards Incentive-Based Access Control
(presented by Debin Liu)

Motivation: incentive-based access control. Provide access flexibility. Assume your organization can manage user profiles, what is the risk budget and incentive? Game equilibrium? User should be able to choose the risk mitigation effort, balanced against the reward of choosing one or another. Example: bank and teller with read access to customer database, using 1000 tokens with copy-paste function vs 200 tokens without copy-paste function. Survey with tests on 36 students, asking when you see a virus, what do you do with your computer? 61% scan personal computer but only 52% scan organisational computer. Task with sending 10 docs in 10 emails as attachment; participants could encrypt/sign messages. Result: a) new incentive-based access control mechanism; b) encourages the users to make necessary accesses; discourage from taken unnecessary risks. Questions.

John McHugh: you must be careful about drawing conclusions based on not statistically significant data. Question: were you trying to see how quickly would your computer scan?
Answer: not really.

Roger Dingledine: you don't need to just give good rewards, but also make sure each user does what is supposed to do.
Answer: we observed all user behavior and applied mechanism into an industry product.

Proximax: Measurement-driven Proxy Dissemination
(presented by Kirill Levchenko)

Motivation: censorship. Flavours of censorship: a) block non-permitted content, focus of the talk. b) allow only permitted content. Proxies: same resource, different name/location, are circumvented by content-based filtering; counter by encryption, obfuscation, etc. Problem: China blocking Tor, repeatedly by enumerating Tor routers and shut down those. Current stats: 600 direct uses, 2000 via bridges. Authors focus on proxy dissemination problem: how to provide proxy service without having the proxy blocked? If advertised too broadly they are quickly censored, advertise too low and we get just few users. Solve by recasting this into an optimisation problem. Maximise number of hours/per user of usage. Mechanism: let users decide how to disseminate addresses. Solution details: a) measurement of service; b) evaluate registered user performance; c) compute Yield: how many users a registered user has attracted, where multiple users may be advertising same proxy. System overview: a) administrator controls database; b) registered users get proxy addresses; c) they advertise then the proxy addresses and attract users; d) attracted users use DNS to lookup proxy/website; e) feedback loop: DNS server and proxy report usage stats. Using measurements, the system compares number of new users versus risk to proxy; if good effect then system allows user to advertise, else skips the user. Questions.

John McHugh: any sensitive analysis on how registration should be done?
Answer: there are several ways including private link, blog, etc. it's up to the users.

John McHugh: how to make sure the disseminated proxies do not get all detected?
Answer: assumption is that users are motivated to increase security.

Tyler Moore: how does this defend against Chinese users participating into the system (active adversary)?
Answer: in the ideal case the sets of registered users would be disjoint. But indeed there is that problem, this is just an idea.

Steven Murdoch: the advantage of tor is to get users from many groups. But in this system you start to see certain nodes created by groups of people and that can be bad since you just see one user and then you take up all the users connecting to the bridge.
Answer: yes, that is a weakness.

Roger Dingledine: there is concern on the fact that you count user-hours. As adversary I pump user-hours artificially. How do you fight against this?

Answer: it comes down to how to define what is an useful service. Still don't have a good answer to that.

(presented by Nick Hopper)

Problem with collaborative websites (e.g. Wikipedia): posting unproductive (stupid) stuff on them. Methods to counter-attack include blocking IP of such users. Problem in countries where access is limited e.g. China because of firewalls; solution: use Tor or other proxy methods to circumvent restriction. Dilema: if a stupid user is using proxies what to do? Currently blocking all proxies gives many issues as the Chinese users cannot contribute. Idea: use a kind of mask on top of Tor to prove to wikipedia that so far they haven't done anything wrong. Provide a kind of anonymous system to ban users based on bad things. Need unlinkability + blacklistability, also must be very cheap. There is an existing solution, Nymble: solves the problem by introducing pair of trusted parties. There are two managers: pseudo-manager (PM) and Nymble-manager (NM). User gets unbreakable and unlinkable id (Nym), user sends the Nym to NM and gets tokens (n1, .. nw). Then user talks to Wikipedia using the tokens; if the user misbehaves the tokens are sent to NM, which extracts the feature tokens n1...nw and send them to Wikipedia to block user. Steps: a) registration; b) Nymble issuing based on hash function over a time sequence. Problem if NM talks to PM as the user-action pairs are revealed. New solutions: a) BLAC: very private, not scalable; b) Nymbler/Jack: variants of Nymble with better privacy (no NM/PM problem); c) Author's new contribution: BNymble, just like Nymble but more scalable. Nym is replaced by an asymmetric token to provide a signature. Questions.

John McHugh: you must record the token that was used in every action. So do you have forward linkability?
Answer: in the original Nymble the system actually take care that you only get a good new id.

Towards Secure Bioinformatic Services
(presented by Stefan Katzenbeisser)

Issue: how to securely process genome data? Problems: genome data is very sensitive. Can be used in many things in the future. Idea: how to process privately the genome data using a 2-party computation protocol. Physician does not need to reveal data to the bio-informatics institute. At the end of processing he gets the desired result using some kind of query. There are lots of previous work, some describing how to encrypt genome data and recognise patterns to analyse it. But problems like cancer cannot be reliable detected using string matching and pattern analysis. Better tools include Hidden Markov Models (HMMs): all transitions are probabilistic. There are proposed solutions using a secure computation of probabilistic HMM Query,. Problems on the length of integer values since they are too short and miss data. The author's contribution: secure computation with non-integer value; values are approximated using a logarithmic representation. This provides secure and efficient arithmetic operations, with constant error. Implementation: code the floating point value as an integer triple (ro, sigma, tao). ro: is or not 0; sigma: sign; tao: logarithmic representation. Multiplication is simple, but addition is very complicated. So authors choose to create a table with all the possible values of the computation: expectation is to be just a few thousand computations if the base is chosen correctly. With this trick the addition gets much more acceptable: 90-140 secure additions per second on 2.1 GHz computer. Experiments show the method works, with high accuracy obtained despite heavy quantization. Questions.

Moti Yung: two papers in the past on this: e.g. O2 - computing on rational.
Answer: yes, they have been checked, but this is a different method.

Kirill Levchenko: why do you take the model where the physical institute is trusted and the other not?
Answer: it's a matter of regulations and policy. We used this to analyse very personal data that you don't want to give to your physician.

3 March 2011

Session: Web security (chair Konstantin Beznosov)

Quo Vadis? A Study of the Evolution of Input Validation Vulnerabilities in Web Applications
(presented by Theodor Scholte)

The authors have presented several methods to improve the security of web applications by providing a design model for developers. This provides an implicit secure framework where developers can design web applications without thinking about the security implications. Questions.

Tyler Moore: what percentage of servers are still vulnerable over time?
Answer: this study only focused on web applications installed on different hosts. Haven't looked into websites that are vulnerable on the internet. Rainer Boehme: database is updated and modified over time. Do you take this bias in consideration?

Answer: Yes. We took the vulnerabilities that had associated data in other source. After collecting all data we start classifying and extracted "clean" data.

Rainer Boehme: still, you might get some bias and you should compensate?
Answer: the bias was very small.

Rob Johnson: on the graph about vulnerability report per year, did you make any attempt to measure the rate at which vulnerabilities are added to the database per year?
Answer: no.

Konstantin Beznosov: different efforts to enforce web authentication and concluded that there are still many problems. So we need to make some review of code or what other things?
Answer: all the approaches are complementary so we need multiple.

Question: why validation by design?
Answer: it seems that even after the time spent in training, developers still don't learn. Therefore if we do things by design then the developers will be less involved into inserting vulnerabilities.

Re-evaluating the Wisdom of Crowds in Assessing Web Security
(presented by Pern Hui Chia)

Motivation: web security remains challenging. Online banking fraud loss in UK was 59.7M BP (2009), $24.9M BP (Jan-June 2010). Porn industry is more than 97 billion $, and this introduces much malware. Some existing solutions, dividing sites in good and bad. Good sites: certification. Bad sites: black list. But we still have a large gray area, not blacklisted but with malware. Idea: can we rely on the crowds to improve security? Apply the "wisdom of crowds" for security, the many can be smarter than the few. Problems: reliability, incentives, gaming behavior. PhishTank is a possible framework for this, used by McAfee, Mozilla, Yahoo, etc. This paper looks at "Web of Trust" (WOT), which gives a confidence of a website. Default signaling: WOT will display a color ring next to each site result based on the trust and confidence level. Experiment with WOT: over 20k domains selected from Alexa top-million sites; count ratings and comments over 50k users; plus 485k random comments overall. Coverage for general sites: WOT (51%) , much less than SA,SBDP and SW (other similar solutions). Outcomes show 3% bad sites. There is a divergent evaluation outcome, since out of 950 sites only 2 get same evaluation results. WOT has a high recall but low precision in identifying bad sites. Contribution pattern: skewed contribution ratios. The comments follow a power law, while ratings follow non-power law. Number of unique contributors and comments increase in time but distribution is also more skewed. Experiment looking at exploitability vs objectivity: in WOT more than 90% of comments are based on sites with low level of conflict; conflict is defined as positive comment given to bad site. Limitations of study: did not evaluate effectiveness of WOT against long or short term malicious websites. Conclusions: a) risk mitigation measures are useful; b) asking if skewed participation ration influences results; c) details of rating and reliability computation are hidden; d) WOT is more comprehensive than other tools in identifying "bad" domains; e) wisdom of crowds security can work with good design. Questions.

John McHugh: the overlap of websites in the divergent evaluation might be analysed wrong? E.g. if just 2 websites are rated by all users.
Answer: we looked also at the comments from users and they seem to be sparse.

Tyler Moore: this work doesn't give a convincing argument for using WOT and the wisdom of crowds mechanism. Do you have any way to look further down the tail?
Answer: it could be an interesting follow up.
Tyler Moore: also on the differences, you could analyse on what markets the crowd is better than others and classify this more explicitly.

Paul Van Oorschot: how reactive the WOT is at detecting that a good website becomes bad just because is compromised?
Answer: the ratings are updated after several hours. Paul Van Oorschot: yes, but a manual rating can take a bit more.

John McHugh: by tracing websites over time do you see websites changing from good to bad?
A: didn't analyse.

Mercury: Recovering forgotten passwords using personal devices
(presented by Mohammad Mannan)

Focus more on the problem side of recovering passwords. Background: need for recovery, recovery vs reset. Sometimes can lead to new but weak passwords being transmitted; how do we get our original strong password? What if you forget your password? Some had it written on the desk. Small environment: you can get it from your administrator. Large environment: phone call, PVQs (personal verification questions); maybe scalable but this is insecure. State of art: password managers, email, sms, phone, ownership, PVQs, including facebook social verification. Email password reset/recovery is widely used but requires high level of trust. Facebook social authentication system: show pictures from people and need to identify friends; but some people use bad images so can be difficult to identify, while others just don't know all their "friends" too well. Blue moon: preference-based authentication, where preferences (e.g. like chinese food) do not change quickly; better than PVQ but not used for password recovery. New idea: Mercury; use end-to-end encryption for safe password retrieval. User generates a key pair for password recovery, shares public key with a site during account setup, the site sends the encrypted password during recovery. Implementation of Mercury: using a mobile phone. Personal-level public keys, no PKIs Mercury key generation can be done using personal objects, see object-based password (HotSec '08); if you loose the image from one device you can still recover from another backup media or just use any other random generator source. Propose to use same seed so that you can restore the key from a backup mechanism key sharing with any site, e.g. while registering. If using a personal device this can send the key directly. Password recovery: server sends to personal device the password encrypted using the public key. If the server just stores a hash then the server can send a reset password or store encrypted (using the public-key) password on website. Features and advantages: a) secure recovery; b) no third-parties; c) cheap two-factor sort-of authentication. Limitations: storing key in mobile device; if you lose this then you give access to others. User must handle key update. Try it on:


John McHugh: the key that you use for recovery is still something you need to remember. Why is this different than the password?
Answer: expectation is that you use the private key in just a few situations.

Moti Yung: you can use the same recovery across many websites. Kazue Sako: maybe avoid using the phone?
Answer: it might be simpler but then you might go into the trust issue.

Omar: if you have the private key then just use that as authentication.
Answer: the goal is to avoid using it in public places.

Serge Egelman: this encourages people to store plain-texts on websites
Answer: studies shows that 25 per cent of web sites actually use plain-text passwords.

Closing talk (chair Burt Rosenberg)

Why Mobile Security is not like Traditional Security
(presented by Markus Jakobsson, Paypal)

There are several problems in mobile phones. Lack of crypto, social abuse: phones much more social used than PCs, power limitations: if AV software running on phone, battery will go out on 30 minutes. Job time / CPU time increases linearly with number of applications and things that it must be aware of. AV software work very well in the absence of a real threat. Limited user interfaces: hard to read a captcha on a small screen, silly rules (upper case, special character, etc...) makes it hard. Password entry pain: 2.5 times longer to entry passwords on the phone. Our own inertia: just go along with the applications without examining how exactly things are done. Problems (from acceptable to very bad): lack of coverage, poor voice quality, small screen size, difficulty customising settings, entering passwords, slow web connection, short battery life (study while at Indiana). How it should be: develop secure and less annoying authentication/anti-virus methods. What we do includes software-based attestation: verify no active malware before running sensitive routines. Protocol: a) phone connects to server; b) server checks the phone (avoiding malware confusion); c) and then makes connection; see How to detect malware using software-based attestation? We must know RAM amount and other hardware parameters. Step 1: page up everything; malware might stay in memory. Step 2: overwrite all "free" RAM with pseudo random content (seed); probably malware might stay there; assume most of the RAM except malware location is overwritten. Step 3: make keyed digest of all RAM and send to external verifier; if malware was in memory this gets detected. Execute theses steps in part of RAM: example using just even memory cells. How UIs affect security: need for password manager because users can't be bothered with passwords. Why not use error correction (e.g. Swype)? Dictionary word passwords decrease security because expressions and short words will be quickly auto-corrected. User study: how secure? Password average: 18 bits entropy (3 words); measure based on the frequency of words within a trigram and frequency of trigrams. Comparing 2/3 and 3/3 fastwords. Study: how fast? Mobile always take 2-3 times more than PC. Issues to consider: a) pushing back on weak credentials; b) dealing with special cases (e.g. resets); c) discouraging credential reuse; d) phishing. Example attack (man in the screen): on bad website the attacker replaces the web page by Home image of iPhone (remove browse bar); ask for iTunes password to "recover from crash"l go back to the attacker website; user has no idea of what happened. Privacy issues: weak credentials. Big problem: identifying when it happens. Reset passwords are easy to guess, hard to remember. Most of people actually use the default questions. Another big issue: registration time; the slower it is the less clients you get; slow registration = no registration. Idea: for pass recovery use a panel of images and user just selects some of them. Intentionally use items that you don't generally talk on Facebook. Test: engaged people would not be able to impersonate the partner well; friends/colleagues even worse than an attacker. Avoiding credential reuse: why do people reuse passwords? Simply because they can; the idea of giving images from which select can restrict users from reuse. Phishing works when attacker can spoof site and make user reaction leak password. Most sites just tackle the first problem. Which methods can be used, which is privacy intrusion: keyboard biometrics, calling behavior, location, face recognition?


Ahmad Sadeghi: what if we use a mafia-kind attack?
Answer: we can use the SIM or other secure element to assure a trusted timed computation. Look also at "Software based attestations" by Adrian.

Jolyon Clulow: are passwords broken because they are weak or because of password recovery?
Answer: tests seem to say there are many chunks from weak passwords.

Jon Callas: I reuse a lot of passwords, but stored on key-chain every time. 600 items now in the key-chain.
Answer: probably not going to get better.

4 March 2011 - Workshops day

Workshop on Real-Life Cryptographic Protocols
Invited talk: by Moti Yung

Worked at: IBM, RSA, Google, visiting at Columbia. There are 3 types of crypto: a) abstract: models, quantifying adversarial model, definition of implementation and proof of security; b) applied: looks at systems context and applies a model to that system; c) actual: actually apply the cryptographic primitives to the models. Symmetric public key encryption - Crypto 85 Moti with Galil and Haber, as a student; method much more scalable: public key only in the server; the principle applies to SSL/TLS. At IBM he worked in: DSA system, open systems, single sign-on products. The Systematic Design of Two-Party Authentication Protocols were motivating work for protocol analysis (provable security) by Bellare Rogaway. On the Internet stack (results from 94-97): Scalability and Flexibility in Authentication Service, Infocom; first crypto protocols with a stateless per connection server, using authenticated encryption before it was defined; extensively used in encrypted cookies. How to do crypto within engineering? Traditional order: business needs, feasibility, functionality, realization, usability + performance, reliability, security, privacy (if they get here); crypto combined with others: system functions, performance, usability; crypto is best applied when these conflicts do not apply, and is hidden from the end user. The goal is to make the crypto help somehow the performance; The need for crypto might come from intrinsic sources. Important: get involved early, don't impose solutions, take time to analyse the system: your time will come. Design the crypto that is optimised for the task but which can be used for other possible future tasks. He then moved to consulting: Greek national lottery project. Designed, implemented, inspected the working back-end; working several years, first year revenues got up 35 per cent. Then got used by Goldman Sachs. Problem tackled: they needed a very secure system (security + crypto); they wanted lottery every 15 minutes; cannot just use Unix random generator, as the Swedish did. Moti came to scrutinize the system. He made the design such that even designers and internal guys cannot bypass the system. Critical system described in "Electronic National Lottery" - Financial Crypto 2004; was robust because the crypto was done right. Special requirements: cryptographic robustness; protection against various attacks. E.g. if the main randomness gets cooled down, the system uses some other source as well; concurrency analysis; several time constraint (not too early, not too late); availability; In the design he used whatever was needed from crypto, but not more! Recent project: AdX (at Google), used for advertisement exchange/ Ads as payment instrument; ads are the DUAL to micro-payments. The AdX system is the one responsible to display the ads on websites that want to display them for money. Other existing systems: static policy, specify in advance which ads to display. New system: AdX; real time, dynamic system; companies are competing for what advertisements to put on the website. AdX Model: a) viewer goes to publisher; b) publisher sends price to ad exchange (AdX); c) via ad network all advertisers give a price they are willing to pay; d) auction made to determine who wins; e) publisher gets ad and advertiser pays based on the 2nd price model. Many constraints on place. If the process takes too long you cannot charge for the ad; biggest exchange in the world transaction-wide: billion auctions per day. Moti joined the project to work on the security of bids; there was a gap between service agreement and engineering. Crypto was the solution; the important issue was in the price that was embedded in the iFrame given to the user. Cannot create multiple data flows because of time, resources, etc. Cannot use standard solutions (SSL) because of time constrained. Provide solution for secrecy and integrity. Use authenticated encryption, pseudo random function based encryption, one-time pad for each auction, MAC the cipher text. Advantages: fast, semantic security flexible utility. Privacy and data liberation issue: end of day/month Google releases data to agencies. Solution: use crypto to refer to user cookie within AdX; avoid agencies to extract information about users. Michael Rabin in sabatical at Google and then with Moti designed "Auction Verification System" for dispute resolution for Adx. Conclusions: a) attacker is less clear; b) envision attacks that are not even there some times; c) crypto not used to achieve ultimate security but rather to shift exposure; d) many times we assume things that are really broken - e.g. public key infrastructure; e) need to take exact cost into account as well as all the communication patterns; f) be a good cryptographer helps; g) important to collaborate; h) crypto keeps the field much alive.

Session: Crypto (chair Ahmad-Reza Sadeghi)

Fast Elliptic Curve Cryptography in OpenSSL
(presented by Emilia Kasper)

In the SSL/TLS handshake (using RSA) 2 roundtrips must be completed before application starts. Problem: if ever a server key is compromised all the keys made with that key are compromised. We want forward security, so we use the Diffie-Helman key exchange. During hello we choose which cipher we want to use; sever and client exchange DH keys using RSA. Server performance varies for RSA handshake: 1 private RSA operation; Elliptic Curve (EC) DH handshake: 1 private RSA operation (sign) + 2 (EC) DH group operations. Need something to try improving the performance. Set of named curves recommended for TLS in RFC 4492. The problem is that we need to support all; so we try to make one of then faster. OpenSSL elliptic curve library: support all but is slow-ish. There are some fast implementations available: Emilia picked NIST P-224 as the curve to be made faster: Implementation in OpenSSL, NSS. 224-bit field = 112-bit security; good and fits into 64-bit registers. NIST P-224 implementation: implementation for 64-bit processors; redesigned field arithmetic; throw away bignum library; use unsigned int representation; branch-free modular reduction; constant-time resistant to software side-channel attacks; Eliminating branches using a select function based on bitmask to select between 2 possible results: avoids different time on branches; issue with multiplication which is not constant-time so made constant-time table lookups. Use select() repeatedly to avoid timing attacks; improvements over all ECDH operations and ECDSA verify (64-bit optimisation); twice as fast for RSA-1024 + ECDH-224 than older implementation. The implementation will be released in OpenSSL 1.0.1 (

Security considerations include: a) session caching; SSL allows session caching on the server side, client resumes session by sending the session ID; but this can be bad as an attacker can retrieve session key from cache. But cache can be disabled; b) session tickets; TLS adds a session ticket mechanism for stateless session resumption. Here client stores session information, but this now depends on the server's long term key; but is very bad as it conflicts with forward security. Server key compromise allows the attacker to decrypt the ticket and obtain the session key. A solution might be to disable session tickets. c) false start: optimistic client that believes it has all the information from the server; start sending app data before the finish message. Used to improve performance, but an attacker can impersonate server and use a weak cipher; client should refuse false start with a weak cipher. Firefox/Chrome are implementing this. d) snap start: predict all the server messages and certificates and start sending application data from the beginning; 0 delay but this doesn't give forward security at all

Towards real-life implementation of signature schemes from the strong RSA assumption
(presented by Ye)

Signature schemes in RSA have bad performance. Here authors present techniques for performance improvement. Digital signature schemes rely on 3 algorithms: key generation, signing, verification. There are two generally used secure models. 1) The random oracle model: proposed in 1987 by Fiat and Shamir. The ROM proofs do not imply security in the standard model, so see "No Matter what real function is used in place of the random oracle", Canetti, Goldreich, and Halevi, 1998. 2) The standard model: closer to real world model. Considerations: traditional RSA scheme used only has a secure proof in the Random Oracle Model. Signature schemes from the strong RSA family have security proofs in the standard model, but not widely deployed because of performance mainly. Authors come with a solution to this: flexible RSA probem: v^e = u mod n; strong RSA assume the problem to solve this is hard. The Camenisch-Lysyanskaya scheme vs Yu-Tate scheme: v^e ?= a^m*b^s*c mod n. Contributions relate to improvements to the Yu-Tate scheme: a) relax requirement on s parameter which was required to be 1344 bits (proof of this in the paper); b) the way to compute v; choose a scalar which is small but random such that the exponential cannot be solved. The new scheme is based on the two improvements over Yu-Tate, faster than all other strong RSA schemes. Support both offline and online signing.


Kazue Sako: what would be the things to do in order to make it real?
Answer: integrate into SSL.

Ahmad Sadeghi: CL signatures do not have non-repudiation, such as the other signature schemes. You can blind the CL signatures. Problems in legal settings, so maybe the title is too general?
Answer: we have more than 10 years of research in these schemes, we think these can provide all the necessary requirements.

Question: what was the target context of the application?
Answer: provide an efficient and secure way of doing signatures.

Detailed cost estimation of CNTW attack against EMV signature scheme
(presented by Yumi Sakemi)

EMV signature scheme based on ISO/IEC

9796-2 (RSA signature). SDA-IPKD (static data authentication - issuer public key data) used on most cards. CNTW attack: forgery method proposed by Coron-Naccache-Tibouchi-Weinmann in Crypto2009. Attack can forge ISO/IEC 9796-2 signature with practical cost: $800 for 2048-bit RSA modulus using Amazon EC2; attack cannot be done using actual cost. Contributions of the authors are to re-evaluate the cost of CNTW attack in detail. Coron et al. roughly estimated the forging cost of EMV signature. Cost estimate the attack under all the cases that each data in D1-D7 fields of signature is alterable or fixed. Overview of CNTW attack: a) represent a signed message as a multiplication of multiple messages; b) forged signature is obtained by correct signature over the combination of these multiple messages, where some of them might not be genuine. The idea is to represent v = a*u - b*N, where u is the signature; then find parameters a and b such as to accelerate computation. Conclusion: EMV signature is forged with less than $2000.


Omar Choudary: does this help to break the bank transaction?
Answer: Not sure (contact main author).

Workshop on Ethics in Computer Security Research

Enforcing community standards for research on users of the tor anonymity network
(presented by Christopher Soghoian)

About Christopher: privacy researcher (Washington DC), advocate and PhD candidate. Focus on ISP/telco assisted government surveillance. He was the first ever in-house technologist at the US federal trade commission. In Tor the communication from Alice and Bob is via multiple servers/relays such as no single organization can control all of them. Nasty people might be running Tor servers. There are active vs passive rogue servers; is possible to detect active attacks (MITM servers) and Tor can block these. But there is nothing we can do for passive data collection. Government will not disclose passive listening, hackers cannot really be stopped, but what about researchers who spy on Tor? There are bad example of research studies. McCoy et al (PETS '08): shining a light on dark places; geolocate users; created exit node Tor server. Researchers did not seek or obtain prior legal analysis of their network; only asked a few minutes of a law processors; therefore the community was not very happy. Their university IRB (Institution Review Board) said that no rules were violated. Castelluccia et al. ('10) made a study on private information disclosure from web searches with stolen session cookies (captured over Wi-Fi network); they got data from 500-600 people/day from their network while sniffing. From 10 users they got opt-in consent for actively hack accounts. 1803 distinct Google users, 46% of which were logged into their accounts. The privacy of colleagues was much more considered than Tor users. Therefore we are unsure why they thought it was ok to sniff against people using Tor. Conclusions: a) first study specific to Tor, second using Tor just as a shortcut to more data; b) several other studies since (even ones awarded best paper); c) problem here with privacy of Tor users: something should be done; d) also there is a problem in that this kind of action as a violation is not well documented. McCoy et al. don't provide info on their web page about the negative community response. Should we discourage Tor snooping research? In Christian's perspective definitely Yes. Should it be illegal if the FI does it? Google has engaged in a massive campaign of car-driving and Wi-Fi payload packet sniffing: claims this is not illegal therefore of course the FBI thinks it is clearly ok. We should establish standards for ethical work, and minimize user data collection and retention. Research should be legal in the country where it is performed. How to enforce the standard: reject academic papers that do not respect ethical considerations, at least on the top conferences; e.g. SOUPS now requires such kind of enforcement.


Question: tcp-dump is not actually illegal in all research organisations.
Answer: actually in US is illegal with minor exception that are related to the health of the network.

Ross Anderson: not sure this is entirely a positive thing. In medical health the data records are made public for research, although there are many contra-arguments for privacy. Researchers make contributions based on that data. So there are disputes.
Answer: I think the situation we have now is even worse than in the Tor example.

George Danezis: I think the discussion is irrelevant as it is based on FBI and specific regulations, not general ethic point of view.
Answer: we think there is a general ethic problem on the fact that researchers are sniffing for random reasons, regardless of local regulation. John McHugh: there is a problem if the Tor exit network is on a weak-protected Wi-Fi link.

George: we published a paper here that shows how to break into facilities (RFID related). If we try to make such publications hard this will break against the principle of security disclosure for the goal of enforcing security.
Answer: if we don't establish research community standards, then people in DC will, and will do it in a bad way.

Tyler Moore: an easy way to enforce the standard might be to publish a set of guidelines on the Tor website.
Answer: pretty good idea. Maybe a few bullet-points to start using your server.

Ethical Dilemmas in Take-down Research
(presented by Tyler Moore)

Tyler made observational studies on phishing attacks, trying to understand how the banks respond to phishing attacks, in particular how websites are taken down. Several papers fron 2007 onwards with R. Clayton. Dilemmas: 1) should researchers notify affected parties in order to expedite take-down? Initially just observing, and then asked who to tell about it. Finally decided to leave it as it is since there is no clear path for that; is there any general research conclusion from this? 2) should researchers intervene to assist victims? Tyler and Richard reported 414 compromised users whose details were recovered from phishing websites. They were stumbled in the situation were you feel like obligated to act; therefore (because of financial/time cost) you try to avoid situations where you get personal data. See Rod Rasmussen: Internet Indentity; big step from reporting credentials to give them to the admin. 3) should researchers fabricate plausible content to conduct pure experiments of take-down? There were several attempts: using fake copyright material and then ask ISPs to act. Problem is that you use the resources for real attack in research experiences. 4) should researchers collect world-readable data from private locations? Some websites publish private data on a public-readable location; they decided to actively look for this kind of data in order to see how much it takes for authorities to take down such websites. In 2007 they got a nasty letter from one of the websites they made public. Therefore from then on they collect data but they anonymise it. But what if our analysis will assist criminals? e.g. by revealing "what to avoid" in published papers. Didn't get to any conclusions after many analysis on this problem. Our goal is to try to help defenders by saying how attackers work 5) should investigatory techniques be revealed? If we reveal our methods then the attacker can adapt its mechanisms. Example case with Mr-Brain: stealing phish from fraudsters; they knew but didn't disclosed and then another research group made the disclosure. The problem is that is not very scientific if you keep your methods hidden, therefore authors generally disclose most but not all the methods. 6) when should datasets be made public or kept secret? See Phish Tank: all phishing attempts get reported here. This is a classic battle: banks don't like this as it provides a history of which banks have been compromised; so we need to think to what extent we disclose data; we need a metric and a system to determine which party is this data helping more. Tyler and Richard did a test on Phish Tank to see if attackers can re-compromise website using Phish Tank information. Result: websites that appeared on Phish Tank were re-compromised less than others. Conclusions: a) disclosing more data leads to better security; b) sharing data between take-down companies would reduce phishing website lifetimes.


Question: when you start the research you imply that you might get across private data. So you need a kind of mitigation?
Answer: yes, there is a hard problem and we need to find way to mitigate this.

Nicolas Christin: are you deceiving users using the fake data approach?
Answer: I think there are separate issues, a bit orthogonal problems. Here you are just using some public resources (admin time, etc.).

George Danezis: how does the criminal network and the effect of disclosing such information affect important decisions? Example on military decisions vs bank phishing.
Answer: this is a general fact that happens in many attacks; see Stuxnet.

Question: should we try to make a judgement about which are the users we are trying to help? E.g. authentication is used by both good and bad guys. Ross Anderson: This is a different case and we should make a discrimination.

Lenore Zuck: what are the other not mentioned dilemmas about?
Answer: suggest solutions that increase take down.

Ethical considerations of sharing data for cyber security research
(presented by Darren Shou, Symantec)

Motivation: continuous sharing of real user data is necessary and useful for research, but data is held by operators (ISPs, etc.), not researchers. Why should they share these data? What are the incentives? Problems: avoidance of error, negligence or ethical consideration violation. There are existing data sharing models (industry-academia): a) interns get shared data; but is very limited in terms of the work they can do and data they can access, and limited by financial factor; b) sponsored research, which allows good data but might make an unfair good publication just for the student with access to data; c) clearing houses, provide a large amount of data (DHS, internet traffic archive, DatCat, ITA, ISC SIE, PREDICT). Issues: is this data generated by these houses? does the data depend on other factors we don't control? Ethical considerations on data sharing: a) opennes, pursuit of knowledge; b) how does this influence a company business model? c) should Symantec provide all data to McAffee? d) issues with trade secrets Say Symantec wants to publish. There are consequences: a) financial issues to make public 300 TB of data; b) fairness actions? issues with competitors; c) intellectual property rights. Sharing proprietary data versus priority and recognition: should a researcher share this data with others? Then she might loose the recognition or there might be ethical issues. Problem maintaining secrecy versus revealing research participants. Competitive advantage versus efficiency of research. Some solutions: a) government funding; b) give just access to data at the company site. Privacy: users have the right to preserve it. Examples of data: IP addresses, user online behavior. HIPAA: gives some guidelines on how to share data. Solutions rely technology (anonymization), but there are limitations. Sound experimentation: avoid error, get independent confirmation. We had problems about papers that give false solutions or some that only work for particular scenarios, especially those that make important affirmations and use prohibitive datasets. Therefore is good to have external review board for limiting research, and restrict access to on-site only. How to improve this: WINE model. Conclusion: Symantec want to provide incentive for sharing but there is a need for a better model.


John McHugh: the fact that you use old data is not valuable for companies but it can be useful for researchers?
Answer: of course, but we would like to allow current data to researchers.

Question: maybe there is a conflict from your perspectives and research community. We used to have a good model like the Bell Labs.
Answer: what if I make all the data available and you get to publish all you want?

Tyler Moore: what kind of data can you make available, and what about making it available for PIR research?
Answer: for example telemetry data would be very useful for that.

Question: what about some existing publications regarding this scheme?
Answer: see Xin Hu, Malware classification from function call graphs and Polo Chau: Tera-scale graph mining for malware detection.

Notes from the Workshop on Ethics in Computer Security Research

by Tyler Moore

Ethical Issues in E-Voting Security Analysis
J. Alex Halderman (The University of Michigan) and David G. Robinson (Information Society Project, Yale Law School)

Both Alex and David presented

Alex works primarily in systems analysis of deployed e-voting. He has done voting studies in California and India, including turning a US e-voting machine into a pacman machine. He observes that there are lots of dangers associated with e-voting, but there is also concern about ethical issues. David Robinson is a philosophy major, and has more interest and expertise in ethics. He has been disentangling and disambiguating human moral decisions. David was the first associate director of Princeton's CITP; he is a tech policy person with a background as a humanist. Part of the paper is an attempt to describe the experiences encountered, but goal is to point out areas for improvement. There is a clear pattern of fielded voting systems. They are developed and deployed, followed by computer security experts kicking the tires and subsequently finding fault with the system. Some claim is that pointing out particular faults is counterproductive, because flaws are so easy to find and yet it doesn't really fix anything, but it does undermine trust. Question raised by Alex and David: should a computer security expert consider near-term political consequences of disclosing vulnerabilities? Could undermine trust in elections when the system is already about to be used and notification would not stop the election. This has happened in India. By disclosing the results, the ruling party is upset, because it provides ammunition to the opposition to undermine credibility, possibly unfairly. This is further complicated because the ruling party is more closely aligned with US foreign policy. A related hypothetical question: what if undermining the credibility of an election could lead to physical violence? There is something immodest about computer scientists having the power to manipulate the election one way or the other - this contradicts the one person-one vote principle. Alex suggests that computer scientists should try to stay out of actively participating in politics, but should disclose if it advances a more secure system. Alex then raises a question that is an "elephant in the room". Researchers may start out confident of finding vulnerabilities. But what if you don't discover any? Should we publish the fact that we didn't find any vulnerabilities (negative results)? It is impossible to determine whether the product is truly secure, and the vendors can use this revelation to advertise false security.

The talk was highly interactive, and questions and comments were raised from the audience.

Ross Anderson (Cambridge): problems in banking are different than voting, because the regulators influence the outcomes.

Alex: bigger issue is whether there is transparency and verifiability.

Ross Anderson: in the long term, the customer is the political establishment. We need a long-term ongoing learning process of how to conduct elections

Alex: Complicating factor is that the relationship between election boards and computer security researchers has been adversarial.

John McHugh (UNC/Redjack): There is a community who believe that elections should be open and transparent. But there is also a larger community that wants elections to be closed and opaque. Even in US local elections, politicians want to retain power at all costs.

Alex: Disagrees on the extent of that problem in US at least, but recognizes this is a temptation that must be protected against.

Ross Anderson: how do you best advocate your position? Train the opposition on the threats and your solutions, and wait for them to take power.

David: often see claim that the goal is electronic voting to be "as secure" as paper voting. Is it an ethical problem to answer this question? Is it really even answerable? Computer security experts don't always have expertise on paper ballot frauds, and so are not in a position to judge the relative merits of threats to paper elections versus threats to electronic voting.

Michael Bailey (Michigan): Maybe computer security researchers are lazy by spending most effort on breaking bad systems, rather than proposing a reliable and secure system.

Alex: disputes claim that finding the faults is easy and designing a secure solution is hard. Also questions whether designing exclusively for security while neglecting transparency is the right way forward. He believes that you first need to convince others of the problem.

Michael: What fraction of the work in this area is dedicated to identifying risks versus proposing potential solutions?

Alex: Most publishing on solutions, but most attention from the press goes to problems.

Dave Dittrich (Washington): good questions I didn't catch.

Alex: Harm minimization problem. Is it OK to only tell the election officials of the problem (so they can cheat), but don't tell problems to citizens?

Dave Dittrich: should tell citizens too (watchdogs)

Ross Anderson: but this makes the discussions necessarily adversarial, which is a good thing

Dave Dittrich: agrees

Ross Anderson: why not use standard vuln disclosure practice (disclose after 60 days)?

Dave Dittrich: but you can't, because the lead times are too long.

Ross Anderson: but static certification doesn't work. Need to recognize that dynamic patching is how software development works, and we should move to a legal environment that works with engineering practice, rather than having working against it.

John McHugh: Alex had mentioned that e-voting should be held to the same standard as paper voting. John proposes a concrete measure of viability for e-voting versus paper voting: need to be able to update e-voting at the same speed you can for paper balloting (which can be done in a few days).

John McHugh: Need research into exploitability? How often are vulnerabilities actually exploited?

Alex: research on end-to-end verifiability which can help make it easier to identify exploits.

John McHugh: but corrupt officials can undermine these systems

Sven Dietrich: problem is technological (lack of) expertise of election officials.

Security Research with Human Subjects: Informed Consent, Risk, and Benefits.
Maritza Johnson, Steven Bellovin and Angelos Keromytis (Columbia University) Maritza Johnson presented

Talk is about IRBs. Traces history to 1947 Nuremberg Code, 1972 Tuskegee Study, culminating in US Dept HHS Reg 45 CFR 46. Goal of IRBs is designed to avoid problems of saying "trust me".

IRB review principles include respect for persons, beneficence (do participants benefit from study), justice (e.g., are results distributed fairly across demographics).

Problems with IRB process
- Mandated by feds
- Formed in reponse to extreme events
- Process may appear arbitrary
- Very little evaluation of the effectiveness of the IRB process

This reminds her of the TSA, each of the above points apply. 2 questions: why make the claim, and what to do if it is true? Does the IRB understand the protocol well enough to really understand what the dangers might be?

Contribution: survey 40 IRBs at top 40 US universities (according to US News). Checked whether the roster is online, and whether the users have CS background.
- Only 17 list rosters online (surprising since regs require that they have to maintain this list and send to US HHS
- Of these, 5 have CS representation. These are from schools that have strong HCI departments.

In 2004, lots of accusations from journalism schools and oral history researchers of mission creep in IRBs and backlash.

In 2008, Garfinkel writes a paper on IRBs, while Allman wrote a paper about what program committees should do when the research may be unethical. Poses that maybe we should ask IRBs?

Examples of Maritza's IRB protocols

1. In-lab study with deceit (phishing study)

2. Remote study (didn't actually physically meet participants), software in stallation, remote data collection

a. Question from audience: do you think IRB understands the risk of collecting all data?

b. David Robinson: are you saying you think that the IRB didn't understand what was really going on, because they would have likely objected otherwise?

Answer: yes c. Dave Dittrich: well there is still a useful IRB function, in that it provides a connection between participants and IRB. They can confirm that the study is consistent with the stated IRB protocol.

3. Facebook app, with non-expiring session. As usable security person, she knows that Facebook consent is suboptimal, but is that OK to still rely on it?

Other protocols that might cause IRB problems

1. In the wild password study - what if someone is using a very weak password, is there an obligation to intervene?

2. In-the-wild phishing experiments - people aren't given consent

Open question: in security research, what are the extremes?

Informed consent: you are supposed to make instructions on an 8th grade level. But what does that mean in terms of IT proficiency (and understanding of consequent risks)?

Nicolas Christin (CMU) claims that often IRBs are actually tasked to reduce the liability and risk to the university, not the participants. This was disputed by Elizabeth Buchanan (Wisconsin-Stout).


Human Subjects, Agents, or Bots: Current Issues in Ethics and Computer Security Research.
Panel moderator: Elizabeth Buchanan

Panelists: Elizabeth Buchanan (University of Wisconsin-Stout), John Aycock (University of Calgary), Scott Dexter (Brooklyn College, CUNY) and Dave Dittrich (University of Washington).

1. Elizabeth Buchanan - ethicist who has experience with IRBs in Internet studies.

Providing some context to the talk: the general term for IRBs is human subjects review boards. Her talk is IRB-focused. Very few people serve on IRB boards. Dave Dittrich found that 4 of 200 people at NDSS serve on IRBs.

Key challenge is to quantify risks, so that researchers and IRBs can articulate these risks. She notes that many IRBs have slid into also considering university protection from legal harms, so it is no longer only about ethics.

As the distance between researcher and subject decreases, we are more likely to view as human subjects. Conversely, as distance increases, we are more likely to view as non-human subjects.

Also starting to see less concern about human subjects research and more concern with human-harming research.

Publicity of the data matters. Pay for access is viewed as less private. If a site is public, then that has greater distance?

Dave: What about stolen private data that is later published (ala Anonymous v. HBGary, Wikileaks)?

Most US models focus on minimizing, not eliminating, risks to research subjects. But what are the risks?

Dave Dittrich: Where does CS fit? Big question. Does computer security research fit in the IRB model?

Dave Dittrich is on an IRB, joined to understand the process. He has observed lots of ambiguous statements in IRB applications about data security statements.

Often copy boilerplate statements that were used in traditional studies where info is stored on paper in a vault onto electronic studies on e.g., surveymonkey, that use machines not under researcher control, but the same doesn't apply ("No others will have access to the data"). Even when more specific, they are not comprehensive (e.g., what is the password policy that protects the encrypted hard drive).

Elizabeth notes that the NSF is requiring that applicants must have a data plan. Maritza noted that the requirements talk about applying "community standards", but we don't have that in security yet.

Dave mentioned a "certificate of confidentiality", where in a physical environment you are studying illegal behavior, you get legal protection from having to turn over the data. But when data collection is remote (e.g., iphone app), then the certificate doesn't apply if the police arrest the participant and seize the evidence stored on the phone/PC.

Elizabeth notes the problem with terms of service on sites like surveymonkey and facebook.

Two mock IRB cases were discussed by the panel.

Case 1: conducting a research project on worm probes on a computer network. Collects network data automatically, is that personal data collection? Collects IP addresses, which may be personal. Final issue if you store the information on the cloud.

John McHugh: the issues are specific, so it depends on whether you need to link IP to customer address, or to ASN, or to other sources. He provocatively argues that all computer-generated data (network traces) is not personal.

John argues that for cloud storage, you should not store IPs. outlines different levels of data sensitivity, and the security that would be required. Tiered access or consent models may be the way forward, for instance, asking for consent to store data in the cloud versus locally.

Second case study is about deploying a worm in a controlled environment. Risks are that the worm might get out and cause additional harm?

Nicolas: question is that often the worm designer is the leading expert and has no peers that can adequately review the proposal? Answer is that they could bring in outside expertise, but the question is how do you find the appropriate experts to do the review. Nicolas: worries that bringing in outside experts will reveal the idea to the expert, who denies and then steals the idea. Dave Dittrich: encourage the proposer to do a literature review and make a more thorough argument. This was then passed onto a peer for review, who was expected to remain confidential.

Also brought that if the event is truly closed, the real problems come on how it is disclosed.

John: we may be slightly ahead of the pack, but whether or not you publish won't affect the eventual discovery of the attack methods. Sven: sometimes in the past we did not disclose methods. Finally, where should we go from here?
1. Pedagogy ( = Academy + Industry) on ethics

a. Slow institutional response to curricular needs and limited faculty experience. Pedagogical change only requires interested faculty. So a demand for better ethics education may come from many sectors, not only security.

b. In social sciences, the prof who teaches research methods course could be given a slot on the IRB, in order to expose them (and consequently students) on how they work and how to be ethical.

2. Promote ethics among professional associations.

a. This is slower, but could be useful for promoting standards once they have coalesced. Can even come up with a profession-wided "ethical clean bill of health" standards for publications, as the AMA does.

3. Use the government

a. Even slower method, but could use regulatory pressure to exact change. Current NSA curricular guidelines have no ethical education component, so we could try to change.

John worries that IRB approval process will cause people to avoid that research path. Dave's response is tough luck, we need to raise the bar. But David Robinson wonders if you can really raise the bar a little bit at all. A lot of the resistance in community is that moving to IRBs would be viewed as overkill. Rachel Greenstadt (Drexel) asked about the role of program committees. Others noted that there could be a preemptive effect if PCs do start to reject suspect papers.

Maritza says that she worries about inconsistency across and within IRBs and the negative effect that could have in preventing research.

Enforced Community Standards For Research on Users of the Tor Anonymity Network
Christopher Soghoian (Indiana University)

Impossible to detect passive manipulation in tor if you are a tor node. Chris worries about 3 classes of those studying tor: governments surveiling citizens, hackers who establish rogue servers, and academic researchers. What should we do about academic researchers who push the bounds by studying users in tor.

Two case studies of Tor. One was McCoy et al at PETS. Researchers did not seek out or obtain legal approval. Researchers did not receive warm welcome at PETS. Following bad press, the university launched an investigation into the ethics and legalities of Tor. After two days, the IRB cleared the researchers.

Second study was by Castelluccia et al (2010). They discovered that stolen session cookies gathered from open WiFi networks enabled a reconstruction of user search history. The authors wanted to assess the effectiveness of the attack. One part of this was to set up a rogue Tor server, where they collected data from the tor network, 1800 distinct users were observed. Concerns about this paper: the authors viewed the privacy of work colleagues as greater than the privacy of tor users.

Chris notes that the McCoy paper was designed to analyze tor traffic specifically, while Castelluccia used tor as a means of analyzing large amounts of network traffic, that they wouldn't otherwise have access to. Chris argues that unless something is done, then we should expect to see more research that uses tor.

While the reaction to McCoy et al at PETS was overwhelmingly negative, there is no real paper trail of the dissatisfaction. Ditto for Catelluccia et al. So as an outsider, you only see the publications and believe that it is being blessed by the community. We need to write down what the guidelines are otherwise they will be ignored. Goal of the talk is to kick off the discussion on how to set guidelines for using tor for research.

Claim is that snooping on tor users is worse than snooping on general public, because they have signaled their privacy interest by using tor. Ross wonders whether banning this is wise, because in other contexts, the IRBs and governance has been captured by research community. So we must be wary of setting up research-driven institutions to evaluate ethics in computer security because it will likely be captured.

George Danezis (Microsoft Research) made the point that in many non-US countries collecting data on tor may not be illegal, and furthermore that the researchers above were not from the US.

Chris's proposal: research should be focused on users of the tor network, not just a source of Internet traffic generally. Need to enforce the standard through PCs at conferences.

Tyler Moore - Ethical Dilemmas in Take-Down Research
(notes by Nicolas Christin)

Will talk about war stories from his research and ethical implications. Based on phishing studies (series of Moore/Clayton papers).

Take down of phishing sites are usually outsourced from banks to security/bank protection firms. 9 ethical questions:

1. Should researchers notify affected parties in order to expedite take-down? Sites can remain online for months at a time. Decided against notifying because they weren't sure whom to notify (banks? Site operators?) Also signed NDAs with take down operators, so couldn't really communicate anything. Clinical trial approach suggested: stop as soon as statistical significance is reached but do not interfere prior to that.

2. Should researchers intervene to assist victims? What should we do when we know possible identities of victims? (e.g., phished credentials stored in clear text at the miscreant site). Contacting affected users may be difficult and complex for the researchers and be very expensive and take a lot of time. Incentive for researchers to even looking into it. Dave Dittrich point: it should be expected that this kind of inconvenient data will be obtained and researchers should take that into consideration when starting that research. Rod Rasmussen story: maybe very difficult to act due to individual complications (sysadmin in Iraq, backup busy..., machine used as 911 dispatch no one wanted to touch it)

3. Should researchers fabricate plausible content to conduct "pure" experiments of take-down? Most interesting research is observational, so fabrication of reports to study take-down is usually unethical.
Nicolas: questions whether testing through deception would be unethical.
Tyler: orthogonal problem.

4. Should researchers collect world-readable from "private" locations? Example from Webalizer data - usually easy to obtain when people leave it in the default location (/Webalizer). They tried this in their study - and acquired that information to figure out traffic dynamics of phishing websites. Most people in the audience think it is dodgy - point made that it really depends on the type of data being collected. Roger makes point that referrers can reveal a lot. Angelos makes point that potential danger is in correlating several data sources.

5. What if our analysis will assist criminals? Eqv to suitability of full disclosure. Generally costs outweigh benefits.

6. Should investigatory techniques be revealed? When investigation revealed publically attackers can attack. Mr. Brain example: phishing kits had backdoors. Known by researchers (who didn't write about it) and law enforcement. Someone at netcraft published it in a blog post. Disrupted criminal investigation. George: says that this is actually a broader question, not limited to interfering with law enforcement. Interfered party may be different. Tyler: criminal enterprise important. George: moral judgment very difficult to do and is kind of the question at hand.

7. When should datasets be made public or kept secret? Example of phishtank. Most other phishing reports kept secret. Banks don't like phishtank because they're exposed. Compromised sites also exposed so pushback on that side as well. Key measure: is the publication of that information beneficial? (Utilitarian approach) Sites in phishtank recompromised more slowly seems like here better outcome. Generalized that to sharing data being a good thing to reduce phishers' success. Balance harm with benefits.

Ethical Considerations of Sharing Data for Cybersecurity Research
Darren Shou (Symantec)

- Continuous sharing of data is a big need for research, but there are obstacles: data sets may become stale. But data is held by operators and not researchers.

- Existing data sharing. One way is to hire interns, but there is a limit to what you can achieve in a short time.

- Proprietary data can make or break a career, but this is unfair.

- One option is a data clearinghouse, but there is very little operator incentive to share. Existing data clearing houses include Predict, ISC SIE, DatCat and ITA (Internet traffic archive). Three factors for these: 1) does the clearinghouse generate data or rely on contributions, 2) preservation of the data over time, and 3) confirmation for data research. Unfortunately, much of the data is considered IP by operators. Fundamental problem with clearinghouse is factor #1.

- Ethical considerations of data sharing

- 1. Openness: difficulty of openness. When do you have to share (for IP concerns), but there is competitive advantage issue: the data is viewed as competitive advantage, and so sharing it would undermine business models. Other issue is that there is a financial cost to sharing data. On top of all this, there are publication issues. Examples of openness dilemmas: #1: sharing proprietary datasets versus priority and recognition. There is a tension between maintaining secrecy versus revealing research participants. Competitive advantage versus efficiency of research.

o He discusses some compromises. Regarding CA, we can share with small set of vetted third parties, who don't make entirely public.

- 2. Privacy

o One particular issue is that the most useful data comes from real users and networks.

o Need to consider what the value is to users from collecting the data.

o One example is what should you do with IP addresses for compromised hosts.

- 3. Sound experimentation

o Often it is difficult to establish meaningful comparisons without access to privately held data, such as the comprehensiveness of antivirus.


1. External review board including industry and academics ala the PREDICT external relations board.

2. Restrict access to on-site only.

3. Operator data-hosting

Shou then proposes a model called WINE for exchanging information. Attempts to balance each of the competing ethical considerations. Have several PB of data, getting around TB. Claims that providing confirmation of research can have competitive value.

John McHugh - why not view aging as a benefit to counteract competitive advantage. Response: that's OK, but it would be better to share current data.

Has chosen to, for example, share telemetry data. But even that is problematic, what to do with IP addresses. Polo Chau used some graph mining for malware detection.

Panel: Moving forward, building an ethics community
Panel moderator: Erin Kenneally (UC San Diego/CAIDA/Elchemy)
Panelists: John McHugh (RedJack/UNC), Angelos Stavrou (George Mason University), Ross Anderson (University of Cambridge), Nicolas Christin (Carnegie Mellon University)

3-legged stool of ethics: principles, applications, and implementation. Key problem on principles is that there is a lack of shared community values. Researchers also lack guidance on ethical standards, as well as any enforcement capability. For any of this to work, we need incentives to make this work.

One thing we can fix as a community: the Menlo Report. One component of the Menlo report is called EIA (ethical impact assessment), a self-assessment tool to help researchers assess ethical impact along the lines of the privacy impact assessment (PIA). On enforcement, there is an opportunity to self-regulate.

Ross: was on sabbatical in Silicon valley, and there is an attempt to share security data across the industry. Current arrangements are chaotic, would be better if transparent and accountable. Ross will focus on the governance of such a sharing body. Any given attack may be of interest to 3 or 4 of the 12 big players involved. Data warehouses are non-starters since operators want to keep data under their control. So instead should there be a transnational body (ala CERT) or a single national body? How can you set up the governance rules to prevent industry capture? This is important because the MPAA might want to use the data sharing arrangement to enforce copyright. There is a serious conflict between laws and norms. Cannot simply call it "serious crime only" since IP violations carry greater prison sentences than computer misuse in the UK. One suggestion is that the body shouldn't be an outgrowth of existing efforts. A final warning from history: in the US wild west there was a shortage of sheriffs, so set up a Pinkerton commission to carry out private law enforcement. But 20 years later strikers were shot in Pittsburgh.

John McHugh: from talks today see lots of contradictions and disagreements about what is ethical and what is not. Either we are an adjunct to law enforcement in this area, or we need to back away from this research. What we cannot do is be a "cowboy" and go out and collect lots of personal data from a botnet. One problem is that we don't have a real vehicle to get involved with law enforcement when it would be useful to assist. If we are to do the "cowboy" things, we need to create a more explicit arrangement with law enforcement.

Chris: Problem is that we see the same people doing detection and enforcement. Claims that we need to enforce a stricter separation between the two roles. We need to develop ways to conduct research that is acceptable to our peers, but there is currently no consensus

Nicolas: Important not to only take a western world view. Most people today have taken a utilitarian view: helping more than hurting. But there are other approaches, even in the west. But if you go to the east, you have Buddhist ethics. It seems that there is conflict between Buddhist ethics and Western ones, so it is important to consider alternative ethical world views.

On Japanese one-click frauds case, he thought he was in the clear on a vigilante study where no PII was involved. But someone in Japan claimed it was unethical because he identified some service providers and embarrassed them, and it wasn't his place to displace law enforcement.

Moderator: speakers have attempted to operationalize ethics by institutionalizing research. But does that put the cart before the horse by not waiting for community standards and agreement to emerge.

Another question: do people agree that there is a lot of research that stays below the radar?

But what can community do to carry out enforcement? Sven: send a clear message to bad-apple researchers that what they are doing is wrong. We must clearly set the boundaries in order to carry out enforcement.

Roger. Incentives are perverse here. People are rewarded for publishing unethical research.
Ross: people need to go to jail.

John: Sees a lot of wiggling here, but lots of ethical decisions do not have such wiggle room. Ethics are a more universal set of values than a particular list of rules. Bothered that professional societies would be given the task of making rules that are then subject to legalistic interpretations.

We are at a stage of trying to break up inertia, where tolerance of past misdeeds is continuing in the future. Roger: there is a growing trends of individuals taking actions as program chairs, but the problem is that paper submission is a repeated game and not all PCs take actions.

Rachel: has been talking with an anthropologist who believes ...

Ross: We have to be careful of using IRBs merely as ass-covering.

Moderator: important not to abrogate ethical responsibility by simply working with law enforcement.

Sven: can imagine a scenario where the police work with researcher and the police so that the researcher engages in legal activities that are not legal for police.

Maritza: why not publicly ostracize papers that fail the ethics test? Seems like a reasonable argument but the problem is that we don't have clear guidelines so the results must be public.

Moderator survey: would it be helpful to have a tool to judge ethical decisions?

Ross: issue of jurisdiction matters. Dave: goes beyond just access to data, and into access to criminological information.

Dave: big problem is that principles aren't agreed upon, and it isn't clear how this maps onto activities.

One problem with IRBs is that if you want to compare the risks and benefits, you need to be able to quantify them, which is hard to do for security research.

Rachel: one aspect of IRB is asking if you can carry out the study without causing harm. This step often gets skipped in security research because participating in attacks are cool.

Elizabeth: suggest that we start engaging with some IRB association. Also look at the AOIR document on ethics of Internet research, first created in 2002. WECSR 2011 concluded.