40th IEEE Symposium on
Security and Privacy


Accepted Papers

Asm2Vec: Boosting Static Representation Robustness for Binary Clone Search against Code Obfuscation and Compiler Optimization
Steven H. H. Ding (McGill University), Benjamin C. M. Fung (McGill University), Philippe Charland (Defence R&D; Canada - Valcartier, Canada)
Reverse engineering is a manually intensive but necessary technique for understanding the inner workings of new malware, finding vulnerabilities in existing systems, and detecting patent infringements in released software. An assembly clone search engine facilitates the work of reverse engineers by identifying those duplicated or known parts. However, it is challenging to design a robust clone search engine, since there exist various compiler optimization options and code obfuscation techniques that make logically similar assembly functions appear to be very different.

A practical clone search engine relies on a robust vector representation of assembly code. However, the existing clone search approaches, which rely on a manual feature engineering process to form a feature vector for an assembly function, fail to consider the relationships between features and identify those unique patterns that can statistically distinguish assembly functions. To address this problem, we propose to jointly learn the lexical semantic relationships and the vector representation of assembly functions based on assembly code. We have developed an assembly code representation learning model \emph{Asm2Vec}. It only needs assembly code as input and does not require any prior knowledge such as the correct mapping between assembly functions. It can find and incorporate rich semantic relationships among tokens appearing in assembly code. We conduct extensive experiments and benchmark the learning model with state-of-the-art static and dynamic clone search approaches. We show that the learned representation is more robust and significantly outperforms existing methods against changes introduced by obfuscation and optimizations.
Attack Directories, Not Caches: Side Channel Attacks in a Non-Inclusive World
Mengjia Yan (University of Illinois at Urbana Champaign), Read Sprabery (University of Illinois at Urbana Champaign), Bhargava Gopireddy (University of Illinois at Urbana Champaign), Christopher Fletcher (University of Illinois at Urbana Champaign), Roy Campbell (University of Illinois at Urbana Champaign), Josep Torrellas (University of Illinois at Urbana Champaign)
Although clouds have strong virtual memory isolation guarantees, cache attacks stemming from shared caches have proved to be a large security problem. However, despite the past effectiveness of cache attacks, their viability has recently been called into question on modern systems, due to trends in cache hierarchy design moving away from inclusive cache hierarchies. In this paper, we reverse engineer the structure of the directory in a sliced, non-inclusive cache hierarchy, and prove that the directory can be used to bootstrap conflict-based cache attacks on the last-level cache. We design the first cross-core Prime+Probe attack on non-inclusive caches. This attack works with minimal assumptions: the adversary does not need to share any virtual memory with the victim, nor run on the same processor core. We also show the first high-bandwidth Evict+Reload attack on the same hardware. We demonstrate both attacks by extracting key bits during RSA operations in GnuPG on a state-of-the-art non-inclusive Intel Skylake-X server.
Blind Certificate Authorities
Liang Wang (UW Madison), Gilad Asharov (Cornell Tech), Rafael Pass (Cornell Tech), Thomas Ristenpart (Cornell Tech), Abhi Shelat (Northeastern University)
We explore how to build a blind certificate authority (CA). Unlike conventional CAs, which learn the exact identity of those registering a public key, a blind CA can simultaneously validate an identity and provide a certificate binding a public key to it, without ever learning the identity. Blind CAs would therefore allow bootstrapping truly anonymous systems in which no party ever learns who participates. In this work we focus on constructing blind CAs that can bind an email address to a public key.

To do so, we first introduce secure channel injection (SCI) protocols. These allow one party (in our setting, the blind CA) to insert a private message into another party's encrypted communications. We construct an efficient SCI protocol for communications delivered over TLS, and use it to realize anonymous proofs of account ownership for SMTP servers. Combined with a zero-knowledge certificate signing protocol, we build the first blind CA that allows Alice to obtain a X.509 certificate binding her email address to a public key of her choosing without ever revealing ``alice'' to the CA. We show experimentally that our system works with standard email server implementations as well as Gmail.
Breaking LTE on Layer Two
David Rupprecht (Ruhr-University Bochum), Katharina Kohls (Ruhr-University Bochum), Thorsten Holz (Ruhr-University Bochum), Christina Pöpper (New York University Abu Dhabi)
Long Term Evolution (LTE) is the latest mobile communication standard and has a pivotal role in our information society: LTE combines performance goals with modern security mechanisms and serves casual use cases as well as critical infrastructure and public safety communications. Both scenarios are demanding towards a resilient and secure specification and implementation of LTE, as outages and open attack vectors potentially lead to severe risks. Previous work on LTE protocol security identified crucial attack vectors for both the physical (layer one) and network (layer three) layers. Data link layer (layer two) protocols, however, remain a blind spot in existing LTE security research. In this paper, we present a comprehensive layer two security analysis and identify three attack vectors. These attacks impair the confidentiality and/or privacy of LTE communication. More specifically, we first present a passive identity mapping attack that matches volatile radio identities to longer lasting network identities, enabling us to identify users within a cell and serving as a stepping stone for follow-up attacks. Second, we demonstrate how a passive attacker can abuse the resource allocation as a side channel to perform website fingerprinting that enables the attacker to learn the websites a user accessed. Finally, we present the A LTE R attack that exploits the fact that LTE user data is encrypted in counter mode (AES-CTR) but not integrity protected, which allows us to modify the message payload. As a proof-of-concept demonstration, we show how an active attacker can redirect DNS requests and then perform a DNS spoofing attack. As a result, the user is redirected to a malicious website. Our experimental analysis demonstrates the real-world applicability of all three attacks and emphasizes the threat of open attack vectors on LTE layer two protocols.
CaSym: Cache Aware Symbolic Execution for Side Channel Detection and Mitigation
Robert Brotzman (Pennsylvania State University), Shen Liu (Pennsylvania State University), Danfeng Zhang (Pennsylvania State University), Gang Tan (Pennsylvania State University), Mahmut Kandemir (Pennsylvania State University)
Cache-based side channels are becoming an important attack vector through which secret information can be leaked to malicious parties. implementations and Previous work on cache-based side channel detection, however, suffers from the code coverage problem or does not provide diagnostic information that is crucial for applying mitigation techniques to vulnerable software. We propose CaSym, a cache-aware symbolic execution to identify and report precise information about where side channels occur in an input program. Compared with existing work, CaSym provides several unique features: (1) CaSym enables verification against various attack models and cache models, (2) unlike many symbolic-execution systems for bug finding, CaSym verifies all program execution paths in a sound way, (3) CaSym uses two novel abstract cache models that provide good balance between analysis scalability and precision, and (4) CaSym provides sufficient information on where and how to mitigate the identified side channels through techniques including preloading and pinning. Evaluation on a set of crypto and database benchmarks shows that CaSym is effective at identifying and mitigating side channels, with reasonable efficiency.
Certified Robustness to Adversarial Examples with Differential Privacy
Mathias Lecuyer (Columbia University), Vaggelis Atlidakis (Columbia University), Roxana Geambasu (Columbia University), Daniel Hsu (Columbia University), Suman Jana (Columbia University)
Adversarial examples that fool machine learning models, particularly deep neural networks, have been a topic of intense research interest, with attacks and defenses being developed in a tight back-and-forth. Most past defenses are best effort and have been shown to be vulnerable to sophisticated attacks. Recently a set of certified defenses have been introduced, which provide guarantees of robustness to norm-bounded attacks. However these defenses either do not scale to large datasets or are limited in the types of models they can support. This paper presents the first certified defense that both scales to large networks and datasets (such as Google's Inception network for ImageNet) and applies broadly to arbitrary model types. Our defense, called PixelDP, is based on a novel connection between robustness against adversarial examples and differential privacy, a cryptographically-inspired privacy formalism, that provides a rigorous, generic, and flexible foundation for defense.
Characterizing Pixel Tracking through the Lens of Disposable Email Services
Hang Hu (Virginia Tech), Peng Peng (Virginia Tech), Gang Wang (Virginia Tech)
Disposable email services provide temporary email addresses, which allows people to register online accounts without exposing their real email addresses. In this paper, we perform the first measurement study on disposable email services with two main goals. First, we aim to understand what disposable email services are used for, and what risks (if any) are involved in the common use cases. Second, we use the disposable email services as a public gateway to collect a large-scale email dataset for measuring email tracking. Over three months, we collected a dataset from 7 popular disposable email services which contain 2.3 million emails sent by 210K domains. We show that online accounts registered through disposable email addresses can be easily hijacked, leading to potential information leakage and financial loss. By empirically analyzing email tracking, we find that third-party tracking is highly prevalent, especially in the emails sent by popular services. We observe that trackers are using various methods to hide their tracking behavior such as falsely claiming the size of tracking images or hiding real trackers behind redirections. A few top trackers stand out in the tracking ecosystem but are not yet dominating the market.
DEEPSEC: A Uniform Platform for Security Analysis of Deep Learning Model
Xiang Ling (Zhejiang University), Shouling Ji (Zhejiang University, Alibaba-Zhejiang University Joint Research Institute of Frontier Technologies), Jiaxu Zou (Zhejiang University), Jiannan Wang (Zhejiang University), Chunming Wu (Zhejiang University), Bo Li (UIUC), Ting Wang (Lehigh University)
Deep learning (DL) models are inherently vulnerable to adversarial examples - maliciously crafted inputs to trigger target DL models to misbehave - which significantly hinders the application of DL in security-sensitive domains. Intensive research on adversarial learning has led to an arms race between adversaries and defenders. Such plethora of emerging attacks and defenses raise many questions: Which attacks are more evasive, preprocessing-proof, or transferable? Which defenses are more effective, utility-preserving, or general? Are ensembles of multiple defenses more robust than individuals? Yet, due to the lack of platforms for comprehensive evaluation on adversarial attacks and defenses, these critical questions remain largely unsolved. In this paper, we present the design, implementation, and evaluation of DEEPSEC, a uniform platform that aims to bridge this gap. In its current implementation, DEEPSEC incorporates 16 state-of-the-art attacks with 10 attack utility metrics, and 13 state-of-the-art defenses with 5 defensive utility metrics. To our best knowledge, DEEPSEC is the first platform that enables researchers and practitioners to (i) measure the vulnerability of DL models, (ii) evaluate the effectiveness of various attacks/defenses, and (iii) conduct comparative studies on attacks/defenses in a comprehensive and informative manner. Leveraging DEEPSEC, we systematically evaluate the existing adversarial attack and defense methods, and draw a set of key findings, which demonstrate DEEPSEC's rich functionality, such as (1) the trade-off between misclassification and imperceptibility is empirically confirmed; (2) most defenses that claim to be universally applicable can only defend against limited types of attacks under restricted settings; (3) it is not necessary that adversarial examples with higher perturbation magnitude are easier to be detected; (4) the ensemble of multiple defenses cannot improve the overall defense capability, but can improve the lower bound of the defense effectiveness of individuals. Extensive analysis on DEEPSEC demonstrates its capabilities and advantages as a benchmark platform which can benefit future adversarial learning research.
Dangerous Skills: Understanding and Mitigating Security Risks of Voice-Controlled Third-Party Functions on Virtual Personal Assistant Systems
Nan Zhang (Indiana University, Bloomington), Xianghang Mi (Indiana University, Bloomington), Xuan Feng (Indiana University, Bloomington; Beijing Key Laboratory of IOT Information Security Technology, Institute of Information Engineering, CAS, China), XiaoFeng Wang (Indiana University, Bloomington), Yuan Tian (University of Virginia), Feng Qian (Indiana University, Bloomington)
Virtual personal assistants (VPA) (e.g., Amazon Alexa and Google Assistant) today mostly rely on the voice channel to communicate with their users, which however is known to be vulnerable, lacking proper authentication (from the user to the VPA). A new authentication challenge, from the VPA service to the user, has emerged with the rapid growth of the VPA ecosystem, which allows a third party to publish a function (called skill) for the service and therefore can be exploited to spread malicious skills to a large audience during their interactions with smart speakers like Amazon Echo and Google Home. In this paper, we report a study that concludes such remote, large-scale attacks are indeed realistic. We discovered two new attacks: voice squatting in which the adversary exploits the way a skill is invoked (e.g., ``open capital one''), using a malicious skill with a similarly pronounced name (e.g., ``capital won'') or a paraphrased name (e.g., ``capital one please'') to hijack the voice command meant for a legitimate skill (e.g., ``capital one''), and voice masquerading in which a malicious skill impersonates the VPA service or a legitimate skill during the user's conversation with the service to steal her personal information. These attacks aim at the way VPAs work or the user's misconceptions about their functionalities, and are found to pose a realistic threat by our experiments (including user studies and real-world deployments) on Amazon Echo and Google Home. The significance of our findings has already been acknowledged by Amazon and Google, and further evidenced by the risky skills found on Alexa and Google markets by the new squatting detector we built. We further developed a technique that automatically captures an ongoing masquerading attack and demonstrated its efficacy.
Data Recovery on Encrypted Databases with k-Nearest Neighbor Query Leakage
Evgenios M. Kornaropoulos (Brown University), Charalampos Papamanthou (University of Maryland), Roberto Tamassia (Brown University)
Recent works by Kellaris et al. (CCS'16) and Lacharite et al. (SP'18) demonstrated attacks of data recovery for encrypted databases that support rich queries such as range queries. In this paper, we develop the first data recovery attacks on encrypted databases supporting one-dimensional k-nearest neighbor (k-NN) queries, which are widely used in spatial data management. Our attacks exploit a generic k-NN query leakage profile: the attacker observes the identifiers of matched records. We consider both unordered responses, where the leakage is a set, and ordered responses, where the leakage is a k-tuple ordered by distance from the query point.

As a first step, we perform a theoretical feasibility study on exact reconstruction, i.e., recovery of the exact plaintext values of the encrypted database. For ordered responses, we show that exact reconstruction is feasible if the attacker has additional access to some auxiliary information that is normally not available in practice. For unordered responses, we prove that exact reconstruction is impossible due to the infinite number of valid reconstructions. As a next step, we propose practical and more realistic approximate reconstruction attacks so as to recover an approximation of the plaintext values. For ordered responses, we show that after observing enough query responses, the attacker can approximate the client's encrypted database with considerable accuracy. For unordered responses we characterize the set of valid reconstructions as a convex polytope in a k-dimensional space and present a rigorous attack that reconstructs the plaintext database with bounded approximation error.

As multidimensional spatial data can be efficiently processed by mapping it to one dimension via Hilbert curves, we demonstrate our approximate reconstruction attacks on privacy-sensitive geolocation data. Our experiments on real-world datasets show that our attacks reconstruct the plaintext values with relative error ranging from 2.9% to 0.003%.
Differentially Private Model Publishing for Deep Learning
Lei Yu (Georgia Institute of Technology), Ling Liu (Georgia Institute of Technology), Calton Pu (Georgia Institute of Technology), Mehmet Emre Gursoy (Georgia Institute of Technology), Stacey Truex (Georgia Institute of Technology)
Deep learning techniques based on neural networks have shown significant success in a wide range of AI tasks. Large-scale training datasets are one of the critical factors for their success. However, when the training datasets are crowdsourced from individuals and contain sensitive information, the model parameters may encode private information and bear the risks of privacy leakage. The recent growing trend of the sharing and publishing of pre-trained models further aggravates such privacy risks. To tackle this problem, we propose a differentially private approach for training neural networks. Our approach includes several new techniques for optimizing both privacy loss and model accuracy. We employ a generalization of differential privacy called concentrated differential privacy(CDP), with both a formal and refined privacy loss analysis on two different data batching methods. We implement a dynamic privacy budget allocator over the course of training to improve model accuracy. Extensive experiments demonstrate that our approach effectively improves privacy loss accounting, training efficiency and model quality under a given privacy budget.
Does Certificate Transparency Break the Web? Measuring Adoption and Error Rate
Emily Stark (Google), Ryan Sleevi (Google), Rijad Muminovic (University of Sarajevo), Devon O'Brien (Google), Eran Messeri (Google), Adrienne Porter Felt (Google), Brendan McMillion (Cloudflare), Parisa Tabriz (Google)
Certificate Transparency (CT) is an emerging system for enabling the rapid discovery of malicious or misissued certificates. Initially standardized in 2013, CT is now finally beginning to see widespread support. Although CT provides desirable security benefits, web browsers cannot begin requiring all websites to support CT at once, due to the risk of breaking large numbers of websites. We discuss challenges for deployment, analyze the adoption of CT on the web, and measure the error rates experienced by users of the Google Chrome web browser. We find that CT has so far been widely adopted with minimal breakage and warnings.

Security researchers often struggle with the tradeoff between security and user frustration: rolling out new security requirements often causes breakage. We view CT as a case study for deploying ecosystem-wide change while trying to minimize end user impact. We discuss the design properties of CT that made its success possible, as well as draw lessons from its risks and pitfalls that could be avoided in future large-scale security deployments.
Exploiting Unintended Feature Leakage in Collaborative Learning
Luca Melis (University College London), Congzheng Song (Cornell University), Emiliano De Cristofaro (University College London), Vitaly Shmatikov (Cornell Tech)
Collaborative machine learning and related techniques such as federated learning allow multiple participants, each with his own training dataset, to build a joint model by training locally and periodically exchanging model updates. We demonstrate that these updates leak unintended information about participants' training data and develop passive and active inference attacks to exploit this leakage. First, we show that an adversarial participant can infer the presence of exact data points -- for example, specific locations -- in others' training data (i.e., membership inference). Then, we show how this adversary can infer properties that hold only for a subset of the training data and are independent of the properties that the joint model aims to capture. For example, he can infer when a specific person first appears in the photos used to train a binary gender classifier. We evaluate our attacks on a variety of tasks, datasets, and learning configurations, analyze their limitations, and discuss possible defenses.
Fidelius: Protecting User Secrets from Compromised Browsers
Saba Eskandarian (Stanford University), Jonathan Cogan (Stanford University), Sawyer Birnbaum (Stanford University), Peh Chang Wei Brandon (Stanford University), Dillon Franke (Stanford University), Forest Fraser (Stanford University), Gaspar Garcia (Stanford University), Eric Gong (Stanford University), Hung T. Nguyen (Stanford University), Taresh K. Sethi (Stanford University), Vishal Subbiah (Stanford University), Michael Backes (CISPA Helmholtz Center for Information Security), Giancarlo Pellegrino (Stanford University/CISPA Helmholtz Center for Information Security), Dan Boneh (Stanford University)
Users regularly enter sensitive data, such as passwords, credit card numbers, or tax information, into the browser window. While modern browsers provide powerful client-side privacy measures to protect this data, none of these defenses prevent a browser compromised by malware from stealing it. In this work, we present Fidelius, a new architecture that uses trusted hardware enclaves integrated into the browser to enable protection of user secrets during web browsing sessions, even if the entire underlying browser and OS are fully controlled by a malicious attacker.

Fidelius solves many challenges involved in providing protection for browsers in a fully malicious environment, offering support for integrity and privacy for form data, JavaScript execution, XMLHttpRequests, and protected web storage, while minimizing the TCB. Moreover, interactions between the enclave and the browser, the keyboard, and the display all require new protocols, each with their own security considerations. Finally, Fidelius takes into account UI considerations to ensure a consistent and simple interface for both developers and users.

As part of this project, we develop the first open source system that provides a trusted path from input and output peripherals to a hardware enclave with no reliance on additional hypervisor security assumptions. These components may be of independent interest and useful to future projects.

We implement and evaluate Fidelius to measure its performance overhead, finding that Fidelius imposes acceptable overhead on page load and user interaction for secured pages and has no impact on pages and page components that do not use its enhanced security features.
Fuzzing File Systems via Two-Dimensional Input Space Exploration
Wen Xu (Georgia Institute of Technology), Hyungon Moon (Ulsan National Institute of Science and Technology), Sanidhya Kashyap (Georgia Institute of Technology), Po-Ning Tseng (Georgia Institute of Technology), Taesoo Kim (Georgia Institute of Technology)
File systems, a basic building block of an OS, are too big and too complex to be bug free. Nevertheless, file systems rely on regular stress-testing tools and formal checkers to find bugs, which are limited due to the ever-increasing complexity of both file systems and OSes. Thus, fuzzing, proven to be an effective and a practical approach, becomes a preferable choice, as it does not need much knowledge about a target. However, three main challenges exist in fuzzing file systems: mutating a large image blob that degrades overall performance, generating image-dependent file operations, and reproducing found bugs, which is difficult for existing OS fuzzers. Hence, we present JANUS, the first feedback-driven fuzzer that explores the two-dimensional input space of a file system, i.e., mutating metadata on a large image, while emitting image-directed file operations. In addition, JANUS relies on a library OS rather than on traditional VMs for fuzzing, which enables JANUS to load a fresh copy of the OS, thereby leading to better reproducibility of bugs. We evaluate JANUS on eight file systems and found 90 bugs in the upstream Linux kernel, 62 of which have been acknowledged. Forty-three bugs have been fixed with 32 CVEs assigned. In addition, JANUS achieves higher code coverage on all the file systems after fuzzing 12 hours, when compared with the state-of-the-art fuzzer Syzkaller for fuzzing file systems. JANUS visits 4.19x and 2.01x more code paths in Btrfs and ext4, respectively. Moreover, JANUS is able to reproduce 88-100% of the crashes, while Syzkaller fails on all of them.
HOLMES: Real-Time APT Detection through Correlation of Suspicious Information Flows
Sadegh Momeni Milajerdi (University of Illinois at Chicago), Rigel Gjomemo (University of Illinois at Chicago), Birhanu Eshete (University of Michigan-Dearborn), R. Sekar (Stony Brook University), V.N. Venkatakrishnan (University of Illinois at Chicago)
In this paper, we present HOLMES, a system that implements a new approach to the detection of Advanced and Persistent Threats (APTs). HOLMES is inspired by several case studies of real-world APTs that highlight some common goals of APT actors. In a nutshell, HOLMES aims to produce a detection signal that indicates the presence of a coordinated set of activities that are part of an APT campaign. One of the main challenges addressed by our approach involves developing a suite of techniques that make the detection signal robust and reliable. At a high-level, the techniques we develop effectively leverage the correlation between suspicious information flows that arise during an attacker campaign. In addition to its detection capability, HOLMES is also able to generate a high-level graph that summarizes the attacker's actions in real-time. This graph can be used by an analyst for an effective cyber response. An evaluation of our approach against some real-world APTs indicates that HOLMES can detect APT campaigns with high precision and low false alarm rate. The compact high-level graphs produced by HOLMES effectively summarizes an ongoing attack campaign and can assist real-time cyber-response operations.
Hard Drive of Hearing: Disks that Eavesdrop with a Synthesized Microphone
Andrew Kwong (University of Michigan), Wenyuan Xu (Zhejiang University), Kevin Fu (University of Michigan)
Security conscious individuals may take considerable measures to disable sensors in order to protect their privacy. However, they often overlook the cyberphysical attack surface exposed by devices that were never designed to be sensors in the first place. Our research demonstrates that the mechanical components in magnetic hard disk drives behave as microphones with sufficient precision to extract and parse human speech. These unintentional microphones sense speech with high enough fidelity for the Shazam service to recognize a song recorded through the hard drive. This proof of concept attack sheds light on the possibility of invasion of privacy even in absence of traditional sensors. We also present defense mechanisms, such as the use of ultrasonic aliasing, that can mitigate acoustic eavesdropping by synthesized microphones in hard disk drives.
How Well Do My Results Generalize? Comparing Security and Privacy Survey Results from MTurk, Web, and Telephone Samples
Elissa M. Redmiles (University of Maryland), Sean Kross (University of California San Diego), Michelle L. Mazurek (University of Maryland)
Security and privacy researchers often rely on data collected from Amazon Mechanical Turk (MTurk) to evaluate security tools, to understand users' privacy preferences and to measure online behavior. Yet, little is known about how well Turkers' survey responses and performance on security- and privacy-related tasks generalizes to a broader population. This paper takes a first step toward understanding the generalizability of security and privacy user studies by comparing users' self-reports of their security and privacy knowledge, past experiences, advice sources, and behavior across samples collected using MTurk (n=480), a census-representative web-panel (n=428), and a probabilistic telephone sample (n=3,000) statistically weighted to be accurate within 2.7% of the true prevalence in the U.S.

Surprisingly, the results suggest that: (1) MTurk responses regarding security and privacy experiences, advice sources, and knowledge are more representative of the U.S. population than are responses from the census-representative panel; (2) MTurk and general population reports of security and privacy experiences, knowledge, and advice sources are quite similar for respondents who are younger than 50 or who have some college education; and (3) respondents' answers to the survey questions we ask are stable over time and robust to relevant, broadly-reported news events. Further, differences in responses cannot be ameliorated with simple demographic weighting, possibly because MTurk and panel participants have more internet experience compared to their demographic peers. Together, these findings lend tempered support for the generalizability of prior crowdsourced security and privacy user studies; provide context to more accurately interpret the results of such studies; and suggest rich directions for future work to mitigate experience- rather than demographic-related sample biases.
Iodine: Fast Dynamic Taint Tracking Using Rollback-free Optimistic Hybrid Analysis
Subarno Banerjee (University of Michigan), David Devecsery (Georgia Institute of Technology), Peter M. Chen (University of Michigan), Satish Narayanasamy (University of Michigan)
Dynamic information-flow tracking (DIFT) is useful for enforcing security policies, but rarely used in practice, as it can slow down a program by an order of magnitude. Static program analyses can be used to prove safe execution states and elide unnecessary DIFT monitors, but the performance improvement from these analyses is limited by their need to maintain soundness.

In this paper, we present a novel optimistic hybrid analysis (OHA) to significantly reduce DIFT overhead while still guaranteeing sound results. It consists of a predicated whole-program static taint analysis, which assumes likely invariants gathered from profiles to dramatically improve precision. The optimized DIFT is sound for executions in which those invariants hold true, and recovers to a conservative DIFT for executions in which those invariants are false. We show how to overcome the main problem with using OHA to optimize live executions, which is the possibility of unbounded rollbacks. We eliminate the need for any rollback during recovery by tailoring our predicated static analysis to eliminate only safe elisions of noop monitors. Our tool, Iodine, reduces the overhead of DIFT for enforcing security policies to 9%, which is 4.4x lower than that with traditional hybrid analysis, while still being able to be run on live systems.
LBM: A Security Framework for Peripherals within the Linux Kernel
Dave Jing Tian (University of Florida), Grant Hernandez (University of Florida), Joseph I. Choi (University of Florida), Vanessa Frost (University of Florida), Peter C. Johnson (Middlebury College), Kevin R. B. Butler (University of Florida)
Modern computer peripherals are diverse in their capabilities and functionality, ranging from keyboards and printers to smartphones and external GPUs. In recent years, peripherals increasingly connect over a small number of standardized communication protocols, including USB, Bluetooth, and NFC. The host operating system is responsible for managing these devices; however, malicious peripherals can request additional functionality from the OS resulting in system compromise, or can craft data packets to exploit vulnerabilities within OS software stacks. Defenses against malicious peripherals to date only partially cover the peripheral attack surface and are limited to specific protocols (e.g., USB). In this paper, we propose Linux (e)BPF Modules (LBM), a general security framework that provides a unified API for enforcing protection against malicious peripherals within the Linux kernel. LBM leverages the eBPF packet filtering mechanism for performance and extensibility and we provide a high-level language to facilitate the development of powerful filtering functionality. We demonstrate how LBM can provide host protection against malicious USB, Bluetooth, and NFC devices; we also instantiate and unify existing defenses under the LBM framework. Our evaluation shows that the overhead introduced by LBM is within 1 ?s per packet in most cases, application and system overhead is negligible, and LBM outperforms other state-of-the-art solutions. To our knowledge, LBM is the first security framework designed to provide comprehensive protection against malicious peripherals within the Linux kernel.
Learning to Reconstruct: Statistical Learning Theory and Encrypted Database Attacks
Paul Grubbs (Cornell University), Marie-Sarah Lacharité (Royal Holloway, University of London), Brice Minaud (Ecole Normale Supérieure, CNRS, PSL University and Inria), Kenneth G. Paterson (Royal Holloway, University of London)
We show that the problem of reconstructing encrypted databases from access pattern leakage is closely related to statistical learning theory. This new viewpoint enables us to develop broader attacks that are supported by streamlined performance analyses. First, we address the problem of ϵ-approximate database reconstruction (ϵ-ADR) from range query leakage, giving attacks whose query cost scales only with the relative error ϵ, and is independent of the size of the database, or the number N of possible values of data items. This already goes significantly beyond the state-of-the-art for such attacks, as represented by Kellaris et al. (ACM CCS 2016) and Lacharité et al. (IEEE S&P; 2018). We also study the new problem of ϵ-approximate order reconstruction (ϵ-AOR), where the adversary is tasked with reconstructing the order of records, except for records whose values are approximately equal. We show that as few as O(ϵ^-1 log ϵ^-1) uniformly random range queries suffice. Our analysis relies on an application of learning theory to PQ-trees, special data structures tuned to compactly record certain ordering constraints. We then show that when an auxiliary distribution is available, ϵ-AOR can be enhanced to achieve ϵ-ADR; using real data, we show that devastatingly small numbers of queries are needed to attain very accurate database reconstruction. Finally, we generalize from ranges to consider what learning theory tells us about the impact of access pattern leakage for other classes of queries, focusing on prefix and suffix queries. We illustrate this with both concrete attacks for prefix queries and with a general lower bound for all query classes. We also show a very general reduction from reconstruction with known or chosen queries to PAC learning.
Measuring and Analyzing Search Engine Poisoning of Linguistic Collisions
Matthew Joslin (University of Texas at Dallas), Neng Li (Shanghai Jiao Tong University), Shuang Hao (University of Texas at Dallas), Minhui Xue (Macquarie University), Haojin Zhu (Shanghai Jiao Tong University)
Misspelled keywords have become an appealing target in search poisoning, since they are less competitive to promote than the correct queries and account for a considerable amount of search traffic. Search engines have adopted several countermeasure strategies, e.g., Google applies automated corrections on queried keywords and returns search results of the corrected versions directly. However, a sophisticated class of attack, which we term as linguistic-collision misspelling, can evade auto-correction and poison search results. Cybercriminals target special queries where the misspelled terms are existent words, even in other languages (e.g., "idobe", a misspelling of the English word "adobe", is a legitimate word in the Nigerian language).

In this paper, we perform the first large-scale analysis on linguistic-collision search poisoning attacks. In particular, we check 1.77 million misspelled search terms on Google and Baidu and analyze both English and Chinese languages, which are the top two languages used by Internet users. We leverage edit distance operations and linguistic properties to generate misspelling candidates. To more efficiently identify linguistic-collision search terms, we design a deep learning model that can improve collection rate by 2.84x compared to random sampling. Our results show that the abuse is prevalent: around 1.19% of linguistic-collision search terms on Google and Baidu have results on the first page directing to malicious websites. We also find that cybercriminals mainly target categories of gambling, drugs, and adult content. Mobile-device users disproportionately search for misspelled keywords, presumably due to small screen for input. Our work highlights this new class of search engine poisoning and provides insights to help mitigate the threat.
Neural Cleanse: Identifying and Mitigating Backdoor Attacks in Neural Networks
Bolun Wang (UC Santa Barbara), Yuanshun Yao (University of Chicago), Shawn Shan (University of Chicago), Huiying Li (University of Chicago), Bimal Viswanath (Virginia Tech), Haitao Zheng (University of Chicago), Ben Y. Zhao (University of Chicago)
Lack of transparency in deep neural networks (DNNs) make them susceptible to backdoor attacks, where hidden associations or triggers override normal classification to produce unexpected results. For example, a model with a backdoor always identifies a face as Bill Gates if a specific symbol is present in the input. Backdoors can stay hidden indefinitely until activated by an input, and present a serious security risk to many security or safety related applications, e.g. biometric authentication systems or self-driving cars.

We present the first robust and generalizable detection and mitigation system for DNN backdoor attacks. Our techniques identify backdoors and reconstruct possible triggers. We identify multiple mitigation techniques via input filters, neuron pruning and unlearning. We demonstrate their efficacy via extensive experiments on a variety of DNNs, against two types of backdoor injection methods identified by prior work. Our techniques also prove robust against a number of variants of the backdoor attack.
Perun: Virtual Payment Hubs over Cryptocurrencies
Stefan Dziembowski (University of Warsaw), Lisa Eckey (TU Darmstadt), Sebastian Faust (TU Darmstadt), Daniel Malinowski (University of Warsaw)
Payment channels emerged recently as an efficient method for performing cheap micropayments in cryptocurrencies. In contrast to traditional on-chain transactions, payment channels have the advantage that they allow for nearly unlimited number of transactions between parties without involving the blockchain. In this work, we introduce Perun, an off-chain channel system that offers a new method for connecting channels that is more efficient than the existing technique of ``routing transactions'' over multiple channels. To this end, Perun introduces a technique called ``virtual payment channels'' that avoids involvement of the intermediary for each individual payment. In this paper we formally model and prove security of this technique in the case of one intermediary, who can be viewed as a ``payment hub'' that has direct channels with several parties. Our scheme works over any cryptocurrency that provides Turing-complete smart contracts. As a proof of concept, we implemented Perun's smart contracts in Ethereum.
PrivKV: Key-Value Data Collection with Local Differential Privacy
Qingqing Ye (Renmin University of China), Haibo Hu (Hong Kong Polytechnic University), Xiaofeng Meng (Renmin University of China), Huadi Zheng (Hong Kong Polytechnic University)
Local differential privacy (LDP), where each user perturbs her data locally before sending to an untrusted data collector, is a new and promising technique for privacy-preserving distributed data collection. The advantage of LDP is to enable the collector to obtain accurate statistical estimation on sensitive user data (e.g., location and app usage) without accessing them. However, existing work on LDP is limited to simple data types, such as categorical, numerical, and set-valued data. To the best of our knowledge, there is no existing LDP work on key-value data, which is an extremely popular NoSQL data model and the generalized form of set-valued and numerical data. In this paper, we study this problem of frequency and mean estimation on key-value data by first designing a baseline approach PrivKV within the same "perturbation-calibration" paradigm as existing LDP techniques. To address the poor estimation accuracy due to the clueless perturbation of users, we then propose two iterative solutions PrivKVM and PrivKVM+ that can gradually improve the estimation results through a series of iterations. An optimization strategy is also presented to reduce network latency and increase estimation accuracy by introducing virtual iterations in the collector side without user involvement. We verify the correctness and effectiveness of these solutions through theoretical analysis and extensive experimental results.
Proof-of-Stake Sidechains
Peter Gaži (IOHK), Aggelos Kiayias (University of Edinburgh, IOHK), Dionysis Zindros (University of Athens, IOHK)
Sidechains have long been heralded as the key enabler of blockchain scalability and interoperability. However, no modeling of the concept or a provably secure construction has so far been attempted. We provide the first formal definition of what a sidechain system is and how assets can be moved between sidechains securely. We put forth a security definition that augments the known transaction ledger properties of liveness and safety to hold across multiple ledgers and enhance them with a new "firewall" security property which safeguards each blockchain from its sidechains, limiting the impact of an otherwise catastrophic sidechain failure. We then provide a sidechain construction that is suitable for proof-of-stake (PoS) sidechain systems. As an exemplary concrete instantiation we present our construction for an epoch- based PoS system consistent with Ouroboros (Crypto 2017), the PoS blockchain protocol used in Cardano which is one of the largest pure PoS systems by market capitalisation, and we also comment how the construction can be adapted for other protocols such as Ouroboros Praos (Eurocrypt 2018), Ouroboros Genesis (CCS 2018), Snow White and Algorand. An important feature of our construction is merged-staking that prevents "goldfinger" attacks against a sidechain that is only carrying a small amount of stake. An important technique for pegging chains that we use in our construction is cross-chain certification which is facilitated by a novel cryptographic primitive we introduce called ad-hoc threshold multisignatures (ATMS) which may be of independent interest. We show how ATMS can be securely instantiated by regular and aggregate digital signatures as well as succinct arguments of knowledge such as STARKs and bulletproofs with varying degrees of storage efficiency.
Razzer: Finding Kernel Race Bugs through Fuzzing
Dae R. Jeong (KAIST), Kyungtae Kim (Purdue University), Basavesh Shivakumar (Purdue University), Byoungyoung Lee (Seoul National University, Purdue University), Insik Shin (KAIST)
A data race in a kernel is an important class of bugs, critically impacting the reliability and security of the associated system. As a result of a race, the kernel may become unresponsive. Even worse, an attacker may launch a privilege escalation attack to acquire root privileges. In this paper, we propose Razzer, a tool to find race bugs in kernels. The core of Razzer is in guiding fuzz testing towards potential data race spots in the kernel. Razzer employs two techniques to find races efficiently: a static analysis and a deterministic thread interleaving technique. Using a static analysis, Razzer identifies over-approximated potential data race spots, guiding the fuzzer to search for data races in the kernel more efficiently. Using the deterministic thread interleaving technique implemented at the hypervisor, Razzer tames the non-deterministic behavior of the kernel such that it can deterministically trigger a race. We implemented a prototype of Razzer and ran the latest Linux kernel (from v4.16-rc3 to v4.18-rc3) using Razzer. As a result, Razzer discovered 30 new races in the kernel, with 16 subsequently confirmed and accordingly patched by kernel developers after they were reported.
Redactable Blockchain in the Permissionless Setting
Dominic Deuber (Friedrich-Alexander-University Erlangen-Nurnberg), Bernardo Magri (Aarhus University), Sri Aravinda Krishnan Thyagarajan (Friedrich-Alexander-University Erlangen-Nurnberg)
Bitcoin is an immutable permissionless blockchain system that has been extensively used as a public bulletin board by many different applications that heavily relies on its immutability. However, Bitcoin's immutability is not without its fair share of demerits. Interpol exposed the existence of harmful and potentially illegal documents, images and links in the Bitcoin blockchain, and since then there have been several qualitative and quantitative analysis on the types of data currently residing in the Bitcoin blockchain. Although there is a lot of attention on blockchains, surprisingly the previous solutions proposed for data redaction in the permissionless setting are far from feasible, and require additional trust assumptions. Hence, the problem of harmful data still poses a huge challenge for law enforcement agencies like Interpol (Tziakouris, IEEE S&P;'18).

We propose the first efficient redactable blockchain for the permissionless setting that is easily integrable into Bitcoin, and that does not rely on heavy cryptographic tools or trust assumptions. Our protocol uses a consensus-based voting and is parameterised by a policy that dictates the requirements and constraints for the redactions; if a redaction gathers enough votes the operation is performed on the chain. As an extra feature, our protocol offers public verifiability and accountability for the redacted chain. Moreover, we provide formal security definitions and proofs showing that our protocol is secure against redactions that were not agreed by consensus. Additionally, we show the viability of our approach with a proof-of-concept implementation that shows only a tiny overhead in the chain validation of our protocol when compared to an immutable one.
Resident Evil: Understanding Residential IP Proxy as a Dark Service
Xianghang Mi (Indiana University Bloomington), Xuan Feng (Indiana University Bloomington), Xiaojing Liao (Indiana University Bloomington), Baojun Liu (Tsinghua University), XiaoFeng Wang (Indiana University Bloomington), Feng Qian (Indiana University Bloomington), Zhou Li (IEEE member), Sumayah Alrwais (King Saud University), Limin Sun (Institute of Information Engineering, CAS), Ying Liu (Tsinghua University)
Abstract-An emerging Internet business is residential proxy (RESIP) as a service, in which a provider utilizes the hosts within residential networks (in contrast to those running in a datacenter) to relay their customers' traffic, in an attempt to avoid server- side blocking and detection. With the prominent roles the services could play in the underground business world, little has been done to understand whether they are indeed involved in Cybercrimes and how they operate, due to the challenges in identifying their RESIPs, not to mention any in-depth analysis on them. In this paper, we report the first study on RESIPs, which sheds light on the behaviors and the ecosystem of these elusive gray services. Our research employed an infiltration framework, including our clients for RESIP services and the servers they visited, to detect 6 million RESIP IPs across 230+ countries and 52K+ ISPs. The observed addresses were analyzed and the hosts behind them were further fingerprinted using a new profiling system. Our effort led to several surprising findings about the RESIP services unknown before. Surprisingly, despite the providers' claim that the proxy hosts are willingly joined, many proxies run on likely compromised hosts including IoT devices. Through cross-matching the hosts we discovered and labeled PUP (potentially unwanted programs) logs provided by a leading IT company, we uncovered various illicit operations RESIP hosts performed, including illegal promotion, Fast fluxing, phishing, malware hosting, and others. We also reverse engi- neered RESIP services' internal infrastructures, uncovered their potential rebranding and reselling behaviors. Our research takes the first step toward understanding this new Internet service, contributing to the effective control of their security risks.
Short Text, Large Effect: Measuring the Impact of User Reviews on Android App Security & Privacy
Duc Cuong Nguyen (CISPA, Saarland University), Erik Derr (CISPA, Saarland University), Michael Backes (CISPA Helmholtz Center i.G.), Sven Bugiel (CISPA Helmholtz Center i.G.)
Application markets streamline the end-users' task of finding and installing applications. They also form an immediate communication channel between app developers and their end-users in form of app reviews, which allow users to provide developers feedback on their apps. However, it is unclear to which extent users employ this channel to point out their security and privacy concerns about apps, about which aspects of apps users express concerns, and how developers react to such security- and privacy-related reviews. In this paper, we present the first study of the relationship between end-user reviews and security- & privacy-related changes in apps. Using natural language processing on 4.5M user reviews for the top 2,583 apps in Google Play, we identified 5,527 security and privacy relevant reviews (SPR). For each app version mentioned in the SPR, we use static code analysis to extract permission-protected features mentioned in the reviews. We successfully mapped SPRs to privacy-related changes in app updates in 60.77% of all cases. Using exploratory data analysis and regression analysis we are able to show that preceding SPR are a significant factor for predicting privacy-related app updates, indicating that user reviews in fact lead to privacy improvements of apps. Our results further show that apps that adopt runtime permissions receive a significantly higher number of SPR, showing that runtime permissions put privacy-jeopardizing actions better into users' minds. Further, we can attribute about half of all privacy-relevant app changes exclusively to third-party library code. This hints at larger problems for app developers to adhere to users' privacy expectations and markets' privacy regulations. Our results make a call for action to make app behavior more transparent to users in order to leverage their reviews in creating incentives for developers to adhere to security and privacy best practices, while our results call at the same time for better tools to support app developers in this endeavor.
Simple High-Level Code for Cryptographic Arithmetic - With Proofs, Without Compromises
Andres Erbsen (Massachusetts Institute of Technology), Jade Philipoom (Massachusetts Institute of Technology), Jason Gross (Massachusetts Institute of Technology), Robert Sloan (Massachusetts Institute of Technology), Adam Chlipala (Massachusetts Institute of Technology)
We introduce a new approach for implementing cryptographic arithmetic in short high-level code with machine-checked proofs of functional correctness. We further demonstrate that simple partial evaluation is sufficient to transform into the fastest-known C code, breaking the decades-old pattern that the only fast implementations are those whose instruction-level steps were written out by hand.

These techniques were used to build an elliptic-curve library that achieves competitive performance for 80 prime fields and multiple CPU architectures, showing that implementation and proof effort scales with the number and complexity of conceptually different algorithms, not their use cases. As one outcome, we present the first verified high-performance implementation of P-256, the most widely used elliptic curve. implementations from our library were included in BoringSSL to replace existing specialized code, for inclusion in several large deployments for Chrome, Android, and CloudFlare.
SoK: General Purpose Compilers for Secure Multi-Party Computation
Marcella Hastings (University of Pennsylvania), Brett Hemenway (University of Pennsylvania), Daniel Noble (University of Pennsylvania), Steve Zdancewic (University of Pennsylvania)
Secure multi-party computation (MPC) allows a group of mutually distrustful parties to compute a joint function on their inputs without revealing any information beyond the result of the computation. This type of computation is extremely powerful and has wide-ranging applications in academia, industry, and government. Protocols for secure computation have existed for decades, but only recently have general-purpose compilers for executing MPC on arbitrary functions been developed. These projects rapidly improved the state of the art, and began to make MPC accessible to non-expert users. However, the field is changing so rapidly that it is difficult even for experts to keep track of the varied capabilities of modern frameworks.

In this work, we survey general-purpose compilers for secure multi-party computation. These tools provide high-level abstractions to describe arbitrary functions and execute secure computation protocols. We consider eleven systems: EMP-toolkit, Obliv-C, ObliVM, TinyGarble, SCALE-MAMBA (formerly SPDZ), Wysteria, Sharemind, PICCO, ABY, Frigate and CBMC-GC. We evaluate these systems on a range of criteria, including language expressibility, capabilities of the cryptographic back-end, and accessibility to developers. We advocate for improved documentation of MPC frameworks, standardization within the community, and make recommendations for future directions in compiler development. Installing and running these systems can be challenging, and for each system, we also provide a complete virtual environment (Docker container) with all the necessary dependencies to run the compiler and our example programs.
SoK: Sanitizing for Security
Dokyung Song (University of California, Irvine), Julian Lettner (University of California, Irvine), Prabhu Rajasekaran (University of California, Irvine), Yeoul Na (University of California, Irvine), Stijn Volckaert (University of California, Irvine), Per Larsen (University of California, Irvine), Michael Franz (University of California, Irvine)
The C and C++ programming languages are notoriously insecure yet remain indispensable. Developers therefore resort to a multi-pronged approach to find security issues before adversaries. These include manual, static, and dynamic program analysis. Dynamic bug finding tools-henceforth "sanitizers"-can find bugs that elude other types of analysis because they observe the actual execution of a program, and can therefore directly observe incorrect program behavior as it happens. A vast number of sanitizers have been prototyped by academics and refined by practitioners. We provide a systematic overview of sanitizers with an emphasis on their role in finding security issues. Specifically, we taxonomize the available tools and the security vulnerabilities they cover, describe their performance and compatibility properties, and highlight various trade-offs.
SoK: Security Evaluation of Home-Based IoT Deployments
Omar Alrawi (Georgia Institute of Technology), Chaz Lever (Georgia Institute of Technology), Manos Antonakakis (Georgia Institute of Technology), Fabian Monrose (University of North Carolina at Chapel Hill)
Home-based IoT devices have a bleak reputation regarding their security practices. On the surface, the insecurities of IoT devices seem to be caused by integration problems that may be addressed by simple measures, but this work finds that to be a naive assumption. The truth is, IoT deployments, at their core, utilize traditional compute systems, such as embedded, mobile, and network. These components have many unexplored challenges such as the effect of over-privileged mobile applications on embedded devices.

Our work proposes a methodology that researchers and practitioners could employ to analyze security properties for home-based IoT devices. We systematize the literature for home-based IoT using this methodology in order to understand attack techniques, mitigations, and stakeholders. Further, we evaluate \numDevices devices to augment the systematized literature in order to identify neglected research areas. To make this analysis transparent and easier to adapt by the community, we provide a public portal to share our evaluation data and invite the community to contribute their independent findings.
SoK: The Challenges, Pitfalls, and Perils of Using Hardware Performance Counters for Security
Sanjeev Das (University of North Carolina at Chapel Hill), Jan Werner (University of North Carolina at Chapel Hill), Manos Antonakakis (Georgia Institute of Technology), Michalis Polychronakis (Stony Brook University), Fabian Monrose (University of North Carolina at Chapel Hill)
Hardware Performance Counters (HPCs) have been available in processors for more than a decade. These counters can be used to monitor and measure events that occur at the CPU level. Modern processors provide hundreds of hardware events that can be monitored, and with each new processor architecture more are added. Yet, there has been little in the way of systematic studies on how performance counters can best be utilized to accurately monitor events in real-world settings. Especially when it comes to the use of HPCs for security applications, measurement imprecisions or incorrect assumptions regarding the measured values can undermine the offered protection. To shed light on this issue, we embarked on a year-long effort to (i) study the best practices for obtaining accurate measurement of events using performance counters, (ii) understand the challenges and pitfalls of using HPCs in various settings, and (iii) explore ways to obtain consistent and accurate measurements across different settings and architectures. Additionally, we then empirically evaluated the way HPCs have been used throughout a wide variety of papers. Not wanting to stop there, we explored whether these widely used techniques are in fact obtaining performance counter data correctly. As part of that assessment, we (iv) extended the seminal work of Weaver and McKee from almost 10 years ago on non-determinism in HPCs, and applied our findings to 56 papers across various application domains. In that follow-up study, we found the acceptance of HPCs in security applications is in stark contrast to other application areas - especially in the last five years. Given that, we studied an additional representative set of 41 works from the security literature that rely on HPCs, to better elucidate how the intricacies we discovered can impact the soundness and correctness of their approaches and conclusions. Toward that goal, we (i) empirically evaluated how failure to accommodate for various subtleties in the use of HPCs can undermine the effectiveness of security applications, specifically in the case of exploit prevention and malware detection. Lastly, we showed how (ii) an adversary can manipulate HPCs to bypass certain security defenses.
Spectre Attacks: Exploiting Speculative Execution
Paul Kocher (Independent (, Jann Horn (Google Project Zero), Anders Fogh (G DATA Advanced Analytics), Daniel Genkin (University of Pennsylvania and University of Maryland), Daniel Gruss (Graz University of Technology), Werner Haas (Cyberus Technology), Mike Hamburg (Rambus, Cryptography Research Division), Moritz Lipp (Graz University of Technology), Stefan Mangard (Graz University of Technology), Thomas Prescher (Cyberus Technology), Michael Schwarz (Graz University of Technology), Yuval Yarom (University of Adelaide and Data61)
Modern processors use branch prediction and speculative execution to maximize performance. For example, if the destination of a branch depends on a memory value that is in the process of being read, CPUs will try to guess the destination and attempt to execute ahead. When the memory value finally arrives, the CPU either discards or commits the speculative computation. Speculative logic is unfaithful in how it executes, can access the victim's memory and registers, and can perform operations with measurable side effects.

Spectre attacks involve inducing a victim to speculatively perform operations that would not occur during correct program execution and which leak the victim's confidential information via a side channel to the adversary. This paper describes practical attacks that combine methodology from side channel attacks, fault attacks, and return-oriented programming that can read arbitrary memory from the victim's process. More broadly, the paper shows that speculative execution implementations violate the security assumptions underpinning numerous software security mechanisms, including operating system process separation, containerization, just-in-time (JIT) compilation, and countermeasures to cache timing and side-channel attacks. These attacks represent a serious threat to actual systems since vulnerable speculative execution capabilities are found in microprocessors from Intel, AMD, and ARM that are used in billions of devices.

While makeshift processor-specific countermeasures are possible in some cases, sound solutions will require fixes to processor designs as well as updates to instruction set architectures (ISAs) to give hardware architects and software developers a common understanding as to what computation state CPU implementations are (and are not) permitted to leak.
Stealthy Porn: Understanding Real-World Adversarial Images for Illicit Online Promotion
Kan Yuan (Indiana University Bloomington), Di Tang (Chinese University of Hong Kong), Xiaojing Liao (Indiana University Bloomington), XiaoFeng Wang (Indiana University Bloomington), Xuan Feng (Indiana University Bloomington/Chinese Academy of Sciences), Yi Chen (Indiana University Bloomington/Chinese Academy of Sciences), Menghan Sun (Chinese University of Hong Kong), Haoran Lu (Indiana University Bloomington), Kehuan Zhang (Chinese University of Hong Kong)
Recent years have witnessed the rapid progress in deep learning (DP), which also brings their potential weaknesses to the spotlights of security and machine learning studies. With important discoveries made by adversarial learning research, surprisingly little attention, however, has been paid to the real-world adversarial techniques deployed by the cybercriminal to evade image-based detection. Unlike the adversarial examples that induce misclassification using nearly imperceivable perturbation, real-world adversarial images tend to be less optimal yet equally effective. As a first step to understand the threat, we report in the paper a study on adversarial promotional porn images (APPIs) that are extensively used in underground advertising. We show that the adversary today's strategically constructs the APPIs to evade explicit content detection while still preserving their sexual appeal, even though the distortions and noise introduced are clearly observable to humans. To understand such real-world adversarial images and the underground business behind them, we develop a novel DP-based methodology called Male`na, which focuses on the regions of an image where sexual content is least obfuscated and therefore visible to the target audience of a promotion. Using this technique, we have discovered over 4,000 APPIs from 4,042,690 images crawled from popular social media, and further brought to light the unique techniques they use to evade popular explicit content detectors (e.g., Google Cloud Vision API, Yahoo Open NSFW model), and the reason that these techniques work. Also studied are the ecosystem of such illicit promotions, including the obfuscated contacts advertised through those images, compromised accounts used to disseminate them, and large APPI campaigns involving thousands of images. Another interesting finding is the apparent attempt made by cybercriminals to steal others' images for their advertising. The study highlights the importance of the research on real-world adversarial learning and makes the first step towards mitigating the threats it poses.
Tap 'n Ghost: A Compilation of Novel Attack Techniques against Smartphone Touchscreens
Seita Maruyama (Waseda University), Satohiro Wakabayashi (Waseda University), Tatsuya Mori (Waseda University / RIKEN AIP)
We present a novel attack named "Tap 'n Ghost", which aims to attack the touchscreens of NFC-enabled mobile devices such as smartphones. Tap 'n Ghost consists of two striking attack techniques --- "Tag-based Adaptive Ploy (TAP)" and "Ghost Touch Generator." First, using a NFC card emulator embedded in a common object such as table, a TAP system performs tailored attacks on the victim's smartphone by employing device fingerprinting; e.g., popping up a customized dialogue box asking whether or not to connect to an attacker's Bluetooth mouse. Further, Ghost Touch Generator forces the victim to connect to the mouse even if she or he aimed to cancel the dialogue by touching the "cancel" button; i.e., it alters the selection of a button on a screen. After the connection is established, the attacker can remotely take control of the smartphone, with the knowledge about the layout of the screen derived from the device fingerprinting. To evaluate the reality of the attack, we perform an online survey with 300 respondents and a user study involving 16 participants. The results demonstrate that the attack is realistic. We additionally discuss the possible countermeasures against the threats posed by Tap 'n Ghost.
Theory and Practice of Finding Eviction Sets
Pepe Vila (IMDEA Software Institute/Technical University of Madrid (UPM)), Boris Köpf (Microsoft Research), José F. Morales (IMDEA Software Institute)
Many micro-architectural attacks rely on the capability of an attacker to efficiently find small eviction sets: groups of virtual addresses that map to the same cache set. This capability has become a decisive primitive for cache side-channel, rowhammer, and speculative execution attacks. Despite their importance, algorithms for finding small eviction sets have not been systematically studied in the literature. In this paper, we perform such a systematic study. We begin by formalizing the problem and analyzing the probability that a set of random virtual addresses is an eviction set. We then present novel algorithms, based on ideas from threshold group testing, that reduce random eviction sets to their minimal core in linear time, improving over the quadratic state-of-the-art. We complement the theoretical analysis of our algorithms with a rigorous empirical evaluation in which we identify and isolate factors that affect their reliability in practice, such as adaptive cache replacement strategies and TLB thrashing. Our results indicate that our algorithms enable finding small eviction sets much faster than before, and under conditions where this was previously deemed impractical.
Threshold ECDSA from ECDSA Assumptions: The Multiparty Case
Jack Doerner (Northeastern University), Yashvanth Kondi (Northeastern University), Eysa Lee (Northeastern University), Abhi Shelat (Northeastern University)
Cryptocurrency applications have spurred a resurgence of interest in the computation of ECDSA signatures using threshold protocols---that is, protocols in which the signing key is secret-shared among n parties, of which any subset of size t must interact in order to compute a signature. Among the resulting works to date, that of Doerner et al. requires the most natural assumptions while also achieving the best practical signing speed. It is, however, limited to the setting in which the threshold is two. We propose an extension of their scheme to arbitrary thresholds, and prove it secure against a malicious adversary corrupting up to one party less than the threshold under only the Computational Diffie-Hellman assumption in the Random Oracle model, an assumption strictly weaker than those under which ECDSA is proven.

Whereas the best current schemes for threshold-two ECDSA signing use a Diffie-Hellman Key Exchange to calculate each signature's nonce, a direct adaptation of this technique to a larger threshold t would incur a round count linear in t; thus we abandon it in favor of a new mechanism that yields a protocol requiring log(t)+6 rounds in total. We design a new consistency check, similar in spirit to that of Doerner et al., but suitable for an arbitrary number of participants, and we optimize the underlying two-party multiplication protocol on which our scheme is based, reducing its concrete communication and computation costs.

We implement our scheme and evaluate it among groups of up to 256 of co-located and 128 geographically-distributed parties, and among small groups of embedded devices. We find that in the LAN setting, our scheme outperforms all prior works by orders of magnitude, and that it is efficient enough for use even on smartphones or hardware tokens. In the WAN setting we find that, despite its logarithmic round count, our protocol outperforms the best constant-round protocols in realistic scenarios.
Touching the Untouchables: Dynamic Security Analysis of the LTE Control Plane
Hongil Kim (Korea Institute of Science and Technology (KAIST)), Jiho Lee (Korea Institute of Science and Technology (KAIST)), Eunkyu Lee (Korea Institute of Science and Technology (KAIST)), Yongdae Kim (Korea Institute of Science and Technology (KAIST))
This paper presents our extensive investigation of the security aspects of control plane procedures based on dynamic testing of the control components in operational Long Term Evolution (LTE) networks. For dynamic testing in LTE networks, we implemented a semi-automated testing tool, named LTEFuzz, by using open-source LTE software over which the user has full control. We systematically generated test cases by defining three basic security properties by closely analyzing the standards. Based on the security property, LTEFuzz generates and sends the test cases to a target network, and classifies the problematic behavior by only monitoring the device-side logs. Accordingly, we uncovered 36 vulnerabilities, which have not been disclosed previously. These findings are categorized into five types: Improper handling of (1) unprotected initial procedure, (2) crafted plain requests, (3) messages with invalid integrity protection, (4) replayed messages, and (5) security procedure bypass. We confirmed those vulnerabilities by demonstrating proof-of-concept attacks against operational LTE networks. The impact of the attacks is to either deny LTE services to legitimate users, spoof SMS messages, or eavesdrop/manipulate user data traffic. Precise root cause analysis and potential countermeasures to address these problems are presented as well. Cellular carriers were partially involved to maintain ethical standards as well as verify our findings in commercial LTE networks.
Towards Automated Safety Vetting of PLC Code in Real-World Plants
Mu Zhang (Cornell University), Chien-Ying Chen (University of Illinois at Urbana-Champaign), Bin-Chou Kao (University of Illinois at Urbana-Champaign), Yassine Qamsane (University of Michigan), Yuru Shao (University of Michigan), Yikai Lin (University of Michigan), Elaine Shi (Cornell University), Sibin Mohan (University of Illinois at Urbana-Champaign), Kira Barton (University of Michigan), James Moyne (University of Michigan), Z. Morley Mao (University of Michigan)
Safety violations in programmable logic controllers (PLCs), caused either by faults or attacks, have recently garnered significant attention. However, prior efforts at PLC code vetting suffer from many drawbacks. Static analyses and verification cause significant false positives and cannot reveal specific runtime contexts. Dynamic analyses and symbolic execution, on the other hand, fail due to their inability to handle real-world PLC programs that are event-driven and timing sensitive. In this paper, we propose VetPLC, a temporal context-aware, program analysis-based approach to produce timed event sequences that can be used for automatic safety vetting. To this end, we (a) perform static program analysis to create timed event causality graphs in order to understand causal relations among events in PLC code and (b) mine temporal invariants from data traces collected in Industrial Control System (ICS) testbeds to quantitatively gauge temporal dependencies that are constrained by machine operations. Our VetPLC prototype has been implemented in 15K lines of code. We evaluate it on 10 real-world scenarios from two different ICS settings. Our experiments show that VetPLC outperforms state-of-the-art techniques and can generate event sequences that can be used to automatically detect hidden safety violations.
Towards Practical Differentially Private Convex Optimization
Roger Iyengar (Carnegie Mellon University), Joseph P. Near (University of California, Berkeley), Dawn Song (University of California, Berkeley), Om Thakkar (Boston University), Abhradeep Thakurta (University of California, Santa Cruz), Lun Wang (Peking University)
Building useful predictive models often involves learning from sensitive data. Training models with differential privacy can guarantee the privacy of such sensitive data. For convex optimization tasks, several differentially private algorithms are known, but none has yet been deployed in practice.

In this work, we make two major contributions towards practical differentially private convex optimization. First, we present Approximate Minima Perturbation, a novel algorithm that can leverage any off-the-shelf optimizer. We show that it can be employed without any hyperparameter tuning, thus making it an attractive technique for practical deployment. Second, we perform an extensive empirical evaluation of the state-of-the-art algorithms for differentially private convex optimization, on a range of publicly available benchmark datasets, and real-world datasets obtained through an industrial collaboration. We release open-source implementations of all the differentially private convex optimization algorithms considered, and benchmarks on as many as nine public datasets, four of which are high-dimensional.
Why Does Your Data Leak? Uncovering the Data Leakage in Cloud from Mobile Apps
Chaoshun Zuo (The Ohio State University), Zhiqiang Lin (The Ohio State University), Yinqian Zhang (The Ohio State University)
Increasingly, more and more mobile applications (apps for short) are using the cloud as the back-end, in particular the cloud APIs, for data storage, data analytics, message notification, and monitoring. Unfortunately, we have recently witnessed massive data leaks from the cloud, ranging from personally identifiable information to corporate secrets. In this paper, we seek to understand why such significant leaks occur and design tools to automatically identify them. To our surprise, our study reveals that lack of authentication, misuse of various keys (e.g., normal user keys and superuser keys) in authentication, or misconfiguration of user permissions in authorization are the root causes. Then, we design a set of automated program analysis techniques including obfuscation-resilient cloud API identification and string value analysis, and implement them in a tool called LeakScope to identify the potential data leakage vulnerabilities from mobile apps based on how the cloud APIs are used. Our evaluation with over 1.6 million mobile apps from the Google Play Store has uncovered 15, 098 app servers managed by mainstream cloud providers such as Amazon, Google, and Microsoft that are subject to data leakage attacks. We have made responsible disclosure to each of the cloud service providers, and they have all confirmed the vulnerabilities we have identified and are actively working with the mobile app developers to patch their vulnerable services.