I investigate data security/privacy, privacy-enhancing learning technologies, and data analytics.

## Selected Publications

- Top-k Query Processing on Encrypted Databases with Strong Security Guarantees

with Haohan Zhu, George Kollios, 34th IEEE ICDE, Paris, France, April 2018

Proceeding version arXiv version Poster.
- NED: An Inter-Graph Node Metric based on Edit Distance

with Haohan Zhu, George Kollios, PVLDB 10(6):697-708, 2017, Proceeding version Poster.
- GRECS: Approximate Shortest Distance Queries on Encrypted Graphs

with Seny Kamara, Kobbi Nissim, and George Kollios. 22nd ACM CCS, Denver, Colorado, USA, October 2015

Proceeding Version, Full version, PDF Slides, Poster
- Privacy-Preserving Similarity Evaluation of Time Series Data,

with Haohan Zhu, and George Kollios. 17th EDBT/ICDT, Athens, Greece, March 2014.

Proceeding Version, Poster, Code
- Computational Fuzzy Extractors

with Ben Fuller, and Leo Reyzin. 19th IACR ASIACRYPT, Bangalore, India, December 2013

Proceeding Version, Full version

The objective is to design provably secure and scalable schemes for encrypting large-scale databases without losing the ability to query them. Supported by The Modular Approach to Cloud Security (MACS).

Graph databases: I investigate graph encryption. Ideally, a graph encryption scheme should encrypt a graph with support for various graph queries. In this paper, we proposed **GRECS** that can support *approximate* shortest distance queries on the encrypted graph. See the blog entry. (Collaboration with Seny Kamara)

Relational databases: I also investigate encryption schemes that can provide more functionalities of encrypted relational databases. The challenge is to balance the three objectives of privacy/security, functionality/utility/accuracy, and performance/scalability. For example, in this we propose an efficient secure *top-k* queries processing on the encrypted database.

Multiparty computation: Motivated by the data privacy concerns in today's internet, I am exploring scalable privacy-preserving data mining/machine learning protocols where each party only learns the output of the algorithm and nothing else. In particular, I consider mostly data mining/learning algorithms that can scale to the massive amount of datasets. For example, I proposed a privacy-preserving protocol for computing the distance between two time-series (see here). I am also studying privacy-preserving clustering algorithms. (Collaboration with RSA labs)

Node similarity: I am interested in node similarity measures, a fundamental problem in graph data analytics that can be used for graph de-anonymization. Existing node similarity measurements focus on evaluating how similar between two nodes in the same graph. In this project, we investigate methods that measure the similarity between nodes in different graphs (inter-graph nodes). In particular, we are interested in defining new distance metrics that are based on the neighborhood topology of two nodes. (see here)

I'm presently a scientist at Amazon Inc. Previously, I worked at Apple Inc. from 2016-2017.

I received my Ph.D. at Boston University, where I was a member of DBLab and BUSec in the Department of Computer Science at BU. My Ph.D. dissertation is "Privacy-Preserving Queries On Encrypted Databases".

I spent some time at Microsoft Research in 2014. See my guest posting on discussing graph privacy: "Graph Encryption: Going Beyond Encrypted Keyword Search", on Seny Kamara's Blog

#### Journal/Conference Reviewer

IEEE TKDE 2014/2015, IEEE ICDE 2013/2014, SIGMOD 2013/2014

VLDB 2013/2014/2015, EDBT/ICDT 2015/2016

IACR EUROCRYPT 2013, IACR ASIACRYPT 2014...