Information Sciences Seminars——PSpec-SQL: Enabling Fine-Grained Control for Distributed Data Analytics
主 题: Information Sciences Seminars——PSpec-SQL: Enabling Fine-Grained Control for Distributed Data Analytics
报告人: Professor Fei He (Tsinghua University)
时 间: 2017-11-03 13:30-14:30
地 点: Room 1303, Sciences Building No. 1
Abstract: Business organizations regularly collect customer data to improve their services. Organizations may want to share data within themselves or even with third-parties to maximize data utility. Since business data contain lots of customer data, organizations must respect customers' privacy expounded by privacy laws. In this paper, we present PSpec-SQL, a distributed data analytics system that automatically enforces privacy compliance for SQL queries. Our system provides a high-level language PSpec for the data owner to specify her data usage policy. As usual, the data analyst queries data to perform data analysis, but our system checks each query to ensure only policy-compliant queries are executed.
We have implemented a prototype of PSpec-SQL on top of Spark-SQL, and carried out a case study on the TPC-DS benchmark. The results show the practicability of our system with negligible overhead over query processing.