ECE 18-734: Foundations of Privacy Homework 3 Due Wednesday, Oct. 21 at 11:59 pm ET/8:59 pm PT 1 Global differential privacy [12 pts] Exercise 1. [4 pts] Consider a target function f : X → R. Prove that the Gaussian mechanism M (i.e., M(x) = f(x) +N (0, σ2)) is not -differentially private for any finite > 0. e Exercise 2. [4 pts] Suppose I have a database that contains the grade point average (ranging from 0.0 to 4.0) of each person in class: x1, . . . , xn. Now suppose I want to release the average GPA among all students with a GPA over 2.0. Design a differentially-private mechanism for releasing this statistic. Exercise 3. [4 pts] For the same database from the previous exercise, design a DP mechanism for releasing the median GPA of the class. How does this compare to the amount of noise you added for the previous exercise? 2 Using global differential privacy [14 pts] Exercise 4. (Implementing Differential Privacy) During this exercise we will be utilizing IBM’s recently-released differential privacy library, or diffprivlib. The paper can be found at https://arxiv.org/pdf/1907.02444.pdf, while the GitHub repository is lo- cated at https://github.com/IBM/differential-privacy-library. The repository contains differen- tially private implementations of the k-Means, Logistic Regression, and Naive Bayes algorithms. Install diffprivlib by typing the following into your terminal: > pip i n s t a l l d i f f p r i v l i b If you were having trouble getting webXray to run in the previous assignment, try using this instead: > python3 −m pip i n s t a l l d i f f p r i v l i b (a) [2 pts] What kind of prediction problems is logistic regression suitable for? Looking at the UCI Adult dataset (as mentioned on page 4 of the paper), is logistic regression a suitable algorithm for this dataset? 1 (b) [2 pts] What kind of problems would K-means be suitable for? Would you use it for the same kinds of problems as logistic regression? (c) [2 pts] A sample Jupyter Notebook for working with diffprivlib’s logistic regression implementa- tion can be found here: https://github.com/IBM/differential-privacy-library/blob/master/ notebooks/logistic_regression.ipynb. This problem and the following problems will be based on this notebook. Downloading the Jupyter notebook is not required. You can either compile the snippets into your own .py file or use the notebook itself. You will need to save and submit all code written for this exercise whether built from the notebook or implemented on your own. As implemented in the Jupyter notebook in cell [10]:, when training the differentially private logistic regression model, it is shown that by setting their epsilon value equal to float(inf), they can “produce the same result as the non-private logistic regression classifier”. Why does this happen? (d) [8 pts] Now we will train a new model on a different dataset and evaluate how a differentially private classifier compares to a regular classifier on this dataset. The dataset we will use for this is the Breast Cancer Wisconsin dataset. It is included with scikit-learn and can be initialized as below: from s k l ea rn import da ta s e t s datase t = data s e t s . l o a d b r e a s t c a n c e r ( ) Implement both a non-private classifier using scikit-learn and a private classifier using diffprivlib. Refer to the following resources for help using this dataset: https://scikit-learn.org/stable/datasets/index.html#breast-cancer-wisconsin-diagnostic-dataset https://www.kaggle.com/leemun1/predicting-breast-cancer-logistic-regression Choose a range of epsilon starting at a value less than 1 and ending above 1, and plot a graph comparing the accuracy of the above two classifiers. A similar graph and accompanying code can be found near the bottom of the example diffprivlib notebook. Save your graph (submission instructions at the end). Describe in a few sentences your interpretation of your graph. Specifically, how do the two accuracies compare and how does this relate to epsilon? Note: this will be different for every student depending on the range of epsilon chosen. 2
欢迎咨询51作业君