辅导案例-ECE 18-734

欢迎使用51辅导,51作业君孵化低价透明的学长辅导平台,服务保持优质,平均费用压低50%以上! 51fudao.top
ECE 18-734: Foundations of Privacy
Homework 3
Due Wednesday, Oct. 21 at 11:59 pm ET/8:59 pm PT
1 Global differential privacy [12 pts]
Exercise 1. [4 pts] Consider a target function f : X → R. Prove that the Gaussian mechanism M (i.e.,
M(x) = f(x) +N (0, σ2)) is not -differentially private for any finite > 0. e
Exercise 2. [4 pts] Suppose I have a database that contains the grade point average (ranging from 0.0 to
4.0) of each person in class: x1, . . . , xn. Now suppose I want to release the average GPA among all students
with a GPA over 2.0. Design a differentially-private mechanism for releasing this statistic.
Exercise 3. [4 pts] For the same database from the previous exercise, design a DP mechanism for releasing
the median GPA of the class. How does this compare to the amount of noise you added for the previous
exercise?
2 Using global differential privacy [14 pts]
Exercise 4. (Implementing Differential Privacy)
During this exercise we will be utilizing IBM’s recently-released differential privacy library, or diffprivlib.
The paper can be found at https://arxiv.org/pdf/1907.02444.pdf, while the GitHub repository is lo-
cated at https://github.com/IBM/differential-privacy-library. The repository contains differen-
tially private implementations of the k-Means, Logistic Regression, and Naive Bayes algorithms.
Install diffprivlib by typing the following into your terminal:
> pip i n s t a l l d i f f p r i v l i b
If you were having trouble getting webXray to run in the previous assignment, try using this instead:
> python3 −m pip i n s t a l l d i f f p r i v l i b
(a) [2 pts] What kind of prediction problems is logistic regression suitable for? Looking at the UCI Adult
dataset (as mentioned on page 4 of the paper), is logistic regression a suitable algorithm for this dataset?
1
(b) [2 pts] What kind of problems would K-means be suitable for? Would you use it for the same kinds of
problems as logistic regression?
(c) [2 pts] A sample Jupyter Notebook for working with diffprivlib’s logistic regression implementa-
tion can be found here: https://github.com/IBM/differential-privacy-library/blob/master/
notebooks/logistic_regression.ipynb. This problem and the following problems will be based on
this notebook. Downloading the Jupyter notebook is not required. You can either compile the snippets
into your own .py file or use the notebook itself. You will need to save and submit all code written for
this exercise whether built from the notebook or implemented on your own.
As implemented in the Jupyter notebook in cell [10]:, when training the differentially private logistic
regression model, it is shown that by setting their epsilon value equal to float(inf), they can “produce
the same result as the non-private logistic regression classifier”. Why does this happen?
(d) [8 pts] Now we will train a new model on a different dataset and evaluate how a differentially private
classifier compares to a regular classifier on this dataset.
The dataset we will use for this is the Breast Cancer Wisconsin dataset. It is included with scikit-learn
and can be initialized as below:
from s k l ea rn import da ta s e t s
datase t = data s e t s . l o a d b r e a s t c a n c e r ( )
Implement both a non-private classifier using scikit-learn and a private classifier using diffprivlib.
Refer to the following resources for help using this dataset:
https://scikit-learn.org/stable/datasets/index.html#breast-cancer-wisconsin-diagnostic-dataset
https://www.kaggle.com/leemun1/predicting-breast-cancer-logistic-regression
Choose a range of epsilon starting at a value less than 1 and ending above 1, and plot a graph comparing
the accuracy of the above two classifiers. A similar graph and accompanying code can be found near
the bottom of the example diffprivlib notebook. Save your graph (submission instructions at the
end).
Describe in a few sentences your interpretation of your graph. Specifically, how do the two accuracies
compare and how does this relate to epsilon? Note: this will be different for every student depending
on the range of epsilon chosen.
2

欢迎咨询51作业君
51作业君

Email:51zuoyejun

@gmail.com

添加客服微信: abby12468