辅导案例-TASK 3

欢迎使用51辅导，51作业君孵化低价透明的学长辅导平台，服务保持优质，平均费用压低50%以上！ 51fudao.top

11/8/20, 4:35 PMTask 3 - Bayesian Optimization
Page 1 of 3https://project.las.ethz.ch/task3/
! TASK 3 - BAYESIAN OPTIMIZATION
Hyperparameter tuning with Bayesian optimization
You are in the group " Pairechong consisting of # huizhang ([email protected] (mailto://[u'[email protected]'])), # wenbwang
([email protected] (mailto://[u'[email protected]'])) and # zhasun ([email protected] (mailto://[u'[email protected]'])).
$ 1. TASK DESCRIPTION
TASK
In this task, you should use Bayesian optimization to tune one hyperparameter of a machine learning model subject to a constraint on a property of the
model. Let denote the hyperparameter of interest, e.g., the number of layers in a deep neural network. We aim at finding a network that makes
accurate and fast predictions. To this end, we can train our model for a specific value of on a training data set. From this, we obtain a corresponding
accuracy on the validation data set and an average prediction speed. In this context, our goal is to find , i.e., the value of the hyperparameter that induces
the highest possible validation accuracy, such that a requirement on the average prediction speed is satisfied.
More formally, let us denote with the mapping from the space of hyperparameter to the corresponding validation accuracy. When we train
with a given , we observe , where is zero mean Gaussian i.i.d. noise. Moreover, we denote with the mapping from
the space of hyperparameters to the corresponding prediction speed. Similar to the accuracy, we observe a noisy value of the speed, which we denote with
, with zero mean Gaussian i.i.d. noise. Then, the problem is formalized as,
where is the minimum tolerated prediction speed. The objective of this problem does not admit an analytical expression, is computationally expensive to
evaluate and is only accessible through noisy evaluations. Therefore, it is well suited to Bayesian optimization (see [1] for further reading on Bayesian
optimization for hyperparameter tuning).
In this project, you need to solve the hyperparameter tuning problem presented above with Bayesian optimization. Let be the hyperparameter evaluated
at the iteration of the Bayesian optimization algorithm. While running the Bayesian Optimization algorithm, you may try hyperparameters for which the
speed constraint is violated, i.e. . However, the final solution must satisfy .
Remarks: In the motivating example above, takes discrete values (number of layers is a Natural number) and the objective and the constraint can be
evaluated independently. However, to keep the problem simple, we let be continuous and we evaluate and simultaneously. Moreover, to avoid unfair
advantages due to differences in computational power, the training of the neural network is simulated and, therefore, the time required for this step is
platform independent. This task does not have a private score.
Below, you can find the quantitative details of this problem.
The domain is .
The noise perturbing the observation is Gaussian with standard deviation and for the accuracy and the speed, respectively.
The mapping can be effectively modeled with a Matérn (https://en.wikipedia.org/wiki/Mat%C3%A9rn_covariance_function) kernel with variance ,
lengthscale and smootheness parameter .
The mapping can be effectively modeled with a constant mean of and a Matérn
(https://en.wikipedia.org/wiki/Mat%C3%A9rn_covariance_function) kernel with variance , lengthscale , and smootheness parameter .
The minimum tolerated speed is . The unit of measurement is not relevant as the training is only simulated.
SUBMISSION WORKFLOW
1. Install and start Docker (https://www.docker.com/get-started). Understanding how Docker works and how to use it is beyond the scope of the project.
Nevertheless, if you are interested, you could read about its use cases (https://www.docker.com/use-cases).
2. Download handout (/static/task3_handout.zip)
3. The handout contains the solution template solution.py . You should write your code for calculating the log posterior probabilities right below the #
TODO: enter your code here marker in the solution template.
4. You should use Python 3.8.5. You are free to use any other libraries that are not already imported in the solution template. Important: please make sure
that you list these additional libraries together with their versions in the requirements.txt file provided in the handout.
5. Once you have implemented your solution, run bash runner.sh - this will run the local checker in Docker. Note: on some operating systems, you might
need to run sudo bash runner.sh if you see a Docker permission denied error.
6. If the checker fails, it will display an appropriate error message. If the check was successful, the a file called results_check.byte will be generated. You
$ 1. READ THE TASK DESCRIPTION % 2. SUBMIT SOLUTIONS & 3. HAND IN FINAL SOLUTION
θ ∈ Θ
θ
θ∗
f : Θ → [0, 1]
θ yf = f(θ) + εf εf ∼ N (0,σ2f) v : Θ → R
+
yv = v(θ) + εv εv ∼ N (0,σ2v)
θ∗ ∈ argmaxθ∈Θ,v(θ)>κf(θ),
κ
θi
ith
v(θi) < κ v(θ) ≥ κ
θ
θ f v
Θ = [0, 5]
σf = 0.15 σv = 0.0001
f 0.5
0.5 ν = 2.5
v 1.5
√2 0.5 ν = 2.5
κ = vmin = 1.2
11/8/20, 4:35 PMTask 3 - Bayesian Optimization
Page 2 of 3https://project.las.ethz.ch/task3/
should upload this file together with your source code to the project server.
EVALUATION
In this kind of problems, we are interested in minimizing the normalized regret for not knowing the best hyperparameter value. Let be the optimal solution
suggested by the algorithm, define the regret as follows:
Which means, if the solution violates the minimum speed criteria, then . To evaluate your algorithm, random hyperparameter tuning tasks as the
one described above are generated and your final score is computed as
You successfully pass the project if . Note that are random variables, however the baseline has been set taking into account the variance of . In
other words, you don't need to worry about this randomness.
Hint: This task is designed such that a correct implementation of a standard Bayesian optimization algorithm (i.e. unaware of the speed constraint) might
not be sufficient to pass. We strongly suggest taking into account the constraint on prediction speed. For example, by making the choice of dependent on
how likely it is to lead to a fast network, you should obtain a better score. For further reading on the topic (in a more challenging setting than the one
presented here), see [2].
GRADING
When handing in the task, you need to select which of your submissions will get graded and provide a short description of your approach. This has to be
done individually by each member of the team. Your submisssion is graded as either pass or fail. A complete submission typically consists of the following
three components:
Submission file: The results_check.byte file generated by the runner.sh script which tries to execute your code and checks whether it fulfills the
requirements of the task.
Your code in form of a .py or .zip file. The source code must be runnable and able to reproduce your uploaded results_check.byte file.
A description of your approach which is consistent with your code. If you do not hand in a description of your approach, you may obtain zero points
regardless of how well your submission performs.
To pass the task, your submission needs to be complete and outperform the baseline in terms of public and private (if the task has one) scores. Some tasks
only have a single score on which you have to improve upon the baseline. Other tasks have a public and private score. In order to pass such tasks, you need
to achieve a higher compound score (average of public and private score) than the baseline.
' Make sure that you properly hand in the task, otherwise you may obtain zero points for this task. If you successfully completed the hand-in, you
should see the respective task on the overview page shaded in green.
FREQUENTLY ASKED QUESTIONS
WHICH PROGRAMMING LANGUAGE AM I SUPPOSED TO USE? WHAT TOOLS AM I ALLOWED TO USE?
You should implement your solutions in Python 3. You can use any publicly available code, but you should specify the source as a comment in your code.
AM I ALLOWED TO USE METHODS THAT WERE NOT TAUGHT IN THE CLASS?
Yes. Nevertheless, the baselines were designed to be solvable based on the material taught in the class up to the second week of each task.
IN WHAT FORMAT SHOULD I SUBMIT THE CODE?
If you changed only the solution file, you can submit it alone. If you changed other files too, then you should submit all changed files in a zip of size max. 1 MB.
You can assume that all files from the handout that you have not changed will be available to your code.
WILL YOU CHECK / RUN MY CODE?
We will check your code and compare it with other submissions. If necessary, we will also run your code. Please make sure that your code is runnable and
your results are reproducible (fix the random seeds, etc.). Provide a readme if necessary.
SHOULD I INCLUDE THE HANDOUT DATA IN THE SUBMISSION?
No. You can assume the data will be available under the same path as in the handout folder.
CAN YOU HELP ME SOLVE THE TASK? CAN YOU GIVE ME A HINT?
As the tasks are a graded part of the class, we cannot help you solve them. However, feel free to ask general questions about the course material during or
after the exercise sessions.
CAN YOU GIVE ME A DEADLINE EXTENSION?
' We do not grant any deadline extensions!
CAN I POST ON PIAZZA AS SOON AS HAVE A QUESTION?
This is highly discouraged. Remember that collaboration with other teams is prohibited. Instead,
~
θ
r =
⎧⎨⎩ v(
~
θ) ≥ κ
1 v(~θ) < κ
f(θ∗)−f(~θ)
f(θ∗)
r = 1 75
R =
75
∑
j=1
rj.
1
75
R < 0.17 rj R
θi








11/8/20, 4:35 PMTask 3 - Bayesian Optimization
Page 3 of 3https://project.las.ethz.ch/task3/
Read the details of the task thoroughly.
Review the frequently asked questions.
If there is another team that solved the task, spend more time thinking.
Discuss it with your team-mates.
WHEN WILL I RECEIVE THE PROJECT GRADES?
We will publish the project grades before the exam the latest.
REFERENCES
[1] Snoek, Jasper, Hugo Larochelle, and Ryan P. Adams. "Practical bayesian optimization of machine learning algorithms." Advances in neural information
processing systems. 2012. (https://papers.nips.cc/paper/4522-practical-bayesian-optimization-of-machine-learning-algorithms.pdf)
[2] Michael A. Gelbart, Jasper Snoek, and Ryan P. Adams. "Bayesian optimization with unknown constraints." In Proceedings of the Thirtieth Conference on
Uncertainty in Artificial Intelligence. 2014. (https://www.cs.princeton.edu/~rpa/pubs/gelbart2014constraints.pdf)


欢迎咨询51作业君