辅导案例-IDS575

欢迎使用51辅导,51作业君孵化低价透明的学长辅导平台,服务保持优质,平均费用压低50%以上! 51fudao.top
Midterm I
Logistics, Topics, Samples
(Midterm Note on September 21, 2020)
IDS575: Machine Learning and Statistical Methods
Moontae Lee
Update on Lecture Notes #04
– Based on feedback from the audiences, we have updated the naming conventions.
– Updated in the Lecture Notes. (Will not be updated in Annotated Notes due to lecture recording)
– ! in "#$(!) or (', )) in * ', )+ : standard parameters
– Used to be called user parameters or user-friendly parameters.
– , when converting to an exponential family: natural parameters
– Could be called canonical parameters in other textbook.
– - in regression models: model parameters
– Linearly interact with features (independent variables).
( l e c t u r e n o t e s a r e u p d a t e d t o r e m o v e t y p o s . d o w n l o a d t h e m a g a i n i f n e c e s s a r y )
9/22/20 2
Logistics
– Date: September 26 (Sat), 2020
– Where: Online synchronous session
– Time: 9:00am
– Duration: Approximately 2 hours
( w h e n a n d w h e r e )
9/21/20 3
Formats
– Closed book exam.
– Only one single cheat sheet is allowed. (letter size, double-sided)
– One additional single empty sheet is allowed as a scratch paper. (letter size, double-sided)
– Online software to use (through the Blackboard)
– Respondus Monitor (your exam solving will be monitored)
– Lockdown browser (you will not be able to navigate different windows during exam hours)
– Zoom (You will privately ask your question only to me. Then the instructor will answer for you)
( h o w t o t a k e a n e x a m )
9/21/20 4
Respondus Monitor
– Respondus Monitor
– Use your webcam to detect suspicious activities.
– Preparation
– Secure your webcam and double check the camera.
– Mostly compatible with Windows and Mac computers.
– iPad could be used with the dedicated app installation, but generally discouraged.
( w h a t t o p r e p a r e )
9/21/20 5
Lockdown Browser
– Lockdown browser
– Online proctoring environment working with Blackboard.
– You are unable to access other applications or websites.
– You cannot close the test until it is submitted.
– Refer to the articles carefully in the following two links.
– Download and install UIC’s version of Lockdown Browser
https://download.respondus.com/lockdown/download.php?id=344933365
– Confirm and follow the general guideline in advance!
https://answers.uillinois.edu/uic/99742
( w h a t t o p r e p a r e )
9/21/20 6
Instructions (1)
1. Close all the windows and applications in your computer.
– Don’t try to connect to the Zoom session first.
2. Open the Lockdown browser.
– You should install it before the exam time.
3. Login to Blackboard
– If you properly installed UIC’s version, Blackboard must be the default webpage.
4. Use ”Midterm I” on the exam section.
– Password: ids575!! (with two exclamation marks)
( l e t ’ s l e a r n s t e p - b y - s t e p )
9/22/20 7
Instructions (2)
5. Follow the necessary steps provided by Respondus Monitor.
– Will take some times but nothing complicated.
6. In the first setup question, click the Zoom link to open our zoom session.
– You should join in from the exam question. Do not try to launch a separate Zoom application.
7. Turn off both video and audio in Zoom.
– Only private chatting to me is allowed in the main exam for clarification questions.
8. Solve the exam.
– Put your maximum care trying not to close the window before submission!
( l e t ’ s l e a r n s t e p - b y - s t e p )
9/22/20 8
Guidelines
– Before taking an exam
– Select a location where you will not be interrupted
– Make sure you have a stable internet connection.
– Turn off all other mobile/electronic device once the main “Midterm I” starts.
– Clear your area except one single cheat sheet, one additional scratch paper, and pencils.
( p r i o r t o t a k i n g a n e x a m )
9/21/20 9
Guidelines
– During the main exam
– Remain seated at your desk/workstation for the entire duration of the test.
– Respondus monitor will alert instructors any suspicious activity.
– Lockdown browser will prevent you from accessing other websites or applications.
– Type your questions on the concurrent Zoom session privately only to the instructor.
– Watch out!
– Do not close your exam window or browser before submission! (you may lose your progress)
( w h i l e t a k i n g a n e x a m )
9/21/20 10
Formats
– True/False
– Each true/false question (only either true or false) will be followed by a short answer question.
– If you think the answer is true à justify your rationale briefly in the following short answer.
– If you think the answer is falseà provide a simple counter example in the following short answer.
– Multiple Choices
– Choose every option you think appropriate.
– Some questions will be followed by a short answer to ask your justification.
– Short answers
– Write up with the best answer with brevity.
( w h a t t y p e o f q u e s t i o n s w i l l b e t h e r e )
9/21/20 11
Basic machine learning setting
– Find instance, label, and example.
– Define input and output space.
– A hypothesis = a mathematical function from the input space to the output space.
– Hypothesis space = a set of all hypotheses given the input and the output space.
Q: Asking a basic property of mathematical function
Q: Counting the size of a hypothesis space under a certain condition.
( e x a m t o p i c s a n d s a m p l e q u e s t i o n s t o s t u d y )
9/22/20 12
k-Nearest Neighbor
– Understand the concept of lazy learning
– Understand the formulas to represent different kNN hypotheses.
– Understand how to draw decision boundaries and how to make actual predictions.
Q: Worst-case scenario.
Q: Given a simple training data, draw the decision boundaries.
Q: Make a prediction on toy examples.
( e x a m t o p i c s a n d s a m p l e q u e s t i o n s t o s t u d y )
9/22/20 13
Linear regression
– Understand the least-square objective (loss) function.
– Gradient-based training algorithms.
– Basic residual and normal equations. (no intricate proofs or heavy computations)
– Probabilistic formulation (what assumptions we have?) and training via MLE.
Q: Formulate a linear regression with few features. Derive a gradient-descent algorithm.
Q: Formulate a linear egression with toy real data. Answer for conceptual questions.
( e x a m t o p i c s a n d s a m p l e q u e s t i o n s t o s t u d y )
9/22/20 14
Logistic regression
– Role of sigmoid (logistic) link function and the power of function composition.
– Understand how to get optimal parameter via Maximum Likelihood Estimation
– Think about what type of decision boundaries will be made.
Q: Why is it called regression though it is purposed for classification?
Q: Formulate a logistic regression with few features.
Q: Choose proper decision boundaries.
( e x a m t o p i c s a n d s a m p l e q u e s t i o n s t o s t u d y )
9/22/20 15
Exponential family and Generalized Linear Models
– Convert a distribution to an exponential family.
– Understand why least-square loss a natural choice for linear regression.
– Understand why sigmoid function is a natural link function for logistic regression.
Q: Concept of statistics and sufficient statistics.
Q: Differentiate standard parameter, natural parameter, and model parameter.
Q: Simple conceptual questions.
( e x a m t o p i c s a n d s a m p l e q u e s t i o n s t o s t u d y )
9/22/20 16
Sample question (1)
– NOTE: decision tree is not the topic of our class. Just as an example purpose.
– Q: Does decision trees always achieve a zero training error if keep splitting attributes?
– Options: (a) True, (b) False
– Justification
– False. If a dataset includes two training examples that have same instance values but with different
label values, there must be a training error no matter how much individual attributes are split.
( t r u e / f a l s e + s h o r t j u s t i f i c a t i o n )
9/22/20 17
Sample question (2)
– Use plain alphanumeric symbols available in general keyboard.
– Superscript by ^
– Subscript by _
– Q: What is the gradient of ! = ||$||%% for $ ∈ ℝ(?
– Options: (a) 1 (b) 2$ (c) 2||$||% (d) 2$+, . . , 2$( (e) $%
– Justification
– It is y = (x_1^2 + x_2^2 + … + x_n^2). So ry/rx_j = 2x_j.
( m u l t i p l e c h o i c e + s h o r t j u s t i f i c a t i o n )
9/22/20 18
Sample question (3)
– You are trying to estimate housing price of Loop area in Chicago. Given 100 training data
points, your model is ! = #$%$ + #'%' + #( %( where %$ is the # of bedrooms, %' is the
distance to the closest station, and %( is the square feet.
– Q: What will be the expected signs of #$,#' , #(?
– Q: Which of the followings is the correct updating formula for SGD?
– Q: Is this Linear Regression?
– Q: This may be difficult to be a reasonable model. Why?
– Q: What happen if you have only 2 data points rather than 100?
( s h o r t a n s w e r s o r m u l t i p l e c h o i c e + s h o r t j u s t i f i c a t i o n )
9/22/20 19
Overall advice
– Try to figure out dimension of each mathematical symbol. (scalar, vector, matrix)
– Our lectures taught various statistical methods in general settings.
– In homework, you applied it to a mid-size specific cases.
– In exam, you will apply it to a toy-level problem but with more concrete numbers.
– Make sure finding answers for the questions within each slide.
– Don’t try to spend too much time to read and watch external material.
– Don’t try to copy-paste something from the lecture notes.
– In many cases, exam questions ask concrete numbers rather than about general forms and formulas.
( e x a m m u s t b e a n e x t e n s i o n o f l e c t u r e . e n j o y i t r a t h e r t h a n g e t t i n g s u f f e r e d )
9/22/20 20

欢迎咨询51作业君
51作业君

Email:51zuoyejun

@gmail.com

添加客服微信: abby12468