程序代写案例-CSC 503/SENG-Assignment 1
CSC 503/SENG 474: Assignment 1 Due on: May 27th at 23:59 PST Where: Brightspace (https://bright.uvic.ca/d2l/home/136102) Instructions: You must complete this assignment entirely on your own. In other words, you should come up with the solution yourself, write the code yourself, conduct the experiments yourself, analyze the results yourself, and finally, write it all solely by yourself. The university policies on academic dishonesty (a.k.a. cheating) will be taken very seriously. This does not mean that you need to go to a cave and self-isolate while preparing the assignment. You are allowed to have high-level discussions with your classmates about the course material. You are also more than welcome to use Piazza or come to office hours and ask questions. If in doubt, ask!— we are here to help. If you are still stuck, you can use books and published online material (i.e., material that has a fixed URL). However, you must explicitly credit all sources. You are also not allowed to copy-paste online materials. Woe to you if we catch you copy-pasting the uncredited sources! Why “if stuck”? Assignments are designed to develop your practical ML skills and make you strong. If you do the assignments well, the project will feel like a piece of cake. So, give your best. But, on the other hand, do not waste a whole week on a single question: if you are stuck on a question for a few days, ask (us) for help! If you cannot make it until the deadline, you can use a maximum of two grace days per assignment. They are not free, though: each grace day comes with the 25% mark penalty (so submitting on Monday evening would reduce your score by 25%; submitting on Tuesday would further reduce it by 50%). No other accommodations will be provided unless explicitly approved by the instructor at least 7 days before the deadline. These assignments are supposed to be really hard! Start early! You will need at least two weeks to complete them! If you do not feel challenged enough, please let me know, and I’ll think of something. Remember: you will need to gather at least one-third of all points during the assignments to pass the course. If you don’t, you will get an F! Make sure to follow the technical requirements outlined below. TAs have the full power to take 50% off your grade if you disregard some of them. Be sure that your answers are clear and easy for TAs to understand. They can penalize you if your solutions lack clarity or are convoluted (in a non-algebraic way), even if they are nominally correct. We will try to grade your assignments within seven (7) days of the initial submission deadline. If you think there is a problem with your grade, you have one week to raise concern after the grades go public. Grading TAs will be holding office hours during those seven days to address any such problems. After that, your grade is set in stone. Technical matters: You must type up your analysis and solutions electronically and submit them as a self-containing Jupyter notebook. Jupyter notebooks can contain code, its output, and images. They can also be used to type math and proofs in LT X mode. A E You must use LT X mode to type formulas. Typing “a^2=sqrt(3)+b1” is a pretty good way to lose 50% of your grade for no good reason. A E Each problem should be submitted as a separate file. Each file should be named SurnameInitial_N.ipynb, where N is two digit-padded problem number. Correct: SmithJ_05.ipynb. Incorrect: JohnSmith_V12345 Problem 1.ipynb, prob1.pdf etc. Zip all ipynb files and submit them as assignment1.zip to the Brightspace. Do not submit RAR, TAR, 7zip, SHAR and whatnot; just use good ol’ ZIP. Do not include other files. The first cell of each Jupyter notebook must start with your name and V number. See the attached notebook for the details. Your notebook should be organized sequentially according to the problem statement. Use sections (with the appropriate numbers and labels) within the notebook. Figures and relevant code should be placed in the proper location in the document. Notebook code must be runnable! Ideally, all answers will be the output of a code cell. You must use Python 3 to complete the assignments. Feel free to use NumPy and pandas as you find it fit. Use SciPy, scikit-learn, and other non-standard libraries only when explicitly allowed to do so. Your first executable cell should set the random seed to 1337 to ensure the reproducibility of your results. For Numpy/SciPy and pandas, use numpy.random.seed(1337); otherwise, use random.seed(10). Document your code! Use either Markdown cells or Python comments to let us know what you have done! Finally, be concise! We do not appreciate long essays that amount to basically nothing ( ). This assignment consists of 11 problems. Some are intended only for graduate students (those taking CSC 503), and are labeled as such. Some contain bonus sections: you can use bonus points to improve your overall homework score. Bonus points cannot be transferred to other assignments or the final project. Any graduate- level problem counts as a bonus problem for undergraduate students. Some problems are interconnected: you cannot solve Problem 2 without solving Problem 1. Some problems are purposefully open-ended. Whatever you think a correct answer is, make sure to support it with code and data. If all this feels dull, read this for some motivation (credits to M. Schmidt): https://www.quora.com/Why-should-one-learn-machine-learning-from-scratch- rather-than-just-learning-to-use-the-available-libraries Problem 1. The American Job [50 points] The never-ending reality show called “The US Presidential Elections” has a saving grace: it leaves a long trail of data that we can use to come up with all sorts of nebulous “scientific” and “data-driven” conclusions. These conclusions can be, in turn, used to annoy our Twitter followers or Facebook friends. Here is one such dataset: https://raw.githubusercontent.com/kkehoe1985/ga_data_science_final_project/mas ter/combined_data.csv. As you can see, this dataset is quite messy— 82 features in total! Let’s clean it up!