程序辅导案例 > Program >

代写辅导接单-QBUS2820 --Assignment 1

欢迎使用51辅导，51作业君孵化低价透明的学长辅导平台，服务保持优质，平均费用压低50%以上！ 51fudao.top

QBUS2820 Assignment 1 (30 marks) August 23, 2024 1 Background Developing a predictive model for building heating load is essential in energy efficiency management. Suppose you work for an energy efficiency consulting firm, and your task is to optimize the heating system operations of buildings by predicting their daily heating load requirements. The variable HeatingLoad in the dataset HeatingLoad training.csv represents the daily energy required to maintain comfortable indoor temperatures in buildings. This data includes several predictors that influence heating load, such as building characteristics, en- vironmental conditions, and occupancy. The response variable and covariates are detailed in the table below. Variable Description HeatingLoad Total daily heating energy required (in kWh) BuildingAge Age of the building (in years) BuildingHeight Height of the building (in meters) Insulation Insulation quality (1 = Good, 0 = Poor) AverageTemperature Average daily temperature (in °C) SunlightExposure Solar energy received per unit area (in W/m²) WindSpeed Wind speed at the building’s location (in m/s) OccupancyRate Proportion of the building that is occupied (percentage) Table 1: Description of Variables Your task is to develop a regression model to predict HeatingLoad based on these covari- ates. Additionally, you are provided with the dataset HeatingLoad test without HL.csv, which is the real test dataset HeatingLoad test.csv with the HeatingLoad column re- moved. The test dataset HeatingLoad test.csv (not provided) has the same structure as the training data HeatingLoad training.csv. 1.1 Test Error To measure prediction accuracy, please use mean squared error (MSE) on the test data. Let yˆi be the prediction of yi, where yi is the i-th HeatingLoad in the test data. The test error is computed as follows: Test error “ 1 ntest ÿ yiPtest data pyˆi ´ yiq2, where ntest is the number of observations in the test data. 1 2 Submission Instructions 1. Please submit THREE files (or more if necessary) via the Canvas site: • A document file named SID Assignment1 document.pdf, reporting your data analysis procedure and results. You should replace “SID” with your student ID. • A Python file named SID Assignment1 implementation.ipynb that imple- ments your data analysis procedure and produces the test error. You may submit additional files if needed, following the format SID Assignment1 . • A CSV file SID Assignment1 HL prediction.csv containing the predictions of HeatingLoad for the dataset HeatingLoad test without HL.csv. This CSV file should have only one column, named HeatingLoad, which holds the pre- dicted values. 2. Regarding your document file SID Assignment1 document.pdf : • Detail your data analysis procedure: how the Exploratory Data Analysis (EDA) was conducted, the methods/predictors used, and the reasoning behind them. The description should be thorough enough for other data scientists in your field to understand and replicate the task. All numerical results should be reported to four decimal places. • Present relevant graphs and tables clearly and appropriately. • The page limit is 15 pages, including everything: appendices, computer output, graphs, tables, etc. 3. The Python file must be written using Jupyter Notebook, assuming all necessary data files (HeatingLoad training.csv and HeatingLoad test.csv) are in the same folder as the Python file. • The Python file SID Assignment1 implementation.ipynb must include the following code in the last code cell: import pandas as pd HeatingLoad_test = pd.read_csv("HeatingLoad_test.csv") # YOUR CODE HERE: code that produces the test error test_error print(test_error) The marker expects to see the same test error you would obtain if you were provided with the complete test data. The file should contain enough explanations for the marker to run your code. • Use only the methods covered in the lectures and tutorials. You are free to use any Python libraries to implement your models as long as they are publicly available. 2 3 Marking Criteria This assignment is worth 30 marks in total, with 18 marks allocated to the content of SID Assignment1 document.pdf and 12 marks to the Python implementation. The marking breakdown is as follows: 1. Prediction accuracy: Your test error will be compared against the smallest test error among all submissions, including the teaching team. The marker first runs SID Assignment1 implementation.ipynb. • If the file runs smoothly and produces a test error, up to 12 marks will be awarded based on prediction accuracy relative to the smallest MSE and the appropriateness of your implementation. • If the marker cannot run SID Assignment1 implementation.ipynb or if no test error is produced, partial marks (maximum 4) may be awarded based on the appropriateness of the file. 2. Report described in SID Assignment1 document.pdf : Up to 18 marks are allo- cated based on: • The appropriateness of the chosen prediction method. • The detail, discussion, and explanation of your data analysis procedure. See the Marking Criteria for more details. 3. CSV File Submission: Up to 2 marks will be deducted if you fail to upload the CSV file in the correct format. 4 Errors If you believe there are errors in this assignment, please contact the teaching team. 3 51作业君版权所有