Assignment 1
URBAN 5103 Transport Planning Methods
1. Download travel survey data and codebook from: https://www.psrc.org/travel-
surveys-spring-2014-household-survey
a) Calculate total travel distance (find the distance variable from codebook) for each
person (personID) and household (hhid) (save them as distance_person and
distance_household, respectively; use a hhsurvey-trips file). Provide a summary
statistics of total travel distance variable for person and household (Please remove
all missing or unexpected data).
b) Create new data (household) that includes household id, number of vehicles,
household size, number of workers in household. Please remove all missing values
(including prefer not to answer). Display summary statistics of all variables in
the data except id. Find its dimensions.
c) Display the data contained in the first 10 rows and in all columns except column
4.
d) Merge new data (household) and total travel distance for household
(distance_household). What is the correlation between total travel distance and
household size? Draw a histogram of log (total travel distance). Order the data by
total travel distance, smallest to largest. (for this problem, please google to find
proper functions. You can also find a way to insert title and axis labels).
2. Please run a multiple linear regression of your own choosing. You can choose any
dependent variable that you are interested in.
a) Write your equation
b) Provide a detailed explanation of your variable choice (i.e., independent variables)
with the expected signs of these variables
c) Test the validity of the model (F-test) and calculateR2. Explain the theoretical
foundation of the test and R2 based on the formula
d) Check the model assumptions  Email:51zuoyejun

@gmail.com