Assignment 3: Model Selection & Inference Pipeline
Due Sep 16 by 3am
Points 100
Submitting a website url
New Attempt
PREREQUISITES: Review the Metrics & Model Selection and Deployment & Post-Deployment lectures.
OBJECTIVES: Develop models, evaluate alternative models, and design an inference pipeline. Place all
files in your provisioned repository under the directory securebank/ (e.g.,
securebank/modules/model.py). All saved artifacts must be written in a directory called
securebank/storage/ (e.g., securebank/storage/models/). For each function or method, provide a default
value for arguments for those not listed in the "minimum requirements."
Task 1: In a python notebook called analysis/model_performance.ipynb write scripts to train and
evaluate models for model selection using data generated by your Data Pipeline (See Assignment 2).
Your notebook should train and store three sklearn models (e.g., Logistic Regression, Support Vector
Machines, Random Forest, etc.). Feel free to modify your previous submissions if you see fit. Store your
models in storage/models/artifacts/.
NOTE: For the case study submission, you will be asked to thoroughly defend the methods (i.e., metrics,
data partitioning, analysis, etc.) used to evaluate and select your models.
Task 2: Develop an inference pipeline for your system. In a python module called pipeline.py, write a
"Pipeline" class that stores all the processes/logic your system would need to address the requirements.
Feel free to modify or add functionality to your previous submissions if you see fit.
This class will have at least FOUR methods:
__init__() initializes the inference object and loads the default model.
minimum arguments:
version: str (e.g., storage/models/artifacts/{version_name})
predict() uses the specified model to perform predictions
minimum arguments:
input_data: Dict()
returns:
prediction_output: bool (i.e., 0: legitimate, 1: fraud)
select_model() loads a model from a catalog of pre-trained models in storage/models/artifacts/
minimum arguments:
2024/9/19 20:44Assignment 3: Model Selection & Inference Pipeline
https://jhu.instructure.com/courses/82966/assignments/878708?return_to=https%3A%2F%2Fjhu.instructure.com%2Fcalendar%23view_name%3Dmo...1/3
version: str (e.g., storage/models/artifacts/{version_name})
returns:
None
get_history() returns information on previous system predictions
minimum arguments:
None
returns
history: Dict
The input_data should be formatted as a dictionary with the following keys:
'trans_date_trans_time',
'cc_num'
'unix_time'
'merchant'
'category'
'amt'
'merch_lat'
'merch_long'
Task 3: In a python script called app.py, implement a Flask server (using the code you've developed in
the previous assignments) with the following endpoint:
predict/, which classifies a transaction as legitimate or fraudulent. Transaction information will be
formatted as a .json string carrying transaction information with the following keys:
'trans_date_trans_time',
'cc_num'
'unix_time'
'merchant'
'category'
'amt'
'merch_lat'
'merch_long'
Provide a test.json file in your submission.
NOTE: Begin considering other endpoints for which you must address the requirements (See Case
Study Introduction).
Task 4: Write a Dockerfile to build a Docker image and run a Docker container for users to interact with
the system.
2024/9/19 20:44Assignment 3: Model Selection & Inference Pipeline
https://jhu.instructure.com/courses/82966/assignments/878708?return_to=https%3A%2F%2Fjhu.instructure.com%2Fcalendar%23view_name%3Dmo...2/3
Total Points: 100
Assignment 3: Model Selection & Inference Pipeline
CriteriaRatingsPts
40 pts
40 pts
10 pts
10 pts
SUBMISSION: You will need to check in the following files and any supporting python modules:
securebank/analysis/model_performance.ipynb
securebank/pipline.py
securebank/app.py
securebank/test.json
securebank/Dockerfile
Provide GitHub the URL link to your model_performance.ipynb file via Canvas to get credit for this
submission.
Task 1
Submission provided a strong argument for the selected model.
Task 2
Submission implemented all four well-designed functions and capabilities
Task 3
Submission includes a working implementation of predict endpoint
Task 4
Submission includes a working Dockerfile implementation
2024/9/19 20:44Assignment 3: Model Selection & Inference Pipeline
https://jhu.instructure.com/courses/82966/assignments/878708?return_to=https%3A%2F%2Fjhu.instructure.com%2Fcalendar%23view_name%3Dmo...3/3