辅导案例-130-US

欢迎使用51辅导,51作业君孵化低价透明的学长辅导平台,服务保持优质,平均费用压低50%以上! 51fudao.top
Diabetes 130-US hospitals for years 1999-2008
Data
The dataset represents 10 years (1999-2008) of clinical care at 130 US hospitals and
integrated delivery networks. It includes over 50 features representing patient and hospital
outcomes. Information was extracted from the database for encounters that satisfied the
following criteria.
● It is an inpatient encounter (a hospital admission).
● It is a diabetic encounter, that is, one during which any kind of diabetes was entered
to the system as a diagnosis.
● The length of stay was at least 1 day and at most 14 days.
● Laboratory tests were performed during the encounter.
● Medications were administered during the encounter.

This data has been prepared to analyze factors related to readmission as well as other
outcomes pertaining to patients with diabetes.

Dataset is in .csv format. It has 100000 instances. An instance corresponds to an admittance
of a patient. You have to predict whether a patient is readmitted in ‘less than 30 days’, ‘more
than 30 days’ or no readmission at all. Dataset has 50 features.


Feature name Type Description and values
Encounter ID Numeric Unique identifier of an encounter
Patient number Numeric Unique identifier of a patient
Race Nominal
Values: Caucasian, Asian, African American, Hispanic, and
other
Gender Nominal Values: male, female, and unknown/invalid
Age Nominal Grouped in 10-year intervals: 0, 10), 10, 20), …, 90, 100)
Weight Numeric Weight in pounds.
Admission type Nominal
Integer identifier corresponding to 9 distinct values, for example,
emergency, urgent, elective, newborn, and not available
Discharge
disposition Nominal
Integer identifier corresponding to 29 distinct values, for
example, discharged to home, expired, and not available
Admission
source Nominal
Integer identifier corresponding to 21 distinct values, for
example, physician referral, emergency room, and transfer from
a hospital
Time in hospital Numeric Integer number of days between admission and discharge
Payer code Nominal
Integer identifier corresponding to 23 distinct values, for
example, Blue Cross/Blue Shield, Medicare, and self-pay
Medical
specialty Nominal
Integer identifier of a speciality of the admitting physician,
corresponding to 84 distinct values, for example, cardiology,
internal medicine, family/general practice, and surgeon
Number of lab
procedures Numeric Number of lab tests performed during the encounter
Number of
procedures Numeric
Number of procedures (other than lab tests) performed during
the encounter
Number of
medications Numeric
Number of distinct generic names administered during the
encounter
Number of
outpatient visits Numeric
Number of outpatient visits of the patient in the year preceding
the encounter
Number of
emergency
visits Numeric
Number of emergency visits of the patient in the year preceding
the encounter
Number of
inpatient visits Numeric
Number of inpatient visits of the patient in the year preceding the
encounter
Diagnosis 1 Nominal
The primary diagnosis (coded as first three digits of ICD9); 848
distinct values
Diagnosis 2 Nominal
Secondary diagnosis (coded as first three digits of ICD9); 923
distinct values
Diagnosis 3 Nominal
Additional secondary diagnosis (coded as first three digits of
ICD9); 954 distinct values
Number of
diagnoses Numeric Number of diagnoses entered to the system
Glucose serum
test result Nominal
Indicates the range of the result or if the test was not taken.
Values: “>200,” “>300,” “normal,” and “none” if not measured
A1c test result Nominal
Indicates the range of the result or if the test was not taken.
Values: “>8” if the result was greater than 8%, “>7” if the result
was greater than 7% but less than 8%, “normal” if the result was
less than 7%, and “none” if not measured.
Change of
medications Nominal
Indicates if there was a change in diabetic medications (either
dosage or generic name). Values: “change” and “no change”
Diabetes
medications Nominal
Indicates if there was any diabetic medication prescribed.
Values: “yes” and “no”
24 features for
medications Nominal
For the generic names: metformin, repaglinide, nateglinide,
chlorpropamide, glimepiride, acetohexamide, glipizide, glyburide,
tolbutamide, pioglitazone, rosiglitazone, acarbose, miglitol,
troglitazone, tolazamide, examide, sitagliptin, insulin,
glyburide-metformin, glipizide-metformin,
glimepiride-pioglitazone, metformin-rosiglitazone, and
metformin-pioglitazone, the feature indicates whether the drug
was prescribed or there was a change in the dosage. Values:
“up” if the dosage was increased during the encounter, “down” if
the dosage was decreased, “steady” if the dosage did not
change, and “no” if the drug was not prescribed
Readmitted Nominal
Days to inpatient readmission. Values: “<30” if the patient was
readmitted in less than 30 days, “>30” if the patient was
readmitted in more than 30 days, and “No” for no record of
readmission.


Instructions:
● You should create a model to predict the target class using the features given.
● You have the freedom to use any method or techniques to analyze the data, train
models and evaluate the results. But you should only use standard python libraries +
scipy + pandas tools. No third-party libraries are allowed.
● You must deliver your project in the form of a jupyter notebook.
● Jupyter notebooks are all about telling a story using the data. So make your
notebook that way, present everything nicely and have a good flow.
● We will not mark you on the final accuracy you get. We will mark you on how well
you explain the decisions you have made in every stage of your project, reasoning
behind your decisions and proper data representation, evaluation and analyzing
methods. All of these must be shown within the notebook itself.
● Make sure you attend labs and complete assignments, that will help you a lot.


欢迎咨询51作业君
51作业君

Email:51zuoyejun

@gmail.com

添加客服微信: abby12468