This topic describes how to use logistic regression to generate a performance prediction model. You can use this model to predict the performance of students in an examination based on the family background of the students and their behavior at school. You can also obtain the key factors that affect the performance of students in examinations.
Background information
After you obtain the performance prediction model that is described in this topic, you can import your data to a MaxCompute table to perform offline prediction.
Dataset
The dataset that is used in this topic contains 25 feature fields and one goal field.
The following table describes the fields in the dataset.
The following figure shows the sample data that is used in the experiment.
Field | Type | Description |
---|---|---|
sex | STRING | The gender of the student. Valid values: F and M. F indicates that the student is a female. M indicates that the student is a male. |
address | STRING | The type of area where the student lives. Valid values: U and R. U indicates that the student lives in the urban area. R indicates that the student lives in the rural area. |
famsize | STRING | The number of family members. Valid values: LE3 and GT3. LE3 indicates that the number of family members is less than or equal to three. GT3 indicates that the number of family members is greater than three. |
pstatus | STRING | Indicates whether the student lives with parents. Valid values: T and A. T indicates that the student lives with parents. A indicates that the student does not live with parents. |
medu | STRING | The education level of the mother. Valid values: 0 to 4. A greater value indicates that the mother is better educated. |
fedu | STRING | The education level of the father. Valid values: 0 to 4. A greater value indicates that the father is better educated. |
mjob | STRING | The job of the mother. For example, the mother may work in the education, health, or services industry. |
fjob | STRING | The job of the father. For example, the father may work in the education, health, or services industry. |
guardian | STRING | The guardian of the student. Valid values: mother, father, and other. |
traveltime | DOUBLE | The travel time from home to school, in minutes. |
studytime | DOUBLE | The study time per week, in hours. |
failures | DOUBLE | The number of failed examinations. |
schoolsup | STRING | Indicates whether the student receives additional training in study. Valid values: yes and no. |
fumsup | STRING | Indicates whether the student has a tutor. Valid values: yes and no. |
paid | STRING | Indicates whether the student receives additional training for passing the examination. Valid values: yes and no. |
activities | STRING | Indicates whether the student receives extracurricular training courses. Valid values: yes and no. |
higher | STRING | Indicates whether the student pursues higher education. Valid values: yes and no. |
internet | STRING | Indicates whether the Internet is available for the student at home. Valid values: yes and no. |
famrel | DOUBLE | The family relationship of the student. Valid values: 1 to 5. A greater value indicates a better family relationship. |
freetime | DOUBLE | The free time available for the student. Valid values: 1 to 5. A greater value indicates a greater amount of free time. |
goout | DOUBLE | Indicates how often the student hangs out with friends. Valid values: 1 to 5. A greater value indicates that the students hangs out with friends more often. |
dalc | DOUBLE | Indicates how much the student drinks per day. Valid values: 1 to 5. A greater value indicates that the student drinks more. |
walc | DOUBLE | Indicates how much the student drinks per week. Valid values: 1 to 5. A greater value indicates that the student drinks more. |
health | DOUBLE | The health status of the student. Valid values: 1 to 5. A greater value indicates that the student has a better health status. |
absences | DOUBLE | The attendance of the student. Valid values: 0 to 93. |
g3 | DOUBLE | The performance in the final examination. The performance is scored at a maximum of 20 points. |

Procedure
- Go to the Machine Learning Studio console.
- Log on to the PAI console.
- In the left-side navigation pane, choose .
- On the PAI Visualization Modeling page, find the project in which you want to create an experiment and click Machine Learning in the Operation column.
- Create an experiment.
- Run the experiment and view the result.