One of the questions often asked by Civitas Learning partners is, “How do I know I can trust a prediction and be confident in taking action?” This is an extremely important question, because having trust and confidence is key in moving to action and ultimately, to improvements in student outcomes. If we don’t have trust and confidence, we won’t do something. If we don’t do something, we won’t get to improved outcomes.
While Civitas Learning data scientists and engineers go through rigorous processes to ensure model quality, and accuracy in building and testing models, the methods they use are often inscrutable to those of us who aren’t data scientists. ROC (receiver operating characteristic) curves are their common language, but the rest of us need something more understandable.
I recently presented at our partner Summit a slightly more simplified perspective on the power of using early and accurate prediction scores. In case you missed that session, this series provides an overview of the predictions, their high accuracy on day one, how you can be confident in them, and examples of how to use the scores effectively.
What is a Prediction?
Let’s start by better understanding predictions and how they help us with student success. A prediction is a signal, a strong signal, created by a model built from the stories (data) of students who came before. It is a quantification of the likelihood that something will happen in the future (based on what has happened in the past). In this case, it is the likelihood that a student will persist based on what happened to similar students in the past. It is not deterministic – it is what is likely to happen, not what absolutely will happen. This is what makes a prediction powerful – we have the ability to change what is likely to happen by taking action.
Our predictive models assign each student a likelihood to persist each day. It’s important to note this is not a yes/no declaration of which students are destined to fail and which are destined to succeed. It is a score between zero and 100 that quantifies the probability that each student will persist without intervention.
These scores are then broken up into five groups — from very low to very high likelihood to persist. For institutions, this provides the ability to create precise strategies for each group and intervene differently based on each group’s overall likelihood to persist.
It is important to note that even in the lowest bucket there are students who will persist without intervention– and in the top bucket there are students who will not persist.
That said, if you think about this in the context of an intervention and outreach to students in the bottom bucket, you know that at least eight out of ten of those students are likely to not persist without your outreach efforts. On the other hand, if you outreach to students in the top bucket, nine out of ten are likely to return without extra intervention.
As our partners engage with this, they use the scores not to focus only on one group of students, but rather, to think about how each group deserves their own approach and strategy to help them stay and thrive.
Segmenting students by prediction score helps provide the right intervention to the right student and better personalize their journey.
Trusting the Prediction From Day One
Now that we have a frame for understanding the prediction scores and their purpose, how do we understand model performance in action at our partner institutions? We assess how the models performed by looking at what actually happened compared to what was predicted by the model.
To do this for each of our partners, we ran an analysis after their first term with daily prediction scores. The analysis looked at what was predicted on Day 1 of the term and compared it to actual persistence post-census in the target term. The predictions on Day One were highly accurate and detected most at-risk students.
To illustrate this, the chart above from one of our partner institutions shows the prediction scores versus actual persistence. The colored bars are the actual persistence rates organized by what was predicted. Starting on the left, where the model said students in the red had a 20% or less likelihood to persist on Day 1 of the term – 20% of students in that bucket persisted post census in the target term. Next, in the orange group, where the model said 20-50% would persist, 37% persisted. In the Yellow bucket, where the model said 50-70% of students would persist, 61% persisted, and so on. In each group, on Day 1 of the term, the model provided highly accurate scores.
In addition, in looking at the red, orange and yellow buckets, the model caught 78% of the non-persisting students for this institution on Day One. This level of detection is highly consistent across our partner institutions. In fact, when we expanded the lens to look across 50 institutions, our models caught, on average, 82% of the non-persisting students on Day One.
When we compared this detection rate to the commonly-used risk trigger of cumulative GPA, cumulative GPA only caught, on average, 21% of non-persisting students. Our Day One prediction is dramatically more powerful than simple triggers or models for detecting persistence risk.