I wanted to take this opportunity to investigate the important role of digital signal processing and machine learning in improving our ability to harness student success knowledge from SIS, LMS, and student touchpoint data in an observational setting.
In machine learning, many tend to focus on learning algorithms, but the real magic comes from feature engineering (FE). FE is a set of processes by which a student’s time-series event data is converted into more meaningful variables that help us glean insights into a student’s institutional journey. This journey is filled with academic and non-academic factors that can influence the student’s success measured in terms of course grades, persistence, completion, employment, and salary.
In the previous article, we discussed the importance of fully-blocked randomized controlled trial (RCT). The fundamental concept behind fully-blocked RCT was randomization at the pair level, meaning that students were matched “somehow” before randomization. The word “somehow” requires further explanation.
There are two different schools of thoughts here. One school of thought posits that we should not engage in data snooping, preferring to throw everything to matching via propensity score, Mahalanobis distance, or coarsened exact matching algorithms.
The second school of thought comes from the machine learning community. Given historical data, we can determine which student variables matter at baseline for student success metrics that we attempt to measure. Instead of matching students on a large number of student variables many of which may be irrelevant to predicting student success metrics, why not build an effective matching system that leverages the knowledge of the minimum number of highly relevant student variables known to influence student success? Figure 1 illustrates this point.
Now the key question is which student variables?
Having started my career in defense R&D and processed all kinds of sensor data spanning satellite imagery to underwater sonar data, I started seeing early on the value of linked events and stimulus-response patterns in inferring intents and trajectories of hostile threats or high-value moving targets. For example, data on when students enroll for next term and what kind of courses they take says a lot about them in terms of being proactive vs. procrastinating, staying on the right pathways vs. wandering around, the level of preparedness, etc. LMS activity trends in relation to high-value events can be quite insightful in helping us infer the student’s social and psychological factors. Student response patterns associated with well-designed micro surveys and insights derived from passive-sensing data done in student-privacy respecting ways can help us design more empathetic and nuanced multidimensional approaches to helping students do better.
The main purpose of digital signal processing (DSP) is to improve the signal-to-noise ratio (SNR) of weak signals using various signal transformation algorithms so that they can be detected and turned into linked events for deeper insights. While highly sophisticated DSP algorithms may be an overkill in higher education, there are some salient DSP concepts in time-series event linking and pattern learning when we treat student-level time-series touch point event data as data points along the student’s academic journey within and across institutions.
Next, let’s consider data compression for efficient matching. Again there are two schools of thought here. On one hand, we have the lossless CD school of thought — match students using student variables using the Mahalanobis distance in a high-dimensional vector space spanned by the student variables. On the other hand, many follow the work of Rubin and Rosenbaum in building propensity score models so that matching can take place in the one-dimensional propensity score space, where propensity score is defined as the probability of the student participating in treatment given his or her student variables.
We experimented with a number of different, yet complementary matching algorithms on a number of intervention programs in our evidence-based student success knowledge base. Our finding is that no matter which school of thought, the most important factor is that good modeling practices be followed. For both Mahalanobis distance matching and propensity score matching (PSM), we saw biases in predicted success rates of matched pilot and control groups. Once those biases were addressed via good predictive modeling techniques, such as regularization, feature ranking, and well-calibrated, highly accurate models, did we see the results that made sense in real-world situations. Following such intervention insights has led to continuously improving ROI.
Given what we have discussed, we will next talk about how to become Sherlock Holmes in investigating reported impact results.