CAIMS Code of Conduct
“All participants are expected to comply with the CAIMS-SCMAI Code of Conduct”
Continuous Semi-Supervised Nonnegative Matrix Factorization
Nonnegative matrix factorization (NMF) is an unsupervised method to detect topics within a corpus. It amounts to an approximation of a nonnegative matrix as the product of two nonnegative matrices of lower rank. In certain applications, it is desirable to extract topics and use them to predict quantitative outcomes. In this talk, we show how NMF can be combined with regression on a continuous response variable through a weighted penalty function. We test our method on synthetic data and on real data coming from Rate My Professors reviews to predict an instructor’s rating from their comments. When used as a dimensionality reduction method, the method performs better than doing regression after topics are identified and it retrains interpretability.