Doubly Calibrated Estimator for Recommendation on Data Missing Not At Random

Doubly Calibrated Estimator for Recommendation on Data Missing Not At Random

May 13–17, 2024 | Wonbin Kweon, Hwanjo Yu
This paper proposes a Doubly Calibrated Estimator (DCE) for recommendation systems dealing with data missing not at random (MNAR). Traditional doubly robust (DR) estimators rely on imputed errors and propensity scores, which may be miscalibrated, leading to biased and variances in estimation. The authors argue that existing DR estimators are limited because they depend on rudimentary models for imputation and propensity estimation, which can produce overconfident predictions. To address this, the DCE introduces calibration experts that consider different logit distributions across users, allowing for more accurate imputation and propensity models. The method also proposes a tri-level joint learning framework that simultaneously optimizes calibration experts alongside prediction and imputation models. This approach ensures that both imputed errors and propensity scores are calibrated, leading to more accurate and unbiased recommendations. The DCE is validated using real-world datasets, demonstrating superior performance compared to existing methods. The results show that the DCE significantly improves the unbiased recommendation performance, with up to an 8.37% increase in MSE on the Coat dataset. The method is also shown to be effective in reducing both bias and variance in DR estimators, and it does not require any unbiased data for training. The paper provides theoretical insights into how miscalibrated imputation and propensity models can limit the effectiveness of DR estimators. It also presents a detailed analysis of the calibration process, showing that calibrated imputed errors and propensity scores can reduce the upper bounds on bias and variance in DR estimators. The proposed method is orthogonal to existing DR estimators and can be seamlessly integrated with them. The DCE is shown to outperform other debiasing methods in terms of recommendation performance, particularly in scenarios where the data is missing not at random.This paper proposes a Doubly Calibrated Estimator (DCE) for recommendation systems dealing with data missing not at random (MNAR). Traditional doubly robust (DR) estimators rely on imputed errors and propensity scores, which may be miscalibrated, leading to biased and variances in estimation. The authors argue that existing DR estimators are limited because they depend on rudimentary models for imputation and propensity estimation, which can produce overconfident predictions. To address this, the DCE introduces calibration experts that consider different logit distributions across users, allowing for more accurate imputation and propensity models. The method also proposes a tri-level joint learning framework that simultaneously optimizes calibration experts alongside prediction and imputation models. This approach ensures that both imputed errors and propensity scores are calibrated, leading to more accurate and unbiased recommendations. The DCE is validated using real-world datasets, demonstrating superior performance compared to existing methods. The results show that the DCE significantly improves the unbiased recommendation performance, with up to an 8.37% increase in MSE on the Coat dataset. The method is also shown to be effective in reducing both bias and variance in DR estimators, and it does not require any unbiased data for training. The paper provides theoretical insights into how miscalibrated imputation and propensity models can limit the effectiveness of DR estimators. It also presents a detailed analysis of the calibration process, showing that calibrated imputed errors and propensity scores can reduce the upper bounds on bias and variance in DR estimators. The proposed method is orthogonal to existing DR estimators and can be seamlessly integrated with them. The DCE is shown to outperform other debiasing methods in terms of recommendation performance, particularly in scenarios where the data is missing not at random.
Reach us at info@study.space