Ten Words Only Still Help: Improving Black-Box AI-Generated Text Detection via Proxy-Guided Efficient Re-Sampling

Ten Words Only Still Help: Improving Black-Box AI-Generated Text Detection via Proxy-Guided Efficient Re-Sampling

14 Feb 2024 | Yuhui Shi, Qiang Sheng, Juan Cao, Hao Mi, Beizhe Hu, Danding Wang
This paper proposes POGER, a proxy-guided efficient re-sampling method for black-box AI-generated text (AIGT) detection. The method estimates word generation probabilities via multiple re-sampling to improve AIGT detection under black-box settings. POGER selects a small subset of representative words (e.g., 10 words) for re-sampling, reducing sampling costs while maintaining detection performance. Experiments on datasets containing texts from humans and seven LLMs show that POGER outperforms existing baselines in macro F1 under black-box, partial white-box, and out-of-distribution settings. POGER also maintains lower re-sampling costs compared to existing methods. The method leverages proxy models to select words with low probabilities and low estimation errors, enhancing detection performance through contextual feature compensation. Results demonstrate that POGER achieves superior performance in binary, multiclass, and out-of-distribution AIGT detection scenarios. The method is efficient, cost-effective, and applicable to a wide range of AIGT detection tasks.This paper proposes POGER, a proxy-guided efficient re-sampling method for black-box AI-generated text (AIGT) detection. The method estimates word generation probabilities via multiple re-sampling to improve AIGT detection under black-box settings. POGER selects a small subset of representative words (e.g., 10 words) for re-sampling, reducing sampling costs while maintaining detection performance. Experiments on datasets containing texts from humans and seven LLMs show that POGER outperforms existing baselines in macro F1 under black-box, partial white-box, and out-of-distribution settings. POGER also maintains lower re-sampling costs compared to existing methods. The method leverages proxy models to select words with low probabilities and low estimation errors, enhancing detection performance through contextual feature compensation. Results demonstrate that POGER achieves superior performance in binary, multiclass, and out-of-distribution AIGT detection scenarios. The method is efficient, cost-effective, and applicable to a wide range of AIGT detection tasks.
Reach us at info@study.space