Parameter-Efficient Detoxification with Contrastive Decoding

Parameter-Efficient Detoxification with Contrastive Decoding

13 Jan 2024 | Tong Niu, Caiming Xiong, Semih Yavuz*, Yingbo Zhou*
The paper introduces DETOXIGEN, a parameter-efficient framework for detoxifying text generation at inference time. DETOXIGEN consists of a pre-trained language model (generator) and a *detoxifier*, which is trained on toxic data to generate undesirable styles. During generation, the *detoxifier* produces tokens that the *generator* is discouraged from generating. This approach significantly reduces toxicity while maintaining generation quality. The *detoxifier* shares the same backbone weights with the *generator*, minimizing additional parameters. Evaluations on the REALTOXICITYPROMPTS benchmark show that DETOXIGEN outperforms previous methods in detoxification metrics. The framework is also evaluated on various language models, demonstrating its versatility and efficiency. Key contributions include a performance improvement over existing detoxification methods, parameter-efficient learning, and the ability to handle multiple attributes.The paper introduces DETOXIGEN, a parameter-efficient framework for detoxifying text generation at inference time. DETOXIGEN consists of a pre-trained language model (generator) and a *detoxifier*, which is trained on toxic data to generate undesirable styles. During generation, the *detoxifier* produces tokens that the *generator* is discouraged from generating. This approach significantly reduces toxicity while maintaining generation quality. The *detoxifier* shares the same backbone weights with the *generator*, minimizing additional parameters. Evaluations on the REALTOXICITYPROMPTS benchmark show that DETOXIGEN outperforms previous methods in detoxification metrics. The framework is also evaluated on various language models, demonstrating its versatility and efficiency. Key contributions include a performance improvement over existing detoxification methods, parameter-efficient learning, and the ability to handle multiple attributes.
Reach us at info@study.space
[slides and audio] Parameter-Efficient Detoxification with Contrastive Decoding