CTRL: A Conditional Transformer Language Model for Controllable Generation

CTRL: A Conditional Transformer Language Model for Controllable Generation

20 Sep 2019 | Nitish Shirish Keskar; Bryan McCann; Lav R. Varshney, Caiming Xiong, Richard Socher
CTRL is a conditional transformer language model with 1.63 billion parameters, designed to enable controllable text generation. It uses control codes derived from the structure of raw text to condition text generation on style, content, and task-specific behavior. These codes allow for explicit control over text generation and help analyze training data by predicting which parts are most likely given a sequence. CTRL is trained on a diverse range of text data, including Wikipedia, Project Gutenberg, Amazon Reviews, and various question-answering datasets. It can generate text based on domain, style, topics, dates, entities, and task-specific behavior. The model is trained using a combination of BPE tokenization and a large vocabulary of around 250,000 tokens. CTRL is implemented in TensorFlow and trained on a Cloud TPU v3 Pod with a global batch size of 1024 for 800,000 iterations. It supports various control codes for tasks like question answering and machine translation, allowing for the generation of text with specific domain, subdomain, entities, and relationships. The model also enables source attribution by analyzing the correlation between text and training data. CTRL is released as a public resource to encourage further research into controllable text generation and natural language understanding. The model's design allows for a balance between controllability and generalization, and it is intended to be used in a variety of natural language processing tasks. The release of CTRL includes a code of conduct to address ethical considerations related to the use of large language models.CTRL is a conditional transformer language model with 1.63 billion parameters, designed to enable controllable text generation. It uses control codes derived from the structure of raw text to condition text generation on style, content, and task-specific behavior. These codes allow for explicit control over text generation and help analyze training data by predicting which parts are most likely given a sequence. CTRL is trained on a diverse range of text data, including Wikipedia, Project Gutenberg, Amazon Reviews, and various question-answering datasets. It can generate text based on domain, style, topics, dates, entities, and task-specific behavior. The model is trained using a combination of BPE tokenization and a large vocabulary of around 250,000 tokens. CTRL is implemented in TensorFlow and trained on a Cloud TPU v3 Pod with a global batch size of 1024 for 800,000 iterations. It supports various control codes for tasks like question answering and machine translation, allowing for the generation of text with specific domain, subdomain, entities, and relationships. The model also enables source attribution by analyzing the correlation between text and training data. CTRL is released as a public resource to encourage further research into controllable text generation and natural language understanding. The model's design allows for a balance between controllability and generalization, and it is intended to be used in a variety of natural language processing tasks. The release of CTRL includes a code of conduct to address ethical considerations related to the use of large language models.
Reach us at info@study.space
Understanding CTRL%3A A Conditional Transformer Language Model for Controllable Generation