Jul 2024 | Xiaohan Peng, Janin Koch, Wendy E Mackay
**DesignPrompt: Using Multimodal Interaction for Design Exploration with Generative AI**
**Authors:** Xiaohan Peng, Janin Koch, Wendy E Mackay
**Abstract:**
Visually oriented designers often struggle with creating effective generative AI (GenAI) prompts. A preliminary study identified issues in composing and fine-tuning prompts, as well as the need for accurate translation of intentions into rich input. To address these challenges, the authors developed *DesignPrompt*, a moodboard tool that allows designers to combine multiple modalities—images, colors, and text—into a single GenAI prompt and refine the results. A comparative structured observation study with 12 professional designers was conducted to understand their intent expression, expectation alignment, and transparency perception using *DesignPrompt* and text input GenAI. The study found that multimodal prompt input encouraged designers to explore and express themselves more effectively. Designer preferences changed based on their sense of control over the GenAI and whether they sought inspiration or specific images. Designers also developed innovative uses of *DesignPrompt*, including creating elaborate multimodal prompts and a pattern to maximize novelty while ensuring consistency.
**Key Contributions:**
1. A preliminary study that investigates how general audiences use GenAI applications for moodboarding and identifies four design implications.
2. *DesignPrompt*, a GenAI-powered moodboard tool with designer-centered multimodal input design.
3. Insights from a study with 12 professional designers.
**Design Goal and Research Questions:**
- Does using multimodal input to create GenAI prompts allow designers to explore and express their intents better?
- Does revealing system interpretation of user prompts help users produce results that are more aligned with their expectations?
- Does interactive and controllable GenAI input let users perceive the system as more transparent and useful for design practice?
**Design and Implementation:**
*DesignPrompt* is a digital moodboard system built as a web application using Vue3 and Express.js. It includes a search engine, moodboard canvas, color palette, semantic and color metadata display, an AI tool panel, an interactive prompt editor, and a GenAI history feature. The system supports different levels of image abstraction and semantics, helps users translate abstract intentions into richer prompts, aids in identifying the impact of prompts, and allows users to control and manipulate images engagingly.
**Study Methodology:**
A comparative structured observation study was conducted to compare text-based and multimodal prompt strategies. Participants performed multiple tasks with each design variant and provided feedback through questionnaires and interviews. The study found that both prompt strategies improved system expressivity and users' understanding, but participants preferred the multimodal condition for exploration and expressivity, while the text condition was preferred for ease of steering and generating meaningful images.
**Results:**
- **Questionnaire Results:** Both prompt strategies improved system expressivity and users' understanding, but participants preferred the multimodal condition for exploration and expressivity.
- **Multimodal Input:** Most designers enjoyed the multimodal input feature**DesignPrompt: Using Multimodal Interaction for Design Exploration with Generative AI**
**Authors:** Xiaohan Peng, Janin Koch, Wendy E Mackay
**Abstract:**
Visually oriented designers often struggle with creating effective generative AI (GenAI) prompts. A preliminary study identified issues in composing and fine-tuning prompts, as well as the need for accurate translation of intentions into rich input. To address these challenges, the authors developed *DesignPrompt*, a moodboard tool that allows designers to combine multiple modalities—images, colors, and text—into a single GenAI prompt and refine the results. A comparative structured observation study with 12 professional designers was conducted to understand their intent expression, expectation alignment, and transparency perception using *DesignPrompt* and text input GenAI. The study found that multimodal prompt input encouraged designers to explore and express themselves more effectively. Designer preferences changed based on their sense of control over the GenAI and whether they sought inspiration or specific images. Designers also developed innovative uses of *DesignPrompt*, including creating elaborate multimodal prompts and a pattern to maximize novelty while ensuring consistency.
**Key Contributions:**
1. A preliminary study that investigates how general audiences use GenAI applications for moodboarding and identifies four design implications.
2. *DesignPrompt*, a GenAI-powered moodboard tool with designer-centered multimodal input design.
3. Insights from a study with 12 professional designers.
**Design Goal and Research Questions:**
- Does using multimodal input to create GenAI prompts allow designers to explore and express their intents better?
- Does revealing system interpretation of user prompts help users produce results that are more aligned with their expectations?
- Does interactive and controllable GenAI input let users perceive the system as more transparent and useful for design practice?
**Design and Implementation:**
*DesignPrompt* is a digital moodboard system built as a web application using Vue3 and Express.js. It includes a search engine, moodboard canvas, color palette, semantic and color metadata display, an AI tool panel, an interactive prompt editor, and a GenAI history feature. The system supports different levels of image abstraction and semantics, helps users translate abstract intentions into richer prompts, aids in identifying the impact of prompts, and allows users to control and manipulate images engagingly.
**Study Methodology:**
A comparative structured observation study was conducted to compare text-based and multimodal prompt strategies. Participants performed multiple tasks with each design variant and provided feedback through questionnaires and interviews. The study found that both prompt strategies improved system expressivity and users' understanding, but participants preferred the multimodal condition for exploration and expressivity, while the text condition was preferred for ease of steering and generating meaningful images.
**Results:**
- **Questionnaire Results:** Both prompt strategies improved system expressivity and users' understanding, but participants preferred the multimodal condition for exploration and expressivity.
- **Multimodal Input:** Most designers enjoyed the multimodal input feature