Understanding Towards Balancing Preference and Performance through Adaptive Personalized Explainability

The supplementary material provides detailed information about the study on balancing preference and performance through adaptive personalized explainability in human-robot interaction. The study consists of two main components: a population study and a personalization study. 1. **Additional Domain Details**: - Participants completed various surveys before and after the studies, including demographics, negative attitudes towards robots, mini-IPP personality survey, and experience with driving, robots, and decision trees. - Post-study, participants completed additional surveys on robot trust, perceived workload, anthropomorphism, social competence, and explainability. 2. **Overall Study Flow**: - **Population Study**: Participants completed consent forms, pre-surveys, demographic information, and training on a simulator. They then performed three tasks with each xAI modality (language, feature-importance, decision-trees) and provided preference rankings. - **Personalization Study**: Participants completed consent forms, pre-surveys, demographic surveys, training, and calibration tasks. They were randomly assigned to either the adaptive personalization strategy or a baseline condition and completed three tasks with each strategy, providing preference rankings after each. 3. **Providing Incorrect Suggestions and Explanations**: - Incorrect suggestions were provided as the opposite of the correct direction, with participants warned about this at the beginning. - Incorrect explanations included "red-herring" features like weather, radio, sky, traffic, rush hour, or the president’s motorcade. - Correct explanations focused on the shortest path, optimal route, and relevant details about the goal or obstacles. 4. **Task Orderings**: - In the population study, participants completed nine navigation tasks, rotating between explanation modalities. - In the personalization study, participants completed six test tasks, with a balanced order of explanation selection strategies. 5. **Statistical Analyses**: - ANOVA and Friedman’s tests were used to analyze the effects of different conditions on various metrics. - Significant differences were found in explanation modality rankings, inappropriate compliance, consecutive mistakes, and consideration time. 6. **Additional Results**: - Full pairwise comparisons between conditions were presented, showing significant differences in preference rankings, inappropriate compliance, and steps above optimal. 7. **Participant Briefing**: - Participants were informed about the study, the role of the self-driving car, the different types of explanations, and the importance of following the correct directions. The study aimed to understand how users can identify errant decisions from a digital assistant and how adaptive personalization can improve user experience.The supplementary material provides detailed information about the study on balancing preference and performance through adaptive personalized explainability in human-robot interaction. The study consists of two main components: a population study and a personalization study. 1. **Additional Domain Details**: - Participants completed various surveys before and after the studies, including demographics, negative attitudes towards robots, mini-IPP personality survey, and experience with driving, robots, and decision trees. - Post-study, participants completed additional surveys on robot trust, perceived workload, anthropomorphism, social competence, and explainability. 2. **Overall Study Flow**: - **Population Study**: Participants completed consent forms, pre-surveys, demographic information, and training on a simulator. They then performed three tasks with each xAI modality (language, feature-importance, decision-trees) and provided preference rankings. - **Personalization Study**: Participants completed consent forms, pre-surveys, demographic surveys, training, and calibration tasks. They were randomly assigned to either the adaptive personalization strategy or a baseline condition and completed three tasks with each strategy, providing preference rankings after each. 3. **Providing Incorrect Suggestions and Explanations**: - Incorrect suggestions were provided as the opposite of the correct direction, with participants warned about this at the beginning. - Incorrect explanations included "red-herring" features like weather, radio, sky, traffic, rush hour, or the president’s motorcade. - Correct explanations focused on the shortest path, optimal route, and relevant details about the goal or obstacles. 4. **Task Orderings**: - In the population study, participants completed nine navigation tasks, rotating between explanation modalities. - In the personalization study, participants completed six test tasks, with a balanced order of explanation selection strategies. 5. **Statistical Analyses**: - ANOVA and Friedman’s tests were used to analyze the effects of different conditions on various metrics. - Significant differences were found in explanation modality rankings, inappropriate compliance, consecutive mistakes, and consideration time. 6. **Additional Results**: - Full pairwise comparisons between conditions were presented, showing significant differences in preference rankings, inappropriate compliance, and steps above optimal. 7. **Participant Briefing**: - Participants were informed about the study, the role of the self-driving car, the different types of explanations, and the importance of following the correct directions. The study aimed to understand how users can identify errant decisions from a digital assistant and how adaptive personalization can improve user experience.

Supplementary Material– Towards Balancing Preference and Performance through Adaptive Personalized Explainability

March 11–14, 2024 | ACM