June 16-17, 2016 | Saif M. Mohammad, Svetlana Kiritchenko, Parinaz Sobhani, Xiaodan Zhu, Colin Cherry
The paper presents a shared task on detecting stance from tweets, where the goal is to determine whether the tweeter is in favor, against, or neutral towards a given target entity. The task is divided into two parts: Task A, a supervised classification task using 70% of the annotated data for training, and Task B, a weakly supervised task using all available data for a new target without training. The highest classification F-score obtained was 67.82 for Task A and 56.28 for Task B. Systems found it challenging to infer stance towards the target when the tweet expresses opinion about another entity. The paper discusses the dataset creation, annotation process, and evaluation metrics, highlighting the nuances of stance detection, such as neutral stance and the relationship between stance and sentiment. It also provides an overview of the systems and results for both tasks, noting that most teams used standard text classification features and word embeddings, with some teams employing deep neural networks. The paper concludes by discussing future research directions and related work.The paper presents a shared task on detecting stance from tweets, where the goal is to determine whether the tweeter is in favor, against, or neutral towards a given target entity. The task is divided into two parts: Task A, a supervised classification task using 70% of the annotated data for training, and Task B, a weakly supervised task using all available data for a new target without training. The highest classification F-score obtained was 67.82 for Task A and 56.28 for Task B. Systems found it challenging to infer stance towards the target when the tweet expresses opinion about another entity. The paper discusses the dataset creation, annotation process, and evaluation metrics, highlighting the nuances of stance detection, such as neutral stance and the relationship between stance and sentiment. It also provides an overview of the systems and results for both tasks, noting that most teams used standard text classification features and word embeddings, with some teams employing deep neural networks. The paper concludes by discussing future research directions and related work.