11 Apr 2024 | Dawn Lawrie,† Sean MacAvaney,‡ James Mayfield,† Paul McNamee,† Douglas W. Oard,†‡ Luca Soldaini,∗ Eugene Yang†
The TREC NeuCLIR track, in its second year, focuses on studying the impact of neural approaches in cross-language information retrieval (CLIR). The track includes four main tasks: ranked retrieval of news in Chinese, Persian, and Russian using English topics, a multilingual information retrieval (MLIR) task, a pilot technical documents CLIR task, and a monolingual retrieval task. The news collections remain the same as in 2022, but new topics were developed to optimize evaluation in the MLIR task. The track also introduced a new technical document CLIR pilot task, which aims to search Chinese dissertation abstracts using English topics. The results show that CLIR systems outperformed monolingual retrieval systems, and the GPT-4 model was effective in reranking documents. The technical documents task highlights the challenges of handling technical vocabulary and the need for specialized assessor expertise. The track's future directions include expanding the technical document task to a full task, introducing a pilot for automatic cross-language report generation, and delaying the submission deadline to encourage more participation.The TREC NeuCLIR track, in its second year, focuses on studying the impact of neural approaches in cross-language information retrieval (CLIR). The track includes four main tasks: ranked retrieval of news in Chinese, Persian, and Russian using English topics, a multilingual information retrieval (MLIR) task, a pilot technical documents CLIR task, and a monolingual retrieval task. The news collections remain the same as in 2022, but new topics were developed to optimize evaluation in the MLIR task. The track also introduced a new technical document CLIR pilot task, which aims to search Chinese dissertation abstracts using English topics. The results show that CLIR systems outperformed monolingual retrieval systems, and the GPT-4 model was effective in reranking documents. The technical documents task highlights the challenges of handling technical vocabulary and the need for specialized assessor expertise. The track's future directions include expanding the technical document task to a full task, introducing a pilot for automatic cross-language report generation, and delaying the submission deadline to encourage more participation.