This study presents a comparative benchmark analysis of six in situ gene expression profiling methods, including both commercially available and academically developed technologies, using publicly accessible mouse brain datasets. The authors find that standard sensitivity metrics, such as the number of unique molecules detected per cell, are not directly comparable across datasets due to significant differences in off-target molecular artifacts affecting specificity. To address these challenges, they explore various sources of molecular artifacts, develop novel metrics to control for them, and use these metrics to evaluate and compare different in situ technologies. The study demonstrates that molecular false positives can seriously confound spatially-aware differential expression analysis, emphasizing the need for caution in interpreting downstream results. The analysis provides guidance for selecting, processing, and interpreting in situ spatial technologies, highlighting the importance of considering both sensitivity and specificity. The results suggest that Vizgen's MERSCOPE dataset exhibits the best performance, with optimal trade-offs between sensitivity and specificity, followed by MERFISH and Molecular Cartography. The study also discusses the limitations of the analysis and the need for improved segmentation methods to address non-specific signal issues in imaging-based spatial transcriptomic datasets.This study presents a comparative benchmark analysis of six in situ gene expression profiling methods, including both commercially available and academically developed technologies, using publicly accessible mouse brain datasets. The authors find that standard sensitivity metrics, such as the number of unique molecules detected per cell, are not directly comparable across datasets due to significant differences in off-target molecular artifacts affecting specificity. To address these challenges, they explore various sources of molecular artifacts, develop novel metrics to control for them, and use these metrics to evaluate and compare different in situ technologies. The study demonstrates that molecular false positives can seriously confound spatially-aware differential expression analysis, emphasizing the need for caution in interpreting downstream results. The analysis provides guidance for selecting, processing, and interpreting in situ spatial technologies, highlighting the importance of considering both sensitivity and specificity. The results suggest that Vizgen's MERSCOPE dataset exhibits the best performance, with optimal trade-offs between sensitivity and specificity, followed by MERFISH and Molecular Cartography. The study also discusses the limitations of the analysis and the need for improved segmentation methods to address non-specific signal issues in imaging-based spatial transcriptomic datasets.