The paper introduces V2X-Real, a large-scale real-world dataset designed for Vehicle-to-Everything (V2X) cooperative perception research. The dataset includes 33K LiDAR frames, 171K camera images, and over 1.2 million annotated 3D bounding boxes of 10 categories in challenging urban scenarios. It is collected using two connected automated vehicles and two smart infrastructures, equipped with multi-modal sensors. The dataset is divided into four sub-datasets based on collaboration modes and ego perspectives: V2X-Real-VC (Vehicle-Centric), V2X-Real-IC (Infrastructure-Centric), V2X-Real-V2V (Vehicle-to-Vehicle), and V2X-Real-I2I (Infrastructure-to-Infrastructure). The paper provides comprehensive benchmarks for multi-class multi-agent V2X cooperative perception and discusses the data acquisition, annotation, and processing methods. The results demonstrate the effectiveness of V2X collaboration in enhancing perception capabilities, with intermediate fusion methods showing superior performance. The dataset and benchmark codes will be released to facilitate future research in V2X cooperative perception.The paper introduces V2X-Real, a large-scale real-world dataset designed for Vehicle-to-Everything (V2X) cooperative perception research. The dataset includes 33K LiDAR frames, 171K camera images, and over 1.2 million annotated 3D bounding boxes of 10 categories in challenging urban scenarios. It is collected using two connected automated vehicles and two smart infrastructures, equipped with multi-modal sensors. The dataset is divided into four sub-datasets based on collaboration modes and ego perspectives: V2X-Real-VC (Vehicle-Centric), V2X-Real-IC (Infrastructure-Centric), V2X-Real-V2V (Vehicle-to-Vehicle), and V2X-Real-I2I (Infrastructure-to-Infrastructure). The paper provides comprehensive benchmarks for multi-class multi-agent V2X cooperative perception and discusses the data acquisition, annotation, and processing methods. The results demonstrate the effectiveness of V2X collaboration in enhancing perception capabilities, with intermediate fusion methods showing superior performance. The dataset and benchmark codes will be released to facilitate future research in V2X cooperative perception.