This article introduces a novel image compression method called Multimodal Image Semantic Compression (MISC), which leverages Large Multimodal Models (LMMs) to achieve ultra-low bitrate image compression with high consistency and perceptual quality. MISC consists of an LMM encoder for semantic information extraction, a map encoder for region localization, an image encoder for extreme compression, and a decoder for image reconstruction. The method is designed to compress both traditional Natural Sense Images (NSIs) and emerging AI-Generated Images (AIGIs), achieving optimal consistency and perception while reducing the bitrate by 50%. The proposed framework includes a high-quality AIGI database for evaluating compression performance, and it demonstrates superior results compared to existing methods at ultra-low bitrates. The MISC method balances consistency and perception by incorporating semantic information and text descriptions, allowing for high-quality image reconstruction even at very low bitrates. The article also presents experimental results showing that MISC outperforms existing compression algorithms in both consistency and perceptual quality, particularly for AIGIs. The study highlights the potential of LMMs in image compression and provides a comprehensive evaluation of the method's performance across different bitrates and image types.This article introduces a novel image compression method called Multimodal Image Semantic Compression (MISC), which leverages Large Multimodal Models (LMMs) to achieve ultra-low bitrate image compression with high consistency and perceptual quality. MISC consists of an LMM encoder for semantic information extraction, a map encoder for region localization, an image encoder for extreme compression, and a decoder for image reconstruction. The method is designed to compress both traditional Natural Sense Images (NSIs) and emerging AI-Generated Images (AIGIs), achieving optimal consistency and perception while reducing the bitrate by 50%. The proposed framework includes a high-quality AIGI database for evaluating compression performance, and it demonstrates superior results compared to existing methods at ultra-low bitrates. The MISC method balances consistency and perception by incorporating semantic information and text descriptions, allowing for high-quality image reconstruction even at very low bitrates. The article also presents experimental results showing that MISC outperforms existing compression algorithms in both consistency and perceptual quality, particularly for AIGIs. The study highlights the potential of LMMs in image compression and provides a comprehensive evaluation of the method's performance across different bitrates and image types.