IndoCulture is a novel dataset designed to evaluate cultural commonsense reasoning across eleven Indonesian provinces. The dataset is manually constructed by local residents based on predefined topics, ensuring cultural relevance and geographical accuracy. The dataset includes 2,429 instances covering 12 cultural topics such as food, weddings, family relationships, pregnancy, and religious holidays. The study evaluates 23 language models, including open-source and closed-source models, to assess their ability to reason about cultural contexts. Results show that even the best open-source models struggle with accuracy, with the highest performance reaching 53.2%. Closed-source models like GPT-3.5 and GPT-4 perform better, achieving 61.7% and 75.8% accuracy, respectively. The inclusion of location context significantly improves performance, especially in larger models like GPT-4. The study also highlights the importance of geographical context in cultural reasoning and the challenges of cultural bias in language models. IndoCulture provides a comprehensive evaluation of cultural commonsense reasoning in Indonesian contexts, emphasizing the need for diverse and culturally informed datasets in language model development.IndoCulture is a novel dataset designed to evaluate cultural commonsense reasoning across eleven Indonesian provinces. The dataset is manually constructed by local residents based on predefined topics, ensuring cultural relevance and geographical accuracy. The dataset includes 2,429 instances covering 12 cultural topics such as food, weddings, family relationships, pregnancy, and religious holidays. The study evaluates 23 language models, including open-source and closed-source models, to assess their ability to reason about cultural contexts. Results show that even the best open-source models struggle with accuracy, with the highest performance reaching 53.2%. Closed-source models like GPT-3.5 and GPT-4 perform better, achieving 61.7% and 75.8% accuracy, respectively. The inclusion of location context significantly improves performance, especially in larger models like GPT-4. The study also highlights the importance of geographical context in cultural reasoning and the challenges of cultural bias in language models. IndoCulture provides a comprehensive evaluation of cultural commonsense reasoning in Indonesian contexts, emphasizing the need for diverse and culturally informed datasets in language model development.