This paper introduces a novel method called Temporal Collaborative Attention (TCOAT) for wind power forecasting. TCOAT is a data-driven approach that captures both temporal and spatial dependencies in wind power generation data, as well as long-term and short-term patterns. It utilizes attention mechanisms to dynamically adjust the weights of each input variable and time step based on their contextual relevance for forecasting. Additionally, TCOAT employs collaborative attention units to assimilate directional and global information from the input data. It also explicitly models the interactions and correlations among different variables or time steps through self-attention and cross-attention mechanisms. To integrate long-term and short-term information effectively, TCOAT incorporates a temporal fusion layer that employs concatenation and mapping operations, along with hierarchical feature extraction and aggregation. The method is validated through extensive experiments on a real-world wind power generation dataset from Greece and compared against twenty-two state-of-the-art methods. Experimental results demonstrate that TCOAT outperforms existing methods in terms of both accuracy and robustness in wind power forecasting. Furthermore, a generality study on an additional real-world dataset from a different climate condition and wind power characteristics shows that TCOAT can achieve comparable or better performance than the state-of-the-art methods, confirming the generalization ability of TCOAT. The proposed TCOAT model consists of four main components: a temporal encoder, a spatial encoder, a collaborative attention module, and a transformer decoder. TCOAT can handle both sequential and spatial data, capture long-term and short-term dependencies, provide attention maps, and achieve state-of-the-art performance. TCOAT is an end-to-end model that can learn directly from raw wind power data without any preprocessing or post-processing steps.This paper introduces a novel method called Temporal Collaborative Attention (TCOAT) for wind power forecasting. TCOAT is a data-driven approach that captures both temporal and spatial dependencies in wind power generation data, as well as long-term and short-term patterns. It utilizes attention mechanisms to dynamically adjust the weights of each input variable and time step based on their contextual relevance for forecasting. Additionally, TCOAT employs collaborative attention units to assimilate directional and global information from the input data. It also explicitly models the interactions and correlations among different variables or time steps through self-attention and cross-attention mechanisms. To integrate long-term and short-term information effectively, TCOAT incorporates a temporal fusion layer that employs concatenation and mapping operations, along with hierarchical feature extraction and aggregation. The method is validated through extensive experiments on a real-world wind power generation dataset from Greece and compared against twenty-two state-of-the-art methods. Experimental results demonstrate that TCOAT outperforms existing methods in terms of both accuracy and robustness in wind power forecasting. Furthermore, a generality study on an additional real-world dataset from a different climate condition and wind power characteristics shows that TCOAT can achieve comparable or better performance than the state-of-the-art methods, confirming the generalization ability of TCOAT. The proposed TCOAT model consists of four main components: a temporal encoder, a spatial encoder, a collaborative attention module, and a transformer decoder. TCOAT can handle both sequential and spatial data, capture long-term and short-term dependencies, provide attention maps, and achieve state-of-the-art performance. TCOAT is an end-to-end model that can learn directly from raw wind power data without any preprocessing or post-processing steps.