The study examines the distribution and abundance of microsatellites (SSRs) in various eukaryotic taxonomic groups, including primates, rodents, other mammals, nonmammalian vertebrates, arthropods, *Caenorhabditis elegans*, plants, yeast, and other fungi. The analysis focuses on exons, introns, and intergenic regions, comparing the distribution of simple sequence repeats (SSRs) with different repeat lengths. Key findings include:
1. **Tri- and Hexanucleotide Repeats**: These are prevalent in protein-coding exons across all taxa, while their abundance varies in intergenic regions and introns.
2. **Taxon-Specific Variations**: The distribution of SSRs shows significant differences among taxa, with specific motifs being more common in certain groups.
3. **CCG Trinucleotide Repeats**: A striking abundance of (CCG)6•(CGG)6n trinucleotide repeats is observed in intergenic regions of all vertebrates but is absent from introns.
4. **Dinucleotide Repeats**: Dinucleotide repeats are most abundant in rodents and least frequent in fungi. They are more common in intergenic regions and introns compared to exons.
5. **Tetranucleotide Repeats**: Tetranucleotide repeats are more frequent in vertebrate introns and intergenic regions than in exons.
6. **Pentanucleotide Repeats**: Pentanucleotide repeats are more abundant in nonvertebrate taxa and in introns and intergenic regions of mammals.
7. **Hexanucleotide Repeats**: Hexanucleotide repeats are the second most common type after trinucleotide repeats in exons and are more frequent in intergenic regions and introns of nonvertebrates.
The study also discusses the limitations of strand-slippage theories in explaining microsatellite distribution and suggests that other factors, such as DNA repair mechanisms and selective pressures, may play a significant role in the observed patterns. The results highlight the complexity of SSR distribution and the importance of taxon-specific analyses in understanding genome evolution.The study examines the distribution and abundance of microsatellites (SSRs) in various eukaryotic taxonomic groups, including primates, rodents, other mammals, nonmammalian vertebrates, arthropods, *Caenorhabditis elegans*, plants, yeast, and other fungi. The analysis focuses on exons, introns, and intergenic regions, comparing the distribution of simple sequence repeats (SSRs) with different repeat lengths. Key findings include:
1. **Tri- and Hexanucleotide Repeats**: These are prevalent in protein-coding exons across all taxa, while their abundance varies in intergenic regions and introns.
2. **Taxon-Specific Variations**: The distribution of SSRs shows significant differences among taxa, with specific motifs being more common in certain groups.
3. **CCG Trinucleotide Repeats**: A striking abundance of (CCG)6•(CGG)6n trinucleotide repeats is observed in intergenic regions of all vertebrates but is absent from introns.
4. **Dinucleotide Repeats**: Dinucleotide repeats are most abundant in rodents and least frequent in fungi. They are more common in intergenic regions and introns compared to exons.
5. **Tetranucleotide Repeats**: Tetranucleotide repeats are more frequent in vertebrate introns and intergenic regions than in exons.
6. **Pentanucleotide Repeats**: Pentanucleotide repeats are more abundant in nonvertebrate taxa and in introns and intergenic regions of mammals.
7. **Hexanucleotide Repeats**: Hexanucleotide repeats are the second most common type after trinucleotide repeats in exons and are more frequent in intergenic regions and introns of nonvertebrates.
The study also discusses the limitations of strand-slippage theories in explaining microsatellite distribution and suggests that other factors, such as DNA repair mechanisms and selective pressures, may play a significant role in the observed patterns. The results highlight the complexity of SSR distribution and the importance of taxon-specific analyses in understanding genome evolution.