The chapter introduces the field of comparative genomics, emphasizing its role in understanding how genetic information leads to observable traits and behaviors. It highlights that while a complete genome sequence provides the ultimate genetic map, it does not directly explain how genetic information functions. The focus shifts to the next phase of the Human Genome Project, which aims to identify and utilize all functional parts of genome sequences to improve health. Comparative genomics is a key tool in this effort, involving the analysis of DNA sequences to understand conserved and divergent features between species.
The principles of comparative genomics are straightforward: common features between species are often encoded in conserved DNA sequences. These sequences include those responsible for functions that have been conserved since the last common ancestor, as well as those controlling gene expression. Conversely, divergent sequences encode or control differences between species.
The chapter discusses how different phylogenetic distances can address various questions. At very long distances, broad insights about gene types can be gained, while at moderate distances, both functional and nonfunctional DNA can be identified. At very close distances, such as between humans and chimpanzees, key sequence differences that may account for differences in organisms can be found.
Alignment of DNA sequences is a core process in comparative genomics, and several algorithms have been developed to align sequences. However, the computational power required for aligning billions of nucleotides between species exceeds what is typically available in individual laboratories. Precomputed alignments are made available through servers or browsers, such as EnteriX, VISTA, UCSC Genome Browser, Ensembl, and GALA.
The chapter also explores what can be learned about genome evolution and function from comparative genomics. It notes that while large-scale gene organization and order have been preserved between humans and mice, about 60% of C. elegans genes encoding proteins have clear homologs in C. briggsae. At the nucleotide level, about 40% of the human genome aligns with the mouse genome, with the remaining 60% composed of lineage-specific insertions, deletions, and other mechanisms.
The chapter concludes by discussing the prospects for future research, including the increasing availability of genome sequences from closely related species and the potential for more powerful functional predictions through multiple sequence comparisons. It emphasizes the importance of large-scale experimental tests to validate predictions from comparative genomics.The chapter introduces the field of comparative genomics, emphasizing its role in understanding how genetic information leads to observable traits and behaviors. It highlights that while a complete genome sequence provides the ultimate genetic map, it does not directly explain how genetic information functions. The focus shifts to the next phase of the Human Genome Project, which aims to identify and utilize all functional parts of genome sequences to improve health. Comparative genomics is a key tool in this effort, involving the analysis of DNA sequences to understand conserved and divergent features between species.
The principles of comparative genomics are straightforward: common features between species are often encoded in conserved DNA sequences. These sequences include those responsible for functions that have been conserved since the last common ancestor, as well as those controlling gene expression. Conversely, divergent sequences encode or control differences between species.
The chapter discusses how different phylogenetic distances can address various questions. At very long distances, broad insights about gene types can be gained, while at moderate distances, both functional and nonfunctional DNA can be identified. At very close distances, such as between humans and chimpanzees, key sequence differences that may account for differences in organisms can be found.
Alignment of DNA sequences is a core process in comparative genomics, and several algorithms have been developed to align sequences. However, the computational power required for aligning billions of nucleotides between species exceeds what is typically available in individual laboratories. Precomputed alignments are made available through servers or browsers, such as EnteriX, VISTA, UCSC Genome Browser, Ensembl, and GALA.
The chapter also explores what can be learned about genome evolution and function from comparative genomics. It notes that while large-scale gene organization and order have been preserved between humans and mice, about 60% of C. elegans genes encoding proteins have clear homologs in C. briggsae. At the nucleotide level, about 40% of the human genome aligns with the mouse genome, with the remaining 60% composed of lineage-specific insertions, deletions, and other mechanisms.
The chapter concludes by discussing the prospects for future research, including the increasing availability of genome sequences from closely related species and the potential for more powerful functional predictions through multiple sequence comparisons. It emphasizes the importance of large-scale experimental tests to validate predictions from comparative genomics.