1984 | John Devereux, Paul Haebertli* and Oliver Smithies
The University of Wisconsin Genetics Computer Group (UWGCG) has developed a comprehensive set of sequence analysis programs for the Digital Equipment Corporation VAX computer using the VMS operating system. These programs are designed to be used together as a coherent system and can be easily modified. The programs are written in Fortran 77 and are available for VAX computers. Most programs can be used with only a terminal, although several require a Hewlett Packard plotter. The software has been installed at eight different institutions and a simple method has been developed for transferring and maintaining the system on other VAX computers.
The UWGCG program design is based on the "software tools" approach, where each program performs a simple function and is easy to use. Programs can be used independently in different combinations to solve complex problems. New programming is simplified as less effort is required to bridge gaps between existing programs. The software is designed to be maintained and modified at sites other than the University of Wisconsin. The program manual is extensive and the source codes are organized to make modification convenient. Scientists are encouraged to use existing programs as a framework for developing new ones. Copyright can be removed from any program modified by more than 25% of the original effort.
The programs available from UWGCG include comparisons, mapping and searching, secondary structure analysis, analysis of composition and location of genetic domains, sequence manipulation, sequence publication, and general features. The software is designed to be interactive, with each program run by simply typing its name. Parameters are obtained interactively, and special features can be obtained by using an extra word next to the program's name. Data from the NIH-GenBank and EMBL nucleotide sequence data libraries are available "on-line" to any UWGCG program. A Search utility locates sequences in the libraries by keyword, and a Find utility locates library entries containing any specified sequence. Programs are available to install new data sent periodically from GenBank and EMBL to update their data libraries.
All data in the system are stored in text files that can be read and modified easily. Every data file has an English heading describing the contents. The data files may be copied by each user for analysis or modification. Programs recognize and read user-modified input data automatically. Data files can be modified with any text editor. Sequences are maintained in files that allow documentation and numbering both above and within the sequence. This file format is compatible with both of the nucleic acid sequence libraries and has been adopted as the standard sequence file format by the data base project at the European Molecular Biology Lab. Because genetic manipulations commonly involve linking several molecules of known sequence, UWGCG sequence files are designed to support concatenation by allowing comments to appear within the sequences at any location. Coding sequences or the boundaries between cloning vector and insert, for instance, can be marked within the sequence itself for immediate identification.
The UWGCG symbol set includes all alphabetic characters plus five additionalThe University of Wisconsin Genetics Computer Group (UWGCG) has developed a comprehensive set of sequence analysis programs for the Digital Equipment Corporation VAX computer using the VMS operating system. These programs are designed to be used together as a coherent system and can be easily modified. The programs are written in Fortran 77 and are available for VAX computers. Most programs can be used with only a terminal, although several require a Hewlett Packard plotter. The software has been installed at eight different institutions and a simple method has been developed for transferring and maintaining the system on other VAX computers.
The UWGCG program design is based on the "software tools" approach, where each program performs a simple function and is easy to use. Programs can be used independently in different combinations to solve complex problems. New programming is simplified as less effort is required to bridge gaps between existing programs. The software is designed to be maintained and modified at sites other than the University of Wisconsin. The program manual is extensive and the source codes are organized to make modification convenient. Scientists are encouraged to use existing programs as a framework for developing new ones. Copyright can be removed from any program modified by more than 25% of the original effort.
The programs available from UWGCG include comparisons, mapping and searching, secondary structure analysis, analysis of composition and location of genetic domains, sequence manipulation, sequence publication, and general features. The software is designed to be interactive, with each program run by simply typing its name. Parameters are obtained interactively, and special features can be obtained by using an extra word next to the program's name. Data from the NIH-GenBank and EMBL nucleotide sequence data libraries are available "on-line" to any UWGCG program. A Search utility locates sequences in the libraries by keyword, and a Find utility locates library entries containing any specified sequence. Programs are available to install new data sent periodically from GenBank and EMBL to update their data libraries.
All data in the system are stored in text files that can be read and modified easily. Every data file has an English heading describing the contents. The data files may be copied by each user for analysis or modification. Programs recognize and read user-modified input data automatically. Data files can be modified with any text editor. Sequences are maintained in files that allow documentation and numbering both above and within the sequence. This file format is compatible with both of the nucleic acid sequence libraries and has been adopted as the standard sequence file format by the data base project at the European Molecular Biology Lab. Because genetic manipulations commonly involve linking several molecules of known sequence, UWGCG sequence files are designed to support concatenation by allowing comments to appear within the sequences at any location. Coding sequences or the boundaries between cloning vector and insert, for instance, can be marked within the sequence itself for immediate identification.
The UWGCG symbol set includes all alphabetic characters plus five additional