BioPerl

1. Use E-Utilities or BioPerl to input a file containing GIs. Retrieve the latest version of the DNA sequences in GenBank format that correspond to the GIs (problem 5 from the last homework)
2. The sequence may contain more than the coding sequence, ie UTR. Use BioPerl to scan the GenBank sequence and extract the CDS only. Once you extract the coding sequence, translate the coding sequence to protein sequence in FASTA format.
3. BLAST the protein sequences against the nr database
4. Use BioPerl to parse the BLAST output and print out the top 3 hits for each amino acid sequence.