Scenario 8 Study Questions

Study Question 1:The file motif.dat contains information similar to motif.hits. The first item of information in each record is a name like "Nostoc"; the second is score like 25.5. The third and fourth items are the starting and ending coordinates of the sequence matching the motif. (This is different form motif.hits, where the fourth item is the matching sequence itself).

The file is a binary file. Each record is 26 bytes long.

Write a program to print out the file in a human-readable form.

Hint: The program print-binary.pl will printout an exploratory version of the 7120DB.DAT file, as discussed near the bottom of the second page of the Notes. Modify it to print out an exploratory version of motif.dat. Then modify it again to print out the information from motif.dat.

Study Question 2: In find-context.pl we see the following piece of code:

      if ($o == @orfs) { # site is to the right of every ORF
die "Can't yet print context without right-hand ORF\n";
}
elsif ($$site[1] < $orfs[$o][0]) { # Site is to the left of $orfs[$o]
if ($o == 0) { die "Can't yet print context without left-hand ORF\n" }
else { print_pair($site, $orfs[$o-1], $orfs[$o]) }
}
The two die statements could be inconvenient: one poorly placed sequence and all our work is lost. Modify the program so that:

SQ2a: It doesn't die.
SQ2b: It reports the pair of ORFs closest to the possible binding site. The distance to one of them might be incorrect.
SQ2c: It reports the pair of closest ORFs, and correctly reports the distance from each ORF to the possible binding site.

You can use test-motif.hits as a replacement for motif.hits to test the modified program.