Biol 591 
Introduction to Bioinformatics
Scenarios
Fall 2003 
Identification of a possible regulatory site in genomic DNA

Scientific story (html)

In brief: You find by chance a sequence upstream from an interesting gene that looks suspiciously like a regulatory sequence. How can you assess the likelihood of your find?
Bioinformatic tools
Simple simulation
     Make up random sequences. Count how many times a regulatory sequence arises.
Pattern recognition
     Scan entire genome, counting sequences that satisfy criteria for regulatory sequence.
Molecular biology concepts
Regulation over gene expression
DNA-binding proteins
Nonrandom nature of biological DNA
Perl focus: Loops and arrays

Programs

Simple simulation: Dice roll (DiceRoll.pl)
     A model program just to get across the idea of a simulation.
Simple simulation/Pattern recognition: Sequence search through simulated genome
     Explores notion of randomness. We will write this program by combining the strategy of DiceRoll
     with the tools from FindMatches.
Pattern recognition: Sequence search through entire genome (FindMatches.pl)
     For comparison with above
     Data: Small sequence file for testing (AnabSeq.nt)
     Data: Full sequence file for real thing (AnabaenaChromosome.nt)
Notes
Regulatory protein and their binding sites (PDF) (Questions)
Arrays and loops (html) (Questions)
Implementing a simulation (html) (Questions)
Problem Set - Molecular biology (PDF)
Problem Set - Programming (html)