BNFO 653 – Pattern Recognition and Gene Finding (2016)          

Projects: Computation to Solve Problems

The following projects guide you through a problem requiring computer programming for its solution. Choose whichever one (or more) that strikes your fancy. Better to do one well than try too many.

  1. What determines the beginning of a gene?
    This is a reasonable choice for those with no experience with BioBIKE or programming, unless one of the other topics particularly attracts you.
     
  2. Where in a bacterial genome are viruses integrated?
    An example of data mining that combines a search of text with a search of protein sequences.
     
  3. Determination of short tandem repeats (STRs)
    STRs are commonly used in forensic application.
     
  4. Analysis of gene expression data
    Extracting useful insights from microarray data sliced by metabolic pathways.
     
  5. CRISPRs in enteric bacteria
    How to find them by repetitive structure, and how to compare their locations in different genomes.
     
  6. Finding targets for DNA-binding proteins given known binding sites
    Using position-specific scoring matrices to extend experimental knowledge of the genomic targets of a certain DNA-binding protein to find new, previously unknown sites. Uses a well-studied cyanobacterial transcription factor as an example.
     
  7. Finding targets for DNA-binding proteins given known target genes
    Certain genes are known or suspected of being co-regulated. Perhaps the genes contain a common upstream sequence that is the target of a transcription factor. Uses a motif-finding program (Meme) to investigate, with Streptococcus genes as the raw material.
     
  8. Identifying protein by pattern
    A family of proteins does not show great sequence similarity, except within certain amino acid motifs. Can these motifs be used to find additional family members? Uses a plant protein involved in floral symmetry as an example.