CS 177: Introduction to Bioinformatics
 
Course Homepage: Fall 2005
 
CS 177: Mondays 3:20-5:55 pm
405 Tompkins Hall
 
Announcements:
Please get Tompkins 405 computer accounts by filling out an application in the TA room on
the 4th floor of Tompkins.
 
Instructors:
Drs. Anna Panchenko and Tom Madej, National Center for Biotechnology Information (NCBI).
Emails: hcnap2003@yahoo.com, tom_ncbi@yahoo.com
Office hours: Before or after class, check with the instructors.
 
Prerequisites: Permission of the instructor.  To get permission, please show up on the first day of class.
 
Course Description:
This course will provide a broad introduction to the area of bioinformatics.  Topics include: molecular
biology background, protein structure and function, sequence alignment algorithms, protein structure
prediction, structure-structure alignment, public sequence/structure databases and search tools,
introductory phylogenetic analysis, and systems biology.
 
Textbook: 
David W. Mount (2001). Bioinformatics, sequence and genome analysis. Cold Spring Lab. Press.
 
Grading: Homework 50%, Mid-term Exam 20%, Final Exam 30%
 
Homework: Homework assignments will be given out in class.  They are due by the next class period
i.e. the next Monday).  An assignment that is turned in after it is due but no more than one week late will
be penalized by 20%.  No credit will be given for homework that is more than one week late.
 
Lecture schedule:  
 
Lecture 1 (Sep 12): Introduction  (Tom Madej) 
               Powerpoint slides
               Motivating problem (protein sequence example).
               History: traditional biology vs. new information-based biology.
               Bioinformatics: a new approach to deal with the complexity of biological data.
               Motivating problem (protein structure example).
               Course overview, grading, etc.
               Molecular basis of cellular processes.
 
Lecture 2 (Sep 19): General principles of DNA/RNA structure and stability (Anna Panchenko)
               Powerpoint slides
               Homework (due Sep 26)
               Physico-chemical properties of nucleic acids.
               RNA-folding and structure prediction.
               Gene identification.
               Genome analysis.
 
Lecture 3 (Sep 26): General principles of protein structure and stability (AP)
               Powerpoint slides
               Homework (due Oct 3)
               Sequence for homework
               Physico-chemical properties of proteins.
               Prediction of protein secondary structure.
               Protein domains and prediction of domain boundaries.
               Protein structure-function relationships.
 
Lecture 4 (Oct 3): Sequence alignment algorithms (TM)
               Powerpoint slides
               Homework (due Oct 10)
               The alignment problem.
               Pairwise sequence alignment algorithms.
               Multiple sequence alignment algorithms.
               Sequence profiles and profile alignment methods
               Alignment statistics.
 
Lecture 5 (Oct 10): Computational aspects of protein structure, part I (AP)
               Powerpoint slides
               Homework (due Oct 17)
               Protein folding problem.
               Problem of protein structure prediction.
               Homology modeling.
               Protein design.
               Prediction of functionally important sites.
 
Lecture 6 (Oct 17): Computational aspects of protein structure, part II (TM)
               Powerpoint slides
               Homework (due Oct 31)
               PDB file for homework
               Structure-structure alignment algorithms.
               Significance of structure-structure similarity.
               Protein structure classification.
 
Mid-term Exam (Oct 24)
 
Lecture 7 (Oct 31): Bioinformatics databases (TM)
               Powerpoint slides
               Homework (due Nov 7)
               EST for homework
               Sequence and sequence alignment formats, data exchange.
               Public sequence databases.
               Sequence retrieval and examples.
               Public protein structure databases.
               Lab exercises.
 
Lecture 8 (Nov 7): Bioinformatics database search tools (TM)
               Powerpoint slides
               Homework
               Sequence database search tools.
               Structure database search tools.
               Assessment of results, ROC analysis.
               Lab exercises.
 
Lecture 9 (Nov 14): Protein threading and design (AP)
               Powerpoint slides
               Homework (due Nov 21)
 
Lecture 10 (Nov 21): SNPs and genetic variation; and experimental techniques (TM)
               Powerpoint slides for HapMap
               Powerpoint slides for Exp. Methods
               Homework (due Nov 28)
               The HapMap project.
               Human genetic variation.
               Sequencing, PCR.
               Microarrays.
               Protein crystallography.
 
Lecture 11 (Nov 28): Phylogenetic analysis, part I (AP)
               Powerpoint slides
               Homework (due Dec 5)
               Molecular basis of evolution.
               Taxonomy and phylogenetics.
               Phylogenetic trees and phylogenetic inference.
               Software tools for phylogenetic analysis.
 
Lecture 12 (Dec 5): Phylogenetic analysis, part II (AP)
               Powerpoint slides
               Homework (due Dec 12)
               Accuracies and statistical tests of phylogenetic trees.
               Genome comparisons.
               Protein structure evolution.
 
Review (Dec 7)
               AP review
               TM review
 
Final Exam - Monday, Dec. 19, 5:20-7:20 pm at Tompkins Hall 405