<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://info.gersteinlab.org/index.php?action=history&amp;feed=atom&amp;title=Cs545-07</id>
	<title>Cs545-07 - Revision history</title>
	<link rel="self" type="application/atom+xml" href="https://info.gersteinlab.org/index.php?action=history&amp;feed=atom&amp;title=Cs545-07"/>
	<link rel="alternate" type="text/html" href="https://info.gersteinlab.org/index.php?title=Cs545-07&amp;action=history"/>
	<updated>2026-04-25T01:53:22Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.42.6</generator>
	<entry>
		<id>https://info.gersteinlab.org/index.php?title=Cs545-07&amp;diff=29&amp;oldid=prev</id>
		<title>Infoadmin: Created page with &#039;This page contains general information for the class:  &#039;&#039;CPSC445/CPSC545/MBB334/MBB545/CBB545&#039;&#039;  &#039;&#039;&#039;Introduction to Data Mining&#039;&#039;&#039;  == Course websites ==  * Main website ** http:…&#039;</title>
		<link rel="alternate" type="text/html" href="https://info.gersteinlab.org/index.php?title=Cs545-07&amp;diff=29&amp;oldid=prev"/>
		<updated>2010-06-03T23:12:09Z</updated>

		<summary type="html">&lt;p&gt;Created page with &amp;#039;This page contains general information for the class:  &amp;#039;&amp;#039;CPSC445/CPSC545/MBB334/MBB545/CBB545&amp;#039;&amp;#039;  &amp;#039;&amp;#039;&amp;#039;Introduction to Data Mining&amp;#039;&amp;#039;&amp;#039;  == Course websites ==  * Main website ** http:…&amp;#039;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;This page contains general information for the class:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;CPSC445/CPSC545/MBB334/MBB545/CBB545&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Introduction to Data Mining&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
== Course websites ==&lt;br /&gt;
&lt;br /&gt;
* Main website&lt;br /&gt;
** http://www.gersteinlab.org/courses/545/&lt;br /&gt;
* Course wiki&lt;br /&gt;
** http://wiki.gersteinlab.org/pubinfo/index.php/Cs545-07&lt;br /&gt;
* Yale classes server&lt;br /&gt;
** http://classes.yale.edu/&lt;br /&gt;
&lt;br /&gt;
* 2008 Wiki &lt;br /&gt;
** http://lab.zoo.cs.yale.edu/cs445-wiki/index.php/Main_Page&lt;br /&gt;
** [[local-copy-cs445 | local cached copy]] &lt;br /&gt;
&lt;br /&gt;
== Homework ==&lt;br /&gt;
&lt;br /&gt;
* HW2: Decision trees.&lt;br /&gt;
** http://www.gersteinlab.org/courses/545/07-spr/hw/hw2.pdf&lt;br /&gt;
** Due Feb 20th, 2007.&lt;br /&gt;
&lt;br /&gt;
* HW3: Multilinear regression.&lt;br /&gt;
** http://www.gersteinlab.org/courses/545/07-spr/hw/hw3.pdf&lt;br /&gt;
** http://www.gersteinlab.org/courses/545/07-spr/hw/hw3_lr.r&lt;br /&gt;
** Due Feb 27th, 2007.&lt;br /&gt;
&lt;br /&gt;
* HW4: SVM.&lt;br /&gt;
** http://www.gersteinlab.org/courses/545/07-spr/hw/hw4.pdf&lt;br /&gt;
** http://www.gersteinlab.org/courses/545/07-spr/hw/bc_data.txt&lt;br /&gt;
** Due Mar 11th, 2007.&lt;br /&gt;
&lt;br /&gt;
== Final Project ==&lt;br /&gt;
&lt;br /&gt;
* Suggested Term Projects. Due May 7th, 2007. Project description due by March 27, 2007&lt;br /&gt;
** http://www.gersteinlab.org/courses/545/07-spr/project.html&lt;br /&gt;
** http://www.gersteinlab.org/courses/545/07-spr/proj/Term_Projects_2007_final.pdf&lt;br /&gt;
&lt;br /&gt;
* List of proposed projects&lt;br /&gt;
** http://spreadsheets.google.com/pub?key=pXgR9Xs-YQoFfc5bTmbwZLQ&lt;br /&gt;
&lt;br /&gt;
== Slides ==&lt;br /&gt;
&lt;br /&gt;
=== Week 1 ===&lt;br /&gt;
* Introduction to Bioinformatics&lt;br /&gt;
** http://www.gersteinlab.org/courses/545/07-spr/slides/cbb545b-spr07-bioinfo1-intro.ppt&lt;br /&gt;
&lt;br /&gt;
=== Week 3 ===&lt;br /&gt;
* Decision tree&lt;br /&gt;
** http://www.gersteinlab.org/courses/545/07-spr/slides/DM_DecisionTree.ppt&lt;br /&gt;
* Ensemble methods&lt;br /&gt;
** http://www.gersteinlab.org/courses/545/07-spr/slides/DM_Ensemble.ppt&lt;br /&gt;
&lt;br /&gt;
=== Week 4 ===&lt;br /&gt;
* Multilinear and logistic regression&lt;br /&gt;
** http://www.gersteinlab.org/courses/545/07-spr/slides/DM_multiple_regression.ppt&lt;br /&gt;
&lt;br /&gt;
=== Week 5 ===&lt;br /&gt;
* SVM&lt;br /&gt;
** http://www.gersteinlab.org/courses/545/07-spr/slides/DM_SVM.ppt&lt;br /&gt;
** http://www.gersteinlab.org/courses/545/07-spr/slides/DM_SVM-law.ppt&lt;br /&gt;
* Perceptron Models&lt;br /&gt;
** http://www.gersteinlab.org/courses/545/07-spr/slides/DM_perceptron.ppt&lt;br /&gt;
* Links &lt;br /&gt;
** http://ranger.uta.edu/~cook/ai1/lectures/l9/l9.html&lt;br /&gt;
&lt;br /&gt;
=== Week 6 ===&lt;br /&gt;
* Molecular Networks as Application of Mining&lt;br /&gt;
** http://www.gersteinlab.org/courses/545/07-spr/slides/cbb545b-spr07-bioinfo1-intro.ppt&lt;br /&gt;
** http://www.gersteinlab.org/courses/545/07-spr/slides/cbb545b-spr07-bioinfo2-nets.ppt&lt;br /&gt;
&lt;br /&gt;
* Unsupervised Learning and Clustering&lt;br /&gt;
** http://www.gersteinlab.org/courses/545/07-spr/slides/DM_clustering.ppt&lt;br /&gt;
** http://www.gersteinlab.org/courses/545/07-spr/slides/DM_KNN.ppt&lt;br /&gt;
** http://www.gersteinlab.org/courses/545/07-spr/slides/DM_cluster-tan.ppt&lt;br /&gt;
* Nice interactive k-means demo&lt;br /&gt;
** http://www.elet.polimi.it/upload/matteucc/Clustering/tutorial_html/AppletKM.html&lt;br /&gt;
&lt;br /&gt;
=== Week 7 ===&lt;br /&gt;
* Predicting Networks Through Bayesian Inference&lt;br /&gt;
** http://www.gersteinlab.org/courses/545/07-spr/slides/cbb545b-spr07-bioinfo3-bayes1.ppt&lt;br /&gt;
** http://www.gersteinlab.org/courses/545/07-spr/slides/cbb545b-spr07-bioinfo4-bayes2.ppt&lt;br /&gt;
&lt;br /&gt;
=== Week 8 ===&lt;br /&gt;
* Applications of Spectral methods (PCA/SVD)&lt;br /&gt;
** http://www.gersteinlab.org/courses/545/07-spr/slides/cbb545b-spr07-bioinfo5-svd1.ppt&lt;br /&gt;
** http://www.gersteinlab.org/courses/545/07-spr/slides/cbb545b-spr07-bioinfo6-svd2.ppt&lt;br /&gt;
&lt;br /&gt;
=== Week 9, 10 ===&lt;br /&gt;
Spring break&lt;br /&gt;
&lt;br /&gt;
=== Week 11 ===&lt;br /&gt;
* Determining Method of Action in Drug Discovery Using Affymetrix Microarray Data&lt;br /&gt;
** Dr. Max Kuhn, Pfizer Research Lab&lt;br /&gt;
** http://www.gersteinlab.org/courses/545/07-spr/slides/StaphYale.pdf&lt;br /&gt;
** http://www.gersteinlab.org/courses/545/07-spr/slides/caretYale.pdf&lt;br /&gt;
&lt;br /&gt;
* An Introduction to Text Mining with an Application to the Life Sciences.&lt;br /&gt;
** Professor Michael Krauthammer, Yale Medical School&lt;br /&gt;
** http://www.gersteinlab.org/courses/545/07-spr/slides/text_mining.mk.ppt&lt;br /&gt;
&lt;br /&gt;
=== Week 12 ===&lt;br /&gt;
&lt;br /&gt;
* A(n) (extremely) brief/crude introduction to minimum description length (MDL) principle&lt;br /&gt;
** Jiang Du&lt;br /&gt;
** http://www.gersteinlab.org/courses/545/07-spr/slides/mdl.jdu.ppt&lt;br /&gt;
&lt;br /&gt;
* Kernel PCA&lt;br /&gt;
** Edo Liberty&lt;br /&gt;
&lt;br /&gt;
== Readings ==&lt;br /&gt;
&lt;br /&gt;
=== Week 1 ===&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Intro. to Data Mining, Overview of Data Mining in Bioinformatics&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
* About Data Mining&lt;br /&gt;
** http://www.twocrows.com/about-dm.htm&lt;br /&gt;
* Data Mining Applications&lt;br /&gt;
** http://www.twocrows.com/applics.htm&lt;br /&gt;
* Data Mining In Depth: Description is Not Prediction&lt;br /&gt;
** http://www.dmreview.com/article_sub.cfm?articleId=6388&lt;br /&gt;
&lt;br /&gt;
=== Week 2 ===&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Datamining workflow, Data Preprocessing and cleaning, Intro. to R and Rattle&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
* Datamining workflow and presprocessing&lt;br /&gt;
** http://www.gersteinlab.org/courses/545/07-spr/reading/aimag-kdd-overview-1996-Fayyad.pdf&lt;br /&gt;
* Rattle&lt;br /&gt;
** http://datamining.togaware.com/survivor/Data_Mining1.html&lt;br /&gt;
* R&lt;br /&gt;
** Primary web page for R.&lt;br /&gt;
*** http://www.r-project.org&lt;br /&gt;
** Guide&lt;br /&gt;
*** http://cran.r-project.org/doc/contrib/Owen-TheRGuide.pdf&lt;br /&gt;
&lt;br /&gt;
=== Week 3 ===&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Intro. to Classification, Decision Trees&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
* Classification: Basic Concepts, Decision Trees and Model Evaluation&lt;br /&gt;
** http://www.gersteinlab.org/courses/545/07-spr/reading/tan-2005-chpt04.pdf&lt;br /&gt;
&lt;br /&gt;
=== Week 4 ===&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Multilinear and Logistic Regression, Support Vector Machines (SVM)&amp;#039;&amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
The following chapters are available online courtesy of the authors/publishers for your personal use in this course. You may print a personal copy but you are prohibited from redistribution.&lt;br /&gt;
&lt;br /&gt;
* Chapter 6 of Jiawei Han and Micheline Kamber (2005). Data Mining Concepts and Techniques. Morgan Kaufmann. 2nd Ed.&lt;br /&gt;
** http://www.gersteinlab.org/courses/545/07-spr/reading/han-2005-chpt06.pdf&lt;br /&gt;
* Chapter 5 of  Pang-Ning Tan, Michael Steinbach, and Vipin Kumar (2005). Introduction to Data Mining. Addison Wesley.&lt;br /&gt;
** http://www.gersteinlab.org/courses/545/07-spr/reading/tan-2005-chpt05.pdf&lt;br /&gt;
&lt;br /&gt;
rpart&lt;br /&gt;
&lt;br /&gt;
* rpart R package vignette&lt;br /&gt;
** http://cran.r-project.org/src/contrib/Descriptions/rpart.html&lt;br /&gt;
* An Introduction to Recursive Partitioning Using the RPART Routines. Terry M. Therneau and Elizabeth J. Atkinson.&lt;br /&gt;
** http://mayoresearch.mayo.edu/mayo/research/biostat/upload/rpartmini.pdf&lt;br /&gt;
&lt;br /&gt;
Logistic regression&lt;br /&gt;
* MIT Sloan Lecture on Logistic Regression&lt;br /&gt;
** http://ocw.mit.edu/NR/rdonlyres/Sloan-School-of-Management/15-062Data-MiningSpring2003/B2EC3803-F8A7-46CF-8B9E-D0D080E52A6B/0/logreg.pdf&lt;br /&gt;
&lt;br /&gt;
=== Week 6 ===&lt;br /&gt;
&lt;br /&gt;
* Modern trends in data mining&lt;br /&gt;
** http://www.gersteinlab.org/courses/545/07-spr/reading/hastie_moderndatamining.pdf&lt;br /&gt;
&lt;br /&gt;
== Slides for courses based on reference text books ==&lt;br /&gt;
&lt;br /&gt;
* Ian Witten and Eibe Frank&lt;br /&gt;
** http://books.elsevier.com/us//mk/us/subindex.asp?maintarget=companions/defaultindividual.asp&lt;/div&gt;</summary>
		<author><name>Infoadmin</name></author>
	</entry>
</feed>