CS 43105 Data Mining Techniques

Fall 2019

 

Instructor: Xiang Lian

Office Location: Mathematics and Computer Science Building, Room 264

Office Phone Number: (330) 672-9063

Web: http://www.cs.kent.edu/~xlian/index.html

Email: xlian@kent.edu

Course: Data Mining Techniques

CRN: 12631

Prerequisites: CS 33007 Introduction to Database System Design

Time: 12:30pm - 1:45pm, MW

Classroom Location: Mathematics and Computer Science Building (MSB), 115

Course Webpage: http://www.cs.kent.edu/~xlian/course_archive/2019Fall_CS43105.html

 

Instructor's Office Hours: 10:00am - 12:30pm, TR; or by appointment

 

Graduate Assistant: N/A

Office: N/A

E-mail: N/A

Phone: N/A

TA's Office Hours: N/A


Enrollment/Official Registration of this Class

The official registration deadline for this course is 08/28/2019. University policy requires all students to be officially registered in each class they are attending. Students who are not officially registered for a course by published deadlines should not be attending classes and will not receive credit or a grade for the course. Each student must confirm enrollment by checking his/her class schedule (using Student Tools in FlashLine) prior to the deadline indicated. Registration errors must be corrected prior to the deadline.

http://www.kent.edu/registrar/calendars-deadlines

 

For registration deadlines, enter the requested information for a Detailed Class Search from the Schedule of Classes Search found at:

https://keys.kent.edu:44220/ePROD/bwlkffcs.P_AdvUnsecureCrseSearch?term_in=201680

 

After locating your course/section, click on the Registration Deadlines link on the far right side of the listing.

 

Last day to withdraw: 10/30/2019

 


Textbook

Ian H. Witten, Eibe Frank, and Mark A. Hall. Data Mining: Practical Machine Learning Tools and Techniques: 3rd Edition. ISBN-13: 9780123748560,
Publisher: Elsevier Science, Publication date: 1/20/2011.

Reference Books

Jiawei Han, Micheline Kamber and Jian Pei. Data Mining: Concepts and Techniques. The Morgan Kaufmann Series in Data Management Systems, Morgan Kaufmann Publishers, July 2011. ISBN 978-0123814791. https://hanj.cs.illinois.edu/bk3/bk3_slidesindex.htm

Mohammed J. Zaki and Wagner Meira. Data Mining and Analysis: Fundamental Concepts and Algorithms. 1st Edition. Cambridge University Press, May 12, 2014. ISBN-13: 978-0521766333.

 

Resources of Reading Materials

Online resources of research papers, including conference papers (e.g., SIGKDD, ICDM, SIGMOD, PVLDB, ICDE, etc.).

 

o   SIGKDD: https://dblp.uni-trier.de/db/conf/kdd/index.html

o   ICDM: https://dblp1.uni-trier.de/db/conf/icdm/index.html

o   SIGMOD: http://dblp.uni-trier.de/db/conf/sigmod/

o   VLDB: http://www.vldb.org/pvldb/, or http://dblp.uni-trier.de/db/journals/pvldb/index.html

o   ICDE: http://ieeexplore.ieee.org/xpl/conhome.jsp?punumber=1000178, or http://dblp.uni-trier.de/db/conf/icde/

o   Datasets and Source Code

     Spatial data sets and index source code: http://chorochronos.datastories.org/

     Road network and stream data: https://www.cs.utah.edu/~lifeifei/datasets.html

     DBpedia RDF data: http://www.dbpedia.org

     Freebase RDF data: https://developers.google.com/freebase/

     YAGO1, YAGO2s, YAGO3 RDF data: https://www.mpi-inf.mpg.de/departments/databases-and-information-systems/research/yago-naga/yago/archive/ (YAGO2 paper: https://people.mpi-inf.mpg.de/~kberberi/publications/2010-mpii-tra.pdf)

o   Apache Hadoop: http://hadoop.apache.org/

o   Amazon AWS: https://aws.amazon.com/

o   Tutorial: https://www.lynda.com/ (Sign in with the organization portal)

 

 


 

Catalog Description

Concepts and techniques of data mining. Data mining is a process of discovering information from a set of large databases. This course takes a database perspective on data mining, covering a set of interesting topics, including association rule mining, clustering, classification, web mining, etc. It covers the basics of some important theoretical foundations for data mining, including linear regression, Bayesian inferences, information theory, and markov chain random walk.

Students are expected to do a research project on new problems or solutions and write reports. Students will also give presentations to the class to demonstrate their outcomes. It is also expected that the resulting reports can be extended to data mining conference/journal papers.

 


 

Tentative Schedule

Week

Topic

Notes1

Week 2 (Aug. 26)

Introduction

Please form study groups, each with 3-4 members, and send your IDs, names, and emails to me (xlian@kent.edu); Due on Sept. 4

Week 2 (Aug. 28)

 

 

Week 3 (Sept. 2)

--

Labor Day; No classes

Week 3 (Sept. 4)

Input: Data Basics

Homework 1 (Due on Sept. 18)

Week 4 (Sept. 9)

 

 

Week 4 (Sept. 11)

Output: Knowledge

 

Week 5 (Sept. 16)

Frequent Itemset Mining

 

Week 5 (Sept. 18)

 

Homework 2 (Due on Oct. 2)

Week 6 (Sept. 23)

Association Rule Mining

 

Week 6 (Sept. 25)

Classification (1)

 

Week 7 (Sept. 30)

 

 

Week 7 (Oct. 2)

 

 

Week 8 (Oct. 7)

Classification (2)

 

Week 8 (Oct. 9)

Project Q/A

 

Week 9 (Oct. 14)

Clustering (1)

 

Week 9 (Oct. 16)

Project Q/A

Project Report (template)

Homework 3 (Due on Oct. 30)

Week 10 (Oct. 21)

 

Submission of Sections 1-4 of Project Report (Due on Oct. 30)

Week 10 (Oct. 23)

Clustering (2)

 

Week 11 (Oct. 28)

 

 

Week 11 (Oct. 30)

 

Last Day to Withdraw: 10/30/2019

Week 12 (Nov. 4)

Anomaly/Outlier

 

Week 12 (Nov. 6)

 

Homework 4 (Due on Nov. 20)

Week 13 (Nov. 11)

Project Q/A

Veterans Day; No class

Week 13 (Nov. 13)

Project Q/A

 

Week 14 (Nov. 18)

 

 

Week 14 (Nov. 20)

Project Q/A

Homework 5 (Due on Dec. 4)

Week 15 (Nov. 25)

Project Q/A

 

Week 15 (Nov. 27)

--

Nov. 27 - Dec. 1, 2019, Thanksgiving Break; No classes

Week 16 (Dec. 2)

Presentations & Demos for Projects

 

Group #1:

Group #2:

Group #3:

Course Evaluation

Week 16 (Dec. 4)

Presentations & Demos for Projects

Group #5:

Group #6:

 

Preparation for Project Reports

Deadline for submitting the project report (Hard deadline: Dec. 5; only one member of each group submits to the Blackboard the project report, source code, data sets, presentation slides, and demos in a single zip package)

Week 17 (Dec. 9-15)

No Final Exam

 

 

Academic calendar: https://www.kent.edu/academic-calendar

Final exam schedule: https://www.kent.edu/registrar/fall-final-exam-schedule

NOTE: Presentation dates and deadlines are tentative. Exact dates will be announced in class!!!


Scoring and Grading

5% - Attendance & Questions

50% - 5 Homeworks (10 points each)

45% - Research Projects & Presentations

o   Research project report (including introduction, related works, problem definition, solutions, experiments, and conclusions) (30%)

o   Presentation and demonstration for the proposed research project (15%)

5% - Bonus Points, rated by other team members

10% - (Optional) Bonus for presenting research papers

A = 90 or higher

B = 80 - 89

C = 70 - 79

D = 60 - 69

F = <60

 


 

Guidelines for Surveys/Papers/Projects

 

All surveys/papers/projects will be submitted electronically only. Instructions are given separately.

 

     Assignments must be submitted to Blackboard by the due date.

     An assignment or project report turned in within two weeks after the due date will be considered late and will lose 30% of its grade (10% for the first week, and 20% more for the second week).

     No assignment will be accepted for grading after two weeks late.

     The late submission needs prior consent of the instructor.


Lecture Attendance Policy

Attendance in the lecture is mandatory. Students are expected to attend lectures, study the text, and contribute to discussions. You need to write your name on attendance sheets throughout the course, so please attend every lecture.

Students are expected to attend all scheduled classes and may be dropped from the course for excessive absences. Legitimate reasons for an "excused" absence include, but are not limited to, illness and injury, disability-related concerns, military service, death in the immediate family, religious observance, academic field trips, and participation in an approved concert or athletic event, and direct participation in university disciplinary hearings.

Even though any absence can potentially interfere with the planned development of a course, and the student bears the responsibility for fulfilling all course requirements in a timely and responsible manner, instructors will, without prejudice, provide students returning to class after a legitimate absence with appropriate assistance and counsel about completing missed assignments and class material. Neither academic departments nor individual faculty members are required to waive essential or fundamental academic requirements of a course to accommodate student absences. However, each circumstance will be reviewed on a case-by-case basis.

For more details, please refer to University policy 3-01.2: http://www.kent.edu/policyreg/administrative-policy-regarding-class-attendance-and-class-absence.


Make-up Presentation Policy

No make-up presentation will be given except for university sanctioned excused absences. If you miss a presentation (for a good reason), it is your responsibility to contact me before the presentation, or soon after the presentation as possible.


Academic Dishonesty Policy

The University expects a student to maintain a high standard of individual honor in his/her scholastic work. Unless otherwise required, each student is expected to complete his or her assignment individually and independently (even in the team, workload should be distributed to team members to accomplish individually). Although it is encouraged to study together, the work handed in for grading by each student is expected to be his or her own. Any form of academic dishonesty will be strictly forbidden and will be punished to the maximum extent. Copying an assignment from another student (team) in this class or obtaining a solution from some other source will lead to an automatic failure for this course and to a disciplinary action. Allowing another student to copy one's work will be treated as an act of academic dishonesty, leading to the same penalty as copying.

University policy 3-01.8 deals with the problem of academic dishonesty, cheating, and plagiarism. None of these will be tolerated in this class. The sanctions provided in this policy will be used to deal with any violations. If you have any questions, please read the policy at http://www.kent.edu/policyreg/administrative-policy-regarding-student-cheating-and-plagiarism and/or ask.


Students with Disabilities

University policy 3-01.3 requires that students with disabilities be provided reasonable accommodations to ensure their equal access to course content. If you have a documented disability and require accommodations, please contact the instructor at the beginning of the semester to make arrangements for necessary classroom adjustments. Please note, you must first verify your eligibility for these through Student Accessibility Services (contact 330-672-3391 or visit www.kent.edu/sas for more information on registration procedures).


Statements for the Course

This course may be used to satisfy the University Diversity requirement. Diversity courses provide opportunities for students to learn about such matters as the history, culture, values and notable achievements of people other than those of their own national origin, ethnicity, religion, sexual orientation, age, gender, physical and mental ability, and social class. Diversity courses also provide opportunities to examine problems and issues that may arise from differences, and opportunities to learn how to deal constructively with them.

 

This course may be used to satisfy the Writing Intensive Course (WIC) requirement. The purpose of a writing-intensive course is to assist students in becoming effective writers within their major discipline. A WIC requires a substantial amount of writing, provides opportunities for guided revision, and focuses on writing forms and standards used in the professional life of the discipline.

 

This course may be used to fulfill the university's Experiential Learning Requirement (ELR) which provides students with the opportunity to initiate lifelong learning through the development and application of academic knowledge and skills in new or different settings. Experiential learning can occur through civic engagement, creative and artistic activities, practical experiences, research, and study abroad/away.

 


 

Disclaimer

The instructor reserves the right to alter this syllabus as necessary.