CS 43105 Data Mining Techniques
Fall 2019
Instructor: Xiang Lian
Office
Location: Mathematics and Computer Science
Building, Room 264
Office
Phone Number: (330) 672-9063
Web: http://www.cs.kent.edu/~xlian/index.html
Email: xlian@kent.edu
Course:
Data Mining Techniques
CRN: 12631
Prerequisites: CS 33007 Introduction to Database System Design
Time:
12:30pm - 1:45pm, MW
Classroom
Location: Mathematics and Computer Science
Building (MSB), 115
Course
Webpage: http://www.cs.kent.edu/~xlian/course_archive/2019Fall_CS43105.html
Instructor's
Office Hours: 10:00am - 12:30pm, TR; or by
appointment
Graduate
Assistant: N/A
Office:
N/A
E-mail:
N/A
Phone:
N/A
TA's Office Hours: N/A
The official
registration deadline for this course is 08/28/2019. University policy requires all
students to be officially registered in each class they are attending. Students
who are not officially registered for a course by published deadlines should
not be attending classes and will not receive credit or a grade for the course.
Each student must confirm enrollment by checking his/her class schedule (using
Student Tools in FlashLine)
prior to the deadline indicated. Registration errors must be corrected prior to
the deadline.
http://www.kent.edu/registrar/calendars-deadlines
For registration deadlines, enter the requested information
for a Detailed Class Search from the Schedule of Classes Search found at:
https://keys.kent.edu:44220/ePROD/bwlkffcs.P_AdvUnsecureCrseSearch?term_in=201680
After locating your course/section, click on the Registration
Deadlines link on the far right side of the listing.
Last
day to withdraw: 10/30/2019
Textbook
Ian H. Witten, Eibe Frank, and Mark A. Hall. Data Mining:
Practical Machine Learning Tools and Techniques: 3rd Edition. ISBN-13:
9780123748560,
Publisher: Elsevier Science, Publication date: 1/20/2011.
Reference Books
Jiawei Han,
Micheline Kamber and Jian Pei. Data Mining: Concepts and Techniques. The Morgan
Kaufmann Series in Data Management Systems, Morgan Kaufmann Publishers, July
2011. ISBN 978-0123814791. https://hanj.cs.illinois.edu/bk3/bk3_slidesindex.htm
Mohammed J.
Zaki and Wagner Meira. Data Mining and Analysis: Fundamental Concepts and
Algorithms. 1st Edition. Cambridge University Press, May 12, 2014. ISBN-13:
978-0521766333.
Resources of Reading Materials
Online resources of research papers, including conference
papers (e.g., SIGKDD, ICDM, SIGMOD, PVLDB, ICDE, etc.).
o
SIGKDD: https://dblp.uni-trier.de/db/conf/kdd/index.html
o
ICDM: https://dblp1.uni-trier.de/db/conf/icdm/index.html
o
SIGMOD:
http://dblp.uni-trier.de/db/conf/sigmod/
o VLDB: http://www.vldb.org/pvldb/, or http://dblp.uni-trier.de/db/journals/pvldb/index.html
o ICDE: http://ieeexplore.ieee.org/xpl/conhome.jsp?punumber=1000178, or http://dblp.uni-trier.de/db/conf/icde/
o
Datasets
and Source Code
❖
Spatial
data sets and index source code: http://chorochronos.datastories.org/
❖ Road network and stream data: https://www.cs.utah.edu/~lifeifei/datasets.html
❖ DBpedia RDF data: http://www.dbpedia.org
❖ Freebase RDF data: https://developers.google.com/freebase/
❖
YAGO1,
YAGO2s, YAGO3 RDF data: https://www.mpi-inf.mpg.de/departments/databases-and-information-systems/research/yago-naga/yago/archive/ (YAGO2 paper: https://people.mpi-inf.mpg.de/~kberberi/publications/2010-mpii-tra.pdf)
o
Apache
Hadoop: http://hadoop.apache.org/
o
Amazon
AWS: https://aws.amazon.com/
o
Tutorial:
https://www.lynda.com/ (Sign in with the organization
portal)
Catalog Description
Concepts and techniques of data
mining. Data mining is a process of discovering information from a set of large
databases. This course takes a database perspective on data mining, covering a
set of interesting topics, including association rule mining, clustering,
classification, web mining, etc. It covers the basics of some important
theoretical foundations for data mining, including linear regression, Bayesian
inferences, information theory, and markov chain random walk.
Students are expected to do a research
project on new problems or solutions and write reports. Students will also give
presentations to the class to demonstrate their outcomes. It is also expected
that the resulting reports can be extended to data mining conference/journal
papers.
Tentative Schedule
Week |
Topic |
Notes1 |
Week 2 (Aug. 26) |
Please form study
groups, each with 3-4 members, and send your IDs, names, and emails to me (xlian@kent.edu); Due on
Sept. 4 |
|
Week 2 (Aug. 28) |
|
|
Week 3 (Sept. 2) |
-- |
Labor Day; No classes |
Week 3 (Sept. 4) |
Homework
1 (Due on
Sept. 18) |
|
Week 4 (Sept. 9) |
|
|
Week 4 (Sept. 11) |
|
|
Week 5 (Sept. 16) |
|
|
Week 5 (Sept. 18) |
|
Homework
2 (Due on Oct.
2) |
Week 6 (Sept. 23) |
|
|
Week 6 (Sept. 25) |
|
|
Week 7 (Sept. 30) |
|
|
Week 7 (Oct. 2) |
|
|
Week 8 (Oct. 7) |
|
|
Week 8 (Oct. 9) |
Project Q/A |
|
Week 9 (Oct. 14) |
|
|
Week 9 (Oct. 16) |
Project Q/A |
Project Report (template) Homework
3 (Due on
Oct. 30) |
Week 10 (Oct. 21) |
|
Submission of
Sections 1-4 of Project Report (Due on Oct. 30) |
Week 10 (Oct. 23) |
|
|
Week 11 (Oct. 28) |
|
|
Week 11 (Oct. 30) |
|
Last Day to Withdraw: 10/30/2019 |
Week 12 (Nov. 4) |
|
|
Week 12 (Nov. 6) |
|
Homework 4 (Due on Nov. 20) |
Week 13 (Nov. 11) |
Project Q/A |
Veterans Day; No class |
Week 13 (Nov. 13) |
Project Q/A |
|
Week 14 (Nov. 18) |
|
|
Week 14 (Nov. 20) |
Project Q/A |
Homework
5 (Due on Dec.
4) |
Week 15 (Nov. 25) |
Project Q/A |
|
Week 15 (Nov. 27) |
-- |
Nov. 27 - Dec. 1, 2019,
Thanksgiving Break; No classes |
Week 16 (Dec. 2) |
Presentations & Demos for Projects Group #1: Group #2: Group #3: |
Course Evaluation |
Week 16 (Dec. 4) |
Presentations & Demos for
Projects Group #5: Group #6: Preparation for Project Reports |
Deadline for submitting the
project report (Hard
deadline: Dec. 5; only one
member of each group submits to the Blackboard the project report,
source code, data sets, presentation slides, and demos in a single zip
package) |
Week 17 (Dec. 9-15) |
No Final Exam |
|
Academic
calendar: https://www.kent.edu/academic-calendar
Final exam
schedule: https://www.kent.edu/registrar/fall-final-exam-schedule
NOTE: Presentation dates and
deadlines are tentative. Exact dates will be announced in class!!!
5% - Attendance & Questions
50% - 5 Homeworks (10 points each)
45% -
Research Projects & Presentations
o
Research
project report (including introduction, related works, problem definition,
solutions, experiments, and conclusions) (30%)
o
Presentation
and demonstration for the proposed research project (15%)
5% - Bonus
Points, rated by other team members
10% - (Optional) Bonus for presenting research papers
A = 90 or higher
B = 80 - 89
C = 70 - 79
D = 60 - 69
F = <60
Guidelines for
Surveys/Papers/Projects
All surveys/papers/projects will be
submitted electronically only. Instructions are given separately.
➢ Assignments must be submitted to Blackboard by the due date.
➢ An assignment or project report turned in within two weeks after the due date will be considered late and will lose 30% of its grade (10% for the first week, and 20% more for the second week).
➢ No assignment will be accepted for grading after two weeks late.
➢ The late submission needs prior consent of the instructor.
Attendance in the lecture is
mandatory. Students are expected to attend lectures, study the text, and
contribute to discussions. You need to write your name on attendance sheets
throughout the course, so please attend every lecture.
Students are expected to attend all
scheduled classes and may be dropped from the course for excessive absences.
Legitimate reasons for an "excused" absence include, but are not
limited to, illness and injury, disability-related concerns, military service,
death in the immediate family, religious observance, academic field trips, and
participation in an approved concert or athletic event, and direct
participation in university disciplinary hearings.
Even though any absence can
potentially interfere with the planned development of a course, and the student
bears the responsibility for fulfilling all course requirements in a timely and
responsible manner, instructors will, without prejudice, provide students
returning to class after a legitimate absence with appropriate assistance and
counsel about completing missed assignments and class material. Neither
academic departments nor individual faculty members are required to waive
essential or fundamental academic requirements of a course to accommodate
student absences. However, each circumstance will be reviewed on a case-by-case
basis.
For more details, please refer to
University policy 3-01.2: http://www.kent.edu/policyreg/administrative-policy-regarding-class-attendance-and-class-absence.
No make-up
presentation will be given except for university sanctioned excused absences.
If you miss a presentation (for a good reason), it is your responsibility to
contact me before the presentation, or soon after the presentation as possible.
The University expects a student to
maintain a high standard of individual honor in his/her scholastic work. Unless
otherwise required, each student is expected to complete his or her assignment
individually and independently (even in the team, workload should be
distributed to team members to accomplish individually). Although it is
encouraged to study together, the work handed in for grading by each student is
expected to be his or her own. Any form of academic dishonesty will be strictly
forbidden and will be punished to the maximum extent. Copying an assignment
from another student (team) in this class or obtaining a solution from some
other source will lead to an automatic failure for this course and to a
disciplinary action. Allowing another student to copy one's work will be
treated as an act of academic dishonesty, leading to the same penalty as
copying.
University policy 3-01.8 deals with
the problem of academic dishonesty, cheating, and plagiarism. None of these
will be tolerated in this class. The sanctions provided in this policy will be
used to deal with any violations. If you have any questions, please read the
policy at http://www.kent.edu/policyreg/administrative-policy-regarding-student-cheating-and-plagiarism and/or ask.
University policy 3-01.3 requires
that students with disabilities be provided reasonable accommodations to ensure
their equal access to course content. If you have a documented disability and
require accommodations, please contact the instructor at the beginning of the
semester to make arrangements for necessary classroom adjustments. Please note,
you must first verify your eligibility for these through Student Accessibility Services (contact 330-672-3391 or visit www.kent.edu/sas for more information on registration procedures).
This course may be used to satisfy
the University Diversity requirement. Diversity courses provide opportunities
for students to learn about such matters as the history, culture, values and
notable achievements of people other than those of their own national origin,
ethnicity, religion, sexual orientation, age, gender, physical and mental
ability, and social class. Diversity courses also provide opportunities to
examine problems and issues that may arise from differences, and opportunities
to learn how to deal constructively with them.
This course may be used to satisfy
the Writing Intensive Course (WIC) requirement. The purpose of a
writing-intensive course is to assist students in becoming effective writers
within their major discipline. A WIC requires a substantial amount of writing,
provides opportunities for guided revision, and focuses on writing forms and
standards used in the professional life of the discipline.
This course may be used to fulfill
the university's Experiential Learning Requirement (ELR) which provides
students with the opportunity to initiate lifelong learning through the
development and application of academic knowledge and skills in new or
different settings. Experiential learning can occur through civic engagement,
creative and artistic activities, practical experiences, research, and study
abroad/away.
The
instructor reserves the right to alter this syllabus as necessary.