CSCI 307
    Data Mining

    College of the Holy Cross, Fall 2023


    Home | | Schedule | | Resources


    Instructor
    Farhad Mohsin [home]


    Lecture times
    TuTh 9:30AM - 10:45AM

    Location
    Swords 328


    Office hours
    Office hours location: Swords 339

    • Wed: 12-1 PM
    • Thu: 2-3:30 PM

    Canvas
    We'll use Canvas for assignment submission, lecture notes sharing etc. All assignment's written reports must be submitted in pdf format. I would prefer a digital file (written in Word or LaTex), however it is fine if you put in pictures taken of handwritten assignments, as long as it's legible. For assignments with coding components, they must be done in JuPyter Notebook, and then exported as pdfs. We will go through the procedure for this in class.


    Course description
    This course provides an introduction to Data Mining and will examine data techniques for the discovery, interpretation and visualization of patterns in large collections of data. Topics covered in this course include data mining methods such as classification, rule-based learning, decision trees, association rules, and data visualization. The work discussed originates in the fields of artificial intelligence, machine learning, statistical data analysis, data visualization, databases, and information retrieval.


    Prerequisites
    The prerequisite for this class is CSCI 132, Data Structures.
    Also note that you'll have assignments that require programming in Python. The first couple of lecture will help brush up Python syntaxes and introduce (possibly) new Python libraries that are common in data mining.


    Textbook
    Data Mining and Machine Learning: Fundamental Concepts and Algorithms Second Edition.

    by Mohammed J. Zaki and Wagner Meira, Jr
    Cambridge University Press, March 2020
    ISBN: 978-1108473989

    The textbook covers fundamental algorithms in data mining and machine learning. The book will specially be referenced for the theoretical concepts related to the course.

    It is also not mandatory to buy the textbook as the book has a free online version, which you can access at https://dataminingbook.info/book_html/.


    Exams
    Midterm:
    There will be one midterm exam held on the following date:

      Midterm Exam: (Tentative) Oct 24

    Final exam:
    A cumulative final exam will be held during finals week as scheduled.

      Final Exam: TBD

    The class might have some in-class pop quizzes, mostly focusing on conceptual questions.


    Homework Assignments
    There will be several homework assignments during the semester. These problem sets will include questions that require written answers about concepts, and also problems that involve use of the Python libraries introduced in class.


    Term Project
    Students will work in groups of two to complete a term project. Details regarding the project will be discussed on the first day of class.


    Grading

    • Homework: 25%
    • Term Project: 25%
    • Midterm exam:20%
    • Final exam: 30%


    Late Policy
    Assignments are due before the beginning of class on the assigned due date. Late assignments will be marked down 10% for each day late. That is, assignments turned in after the time they are due will be marked down 10%, assignments turned between 24 and 48 hours after the due date will be marked down 20%, and so on. The penalty will be determined when the assignment is physically transferred to the instructor or submitted online (whichever is the submission method for that particular assignment). Late work will not be accepted after the graded assignment is returned to the class.


    Collaboration Policy
    You are allowed to discuss strategies for solving homework problems with other students, however any work you turn in must be your own work (i.e. you may not simply copy another student's answers and turn them in as your own).

    For the group project, you will work together, but the contribution of each student should clearly be written in the final project report. You must clearly indicate the names of any students you work with on each assignment.

    You may consult public literature (books, articles, etc) for information, but you must cite each source of ideas you adopt.

    Please familiarize yourself with the Math and CS Department's policy on Academic Integrity as well as the College's Academic Integrity Policy.


    Excused Absence Policy
    Class attendance is expected and will be counted toward the participation part of the grade. If you have a confirmed reason why you cannot attend an exam at the day or time it is given, you must contact your instructor well ahead of time to arrange to take it at another time. Please see the College Policy on excused absences.


    Reasonable Accommodations and Accessibility Services

    The instructor is committed to providing students with disabilities equal access to the educational opportunities associated with this course. For details or to request accommodation, please refer to College procedures on Requests for Reasonable Accommodations and the Office of Accessibility Services.


    Class Recordings

    Consistent with applicable federal and state law, this course may be video/audio recorded as an accommodation only with permission from the Office of Accessibility Services.



    Last modified: Sep 4, 2023