Holy Cross Mathematics and Computer Science
Mathematics 375 -- Probability and Statistics I
Syllabus Fall 2005
Professor: John Little
Office: Swords 335
Office Phone: 793-2274
Office Hours: MWF 1 - 3pm, TR 11 am - noon, and by appointment
Email: little@mathcs.holycross.edu (prefered), or jlittle@holycross.edu
Course Homepage: http://mathcs.holycross.edu/~little/ProbStat0506/PS1.html
Course Description
Statistics is the branch of the mathematical sciences
that deals with with the collection, analysis, and
interpretation of data. It is used very widely today in the physical,
life, social, and management sciences for making decisions and predictions
in the presence of uncertainty. Some typical examples are:
- Demonstrating evidence for an empirical relationship
between different quantities in a physics experiment, where measurement
errors or fundamental elements of randomness (e.g. quantum physical
effects) may be present,
- Analyzing the results of treating patients with a new drug
in a clinical trial, where the reaction of an individual patient
depends on too many different types of factors to be readily predictable,
- Predicting the outcome of an election by sampling voter preferences
(see below for a fuller discussion of this case),
- Estimating how to price an insurance policy based on likely
risks to the persons covered. (This is a large part of what
actuaries do in insurance companies; Probability and
Statistics is the subject of one of the first in the series of
exams actuarial trainees must pass to become qualified.)
Indeed statistical reasoning is probably the most common use
of mathematics in real world applications at present.
On the most basic level, "descriptive statistics" -- quantities
such as the mean (average), variance, and standard deviation
of a collection of numbers (e.g. experimental measurements,
blood pressure readings from patients, etc.) plus other types of summary
information about a data set -- describe the "shape" or
distribution of the data (for example, how "spread out" the
numbers are around their "middle point"). These descriptive
statistics are also used as the basis for making inferences
or predictions on the overall distribution of a quantity
based on a random sample. Some of you may have studied this
aspect of statistics in previous high school or college course
work in mathematics, economics, sociology, psychology, or other
areas.
We will use these ideas too, but we will also go considerably
farther than you probably have done in other courses to study
the theoretical basis for comparing patterns observed
in samples from a population to the patterns of the whole population.
In other words, in this class we will learn not only how to apply
statistical tests, but also how and why statistical tests work
(with proofs), and how statistical tests for new situations
might be designed. The mathematical tools and underpinnings here
are provided by the theory of probability.
The following is a brief preliminary discussion of how probability
enters into these questions.
If our data included every possible
measurement, then appropriate descriptive statistics applied
to the data would presumably generate the information we seek. But
of course that kind of "completeness" is virtually never
available. Think of the election polling example. Here the statistic
of interest is most often simply the fraction of the total
population of voters favoring a particular candidate. Prior to
the actual election, it would be too difficult and costly to ask
every possible voter which candidate they prefer to determine
that fraction. Moreover, if people can choose whether or
not to vote, it might not be possible ahead of time to determine exactly
who will cast ballots in a given election.
So instead, pollsters select a sample (that is, a subset)
of the whole population of voters and determine the preferences
of those in the sample. The goal is to make inferences about the
preferences of the whole population from the preferences of those
in the sample. If the sample were perfectly representative of the
whole population, the results would definitely be correct. But of
course, that is also virtually never true. There will
always be some amount of randomness involved in the selection
of the sample. That means that the fraction of voters favoring
candidate X in the sample will most likely differ from the true
fraction to some extent. We would like to be able to estimate
the error -- for instance, to say that in our sample
55% of the voters favored X, and the same is true for all voters
with an error of + or - 3%. Then we must address questions such
as: How large a sample do we need so that we can be reasonably
certain that the error is that small? What does reasonably
certain mean -- can we quantify that? and so forth.
The basis for answering this type of question is the theory of
probability, so we will start there this semster. We
will study:
- Sample spaces, events, the concepts of density and distribution
functions in the discrete and continuous cases
- Discrete and continuous random variables with a given
density function, expected value, mean, variance, etc.
- Binomial, geometric, and Poisson discrete random variables,
- Normal, Gamma, Beta, and other continuous random variables,
- Functions of random variables and multivariable density and
distribution functions
- The Central Limit Theorem (CLT) (The form we will prove
states that if we sample repeatedly and independently
from any suitable distribution, then the sample mean
tends to a normal random variable as the sample size goes
to infinity. This is one justification for the central role
of normal distributions in statistics.)
The major tools here will be counting techniques for subsets,
permutations, and combinations, single- and multi-variable calculus,
and ideas about infinite series from Principles of Analysis.
This course is the first half of a full-year sequence; we will turn
to statistics per se next semester.
There is a week-by-week schedule at the end of this syllabus with
more information on the topics we will cover.
Text
The text for both semesters of the course is
Mathematical Statistics with Applications, 6th ed
by D. Wackerly, W. Mendenhall, and R. Scheaffer. We plan
to cover Chapters 1-6 and the section in Chapter 7 on
the Central Limit Theorem this semester and the rest of
Chapter 7 and Chapters 8-11 and 13 next semester.
Course Assignments and Grading
The assignments for the course will consist of:
- Two in-class midterm exams, each worth 20% of
the course grade. Tentative dates: Friday, October 7 and Friday,
November 18.
- Final Examination, worth
25% of the course grade. (Scheduled date:
Saturday, December 17 at 8:30am.)
- Weekly problem sets, worth 25% of the course
grade. Notes:
- Because of the large size of this class, in order for me to
return your work with constructive comments in a timely manner, it
may become necessary to grade only selected problems on each assignment.
If that happens, I will always try to select a representative sample of
computational and theoretical problems to be evaluated from that assignment.
But the selection will not be announced beforehand, and you
will be expected to do all of the problems in any case.
- I will put complete solutions of all assigned
problems on reserve in the Science Library after class on the date
the assignment is due. You may consult these and photocopy them
for your own use at any time if you wish.
- Because of the availability of these complete solutions,
because every effort will be made to return your graded problem
sets in a timely fashion, and for reasons of fairness,
no problem sets will be accepted for credit after the announced due date,
except in the case of a verified medical excuse. If you
are authorized to hand in a problem set late, I will ask you
sign a statement that you have not consulted the reserve
solutions in preparing your work.
- Group reports from discussion class meetings,
together worth 10% of the course grade.
If you ever have a question about the grading policy, or about your
standing in the course, please feel free to consult with me.
Schedule
The following is an approximate schedule. Some rearrangement,
expansion, or contraction of topics may become necessary. I will announce
any changes in class, and on the course homepage.
Week | Dates | Class Topics | Reading (WMS)
|
---|
| |
|
---|
1 | 8/31, 9/2 | Course introduction | Chapter 1, 2.1-2.3
|
---|
2 | 9/5,7,9 | Sample spaces, events, probabilities | 2.4-2.6
|
---|
3 | 9/12,14,16 | Conditional probabilities, independence | 2.7-2.9
|
---|
4 | 9/19,21,23 | Discrete random variables, expected values | 2.10-3.3
|
---|
5 | 9/26,28,30 | Binomial, Geometric and related random variables | 3.4-3.7
|
---|
6 | 10/3,5 | Poisson processes, moment generating functions | 3.8-3.9
|
---|
| 10/7 | Exam 1 (Chapters 1, 2, and 3.1-3.7) |
|
---|
| 10/10 | No Class -- Columbus Day Break |
|
---|
7 | 10/12,14 | Continuous Random Variables | 4.1-4.4
|
---|
8 | 10/17,19,21 | Normal, Gamma distributions | 4.5-4.6
|
---|
9 | 10/24,26,28 | More on continuous random variables | 4.7-4.10
|
---|
10 | 10/31, 11/2,4 | Multivariate distributions, independence | 5.1-5.4
|
---|
11 | 11/7,9,11 | Expected value, covariance | 5.5-5.9
|
---|
12 | 11/14,16 | Functions of Random Variables | 6.1-6.3
|
---|
| 11/18 | Exam 2 (Chapters 4,5) |
|
---|
13 | 11/21 | Transformations | 6.4
|
---|
| 11/23,25 | No Class -- Thanksgiving Break |
|
---|
14 | 11/28,30, 12/2 | Moment generating functions, CLT | 6.5, 7.3-7.4
|
---|
15 | 12/5 | Semester wrap-up |
|
---|
The final examination for this class will be held on Saturday, December 17, at
8:30 am.
Departmental Statement on Academic Integrity
Why is academic integrity important?
All education is a cooperative enterprise between teachers and
students. This cooperation works well only when there is trust and
mutual respect between everyone involved.
One of our main aims as a department is to help students become
knowledgeable and sophisticated learners, able to think and work
both independently and in concert with their peers. Representing another
person's work as your own in any form (plagiarism or ``cheating''),
and providing or receiving unauthorized assistance on assignments (collusion)
are lapses of academic integrity because they subvert the learning process
and show a fundamental lack of respect for the educational enterprise.
How does this apply to our courses?
You will encounter a variety of types of assignments and examination
formats in mathematics and computer science courses. For instance,
many problem sets in mathematics classes and laboratory assignments
in computer science courses are individual assignments.
While some faculty members
may allow or even encourage discussion among
students during work on problem sets, it is the expectation that the
solutions submitted by each student will be that student's own work,
written up in that student's own words. When consultation with other
students or sources other than the textbook occurs, students should
identify their co-workers, and/or cite their sources as they would for
other writing assignments. Some courses also make use of collaborative
assignments; part of the evaluation in that case may be a rating of each
individual's contribution to the group effort.
Some advanced classes may use take-home
examinations, in which case the ground rules will usually allow no
collaboration or consultation.
In many computer science classes, programming projects are
strictly individual assignments; the ground rules
do not allow any collaboration or consultation here either.
What are the responsibilities of faculty?
It is the responsibility of faculty in the department to
lay out the guidelines to be followed for specific assignments in
their classes as clearly and fully as possible, and to
offer clarification and advice concerning those guidelines
as needed as students work on those assignments.
The Department of Mathematics and Computer Science upholds the
College's policy on academic honesty.
We advise all students taking mathematics or computer science courses
to read the statement in the current College catalog carefully and
to familiarize themselves with the procedures which may be
applied when infractions are determined to have occurred.
What are the responsibilities of students?
A student's main responsibility is to follow the guidelines laid down
by the instructor of the course. If there is some point about the
expectations for an assignment that is not clear, the student is responsible
for seeking clarification. If such clarification is not immediately available,
students should err on the side of caution and follow the strictest possible
interpretation of the guidelines they have been given.
It is also a student's responsibility to protect his/her
own work to prevent unauthorized use of exam papers, problem solutions,
computer accounts and files, scratch paper, and any other materials used in
carrying out an assignment. We expect students to have the integrity to say
``no'' to requests for assistance from other students when offering that
assistance would violate the guidelines for an assignment.
Specific Guidelines for this Course
Because of the large size of this class, examinations will be given
in class, and the other assignments will be weekly individual problem
sets. Some examinations may be given as open book and/or open notes
tests. No sharing of information of any form with other students will
be permitted during exams. On the problem sets, discussion of the
questions with other students in the class, and with me during office
hours is allowed, even encouraged. Consultation of other probability and
statistics texts in the library for ideas leading to a problem solution
will also be allowed. If you do take advantage of any of these
options, you will be required to state that fact in a "footnote"
accompanying the problem solution. Failure to follow this rule
will be treated as a violation of the College's Academic
Integrity policy.