
Holy Cross Mathematics and Computer Science
MATH 375-6, Probability and Statistics Maple Package
Downloading the package
The Maple package is contained in a single file called MSP.map.
These routines can be used on any computer with Maple 7 or later
installed. This includes the Sun lab machines in Swords 219, and
the PC lab machines in HA 408.
To get a copy for yourself, click on the link for the package
from the course homepage in your web browser. This will display
the Maple source code as text. From the FILE menu, select SAVE AS,
and supply the path and filename you want for your copy.
- If
you are at one of the HA 408 PC's and plan to use the package there,
save the file to your campus network P: drive.
- If
you are on a SunRay in Swords 219 and plan to use the
package there, save the file to your
Sun network account.
Reading the package into Maple
Suppose you have saved the package in a file
called StatPac.txt in your top-level directory (folder) on
either the campus network or the department Sun network.
After you have logged on and launched Maple,
to use these procedures, you will need to execute a
command of the form:
- read "p:StatPac.txt"; on the campus network (HA 408)
- read "StatPac.txt"; on the Sun network (SW 219)
If you are on the Sun network and you haven't saved the package,
you can also load it directly from Prof. Little's web directory by entering:
- read "http://mathcs.holycross.edu/~little/ProbStat/MaplePackage/MSP.map";
Unfortunately, this apparently doesn't work from the PC's on the
campus Novell network.
Some relevant Maple information
Many of the procedures in the package either take as input,
and/or produce as output, lists of numbers.
A Maple list is an ordered list, enclosed in square brackets ( [ , ] ),
with the items separated by commas. For instance,
[2.3,2.4,5.6,1.2,0.9] is a Maple list with 5 entries.
A list can be treated as a single object, assigned to a variable
(to give the list a "name"), etc.
For instance the Maple command
XList:=[2.3,2.4,5.6,1.2,0.9];
assigns the list from before to the "name" XList. To
refer to that list then, you can just use the name XList.
The items within a list can be accessed using subscript notation.
For example, XList[3] is the third item in the list, the number
5.6.
The number of items in a list can be determined with the builtin
Maple function nops. For instance, if we executed the command
nops(XList);, Maple would print the output: 5.
For operations involving some or all of the items in a list,
the Maple for loop structure is extremely useful. This
loop is similar to the count-controlled loops provided in almost
all programming languages (like BASIC, Pascal, C++, etc.) The syntax is
for "counter" from "start" to "finish" by "increment" do "body of loop" end do;
- The words for, from, to, by, do, end do are Maple "reserved words";
they have particular fixed meanings and must appear exactly as shown.
- The counter is a Maple variable (it can be named anything you want)
- start, finish, and increment can be fixed numbers or Maple expressions
that evaluate to fixed numbers. For simplicity in the following
explanation, we will assume start <= finish and increment > 0.
That is not necessary, though.
- The "body of loop" can be any sequence of Maple commands.
- The way the loop works is this: the counter is set to
the value given by start; if start <= finish then the commands
in the body of the loop are executed (once), and value of
counter is changed to previous value, plus increment.
Then the test start <= finish is repeated, and if this is true
the commands in the body of the loop are executed (once more).
The loop continues until counter exceeds finish.
- from "start" is optional. If you don't specify
start, the value start = 1 is assumed.
- by "increment" is also optional. If you don't specify
that, the value increment = 1 is assumed.
For example, suppose we wanted to find the sum of the entries
in XList. We could say:
Sum:=0; for i to 5 do Sum:=Sum+XList[i]; end do;
When the loop is complete, the variable Sum will
contain the sum of the entries. If we didn't know how many
entries XList had, but we knew we wanted to sum
them all, we could say
Sum:=0; for i to nops(XList) do Sum:=Sum+XList[i]; end do;
Many other features of Maple programming are illustrated
by the procedures in the package. If you're interested, take
a look. I will be happy to explain anything you are
curious about.
The procedures currently in the package
- Some first descriptive statistics
- Range -- computes the range of a list of numbers (maximum minus minimum)
- Mean -- computes the mean of a list
- Variance -- computes the variance of a list
- StandardDeviation -- computes the standard deviation of a list
- Skewness -- computes the normalized 3rd moment about the mean
- Kurtosis -- computes the normalized 4th moment about the mean
This first batch all work the same way. To use them, you put the name
of the procedure in a Maple command, followed by the list of numbers
you want to apply the command to (or its name), in parentheses.
Usage example: Variance(XList);. The output will be
the corresponding statistic for the input list.
- Percentile -- computes the 100*pth percentile of the
data in a list of numbers. The input is the name of the list, followed
by the value of p (a number between 0 and 1).
Usage example: Percentile(XList,.75); computes the 75th percentile
value for the data.
- Random numbers and samples from given distributions
- RandomNumbers -- input a positive integer n; output a
list of n uniformly distributed (pseudo-)random numbers
in the interval [0,1]. Usage example: RandomNumbers(10);
- DieRoll -- input two positive integers n=number of rolls,m=number
of faces on die; output a list of numbers in set {1,..,m} representing
n rolls of a fair m-sided die. Usage example: DieRoll(10,6);
gives 10 rolls of a standard 6-sided die.
- UniformSample -- input the endpoints a,b of an interval on
the real line, and a number n of points. Output will be a
sample of size n from the uniform distribution on [a,b].
Usage example: UniformSample(1,4,100); gives a
sample of 100 points from the uniform distribution on [1,4].
- NormalSample -- input the mean and standard deviation
of a normal (Gaussian) distribution, and a number of points n.
Output will be a sample of size n from the normal distribution.
Usage example: NormalSample(0,1,100); gives a
sample of 100 points from the normal distribution with mean
0 and standard deviation 1.
- ExponentialSample -- similar to NormalSample,
input is the parameter lambda of the exponential distribution
and n. Usage example: ExponentialSample(2,100); gives a
sample of 100 points from the exponential distribution with
parameter lambda=2.
- ChiSquareSample -- similar to NormalSample,
input is the number of degrees of freedom, nu, of the
chi-square distribution, and the number n.
Usage example: ChiSquareSample(2,100); gives a
sample of 100 points from the chi-square distribution
with 2 degrees of freedom.
- GammaSample -- similar to NormalSample,
input is the parameters alpha and beta of the
gamma distribution, and the number n.
Usage example: GammaSample(2,4,100); gives a
sample of 100 points from the gamma distribution
with alpha = 2 and beta = 4.
- HypergeometricSample -- input the parameters N, n, r
of the hypergeometric distribution (in that order), followed by
a number of samples.
Usage example: HypergeometricSample(20,6,8,1000); gives
a sample of 1000 points from the hypergeometric distribution
with parameters N=20,n=6,r=8. The output is a list
of integers in the range 0..n, though in case r < n,
no values bigger than r will be generated(!)
- Frequencies -- input a list X, endpoints of an interval
a,b, and a number of intervals. The interval [a,b] is
divided into n equal pieces and the number of entries
from the list X in each is counted. Output is the list
of frequencies.
Usage example: Frequencies(XList,0,2,10);
subdivides [0,2] into 10 equal subintervals and counts frequencies.
- PDF's (probability density functions) and CDF's (cumulative
distribution functions)
The following PDF's and CDF's are currently implemented:
BetaPDF,BinomialPDF,ChiSquarePDF,ExponentialPDF,FPDF,
GammaPDF,HypergeometricPDF,PoissonPDF,TPDF,UniformPDF,WeibullPDF,
BetaCDF,BinomialCDF,ChiSquareCDF,ExponentialCDF,FCDF,
GammaCDF,HypergeometricCDF,PoissonCDF,TCDF,UniformCDF.
Each takes one or more inputs corresponding to the
parameters of the corresponding distribution (always come first),
and the independent variable (last). For example, the inputs
for ChiSquarePDF are the number of degrees of freedom,
nu and the independent variable x. A call to that procedure
like ChiSquarePDF(4,3.4) gives the value of the
density function at x = 3.4. If you want to plot
one of the pdf's or cdf's, use the following method.
Usage example: plot(x -> NormalPDF(0,2,x),-4..4);
will generate a plot of the normal density function with
mean = 0, standard deviation = 2, on the interval -4..4.
- Graphical routines
- Hist -- plots a relative frequency histogram of the data
in a list, on a given interval, with a given number of
"bins". Usage example: Hist(XList,0,4,7);
generates the histogram for XList on [0,4] with
7 equal "bins" (subdivisions). Note: if some of the
data points are outside the interval, a warning is generated
and only the points in the interval are used.
- ScatterPlot -- generates a scatterplot (point plot) for the
data points represented by two input lists -- first is list of
x-coordinates (abcissas), second is list of
y-coordinates (ordinates). Usage example:
ScatterPlot(XList,YList). Note: The plotting
window is determined automatically and will always be large
enough to show all the given points.
- PlotEmpiricalPDF -- generates an approximation to the
density function for a given input list X of numbers. (This is
essentially the same as the relative frequency histogram, but
scaled vertically so the total area under the graph is 1.)
Usage example: PlotEmpiricalPDF(XList);
The endpoints of the interval plotted and the number of subintervals
can also be specified: PlotEmpiricalPDF(XList,-3,3,20);
uses the interval [-3,3] and 20 "bins" for the histogram.
- PlotEmpiricalCDF -- generates an approximation to the
cumulative distribution function for a given input list X of numbers,
on the interval [a,b]. Usage example: PlotEmpiricalCDF(XList,-3,3);
- BoxWhisker -- generates a ``box-and-whisker'' plot for one or
more lists. For each list, a graphical display is generated
showing a thicker central ``box'' with vertical line segments
drawn at the 25th, 50th, and 75th percentile values, and thin
``whiskers'' extending past the box to the minimum and the maximum
of the data values. If more than one list is given as input, the
box-and-whisker plots will be stacked vertically in the plot,
with the first list at the bottom, etc. Usage example:
BoxWhisker(XList,YList,ZList); would generated three
stacked box-and-whisker plots.
- Confidence Intervals
- MeanLSCI -- computes the endpoints of an approximate
large-sample (1-alpha) x 100% confidence interval for the mean of a population
based on sample mean of data supplied. Usage example:
MeanLSCI(XList,.05); finds 95% confidence interval for mean.
- Games of Chance
- Craps -- Simulates any number of games of the dice game
Craps. There are two inputs: n the number of games,
and verbose which controls how much output
is generated. The output is a list of n 0s and 1s
(0 = loss in one game, 1 = win in one game). If verbose
is true, the list of rolls in each game is shown; otherwise
only the list of outcomes is printed. Usage example:
Craps(10,true); simulates 10 games and prints out the
rolls in each of the 10 games.
As more procedures are added through the year, this documentation
will be updated.
To
my personal homepage
To the Math homepage
To the Holy Cross homepage
Last modified: March 16, 2004