> restart;

Solution Key

This solution key is due to Charlie Hanley, Spring 2002

Maple Project #2

Data Sets and Commands Subsection

> with(stats):with(stats[statplots]):with(random):with(statevalf):with(student):with(stats[describe]):
with(plots):
The following procedure improves the histogram program. It allows specification of bin ranges along with thedata.
histo:=proc(data,ranges) local n,i ; n:=nops(ranges);
histogram(stats[transform,scaleweight[1/nops(data)]]
(stats[transform,tallyinto['extras']]
(data,[seq(ranges[i]..ranges[i+1],i=1..(n-1))]))): end:
Here is the simulated data:
normaldataset := [.8441162075, .1675078072, -.6676777329, -.4995752138e-1, 1.502155118, -.5568011924, 1.707902213, -1.074078825, .4199611949, -.9577805554e-2, .1697413106e-1, -.4995798757, -.6337893330, -.5650013958, .1744136103, -.6279057848, .8220447994, .6040674919, .6442369504, -1.327601386, 1.669060099, 1.013599091, -1.206729633, -.9813553812, -1.227766719, -.6001401143, -.3683659020, .1020625770e-2, -.5613021077, -1.237767572, .5907551397, .1688781208e-1, -.1614290579, -.2631929959, -.2682826964, .9013466929, -.1423211060, 1.575322000, 1.968681058, -.7095172514, .2476207045, -.1310654436, -.1627628270, -.9528732852, .2177892855e-1, .2941735009, -.2621070194, -1.670050594, .8984387349, 1.063402533, .1685053465, 1.233355953, .4445350600, 2.030521026, -1.291595763, -2.618596553, 2.035686170, -.2022766430, .1000936624, .5039054945, -.4095714084, .2492422437e-1, -1.050241620, -1.532729703, -.4053617765, -.2600474204, 1.743665944, 1.421809036, -.5183661918, -1.261056576, -.8719509654, -.2055324937, 1.269669973, .7711251851, -2.356327134, .4964135772, -1.159187869, -.2856297104, .1910937487, .7958866242, .6278507547, -.7620564877, .2706999082, -.7669122519, .5143784655, -.3783585777, -.3311418580, -.8202443397, .3956496264, .9380667145, .7823346678, 1.226855457, .4931329422, -1.673055015, .4068140517, -.8048465492, -.2822776423, .9585722598, -.4911639986, -.2011264576e-1]:
expdataset := [7.885590925, 4.702060218, 2.490185865, 3.242240653, 1.340675124, 3.661654543, .8143198115, .3448392635, 1.762374708, 1.596616308, 3.045972928, .1048340351, .8122487665, 1.312042270, .3210708275, .6063731865, 8.025586013, .3103939710, .4247798420, .7083070305, 2.557072468, 11.75270413, 5.613973343, 1.030900203, 3.763702180, .8359464300e-1, 2.840576003, 1.431463221, 1.451136681, 5.397233638, 3.871891675, 3.712009483, 2.366286344, .1507242052, 1.529106515, 4.819369353, .6740114550, 4.372344810, .2772995553, 8.880719618, 1.559404730, .2309232676, .4691102838, 3.527392048, 2.034073253, 4.462702103, 2.176072523, .6497600298, .1098657740, 4.683669508, 2.383357198, .4602255753, 2.876770555, .2158049156, 3.315624530, 2.321797189, .3351986210, .7641193595, 8.389897488, 1.712383348, 2.294743645, .2008303222, .7349989260, .9877686680, 3.034921273, 2.246621681, .6304389260, .8058789315, .6728238965, 1.042509567, 5.434997463, .8238934258, 4.493314183, .4664634878, 5.888566448, 2.228198160, 4.002202828, 1.348413776, 1.153161079, .1611020836, 2.339653543, 2.711198408, .2406473455e-1, 4.704403665, .1179580323e-1, 3.091762885, .1785056384e-2, 1.861998363, 1.958990347, .7647498905e-1, 5.275310273, 1.719625418, 9.575994495, 2.764027188, .2264810046, 5.890674828, .9424533133, .6675375813, 1.558428403, .2506159995]: data1:=[seq(1.5*normaldataset[i]+12, i=1..nops(normaldataset))]:
data2:=[seq(0.5*(data1[i]-4),i=11..90)]:
expdata := [.1120369893, 2.334139781, 4.678167130, .4705372194, 2.328780876, 12.34825991, 9.014209867, 5.436187016, 9.913748387, 1.540165607, 19.76166247, 4.701175160, .9907759420, 2.174687245, 16.78005340, 45.34618402, 1.535887240, .4824757234, 14.30887167, 6.463480786, 17.40341021, 16.93452753, 4.394551810, 3.732709666, 23.10634982, 22.25558379, 2.659917653, 13.93220691, 19.25247833, .8331147647, 3.710851430, .6267695523, .6518215197, .9739978074e-1, 4.576383582, 21.61790771, 4.504740148, 29.06304041, 2.020316093, 3.631875673, 19.87235464, .5335717365, 1.566086725, 1.963076922, 6.959548880, 2.490692867, .2907234955, 8.316983454, 16.80554524, 4.334713517, 2.205953864, 13.50022949, 6.937006914, 7.468602780, 4.926504421, 5.453933197, .1436084329, 25.36055519, 3.686491281, 3.961540008, 14.60758545, 16.16638024, 11.23419654, 2.563366288, .2886807971, 4.362667264, 3.398720113, 9.702074154, 6.361943260, 4.944042345, 26.19876319, .5792690188, 10.60850810, 8.331003234, 21.06726548]:
data3:=expdata:

```Warning, the name poisson has been redefined
```

```Warning, the name changecoords has been redefined
```

Problem 1. The list data2 contains laboratory measurements of the lengths, in centimeters, of 80 earthworms.

a) Look at the data to decide on appropriate histogram and plotting ranges and then plot at least two histograms.

b) Find a normal density which gives good smoothing approximations to your histograms. (Determine suitable values for m and s by trial and error.) Plot your density on the same axes as one of your histograms.

c) Plot the corresponding c.d.f.

d) Assuming that your density and c.d.f. accurately reflect the population of all earthworms, determine the following

i) The probability that a randomly selected earthworm has length between 4 and 5 centimeters.

ii) The probability that a randomly selected earthworm has length more than 4.5 centimeters.

iii) The 60th percentile, x60, of earthworm lengths.

> sort(data2);

Part a: Plot two histograms that appropriately describe the data set.

> ranges1:=[seq(2+0.5*i,i=0..8)];hist1data2:=histo(data2,ranges1):display(hist1data2);

> ranges2:=[seq(2+0.125*i,i=0..32)];hist2data2:=histo(data2,ranges2):display(hist2data2);

Part b: Find a normal density which gives good smoothing approximations to the histograms. This is accomplished by defining the normal density function (given by normalp), and fitting this function to the data set by adjusting the m and s parameters.

> normalp:=(x,m,s)->1/(sqrt(2*Pi)*s)*exp(-((x-m)/s)^2/2);

> display([hist1data2,plot(normalp(x,4,0.5),x=2..6,color=black,thickness=2)]);

Now I will adjust the guess and see if that improves the fit. I'll try m=3.75 and s=0.7.

> display([hist1data2,plot(normalp(x,3.75,0.7),x=2..6,color=black,thickness=2)]);

That seems to be a decent fit to the data. Now I will use Maple to determine the values for the mean and the standard deviation, and then use these values to plot the normal density against the two histograms.

> mean:=mean(data2);std_dev:=evalf(standarddeviation(data2));

> display([hist1data2,plot(normalp(x,mean,std_dev),x=2..6,color=black,thickness=2)]);display([hist2data2,plot(normalp(x,mean,std_dev),x=2..6,color=black,thickness=2)]);

Part c: Plot the corresponding c.d.f. To do this, you need to first define the c.d.f. in Maple (given by Fnormal) .

> Fnormal:=(x,m,s)->int(normalp(t,m,s),t=-infinity..x);

> plot(Fnormal(x,mean,std_dev),x=2..6,color=black,thickness=2,title="The Cumulative Distribution Function F(x)");

Part d: Determine the following probabilities.

i) The probability that a randomly selected earthworm has length between 4 and 5 centimeters.

ii) The probability that a randomly selected earthworm has length more than 4.5 centimeters.

iii) The 60th percentile, x60, of earthworm lengths.

> part_i:=int(normalp(x,mean,std_dev),x=4..5);part_ii:=1-Fnormal(4.5,mean,std_dev);part_iii:=fsolve(Fnormal(x,mean,std_dev)=.60,x=2..6);

Thus, for part i , there is a 40.683% probability that a randomly selected earthworm will have a length between 4 and 5 centimeters. For part ii , there is a 22.839% probability that a randomly selected earthworm will have a length greater than 4.5 centimeters. For part iii , the 60th percentile of earthworm lengths is 4.148 centimeters.

The list data3 , which we display below, consists of 80 measurements of the time elapsed, in minutes, between the arrival of phone calls at a certain 800 phone number. There are theoretical reasons to assume that the distribution of such times is given by an exponential density.

> sort(data3);

Problem 2 .

a) Plot some histograms, starting with ranges [0,2,4,...,30] and find an exponential density that appears to smooth the histograms. Give your value for the parameter a. (Hint: Start with a=.5. Plot over the interval [0,20].)

b) Assuming that the density you found in a) is reflective of the typical times between phone calls, answer the following questions.

i) What is the probability that at least 8 minutes elapse between the most recent phone call and the next?

ii) What is the probability that the time between two successive phone calls is between 2.5 and 6.5 minutes?

iii) What is the mean time between calls?

iv) What is the median time between calls?

v) What is the 75th percentile of the elapsed time between calls?

Part a: Plot some histograms that appropriately describe the data set.

> ranges3:=[seq(0+6*i,i=0..5)];hist1data3:=histo(data3,ranges3):display(hist1data3);

> ranges3:=[seq(0+2*i,i=0..15)];hist2data3:=histo(data3,ranges3):display(hist2data3);

Now I will find an exponential density which gives good smoothing approximations to the histograms. This is accomplished by defining the exponential density function (given by exponp), and fitting this function to the data set by adjusting the a parameter. I will start with the initial guesses of a=0.5, a=0.25, and a=0.125.

> exponp:=(x,a)->a*exp(-a*x);

> display([hist1data3,plot([exponp(x,0.5),exponp(x,0.25),exponp(x,0.125)],x=0..30,labels=["",""],color=black,thickness=2,linestyle=[2,3,4],legend=["a=0.5","a=0.25","a=0.125"],title="hist1data3 vs. three values of 'a'")]);

> display([hist2data3,plot([exponp(x,0.5),exponp(x,0.25),exponp(x,0.125)],x=0..30,labels=["",""],color=black,thickness=2,linestyle=[2,3,4],legend=["a=0.5","a=0.25","a=0.125"],title="hist2data3 vs. three values of 'a'")]);

It would seem that setting the a parameter equal to 0.125 provides a good smoothing approximation.

Part b: Assuming that the density you found in a) is reflective of the typical times between phone calls, answer the following questions. This will require defining the c.d.f. for the exponential density function in Maple (given by Fexpon).

i) What is the probability that at least 8 minutes elapse between the most recent phone call and the next?

ii) What is the probability that the time between two successive phone calls is between 2.5 and 6.5 minutes?

iii) What is the mean time between calls?

iv) What is the median time between calls?

v) What is the 75th percentile of the elapsed time between calls?

> Fexpon:=(x,a)->int(exponp(t,a),t=0..x);

> part_i:=evalf(Fexpon(8,0.125));part_ii:=int(exponp(x,0.125),x=2.5..6.5);part_iii:=evalf(int(x*exponp(x,0.125),x=0..infinity));part_iv:=fsolve(Fexpon(x,0.125)=0.5,x=0..46);part_v:=fsolve(Fexpon(x,0.125)=0.75,x=0..46);

Thus, for part i , there is a 63.212% probability that at least 8 minutes elapse between the most recent phone call and the next. For part ii , there is a 28.786% probability that the time between two successive phone calls is between 2.5 and 6.5 minutes. As given in part iii and part iv , the mean time and median time between phone calls is 8 and 5.545 minutes, respectively. For part v , the 75th percentile of elapsed time between phone calls is 11.090 minutes.

Problem 3. Consider the function, DEFINED WHEN x > 0, by

and defined to be 0 when x < 0.

a) Determine a value for k so that the function DEFINED WHEN x > 0 by

and defined to be 0 when x < 0, IS a density. (Remember that densities must have integral 1.) Put the value you find for k into the definition of p and use it for the remainder of this problem.

b) Plot the density function p(x) and the corresponding c.d.f., for x>0 of course, on the same axes if you can.

c) Use Maple to find the corresponding mean and median.

d) If a random value has this density, find the probability that the value lies between 1.5 and 3.2.

Part a: The first step to this problem is to define the integral of pnotadensity from 0 < x < infinity.

Now we solve for a value of k that will define pnotadensity as a density (i.e.with an integral of 1).

> k:=1/pnot;

The next step is to define the c.d.f. in Maple for pisadensity (given by Pcdf).

Part b: Plot the density function pisadensity(x) and its corresponding c.d.f. for x>0, and on the same axes.