How To Calculate Proportion In R
- New methods
- Table seven.1-1 and Figure vii.one-1. Binomial distribution with n = 27 and p = 0.25
- Figure seven.1-2. Sampling distribution of a binomial proportion
- Instance 7.3. Sexual activity and the 10 chromosome
- Example 7.two. Radiologists' missing sons
Note: This certificate was converted to R-Markdown from this page past M. Drew LaMar. You tin download the R-Markdown hither.
Download the R code on this page as a single file here (make sure to install the "binom" parcel before running).
New methods
Hover over a function statement for a curt description of its meaning. The variable names are plucked from the examples farther below.
Calculate binomial probabilities:
dbinom(6, size = 27, prob = 0.25)
Binomial test:
binom.test(ten, north = 25, p = 0.061)
Install an R package, the binom
package to summate confidence interval for a proportion.
install.packages("binom", dependencies = TRUE)
Agresti-Coull 95% confidence interval for the proportion using the binom
package.
binom.confint(thirty, n = 87, method = "ac")
Other new methods:
Sampling distribution of a proportion by repeated sampling from a known population.
Table 7.1-1 and Figure seven.1-ane. Binomial distribution with northward = 27 and p = 0.25
Table and histogram of binomial probabilities. Uses the data from Chapter 6 on the genetics of mirror-image flowers.
Calculate a binomial probability, the probability of obtaining \(Ten\) successes in due north trials when trials are independent and probability of success \(p\) is the same for every trial. The probability of getting exactly six left-handed flowers when \(n = 27\) and \(p = 0.25\) is
dbinom(6, size = 27, prob = 0.25)
## [one] 0.171883
Table of probabilities for all possible values for the number of left-handed flowers out of 27.
xsuccesses <- 0:27 probx <- dbinom(xsuccesses, size = 27, prob = 0.25) probTable <- data.frame(xsuccesses, probx) probTable
## xsuccesses probx ## 1 0 iv.233057e-04 ## two i three.809751e-03 ## three ii 1.650892e-02 ## 4 3 4.585812e-02 ## 5 4 ix.171623e-02 ## 6 5 1.406316e-01 ## 7 6 ane.718830e-01 ## 8 7 i.718830e-01 ## 9 8 ane.432358e-01 ## ten 9 1.007956e-01 ## 11 x 6.047736e-02 ## 12 eleven 3.115500e-02 ## 13 12 ane.384667e-02 ## 14 13 five.325641e-03 ## 15 14 one.775214e-03 ## 16 xv v.128395e-04 ## 17 16 i.282099e-04 ## 18 17 2.765311e-05 ## 19 18 5.120947e-06 ## xx 19 viii.085705e-07 ## 21 20 1.078094e-07 ## 22 21 1.197882e-08 ## 23 22 1.088984e-09 ## 24 23 7.891188e-11 ## 25 24 4.383993e-12 ## 26 25 ane.753597e-13 ## 27 26 4.496403e-15 ## 28 27 5.551115e-17
Histogram of binomial probabilities for the number of left-handed flowers out of 27. This illustrates the full binomial distribution when \(n = 27\) and \(p = 0.25\).
barplot(summit = probx, names.arg = xsuccesses, space = 0, las = 1, ylab = "Probability", xlab = "Number of left-handed flowers")
Figure 7.1-2. Sampling distribution of a binomial proportion
Compare sampling distributions for the proportion based on due north = 10 and n = 100.
Take a large number of random samples of \(north = 10\) from a population having probability of success \(p = 0.25.\) Convert to proportions by dividing by the sample size. Do the same for the larger sample size \(northward = 100\). The following commands use ten,000 random samples.
successes10 <- rbinom(10000, size = 10, prob = 0.25) proportion10 <- successes10 / 10 successes100 <- rbinom(10000, size = 100, prob = 0.25) proportion100 <- successes100 / 100
Plot and visually compare the sampling distributions of the proportions based on \(north = ten\) and \(n = 100\). The par(mfrow = c(2,1))
command sets upwardly a graph window that will plot both graphs arranges in 2 rows and one cavalcade.
par(mfrow = c(two,one)) hist(proportion10, breaks = 10, correct = FALSE, xlim = c(0,ane), xlab = "Sample proportion") hist(proportion100, breaks = 20, right = FALSE, xlim = c(0,1), xlab = "Sample proportion")
par(mfrow = c(1,1))
Commands for a fancier plot:
oldpar <- par(no.readonly = TRUE) # make fill-in of default graph settings par(mfrow = c(2,1), oma = c(4, 0, 0, 0), mar = c(1, half-dozen, 4, 1)) # accommodate margins saveHist10 <- hist(proportion10, breaks = 10, right = FALSE, plot = FALSE) saveHist10$counts <- saveHist10$counts/sum(saveHist10$counts) plot(saveHist10, col = "firebrick", las = 1, cex.lab = 1.2, ylim = c(0,0.3), xlim = c(0,i), ylab = "Relative frequency", xlab = "", main = "") text(ten = 1, y = 0.25, labels = "n = 10", adj = 1, cex = 1.i) saveHist100 <- hist(proportion100, breaks = 40, correct = FALSE, plot = Imitation) saveHist100$counts <- saveHist100$counts/sum(saveHist100$counts) plot(saveHist100, col = "firebrick", las = 1, cex.lab = 1.2, ylim = c(0,0.1), xlim = c(0,1), ylab = "Relative frequency", xlab = "", primary = "") text(ten = i, y = 0.08, labels = "due north = 100", adj = 1, cex = one.one) mtext("Proportion of successes", side = 1, outer = TRUE, padj = 2)
par(oldpar) # Revert to backup graph settings
Example 7.3. Sex and the X chromosome
The binomial test, used to examination whether spermatogenesis genes in the mouse genome occur with unusual frequency on the X chromosome.
Read and inspect the information. Each row in the data file represents a different spermatogenesis gene.
mouseGenes <- read.csv(url("http://whitlockschluter.zoology.ubc.ca/wp-content/data/chapter07/chap07e2SexAndX.csv")) head(mouseGenes)
## chromosome onX ## 1 4 no ## ii iv no ## iii half-dozen no ## 4 6 no ## 5 half dozen no ## 6 seven no
Tabulate the number of spermatogenesis genes on the X-chromosome and the number non on the 10-chromosome.
table(mouseGenes$onX)
## ## no yeah ## xv 10
Calculate the binomial probabilities of all possible outcomes nether the null hypothesis (Table 7.2-i). Under the binomial distribution with \(n = 25\) and \(p = 0.061\), the number of successes tin can be whatever integer between 0 and 25.
xsuccesses <- 0:25 probx <- dbinom(xsuccesses, size = 25, prob = 0.061) information.frame(xsuccesses, probx)
## xsuccesses probx ## one 0 2.073193e-01 ## 2 ane iii.367007e-01 ## 3 2 two.624760e-01 ## four 3 ane.307255e-01 ## 5 4 4.670757e-02 ## 6 five one.274386e-02 ## 7 vi ii.759585e-03 ## 8 7 four.865905e-04 ## 9 8 7.112305e-05 ## 10 nine 8.727323e-06 ## 11 10 9.071211e-07 ## 12 xi 8.035781e-08 ## 13 12 6.090306e-09 ## xiv 13 three.956429e-10 ## 15 fourteen 2.203032e-11 ## 16 15 1.049510e-12 ## 17 16 iv.261188e-xiv ## 18 17 i.465509e-fifteen ## 19 eighteen iv.231266e-17 ## 20 xix 1.012696e-18 ## 21 twenty one.973624e-20 ## 22 21 3.052667e-22 ## 23 22 3.605629e-24 ## 24 23 three.055193e-26 ## 25 24 1.653947e-28 ## 26 25 4.297797e-31
Use these probabilities to calculate the \(P\)-value corresponding to an observed 10 spermatogenesis genes on the X chromosome. Retrieve to multiply the probability of 10 or more successes past 2 for the 2-tailed test result.
two * sum(probx[xsuccesses >= 10])
## [1] i.987976e-06
For a faster consequence, try R's born binomial test. The resulting \(P\)-value is slightly different from our calculation. In the book, we get the two-tailed probability past multiplying the i-tailed probability past ii. As we say on page 188, reckoner programs may summate the probability of extreme results at the "other" tail with a different method. The output of binom.exam
includes a confidence interval for the proportion using the Clopper-Pearson method, which is more than bourgeois than the Agresti-Coull method.
library(binom) # Load the binom packet binom.test(ten, north = 25, p = 0.061)
## ## Exact binomial exam ## ## data: 10 and 25 ## number of successes = x, number of trials = 25, p-value = ## 9.94e-07 ## alternative hypothesis: true probability of success is not equal to 0.061 ## 95 pct conviction interval: ## 0.2112548 0.6133465 ## sample estimates: ## probability of success ## 0.4
Example 7.two. Radiologists' missing sons
Standard error and 95% conviction interval for a proportion using the Agresti-Coull method for the conviction interval.
Read and audit the data.
radiologistKids <- read.csv(url("http://whitlockschluter.zoology.ubc.ca/wp-content/information/chapter07/chap07e3RadiologistOffspringSex.csv")) head(radiologistKids)
## offspringSex ## one male ## ii male ## 3 male ## 4 male person ## five male person ## 6 male
Frequency table of female and male offspring number.
table(radiologistKids$offspringSex)
## ## female person male ## 57 30
Calculate the estimated proportion of offspring that are male, and the full number of radiologists.
n <- sum(tabular array(radiologistKids$offspringSex)) n
## [1] 87
pHat <- 30 / northward pHat
## [1] 0.3448276
Standard error of the sample proportion.
sqrt( (pHat * (one - pHat))/n )
## [1] 0.0509588
Agresti-Coull 95% conviction interval for the population proportion.
pPrime <- (30 + 2)/(n + four) pPrime
## [1] 0.3516484
lower <- pPrime - 1.96 * sqrt( (pPrime * (1 - pPrime))/(n + 4) ) upper <- pPrime + 1.96 * sqrt( (pPrime * (1 - pPrime))/(north + 4) ) c(lower = lower, upper = upper)
## lower upper ## 0.2535425 0.4497542
Agresti-Coull 95% confidence interval for the population proportion using the binom
bundle. To employ this package yous will need to install it (this needs to exist done merely in one case per reckoner) and load information technology using the library
control (this needs to exist done in one case per R session). The confidence interval from the binom
package will exist very slightly dissimilar from the 1 you lot calculated to a higher place because the formula we use takes a slight shortcut.
binom.confint(30, n = 87, method = "ac")
## method x due north hateful lower upper ## 1 agresti-coull thirty 87 0.3448276 0.2532164 0.4495625
Source: https://rstudio-pubs-static.s3.amazonaws.com/154137_6fe804f283e44a598e90a9f6a01ea7a8.html
0 Response to "How To Calculate Proportion In R"
Post a Comment