banner



How To Calculate Proportion In R

  • New methods
  • Table seven.1-1 and Figure vii.one-1. Binomial distribution with n = 27 and p = 0.25
  • Figure seven.1-2. Sampling distribution of a binomial proportion
  • Instance 7.3. Sexual activity and the 10 chromosome
  • Example 7.two. Radiologists' missing sons

Note: This certificate was converted to R-Markdown from this page past M. Drew LaMar. You tin download the R-Markdown hither.

Download the R code on this page as a single file here (make sure to install the "binom" parcel before running).

New methods

Hover over a function statement for a curt description of its meaning. The variable names are plucked from the examples farther below.

Calculate binomial probabilities:

dbinom(6, size = 27, prob = 0.25)

Binomial test:

binom.test(ten, north = 25, p = 0.061)

Install an R package, the binom package to summate confidence interval for a proportion.

install.packages("binom", dependencies = TRUE)

Agresti-Coull 95% confidence interval for the proportion using the binom package.

binom.confint(thirty, n = 87, method = "ac")

Other new methods:
Sampling distribution of a proportion by repeated sampling from a known population.

Table 7.1-1 and Figure seven.1-ane. Binomial distribution with northward = 27 and p = 0.25

Table and histogram of binomial probabilities. Uses the data from Chapter 6 on the genetics of mirror-image flowers.

Calculate a binomial probability, the probability of obtaining \(Ten\) successes in due north trials when trials are independent and probability of success \(p\) is the same for every trial. The probability of getting exactly six left-handed flowers when \(n = 27\) and \(p = 0.25\) is

            dbinom(6, size = 27, prob = 0.25)          
            ## [one] 0.171883          

Table of probabilities for all possible values for the number of left-handed flowers out of 27.

            xsuccesses <- 0:27 probx <- dbinom(xsuccesses, size = 27, prob = 0.25)  probTable <- data.frame(xsuccesses, probx) probTable          
            ##    xsuccesses        probx ## 1           0 iv.233057e-04 ## two           i three.809751e-03 ## three           ii 1.650892e-02 ## 4           3 4.585812e-02 ## 5           4 ix.171623e-02 ## 6           5 1.406316e-01 ## 7           6 ane.718830e-01 ## 8           7 i.718830e-01 ## 9           8 ane.432358e-01 ## ten          9 1.007956e-01 ## 11         x 6.047736e-02 ## 12         eleven 3.115500e-02 ## 13         12 ane.384667e-02 ## 14         13 five.325641e-03 ## 15         14 one.775214e-03 ## 16         xv v.128395e-04 ## 17         16 i.282099e-04 ## 18         17 2.765311e-05 ## 19         18 5.120947e-06 ## xx         19 viii.085705e-07 ## 21         20 1.078094e-07 ## 22         21 1.197882e-08 ## 23         22 1.088984e-09 ## 24         23 7.891188e-11 ## 25         24 4.383993e-12 ## 26         25 ane.753597e-13 ## 27         26 4.496403e-15 ## 28         27 5.551115e-17          

Histogram of binomial probabilities for the number of left-handed flowers out of 27. This illustrates the full binomial distribution when \(n = 27\) and \(p = 0.25\).

            barplot(summit = probx, names.arg = xsuccesses, space = 0, las = 1, ylab = "Probability", xlab = "Number of left-handed flowers")          

Figure 7.1-2. Sampling distribution of a binomial proportion

Compare sampling distributions for the proportion based on due north = 10 and n = 100.

Take a large number of random samples of \(north = 10\) from a population having probability of success \(p = 0.25.\) Convert to proportions by dividing by the sample size. Do the same for the larger sample size \(northward = 100\). The following commands use ten,000 random samples.

            successes10 <- rbinom(10000, size = 10, prob = 0.25) proportion10 <- successes10 / 10 successes100 <- rbinom(10000, size = 100, prob = 0.25) proportion100 <- successes100 / 100          

Plot and visually compare the sampling distributions of the proportions based on \(north = ten\) and \(n = 100\). The par(mfrow = c(2,1)) command sets upwardly a graph window that will plot both graphs arranges in 2 rows and one cavalcade.

            par(mfrow = c(two,one)) hist(proportion10, breaks = 10, correct = FALSE, xlim = c(0,ane), xlab = "Sample proportion") hist(proportion100, breaks = 20, right = FALSE, xlim = c(0,1), xlab = "Sample proportion")          

            par(mfrow = c(1,1))          

Commands for a fancier plot:

            oldpar <- par(no.readonly = TRUE) # make fill-in of default graph settings par(mfrow = c(2,1), oma = c(4, 0, 0, 0), mar = c(1, half-dozen, 4, 1)) # accommodate margins saveHist10 <- hist(proportion10, breaks = 10, right = FALSE, plot = FALSE) saveHist10$counts <- saveHist10$counts/sum(saveHist10$counts) plot(saveHist10, col = "firebrick", las = 1, cex.lab = 1.2,     ylim = c(0,0.3), xlim = c(0,i), ylab = "Relative frequency",     xlab = "", main = "") text(ten = 1, y = 0.25, labels = "n = 10", adj = 1, cex = 1.i) saveHist100 <- hist(proportion100, breaks = 40, correct = FALSE, plot = Imitation) saveHist100$counts <- saveHist100$counts/sum(saveHist100$counts) plot(saveHist100, col = "firebrick", las = 1, cex.lab = 1.2,      ylim = c(0,0.1), xlim = c(0,1), ylab = "Relative frequency",      xlab = "", primary = "") text(ten = i, y = 0.08, labels = "due north = 100", adj = 1, cex = one.one) mtext("Proportion of successes", side = 1, outer = TRUE, padj = 2)          

            par(oldpar) # Revert to backup graph settings          

Example 7.3. Sex and the X chromosome

The binomial test, used to examination whether spermatogenesis genes in the mouse genome occur with unusual frequency on the X chromosome.

Read and inspect the information. Each row in the data file represents a different spermatogenesis gene.

            mouseGenes <- read.csv(url("http://whitlockschluter.zoology.ubc.ca/wp-content/data/chapter07/chap07e2SexAndX.csv")) head(mouseGenes)          
            ##   chromosome onX ## 1          4  no ## ii          iv  no ## iii          half-dozen  no ## 4          6  no ## 5          half dozen  no ## 6          seven  no          

Tabulate the number of spermatogenesis genes on the X-chromosome and the number non on the 10-chromosome.

            table(mouseGenes$onX)          
            ##  ##   no  yeah  ##   xv   10          

Calculate the binomial probabilities of all possible outcomes nether the null hypothesis (Table 7.2-i). Under the binomial distribution with \(n = 25\) and \(p = 0.061\), the number of successes tin can be whatever integer between 0 and 25.

            xsuccesses <- 0:25 probx <- dbinom(xsuccesses, size = 25, prob = 0.061) information.frame(xsuccesses, probx)          
            ##    xsuccesses        probx ## one           0 2.073193e-01 ## 2           ane iii.367007e-01 ## 3           2 two.624760e-01 ## four           3 ane.307255e-01 ## 5           4 4.670757e-02 ## 6           five one.274386e-02 ## 7           vi ii.759585e-03 ## 8           7 four.865905e-04 ## 9           8 7.112305e-05 ## 10          nine 8.727323e-06 ## 11         10 9.071211e-07 ## 12         xi 8.035781e-08 ## 13         12 6.090306e-09 ## xiv         13 three.956429e-10 ## 15         fourteen 2.203032e-11 ## 16         15 1.049510e-12 ## 17         16 iv.261188e-xiv ## 18         17 i.465509e-fifteen ## 19         eighteen iv.231266e-17 ## 20         xix 1.012696e-18 ## 21         twenty one.973624e-20 ## 22         21 3.052667e-22 ## 23         22 3.605629e-24 ## 24         23 three.055193e-26 ## 25         24 1.653947e-28 ## 26         25 4.297797e-31          

Use these probabilities to calculate the \(P\)-value corresponding to an observed 10 spermatogenesis genes on the X chromosome. Retrieve to multiply the probability of 10 or more successes past 2 for the 2-tailed test result.

            two * sum(probx[xsuccesses >= 10])          
            ## [1] i.987976e-06          

For a faster consequence, try R's born binomial test. The resulting \(P\)-value is slightly different from our calculation. In the book, we get the two-tailed probability past multiplying the i-tailed probability past ii. As we say on page 188, reckoner programs may summate the probability of extreme results at the "other" tail with a different method. The output of binom.exam includes a confidence interval for the proportion using the Clopper-Pearson method, which is more than bourgeois than the Agresti-Coull method.

            library(binom) # Load the binom packet binom.test(ten, north = 25, p = 0.061)          
            ##  ##  Exact binomial exam ##  ## data:  10 and 25 ## number of successes = x, number of trials = 25, p-value = ## 9.94e-07 ## alternative hypothesis: true probability of success is not equal to 0.061 ## 95 pct conviction interval: ##  0.2112548 0.6133465 ## sample estimates: ## probability of success  ##                    0.4          

Example 7.two. Radiologists' missing sons

Standard error and 95% conviction interval for a proportion using the Agresti-Coull method for the conviction interval.

Read and audit the data.

            radiologistKids <- read.csv(url("http://whitlockschluter.zoology.ubc.ca/wp-content/information/chapter07/chap07e3RadiologistOffspringSex.csv")) head(radiologistKids)          
            ##   offspringSex ## one         male ## ii         male ## 3         male ## 4         male person ## five         male person ## 6         male          

Frequency table of female and male offspring number.

            table(radiologistKids$offspringSex)          
            ##  ## female person   male  ##     57     30          

Calculate the estimated proportion of offspring that are male, and the full number of radiologists.

            n <- sum(tabular array(radiologistKids$offspringSex)) n          
            ## [1] 87          
            pHat <- 30 / northward pHat          
            ## [1] 0.3448276          

Standard error of the sample proportion.

            sqrt( (pHat * (one - pHat))/n )          
            ## [1] 0.0509588          

Agresti-Coull 95% conviction interval for the population proportion.

            pPrime <- (30 + 2)/(n + four) pPrime          
            ## [1] 0.3516484          
            lower <- pPrime - 1.96 * sqrt( (pPrime * (1 - pPrime))/(n + 4) ) upper <- pPrime + 1.96 * sqrt( (pPrime * (1 - pPrime))/(north + 4) ) c(lower = lower, upper = upper)          
            ##     lower     upper  ## 0.2535425 0.4497542          

Agresti-Coull 95% confidence interval for the population proportion using the binom bundle. To employ this package yous will need to install it (this needs to exist done merely in one case per reckoner) and load information technology using the library control (this needs to exist done in one case per R session). The confidence interval from the binom package will exist very slightly dissimilar from the 1 you lot calculated to a higher place because the formula we use takes a slight shortcut.

            binom.confint(30, n = 87, method = "ac")          
            ##          method  x  due north      hateful     lower     upper ## 1 agresti-coull thirty 87 0.3448276 0.2532164 0.4495625          

Source: https://rstudio-pubs-static.s3.amazonaws.com/154137_6fe804f283e44a598e90a9f6a01ea7a8.html

0 Response to "How To Calculate Proportion In R"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel