Bar chart with relative and cumulative frequency

I am trying to make a bar chart with the count on top of the bars and the relative and cumulative frequencies of the left and right sides respectively but I am not succeeding.

The data is:

dput(x2)
    c(1L, 5L, 3L, 3L, 5L, 3L, 4L, 1L, 2L, 2L, 7L, 3L, 2L, 2L, 3L, 
    3L, 2L, 1L, 5L, 4L, 4L, 3L, 5L, 2L, 6L, 2L, 1L, 2L, 5L, 5L, 5L, 
    3L, 6L, 4L, 5L, 4L, 6L, 7L)

The frequency distributions

table(x2)
x2
1 2 3 4 5 6 7 
4 8 8 5 8 3 2 

Relative frequencies

prop.table(table(x2))
x2
         1          2          3          4          5          6          7 
0.10526316 0.21052632 0.21052632 0.13157895 0.21052632 0.07894737 0.05263158 

What I'm trying to do is exactly like in the image below

insert the description of the image here

Author: Roland, 2017-04-27

1 answers

This is solved with the package ggplot2. First of all, I build a data frame with everything that needs to be plotted, with names that have some meaning in this context:

dados <- c(1L, 5L, 3L, 3L, 5L, 3L, 4L, 1L, 2L, 2L, 7L, 3L, 2L, 2L, 3L, 
           3L, 2L, 1L, 5L, 4L, 4L, 3L, 5L, 2L, 6L, 2L, 1L, 2L, 5L, 5L, 5L, 
           3L, 6L, 4L, 5L, 4L, 6L, 7L)

dados.plot <- data.frame(table(dados), table(dados)/sum(table(dados)),
  cumsum(prop.table(table(dados))))
dados.plot <- dados.plot[, -3]
names(dados.plot) <- c("Categoria", "FreqAbsoluta", "FreqRelativa", 
  "FreqCumulativa")
dados.plot$FreqRelativa <- dados.plot$FreqRelativa*100
dados.plot
  Categoria FreqAbsoluta FreqRelativa FreqCumulativa
1         1            4    10.526316      0.1052632
2         2            8    21.052632      0.3157895
3         3            8    21.052632      0.5263158
4         4            5    13.157895      0.6578947
5         5            8    21.052632      0.8684211
6         6            3     7.894737      0.9473684
7         7            2     5.263158      1.0000000   

With the data frame dados.plot prepared, I create a bar chart with the column FreqRelativa. Next, I put the values of FreqAbsoluta above the bars of the chart. Finally, I use the sec_axis function to include a second axis. Notice that I made a transformation in the data of FreqCumulativa, so that the row ended at the same height as the maximum value of the bars. It was enough to take 100 (maximum value of FreqCumulativa) and divide by maximum of FreqRelativa.

library(ggplot2)

ggplot(dados.plot, aes(x=Categoria, y=FreqRelativa)) +
  geom_bar(stat="identity") + 
  geom_line(aes(y=FreqCumulativa*max(FreqRelativa), group=1)) +
  labs(x="Categoria", y="Frequência Relativa (%)") + 
  geom_text(aes(label=FreqAbsoluta), vjust=-0.8) +
  scale_y_continuous(
    sec.axis=sec_axis(trans=~ .*100/(max(dados.plot$FreqRelativa)), 
    name = "Frequência Cumulativa (%)"))

insert the description of the image here

Colors, captions, and other chart characteristics can be adjusted after a query to the ggplot2 help.

 2
Author: Marcus Nunes, 2017-04-27 17:52:46