Bar chart with relative and cumulative frequency
I am trying to make a bar chart with the count on top of the bars and the relative and cumulative frequencies of the left and right sides respectively but I am not succeeding.
The data is:
dput(x2)
c(1L, 5L, 3L, 3L, 5L, 3L, 4L, 1L, 2L, 2L, 7L, 3L, 2L, 2L, 3L,
3L, 2L, 1L, 5L, 4L, 4L, 3L, 5L, 2L, 6L, 2L, 1L, 2L, 5L, 5L, 5L,
3L, 6L, 4L, 5L, 4L, 6L, 7L)
The frequency distributions
table(x2)
x2
1 2 3 4 5 6 7
4 8 8 5 8 3 2
Relative frequencies
prop.table(table(x2))
x2
1 2 3 4 5 6 7
0.10526316 0.21052632 0.21052632 0.13157895 0.21052632 0.07894737 0.05263158
What I'm trying to do is exactly like in the image below
1 answers
This is solved with the package ggplot2
. First of all, I build a data frame with everything that needs to be plotted, with names that have some meaning in this context:
dados <- c(1L, 5L, 3L, 3L, 5L, 3L, 4L, 1L, 2L, 2L, 7L, 3L, 2L, 2L, 3L,
3L, 2L, 1L, 5L, 4L, 4L, 3L, 5L, 2L, 6L, 2L, 1L, 2L, 5L, 5L, 5L,
3L, 6L, 4L, 5L, 4L, 6L, 7L)
dados.plot <- data.frame(table(dados), table(dados)/sum(table(dados)),
cumsum(prop.table(table(dados))))
dados.plot <- dados.plot[, -3]
names(dados.plot) <- c("Categoria", "FreqAbsoluta", "FreqRelativa",
"FreqCumulativa")
dados.plot$FreqRelativa <- dados.plot$FreqRelativa*100
dados.plot
Categoria FreqAbsoluta FreqRelativa FreqCumulativa
1 1 4 10.526316 0.1052632
2 2 8 21.052632 0.3157895
3 3 8 21.052632 0.5263158
4 4 5 13.157895 0.6578947
5 5 8 21.052632 0.8684211
6 6 3 7.894737 0.9473684
7 7 2 5.263158 1.0000000
With the data frame dados.plot
prepared, I create a bar chart with the column FreqRelativa
. Next, I put the values of FreqAbsoluta
above the bars of the chart. Finally, I use the sec_axis
function to include a second axis. Notice that I made a transformation in the data of FreqCumulativa
, so that the row ended at the same height as the maximum value of the bars. It was enough to take 100 (maximum value of FreqCumulativa
) and divide by maximum of FreqRelativa
.
library(ggplot2)
ggplot(dados.plot, aes(x=Categoria, y=FreqRelativa)) +
geom_bar(stat="identity") +
geom_line(aes(y=FreqCumulativa*max(FreqRelativa), group=1)) +
labs(x="Categoria", y="Frequência Relativa (%)") +
geom_text(aes(label=FreqAbsoluta), vjust=-0.8) +
scale_y_continuous(
sec.axis=sec_axis(trans=~ .*100/(max(dados.plot$FreqRelativa)),
name = "Frequência Cumulativa (%)"))
Colors, captions, and other chart characteristics can be adjusted after a query to the ggplot2
help.