Apply, sapply, mapply, lapply,vapply, rapply, tapply, replicate, aggregate, by and correlates in R. When and how to use?
What is the difference between functions apply
, sapply
, mapply
, lapply
, vapply
, rapply
, tapply
, replicate
, aggregate
, by
and correlates in the R?
When and how to use each?
Are there other packages that do something similar or can override these functions?
2 answers
Translating from here .
R has many *apply functions that are well explained in help (e.g. ?apply
). As there are many, some novice users may have difficulty deciding which one is appropriate for their situation or even remembering all of them.
-
Apply - when you want to apply the function to rows or columns of an array.
# Matriz de duas dimensões M <- matrix(seq(1,16), 4, 4) # apply min às linhas apply(M, 1, min) [1] 1 2 3 4 # apply min às colunas apply(M, 2, max) [1] 4 8 12 16 # Array tridimensional M <- array( seq(32), dim = c(4,4,2)) # Aplicar soma em cada M [ * ], - isto é, através de Soma 2 ª e 3 ª dimensão apply(M, 1, sum) # O resultado é unidimensional [1] 120 128 136 144 # Aplicar soma em cada M [ * , * ] - ou seja, através de Soma 3 ª dimensão apply(M, c(1,2), sum) # O resultado é bidimensional [,1] [,2] [,3] [,4] [1,] 18 26 34 42 [2,] 20 28 36 44 [3,] 22 30 38 46 [4,] 24 32 40 48
-
Lapply - when you want to apply a function for each element of a list and receive a list back.
This is the flagship of many of the other *apply functions.
x <- list(a = 1, b = 1:3, c = 10:100) lapply(x, FUN = length) $a [1] 1 $b [1] 3 $c [1] 91 lapply(x, FUN = sum) $a [1] 1 $b [1] 6 $c [1] 5005
-
Sapply - when you want to apply the function to each element of a list, however you want to return a vector instead of a list.
Instead of using
unlist(lapply(...))
, consider usingsapply
.x <- list(a = 1, b = 1:3, c = 10:100) #Compare com acima; um vetor chamado , não uma lista sapply(x, FUN = length) a b c 1 3 91 sapply(x, FUN = sum) a b c 1 6 5005
In more advanced uses of
sapply
the function will attempt to result in a multi-dimensional array, if appropriate. For example, if our function returns vectors of the same length,sapply
will use them as columns of an array:sapply(1:5,function(x) rnorm(3,x))
If our function returns a 2-dimensional array,
sapply
will do essentially the same thing, treating each array as a single vector:sapply(1:5,function(x) matrix(x,2,2))
Unless we specify
simplify = "array"
, in which case it will use the individual arrays to construct a multi array- dimensional:sapply(1:5,function(x) matrix(x,2,2), simplify = "array")
-
Vapply - for when you want to use
sapply
but may need a faster code.By
vapply
, you basically give R an example of what kind of function it will return, which can increase its performance.x <- list(a = 1, b = 1:3, c = 10:100) # Note que uma vez que o avanço aqui é principalmente a velocidade , este # Exemplo é apenas para ilustração. Estamos dizendo que R # Tudo voltou por length () deve ser um número inteiro de # Comprimento 1. vapply(x, FUN = length, FUN.VALUE = 0) a b c 1 3 91
-
Mapply - for when you have several different data structures (e.g. vectors, lists) and you want to apply the function to the first elements of each and then the seconds, etc., forcing the result into a vector or array as in
sapply
.In this case your function must accept multiple arguments.
#Soma os 1ºs elementos, os 2ºs elementos, etc. mapply(sum, 1:5, 1:5, 1:5) [1] 3 6 9 12 15 #Para fazer rep(1,4), rep(2,3), etc. mapply(rep, 1:4, 4:1) [[1]] [1] 1 1 1 1 [[2]] [1] 2 2 2 [[3]] [1] 3 3 [[4]] [1] 4
-
Rapply - for when you want to apply the function to each element of a nested list recursively.
#Adiciona ! na string, ou incrementa myFun <- function(x){ if (is.character(x)){ return(paste(x,"!",sep="")) } else{ return(x + 1) } } #Estrutura da lista l <- list(a = list(a1 = "Boo", b1 = 2, c1 = "Eeek"), b = 3, c = "Yikes", d = list(a2 = 1, b2 = list(a3 = "Hey", b3 = 5))) #O resultado é um vetor ligado ao caractere rapply(l,myFun) #O resultado é uma lista como l, porém com os valores alterados rapply(l, myFun, how = "replace")
-
Tapply - for when you want to apply the function to the subsectors of a vector and these are defined by another vector.
A vector:
x <- 1:20
A factor (the same size!) defining the groups:
y <- factor(rep(letters[1:5], each = 4))
Add the values in
x
in each subgroup defined byy
:tapply(x, y, sum) a b c d e 10 26 42 58 74
-
Aggregate and by - it is relatively easy to collect data in
R
using one or moreBY
variables and a defined function.
-
Aggregate and by - it is relatively easy to collect data in
Attach(mtcars)
aggdata FUN = mean, na.rm=TRUE)
print (aggdata)
detach (mtcars)
I think the best way to discover anything in R is to learn by experimentation, using embarrassingly trivial data and functions.
If you turn on your R console, type "apply" and scroll down to the functions in the base package, you'll see something like this:
1: base::apply aplicar aplicar funções sobre Margens de matriz
2: base::by aplicar uma função de um quadro de dados Dividido por Fatores
3: base::eapply aplique uma função acima de valores em um ambiente
4: base::lapply aplicar uma função sobre uma lista ou vetor
5: base::mapply aplicar uma função para listar vários ou Argumentos vetoriais
6: base::rapply aplicar recursively uma função a uma lista
7: base::tapply aplicar uma função sobre uma matriz Ragged
Example using eapply:
# a new environment
e <- new.env()
# two environment variables, a and b
e$a <- 1:10
e$b <- 11:20
# mean of the variables
eapply(e, mean)
$b
[1] 15.5
$a
[1] 5.5