What is Width First and Depth First?

Question

What is Width First and Depth First?

When we are dealing with trees and graphs we find these terms. What do they mean and why are they important in the use of data structures of these types and algorithms that manipulate them? What do I gain or lose when one or the other occurs?

13

algoritmo terminologia estrutura-de-dados grafo árvore

Author: Maniero, 2019-01-07

Source

1 answers

score 8 · Accepted Answer

width first and depth first are two similar but distinct search strategies. Usually you find the full term: BFS or DFS, where the acronym S means search.

Its meanings are:

width first search : search for Width first
depth first search : search for depth first

Yes, there are other search alternatives. Djikstra's algorithm, for example, is a search by priority first. I will not go into sordid details about other types of Graph search, I will only be left with this quote

Before going into more detail, an extremely important point to choose which algorithm to use: in decision problems . The search in width is guaranteed to find a solution (if any) in a problem RE, in-depth search does not; in-depth search finds a problem much more quickly in class R (which is a subclass of RE).

Set definition of Class R:

The RE class consists of all the decision problems which exist a Turing machine can define, in finite time, if the answer is "yes"; this is the class of treatable problems (for answer "yes"). In it, we have the problem of stop and the problem of recognizing whether a word belongs to the language of an unrestricted grammar.

The co-RE class, however, is composed of the problems for which some Turing machine would answer "No" in a finite time; this is the class of treatable problems (parava answer "no"). An example of a co-RE problem is the Domino problem using Wang tiles periodically (more details).

The R class is composed of the problems that one always has the answer, whether it is "yes" or "no", in a finite time (so it is the intersection between RE and co-RE); this is the class of Decidable problems. One problem that is contained in R (and also contained in NP, a very small and restricted subset of R) is the Hamiltonian Path problem. Your decision version looks like this: given a graph G (with weighted edges) and a number k, I can go through all the vertices of G only once so that my path has Weight less than or equal to k?

What is a search in a graph?

When working with graphs (and consequently with trees), we need to find some condition. There are several types of conditions to be found, but they usually fall into the following:

find a vertex with a given property
find a path (set of edges) with some particular property
find a single edge with a particular property

Many algorithms start from a single point, while others start from several points ( Floyd-Warshall starts from all vertices at the same time \ _ (ツ)_/).

For those departing from some single origin, he needs to determine how to visit the neighboring vertices in order to then continue the search. Then you need to go to the neighbors of the neighbors and so on.

The search in width visits at a time, all vertices of a given depth (related ). Already in depth, No, goes to the end of a path to then try an alternative path.

One problem to visualize this is to find the output in a maze.

Given a map, where . means free space, # means impassable wall, s the exit and i its initial position, is it possible to get out of the maze?

Example:
#######
#...#.#
#.#...#
#.###.#
#.#s#.#
#i....#
#######

First: is this a graph problem? Yeah, he is. Except, here, we don't explicitly define the vertex set and the edge set. We have, here, 19 vertices (all tiles with: . or i or s) and 18 non-directed edges.

So we need to find a vertex with a special property, starting from a defined origin. Below, you check the execution of the algorithm in depth leading in consider the following:

while I build a path, I do not go back (we already visited will not be revisited in a single descent)
if I have more than one path to go, I go first to the left, then up, then right, and finally down
if I happen to hit a path that has already been traveled, then that means I need to do backtrack and I can abort that descent

I'll score points now visited that particular descent with a X. When it is necessary to do the backtracking , I will mark the points of the old path with B. When I reach the target, I will score with success with Y:

Now, let's see the problem being solved with a width search. Here, we will avoid making any kind of walk over the path already traveled (as we did in bisca in depth). I'll put it in the bottom corner left of the animation what depth level I'm expanding:

If we were to use another search, the search a* would answer and be even better than the search in width. But it is a heuristic , not a purely systematic search.

Another interesting problem about graphs is the problem of the zombie bridge .

The problem consists of the following statement:

A group of N people released a horde of zombies into the laboratory from the top of the mountains. To escape this Horde, they need to cross a bridge. However, there is a problem here: there is only one person with a flashlight and it is too dark. Then it is only possible to cross the bridge with one person holding the flashlight and another following it, just behind. And there is something else, each person has a different time to cross the bridge, and when 2 people crossing at the same time, they need to walk at the pace of the slowest person.

What is the shortest possible time to cross all the people on the bridge?

The input will be provided in two lines. In the first, we have a number N with the number of people. In the second line, you will have N numbers with the time (in minutes) that each person takes to cross the bridge.

Example:
4
1 2 5 10

First, this problem is a problem of Graphos? Well, oddly enough, it is rather a graph problem... but as in the other problem, the graph is implicit .

Let's take the Example Case: 4 people, Times 1, 2, 5 and 10. We can cross in 19 minutes if we go through the graph in one of these ways:

For this particular case, it has a heuristic that defines an upper bound:

the fastest person always keeps up with all the others people
if you still have someone missing to cross, the faster you return the bridge alone

This ensures that I guarantee that in at least soma todos - menor + (quantidade - 1)*menor minutes I can cross the bridge (in the example in question this value is 18 - 1 + (4-1)*1 = 17 + 3 = 20). With this, I can do an in-depth search, and if by chance my search has passed through a path that is more than 20 minutes, I can abort this descent. So in this case, a deep search goes well.

By the way, I found 17 minutes for everyone's crossing, and it seems to be the best case

Finally, I would like to mention the Nim game (more details). This game (in the chosen variation) consists of the following:

There are 2 players, Alice and Bob, who play alternately. There is a stack with n sticks, with each player having to remove 1, 2 or 3 sticks in their turn. The player who removes the last toothpick loses.

You must determine the winning strategy for Alice, with the problem input consisting of a positive integer n and which player will start playing.

This, again, is a graph. Each vertex of the graph consists of 2 information:

number of sticks in the stack
the turn of who should remove the sticks

This generates a bipartite acyclic directed graph whose size is limited only by n maximum; since nothing has been described about n other than being integer and positive, this graph has the potential to be infinite.

Thus, we have the following vertices that define victory:

1, Alice; since Alice needs to take out at least one toothpick, and only has 1 toothpick to be removed, then Bob is the winner
1, Bob; analogous to the previous one, with Alice being the winner

So how to find out the winning strategy? Making a set of searches by depth. If the vertex represents Alice's turn, then it is enough for one of the possible edges to return victory to Alice; if the turn is Bob's, all possible edges need to return victory to Alice for it to be a winning strategy. The advantage of using deep search is that we can use memoization : if I find out that k, J is guaranteed victory for Alice, then we have to, if by chance I find k, J again, I already know the response of this situation and then I don't have to navigate through it again.

We know that 1, Bob is guaranteed victory for Alice. What about the vertex 5, Bob? Well, from here we have 3 possibilities: 4, Alice, 3, Alice and 2, Alice. If, by chance, all these options result in guaranteed victory for Alice, then 5, Bob is victory for Alice.

Let's go to 2, Alice first. An alternative is to remove 2 sticks, resulting in Alice's defeat. However, there is another alternative: remove 1 toothpick. To pull out a toothpick, we go to the vertex 1, Bob, so Alice is the winner.

Now in 3, Alice. We have 3 alternatives: 1, Bob, 2, Bob and 3, Bob. 2, Bob and 3, Bob do not generate a guaranteed victory for Alice, but 1, Bob does. So 3, Alice is also victory.

Finally, we have 4, Alice. From here, the alternatives are: 3, Bob, 2, Bob and 1, Bob. We have already found in the previous navigation that 3, Bob and 2, Bob can generate victory for Bob, but 1, Bob generates victory for Alice.

Therefore, how 2, Alice, 3, Alice and 4, Alice guarantee victory for Alice, 5, Bob also guarantees. And 6, Alice? Well, from 6, Alice we can go to 5, Bob, which guarantees victory. And 6, Bob? From 6, Bob we fall into 5, Alice, 4, Alice (already proven winning Moose) and 3, Alice (already proven winning Alice). So from 5, Alice can we guarantee Alice's victory?

For this we must see 4, Bob, 3, Bob (proven that Bob can win) and 2, Bob (proven that Bob can win). From 4, Bob we can go to 1, Alice, which guarantees victory to Bob. Therefore, in 6, Bob Bob can win.

And 7, Alice? Well, from 7, Alice we can go to 5, Bob, which is guaranteed victory for Alice. And so we can move on to any time, using the in-depth search memoizing the results.

The search in width, however, would not favor in the case of memoization. It would add in the queue of vertices to be visited the same Vertex several times, increasing the maximum memory load. Not to mention that to catch all Bob's moves from a certain Vertex k,Bob, the search in width will not make it easy to get the answer after finishing the navigation.

For deep searching, a stack is said to be used; for wide searching, a queue is said to be used. They just don't talk about which objects are stacked or queued.

In the case of width search, the vertices themselves are queued. More or less like this:

def busca_largura(origem, estrutura_auxiliar):
  fila = Fila()
  fila.push(origem)

  while not fila.empty():
    sendo_visitado = fila.pop()
    if not estrutura_auxiliar.ja_visitou(sendo_visitado):
      estrutura_auxiliar.marcar_visitado(sendo_visitado)
      [fila.push(v) for v in sendo_visitado.vizinhos() if not estrutura_auxiliar.ja_visitou(v)]

Already the search in depth, although it is said that a stack is used, usually this stack is delegated to the recursive call of functions:

def busca_profundidade(sendo_visitado, estrutura_auxiliar):
  estrutura_auxiliar.marcar_visitado(sendo_visitado)
  [busca_profundidade(v) for v in sendo_visitado.vizinhos() if not estrutura_auxiliar.ja_visitou(v)]

If you want to implement this iteratively, well, it's possible. A naive implementation would be to stack the vertices. It is almost equal to the search in width, it only changes the data structure used:

def busca_profundidade_iterativa_naive(origem, estrutura_auxiliar):
  pilha = Pilha()
  pilha.push(origem)

  while not pilha.empty():
    sendo_visitado = pilha.pop()
    if not estrutura_auxiliar.ja_visitou(sendo_visitado):
      estrutura_auxiliar.marcar_visitado(sendo_visitado)
      [pilha.push(v) for v in sendo_visitado.vizinhos() if not estrutura_auxiliar.ja_visitou(v)]

However, to be honest, in this case you are using more memory than the recursive function. That's why I called this solution naive. To recursive solution keeps in the stack the node being visited and also an indicator of which child being visited. So here I would stack a couple of data: the vertex and indica of the next child to be visited. Starting from 0, always. It looks something like this:

def busca_profundidade_iterativa(origem, estrutura_auxiliar):
  pilha = Pilha()
  pilha.push((origem, 0))

  while not pilha.empty():
    sendo_visitado, index_filho = pilha.pop()
    # index_filho == 0 indica a primeira vez que se vem nesse vértice

    pop_valido = False
    if index_filho == 0 and not estrutura_auxiliar.ja_visitou(sendo_visitado):
      estrutura_auxiliar.marcar_visitado(sendo_visitado)
      pop_valido = True
    if index_filho > 0:
      pop_valido = True

    if pop_valido:
      vizinhos = sendo_visitado.vizinhos()
      while index_filho < len(vizinhos):
        proximo_vizinho = vizinhos[index_filho]
        if not estrutura_auxiliar.ja_visitou(proximo_vizinho):
          pilha.push((sendo_visitado, index_filho + 1))
          pilha.push((proximo_vizinho, 0))
          break
        index_filho  += 1

About resource usage

Each problem has a specific search that fits best. However, it is worth highlighting some points here:

deep search allows you to do memoization in some problems (see Nim/zombie Bridge problem)
the maximum DFS stack size is the depth of the graph
the maximum queue size of the BFS is O(|av|*n), where |av| is the average of edges per Vertex and n is the depth level being investigated; therefore, in cases of full (or simply not sparse) graphs, BFS occupies more memory
In DAGs, the same Vertex can belong to several depths of the graph; for example, in the maze, the output has 2 depths: 4 and 16; this is one of the factors that blow up the queue size in a BFS
in infinite graphs (even DAGs), an in-depth search may not return the answer, as in RE problems
if the answer is on a finite depth Vertex, even if the graph is infinite, the search in width will always find the answer when exploring the smallest of its depths

Normally, the choice by search in width or depth depends on the domain of the problem, including some cases in which the choice of a search can generate a better use of memory or allow optimizations (such as memoization or the existence of some cut heuristic).