Table normalization to 2nd Normal Form

A relation is in 2FN if and only if it is in 1FN and not contains partial dependencies.

From this definition I am normalizing a BD to 2fn. This database has 2 tables, which illustrates the figure with their respective attributes:

Current status of tables

To candidate key cpf of the Table cliente, it's causing partial dependence on this format, right? For only id already the reference.

Rearranging the table cliente according to 2FN I ended up defining it like this, is the central idea of my normalization thinking correct?

Is there any other better way for me to normalize it?

New model

Author: Maniero, 2017-10-28

3 answers

Your tables Cliente and Agência are in the first normal form because no field is multivalued, so let's focus on the second normal form.

There are two candidate keys in the table Cliente: id and cpf. Clearly from the id we can get any other field and from the cpf as well. Nor can we use only a part of id or cpf for this purpose.

Already in the table Agencia, the only candidate key is id and the other fields are defined based on id and not just part of id.

However, the fields estado are determined by the fields cidade which in turn are determined by the fields rua in Table Cliente and cep in Table Agencia. That is, the fields cidade and estado are violations of the second normal form because they do not depend on the primary key, but on some other field. Your attempt to normalize does not correct this problem completely, but it is already a step in some direction.

The solution Series:

  • Create a table Estado with the fields codigo and nome.

    The codigo is the primary key, while the nome is another candidate key because we cannot have two states with the same name.

  • Create a table Cidade with the fields codigo_estado, codigo_cidade and nome.

    The codigo_cidade is the primary key, the codigo_estado is the foreign key for the table Estado.

    The fields codigo_estado and nome are also a candidate key because they do not there may be two cities with the same name in the same state.

    The field nome alone is not a candidate key because there may be cities with the same name in different states (such as Rattlesnake, name of a city in Paraná and one in Ceará).

  • Create a table Logradouro with the fields cep, codigo_cidade, bairro and nome.

    cep it's the primary key.

    codigo_cidade it is the foreign key for the table Cidade.

    One would imagine that codigo_cidade and nome together were a key candidate, and in almost every city that would be true, but it is not always. In São Paulo, for example, there are three different streets called "Rua Piracicaba". Therefore, the candidate key is codigo_cidade, nome and bairro.

  • Create a table Endereco with the fields id, cep, numero_logradouro and complemento.

    id it's the primary key.

    cep it is the foreign key for the table Logradouro.

    Interestingly, the number of the plot is not always numerical. For example: "Rua João da Silva, 148-B , casa dos fundos ".

  • In the tables Agencia and Cliente, put a foreign key for Endereco and remove the other fields for addresses.

This model is not yet perfect, because the same street can have different ZIP codes and maybe it was the case to have a table of neighborhoods and also a table of types of logradouros (Avenue, Street, Square, alameda, transom, Highway, court, etc.). However, for its purpose, this should be enough.

 3
Author: Victor Stafusa, 2017-10-28 12:20:30

Normalization essentially exists to resolve redundancies. See any redundancy there?

Address

None shown. Is it possible for the customer to have more than one address? Is it possible for more than one customer to have the same address? If you can, maybe makes sense to apply normalization in this case. With only one address it makes no sense to make this separation. Neither the second, nor any normal form applies.

Note that in the second example you created a table called Cliente Endereço. Why are you putting an agency's address on it? It doesn't make sense.

Let's assume that it is actually a general address table. You could even put the address there. But for what?

Nothing prevents you from having a table only for addresses, but if there is no repetition of the data you are not doing this by normalization. This makes sense if entities can have more than one address or more than one of them has the same address.

I did not enter the table normalization address because it does not seem to be the focus of the question and in the form presented may even be normalized, it has no data indicating what are the columns, not even the types, it could very well be ids of the normalized data. It may also be that you do not want to normalize this.

Agency

A table has been created to separate the agency it belongs to. Again, do you have more than one agency that the client can have? It's the same question of the address. What gain do I think you got doing this? That problem you think you solved? Normalization needs to solve problems, not cause new ones. I saw no advantage.

Best Model

Even these cases can be questioned in modern databases. It doesn't cost so much to keep space for more than one address in the entity itself. It is not always a problem to have a very possible repetition of registration of the address when two entities are in the same address. It is a matter of pragmatism.

If you can have these cases, strictly speaking, it should normalize, but an experienced developer will analyze how much the effort is worth because it complicates the model hindering all the code and performance. So you have to think if it compensates, if it is so necessary.

On the other hand the whole model may be wrong. This idea of separating entities into customer, supplier, Bank, etc. it's wrong by nature. At least in the presented form.

Entities are natural and legal persons (separate). Customer, supplier, bank, carrier, seller, employee, etc. these are roles that these people play in this organization. In these tables should only have data on the papers. The data of the person himself should be in the tables of natural and legal persons. This is the most correct, but again it is possible pragmatically to do otherwise if it makes sense. But you need to choose another shape because it's the best and not because it's the only shape you know. Not all cases make sense to separate so, but in general is the most correct.

I could point out other possible errors in this model. But it would be speculation because it is only right or wrong knowing all the requirements. The errors I see would be according to my experience, not with the actual case, are could be right or not.

Normalization

To know how to normalize you need to know the goals. Be the most formally correct, be the fastest to develop, be the easiest to maintain, be the most performatic, to be what the teacher taught or the boss ordered even if it is not the best, or whatever. You can't blindly normalize. How far to normalize and where to stop is something you will learn over time.

 4
Author: Maniero, 2020-06-24 16:54:18

Based on the conversion examples in the book, below is an example of 1FN to 2fn and 2FN to 3fn.

insert the description of the image here

Based on the drawing of the book, I made a drawing to try to help with your question.

insert the description of the image here

Since your question limited the conversion only up to 2FN , I did not convert from 2fn to 3FN as the example in the book shows.

Consider the primary keys the words with full underscore, e.g. (CPF) and the foreign keys like (_CEP) with the underline before the word.

I added some non-key attributes to the CLIENT_AGENCIA relationship table just to make the illustration more coherent, since you did not cite any relevant attributes to compose that table.

Source: database systems 6th edition, authors: Elmasri, Ramez Navathe, Shamkant B. year: 2011

 2
Author: Jonathas B. C., 2020-06-11 14:45:34