A regular expression to detect Brazilian Highway acronyms

I am trying to detect if a given address corresponds to a Brazilian Highway.

For example, br-101 matches.

My initial plan was to list the state acronyms (mg, sp, rn ...) plus the acronym br, and write something like /sigla1-[0-9]{3}|sigla2-[0-9]{3}.../.

But a query to wikipedia gave me a surprise: there are other prefixes besides states. (for example, prc, in https://pt.wikipedia.org/wiki/Rodovias_do_Paran%C3%A1 )

I ask, so: what is the most correct way to detect highways?

We can take (duas_ou_tres_letras)-(tres_numeros), for example. Does the part before the hyphen necessarily have two or three letters? can the part after hyphem have less than three numbers?

Would anyone happen to have a list of the possible acronyms that may come before hyphen?

Author: josinalvo, 2017-08-07

2 answers

I found the question interesting and tried to inform myself how the nomenclature of Brazilian highways works.

According to the government's website federal highways there is a standard for defining the names of federal highways. And by the I researched I could notice that this standard is also adopted on state highways, but there are exceptions.

The first number of the name of the highway, for example, BR-307 It has meaning and ranges from 0 to 6. And also applies the state highways.

  • radial Highways: BR-0xx-highways that depart from the federal capital towards the ends of the country
  • longitudinal Highways: BR-1XX-highways that cut the country in north-south direction
  • cross roads: BR-2xx-highways that cut the country in the direction East-West
  • diagonal Highways: BR-3XX-highways can have two modes of orientation: northwest-southeast or northeast-southwest
  • connecting highways: BR-4XX - highways present themselves in any direction. There are also highways started with BR-6XX, but there are few and short extension.

It would be interesting to confirm this information so that the regex is more accurate, for example:

  • we know that the first information is capital letters and varies 2 to 3 letters: [A-Z]{2,3}
  • there is a hyphen between the letters and numbers: -
  • the first number ranges from 0 to 6: [0-6]
  • and ends with two more digits: [0-9]{2}

Finally your regex would look like this: [A-Z]{2,3}-[0-6][0-9]{2}. functional example

 2
Author: DNick, 2017-08-08 12:02:21

You can assemble two regexs one more generic to validate only the format of the highway and another more specialized that guarantees with greater chances its existence.

By the research I did some highways receives a C after the acronym of the state because they are coincident or a stretch of a federal highway is in the same stretch of a state and it is the responsibility of the state to maintain conservation but I did not find any centralized list each is maintains its own list.

Not all states have coincident highway soon the second regex House invalid values like BRC-000 or ACC-00 so further treatment is needed in the application as a list exceptions or find out which states have those highways and refine more regex.

A generic Series:

[A-Z]{2,3}-[0-9]{3}

Entries:

BR-101 //OK
ABC-100 //OK
ZZ-000 //OK

Example-regex101

The other would be the list of State acronyms followed by an optional matching C followed by dash and three numbers.

(AC|AL|AP|AM|BA|CE|DF|ES|GO|MA|MT|MS|MG|PA|PB|PR|PE|PI|RJ|RN|RS|RO|RR|SC|SP|SE|TO|BR)C?-[0-9]{3}

Entries:

BR-101 //OK
ABC-000 //fora do padrão
ZZZ-999 //fora do padrão
PRC-280 //OK
RSC-453 //OK
BRC-000 //OK mas é inválida
ACC-999 //OK mas é inválida

Example-regex101

 1
Author: rray, 2017-08-08 13:05:09