Difference between 'regular language 'to'irregular language'

Reading a question about basic programming concepts, I come across the following statement:

Let A and B be two languages about the binary Alphabet, that is, about the Alphabet composed only of 0's and 1's. let A be the language in which a quantity of 0's and 1's is equal. Let B be the language where no 0 occurs after a character 1, only B is a regular language .

Searching here on the site I find the answer: What is a context-free language? however, even then, it was not clear.

My questions:

  1. What are the main differences from a regular language to a irregular language? In the above statement Why is only language B regular?
  2. What are the main examples of the two languages?
Author: Victor Stafusa, 2019-01-21

1 answers

Well, I think the answer here may not be very pleasing, but it's the correct one: a regular language is one that can be generated by a regular grammar.

And so what would a regular grammar be? Well, it would be a grammar that has only regular productions. Well, that I have already explained in this answer .

Reiterating what was spoken in the answer above: a regular language is closely related to a Markov chain, where for each language regular, there is a Markov chain that represents it in form and content.


About irregular grammar... well, we are dealing here with a semantic problem. I'm taking this prefix i- as the negation of the following radical, in which case it would be regular. In this sense, you would have the "non-regular"languages. Which means any language that cannot be described through a regular language.

However, let's go back one step. What is the definition of number irrational ? Would the numbers complement the rational ones (i.e. the not-rational ones)? Well, the square root of 2 and even pi are not rational numbers, so I can call them irrational, check? Well, so far quiet, but what about the fourth root of -1? Would this number be irrational ? We know that no, this number is a complex number, and the complex number is neither rational nor irrational. So, the definition of number irrational cannot be simply "that number that is not rational," even more so by defining the real numbers as being the Union of the rational with the irrational.


Does non-regular language mean that has no specific form ? Well, actually, no. It can have an extremely specific shape .

For example, the language D formed by a word of a binary Alphabet followed by itself? This language is defined more or minus like this: D = { WW, para todo W em {a,b}* }. This language has a very specific form, but it is not regular (I will prove this here after proving the languages A and B).


About regular expressions and regular languages. In the beginning, regex represented yes regular languages. But they began to add things that are not regular. For example, they put the possibility of having a rearview mirror in the regex query. In some search engines, I can recognize exactly D:

([ab]*)\1

Also has the options of look-ahead and negative look-ahead to make a better path on the state machine and recognize/deny a word. However, this "look ahead" means that we are no longer dealing with a Markov chain. That is, not regular. The look-behind means the existence of memory, which means, again, not being a Markov chain.

You can see more about regex vs regular languages in this question , in your comments and in the accepted answer.


@anonimo commented on Chomsky's hierarchy. However, he put both recursive and recursively enumerable languages at Level 0. But that has trouble.

A decision problem, in computer science, is equivalent to recognizing whether the input belongs to a given language, and who defines which valid language is the problem. By for example, the stop problem is so equivalent to checking if, by chance, a word belongs to an unrestricted grammar; This is a problem of the RE Class.

However, I can define, on top of any language, its complement language. For example, the lock problem (which is the negation of the Stop problem, which should answer yes if the program crashes). He is the complement of the Stop problem. It is considered a problem of the co-re Class. Or that is, the latch problem is equivalent to detecting whether a word belongs to a co-re Language.

I go into more detail of these complexity class in the answer about search in width vs search in depth .


I have already answered What is the difference between one regular language and the others. Everything else is a consequence of this.

One of these consequences is that regular languages obey the pumping motto. There may be languages not regulars who also obey this motto, but non-obedience necessarily implies being non-regular.

Read more in this answer .

Read more

So how to use to prove that A is not regular?

Consider the operator . as the word concatenation operator (receives as operating two words, returns their concatenation; "abc" . "def" == "abcdef"), and the operator ^ as the repeat of words (receives a word and a natural number and returns that word repeated the number Times; ""ba"."na"^3 == "ba"."nanana" == "bananana"; "ba"."na"^0 = "ba").

We need to find a word in the format x.y.z belonging to A and a number p such that:

  • |y| >= 1
  • |x.y| <= p

So that when pumping y a few times, the new word x.y^n.z no longer belongs to A.

So let's take the word formed by all 0s to the left of all 1 s, and both with the same amount. Since p is arbitrary, then we can take the word "0"^p . "1"^p. What does this imply? Like |x.y| <= p, it is not possible for y to have the character 1. That is, y = "0"^m, 1 <= m <= p.

Let's then pump y an additional time. That is, from x.y.z we will get x.y^2.z, which is identical to x.y.y.z. So what about the amount of 0 s in that string? Well, how x.y has exactly p 0s, this means that x.y.y.z will have p + m 0s, but the amount of 1 s it is not changed. That is, if x.y.y.z belongs to A, it necessarily implies p + m = p; however, we know that 0 < 1 <= m <= p, which means that p < 1 + p <= m + p <= 2*p. Therefore, p < p + m. Therefore, x.y.y.z does not belong to A. Therefore, A is not regular because it does not meet the pumping motto.


To prove that B is regular, just write a regex (pure regex, without look-arounds , mirrors or anything other than regular). Basically, we are left with the lists, denied lists, groupings, choice (usually indicated by the bar-vertical |), quantifiers (optional, Kleene star, Kleene cross).

In B, No 0 can be preceded by 1. That is, if it has the subword 10, it no longer belongs to B. This means that words with only 0 s (words z) or with 1s (words u) are valid. Also, I can make a new word with z.u without any problem.

Who is the set of words z? It's 0*. What about the set of words u? It's 1*. Therefore, z.u is the regular expression 0*1*.

That is, B can be written as 0*|1*|0*1*. However this is trivially simplified to 0*1*. Since we have a regular expression, B is a regular language.

Your grammar:

S -> '0'.S
S -> '0'
S ->
S -> '1'.U
U -> '1'.U
U -> '1'

Your finite state automaton (with lambda transaction, whereas all states are acceptance states):

 +---+       +---+
>| q0|>--+-->| q1|>--+
 +---+   |   +---+   |
   ^     |     ^     |
   |     |     |     |
   +- 0 -+     +- 1 -+

Let's prove that D Not regular? She is the language of the duplicate binary word.

We choose the word "a"^p . "b"^p . "a"^p . "b"^p. It is exactly the word "a"^p . "b"^p twice, so it belongs to D. This means that necessarily y will be composed purely of aS. So what? And hence, pumping y means getting the following word:

"a"^n . ("a"^p . "b"^p)^2

Therefore, this new word is not a word duplication. Therefore, it does not belong to D. Therefore, D is not regular.

 4
Author: Jefferson Quesado, 2020-06-11 14:45:34