Why does TLS encryption use prime numbers?

Question

Why does TLS encryption use prime numbers?

Why are prime numbers used in TLS encryption, and in general in cryptography? Why not use any of them?

8

алгоритм ssl шифрование криптография безопасность

Author: Павел Игорев, 2017-11-03

Source

5 answers

TLS allows the use of various encryption and key exchange protocols. Different protocols use prime numbers for completely different reasons:

RSA

One of the most popular encryption protocols, RSA, is based on the fact that the problem of factorizing a number - decomposing a composite number into prime factors (most likely) is not solvable with polynomial complexity on ordinary computers (but is solvable in polynomial time on quantum computers)., crypto-apocalypse is coming).

I.e. if you have two huge primes p and q, then the one who only knows n = p * q, will spend quite a lot of time trying to decompose n back into p and q. Naturally, there are reservations that allow us to cut off the known subexponential algorithms, for example, p and q must differ in order by at least a couple of digits, but in general, we can assume that large p and q will make solving the factorization problem n wildly long.

At the same time, it is quite easy to find two large prime numbers.

So using large primes is a way to get a huge and hard-to-factorize composite number.

RSA describes the selection/generation of numbers e, d, n, such that for any value m will be valid:

The generation method assumes the choice of n = p * q, e = 65,537 (or any other small number). I.e. they are found quickly. The pair (n, e) is a public key.

It is proved that if we take d as the result of solving the equation ed = 1 (mod ϕ(n)), then the triple (e, d, n) will meet the requirement above.

ϕ(n) - this is the Euler function - equal to the number of natural numbers less than n and mutually prime with it. To find it you need to factorize n. That is, if you only know (n, e) , then finding d to solve this equation will take forever.

But φ(p) = p - 1, φ(q) = q-1. And for simple p and q:

φ(n) = φ(pq) = (p - 1) * (q - 1).

So you take two large primes, quickly compute φ(n) and n, and get d from the equation above. Prime numbers and φ(n) throw it away and don't show it to anyone. (n, d) is your private key, and no one will be able to find it out in a reasonable amount of time if they only know (n, e) .

After that, anyone who has your public key can encrypt the message for you:

But only you can decipher it:

From the interchangeability of e and d and the sameness for the steps for encryption/decryption, the second follows how to use RSA - only you can encrypt a known text so that it can only be decrypted using a public key.

If you add to the end of the message its checksum, encrypted with the private key (while receiving an "electronic signature"), then anyone can count the same checksum, "decrypt" the signature, and compare the two values. If they match , it means that the message arrived without changes, and it was sent exactly you.

Diffie-Hellman

The DH key exchange protocol relies not on the complexity of factorization, but on the complexity of solving the discrete logarithm problem. It is based on the fact that you cannot quickly calculate the key

Knowing only g^a, g^b, g and p.

But it also allows the parties to easily calculate this shared key by sending it over an open channel. g^a, g^b, if one side comes up with its secret a, and the second-its secret b.

Elliptic Curve Diffie-Hellman

Even the basics of elliptic cryptography will not fit in one SO post, so it is only worth mentioning that ECDHE also relies not on the complexity of factorization, but on the complexity of solving the discrete logarithm problem.

For the discrete time problem logarithm is a subexponential Polyg-Hellman algorithm that reduces the solution of the problem for n points to subproblems for multipliers n-p₁, p₂, p₃, p₄. Accordingly, the number of points of the curve is chosen so that it is divisible by some large prime number p, comparable in length to n. That is, prime numbers in ECDHE are used only indirectly, as a "limiter of simplicity". For the same reasons values of p, g in DH , the corresponding constraints are imposed.

9

Author: PashaPash, 2020-06-12 12:52:24

In fact, there is nothing super complicated here.

1) TLS is based on asymmetric cryptography, in which there is a concept of private and public keys

2) If you do not go into the mathematical subtleties of asymmetric cryptography, like the theorems of Euler, Fermat, Galois, and so on. mastodons then these keys (for example, in the RSA algorithm) are calculated based on the integer p, q (optional by the way and prime):

PublicKey={e(p,q), p*q} //e(p,q) - некая целочисленная функция (публичная экспонента)
PrivateKey={d(p,q), p*q) //d(p,q) - также некая целочисленная функция (приватная экспонента)

Roughly speaking, if you know these 2 numbers p, q then you can calculate the keys. Now attention, the attacker always knows their product p*q - from the public key, which, as its name implies, is known to everyone.

3) That is, to decrypt it, you just need to decompose a certain number n into 2 multipliers p, q. The problem is called factorization - there are a lot of algorithms, but in general everything is very sad (in terms of speed).

4) We come to the most important thing - for what the whole circus with horses was started: why is it required that p, q were simple? There are 2 answers here:

Functions e(p, q) and d(p, q) - do not work with all numbers, that is, if you give them a non-simple number as input, the functions may not give anything. But if p, q is simple, then the functions work flawlessly.
To complicate factorization n: decomposing a large number into prime factors becomes more complicated as the length of the number grows as n*log(n), that is, you decompose a 1024-bit number, then the complexity roughly speaking will be 1024*2^1024, but if the number suddenly if it turns out to be even, then its complexity will already be 512*2^1023 - that is, the computational complexity will fall by 4 times-accordingly, simplicity guarantees an increase in computational complexity.

5

Author: Barmaley, 2017-11-03 22:07:32

I think I figured out this topic. Using the example of Diffie-Hellman. To find the key K, you need to solve the equation K=g^a*b mod p, where "g", "p" is known. The product of primes (very large) is really hard to pick up. And it is necessary to use prime numbers because there is no database among the large primes (and there is no database because it is unrealistic to iterate over such large numbers), otherwise it would greatly speed up the solution of the problem, as with all numbers. But this is in theory, and in practice serious hackers are unlikely to iterate over the product of 2 primes. Because to find the secret key K, you can solve another equation, which is also equal to the secret key. K=B^a mod p, where "p" is known and "B" can be intercepted (since this information is transmitted over the network in an unsecured video). And then it will be necessary to pick up only "a". And if we take into account that "a" is at least odd and huge (on the order of 10 ^ 100) - discard even numbers and a range of too small numbers at the end. Although even with all at the same time, you will be tormented by picking up "a" )

1

Author: Павел Игорев, 2017-11-07 07:34:17

TLS uses the RSA cryptographic algorithm to generate keys most often. It is based on the assumption that the factorization of the product of prime numbers is a computationally complex operation.

It is easy to multiply two prime numbers, but it is very difficult to get the original numbers by having only the product. To date, there is no effective algorithm that allows you to do this quickly.

0

Author: Talleyran, 2017-11-04 10:00:58

score 0 · Accepted Answer

One reason is that is easy to multiply 2 large primes, such as

393050634124102232869567034555427371542904833 * 170141183460469231731687303715884105727

Gives immediately

66874100049762646240147492397977579549553083399872470020691839934725993584071278591

(Python, is practically impossible - there is no general fast enough algorithm (see Factorization algorithms.)

In general, this is a job for several years even for super computers.