How do I make a sentence from the letters entered by the user?

At the entrance to the program, a set of English letters is provided. There is a dictionary of words.

How to generate such sentences from words, so that each of the sentences consists only of those letters that were submitted to the input, taking into account the number of repetitions.

Example

Incoming character set: hellomyfriend

Program output:

Feed Hilly Morn
Feed Hilly Norm
Feed Horny Mill
Freehold Nil My
Refilled Hon My
Defiler Hymn Lo
и т.д.

I managed to select words that match the input set of characters. But quickly you can't make sentences out of them.

I did it by going through all the words. For the phrase myfavoritegame, the script worked for about 5 minutes. On the site, offers are given instantly.

Can you give me any advice?

Author: LEQADA, 2015-08-28

2 answers

Suppose you just need to find all possible combinations of words from a given dictionary that satisfy the condition. Without being distracted by the specifics of the language.

Important for search:

  1. the presence of letters in the word. We exclude letters that contain letters outside of the specified ones. We exclude, as we compose phrases that have already been used.
  2. counting the letters. At any time of composing a phrase, it is known how long the words are suitable, or exactly do not fit.
  3. repeats letters'.

Not important:

  1. the order of the letters in the dictionary word.
  2. the meaning of the word.

For speed, you need to make the search for important features as fast as possible, if necessary, removing unimportant aspects.

To quickly find words with a suitable set of letters, but without taking into account repetitions, you can make a bit index, as suggested by @Mirdin: 26 letters of the English alphabet = 26 bits. Number the words in the dictionary (or just count the number as an index lines) and make a separate index in two columns: the word id is the bit mask of the available letters. For a dictionary of less than 65k words, such an index will "weigh" 6 bytes per word, less than 400k. Can be kept in RAM for almost-instant search. So you can quickly find, for example, the first word of a phrase-just to make sure that the bits of "extra" letters are turned off.

It is worth making a copy of the dictionary, where the letters of the word are sorted alphabetically, and the words are sorted alphabetically. separate index: сортированные_буквы - id_слова. This index will be heavier than the dictionary itself by (number of words * 2 or 3 bytes). In this index, you can quickly find the right words and discard exactly the wrong ones.

The algorithm is something like this. Looking for the first word. I want to find the first word of the largest possible length. Iterate over the length, from larger to smaller. Eats a valid set of letters and length. We found the first word, updated the allowed set of letters and word lengths – we are looking for the next word.

 3
Author: Sergiks, 2015-08-28 21:17:37
  1. You make a structure (a table in the database, a hash table, a dictionary, and so on that there is in the PCP) from two fields per record: hash and the actual word. We fill this structure with words.
  2. The hash is roughly done as follows: the index of the letter in the alphabet is a power of two, all the letters of the word are added (32 bits should be enough for the Russian language). This is the simplest option, you may need to come up with a more complex function.
  3. Having received the word that needs to be instagrammed, we calculate its hash and search for it in our structure

UPD: This is for a single word, with phrases it will be more difficult, but the principle is the same

 2
Author: Mirdin, 2015-08-28 08:05:38