Reverse engineering

Good time of day!

There is a binary file whose format is not known. At the same time, there is data that this file contains, but already in text form. The essence of the task is to understand in what format the data is stored. You need a guide or at least a set of tips on how to approach such a problem and what to check first.

Data in text format from this binary file can be extracted by a program written in Qt. Maybe it makes sense try to decompile this program, but then how to find the place in the code where this file is read?

Author: Asem, 2019-09-06

2 answers

As requested, a bit of a messy set of tips

  1. It is worth trying to open the file with any hex editor. The first thing I try to do is open the file in the standard Total Commander viewer (hotkey F3). This allows you to immediately see if there is text data in the file, for example. As already advised in the comments - you can try to open the file with an archiver.
  2. You can drive the binary file into IDA Pro (even the free version is enough) and try it manually mark up the data. Initially, everything looks like an array of bytes with comments where the byte falls into the character range.
    • On what looks like text, click a - if it is ascii, not unicode. If not ascii, but clearly text-click alt+a and select the desired encoding.
    • On what looks like 4-byte integers (for example) - click d three times (for single-byte values 1 time, for two-byte values-2 times). In general, the basic skills to use the Ida they'll come in handy.
  3. We try to drive the converter program into IDA Pro, open the import table, look for functions related to opening files, look at links to them (xrefs-cross-references), one of (hundreds) of these links will be the code for opening the file you are interested in (and sometimes viewing hundreds of links brings the desired result). You can also search for the function of the open file dialog box.
    • In general, if you have worked with Qt, then the names of the functions (methods) will owe you something talk (unless of course the Qt library is linked statically).
    • If there is some text that is displayed on the screen immediately after the file is loaded or before it , you can try to search in the executable file for it, then again, follow the links to this text and go to the code that references it. You need to be prepared for the fact that the decompiler will probably not help you in any way, so you need to know assembly language at least at some basic level.
    • This is a general principle - we are looking for what we can pay for to catch on (a known function or text), from this we return to the code.
    • This is all provided that the exeshnik is not packed with anything and is not encrypted - for a beginner, this will be an almost insurmountable obstacle.
 2
Author: insolor, 2019-09-06 19:55:19

Download any binary file editor, for example https://www.hhdsoftware.com/free-hex-editor. In such editors, you can search for standard numeric types (int, short, float, double, etc.) and text strings. Look for numbers or strings that are exactly there (you know them from the text representation you have). On a piece of paper, you draw the structure of storing information in a file and think about how to continue working with it. But all this will only work if the file is not compressed, not compressed. encrypted and otherwise unmodified.

 1
Author: Andrey Sokolov, 2019-09-06 17:57:54