Audio signal FFT (C++ and SFML)

I have a question about the FFT of the audio signal. I want to draw a graph based on the audio signal. But I'm not good at this topic. I want advice from experienced comrades where to start.

I have a few questions: 1) What is samples? For example, when using the SFML multimedia library, the following construction is used to get samples:

sf::Buffer Buffer;
Buffer.loadFromFile("sound.wav");
const sf::Int16 *input = Buffer.getSamples();

So I understand that samples are a binary representation of an audio file? I understand correctly that the above code is the same as:

    typedef short int16;
    int16 *load()
    {
      FILE  *fp;
      if((fp=fopen("sound.wav", "rb"))==NULL) {
        printf("Ошибка при открытии файла.\n");
      }
      fseek(fp, 0, SEEK_END);
      long N = ftell(fp);
      fseek(fp, 0, SEEK_SET);
      int16 *A = new int16[N];
      for(i=0; i<N; i++)
        if(fread(A[i],
           sizeof(A), 1, fp)!=1) {
             if(feof(fp)) break;
             printf("Ошибка при чтении файла.\n");
        }

      fclose(fp);
      return A;
    }

    int16 *input = load(); 

I.e. getting samples is the same as getting a binary representation of a file in a 16-bit type? Or do I misunderstand?

2) Question 2: I know that there is a fast FFTW library for FFT. Also in the wiki there is an implementation of the FFT algorithm in C++ (link algorithm: https://ru.wikibooks.org/wiki/Implementation_algorithms/fast_furrier transformation_#C. 2B. 2B). The question is, what exactly is served at the entrance to the this algorithm? An array with a binary representation of the file, or samples taken using SFML? In this algorithm, it is assumed that an array with the analyzed data and an array with the transformed data are passed as parameters. What is meant by the quality of the analyzed data? And what if the size of the array is not equal to the power of two? Is this a prerequisite? I understand that the algorithm given by the link gives the resulting sample array, only converted to double?

int16 *input[N];
double *in = new double[N];
in = input;
double *out = new double[N];
void FFTAnalysis(in, out, N, N)

In this example, I understand that N is the ramser of an array of samples, i.e. sizeof (input);

In the same SFML library, the number of samples is as follows:

unsigned long long N = Buffer.getSampleCount();

I also don't understand why the size of the sample array is not equal to the number of samples, for example, when I do this:

sf::Buffer Buffer;
Buffer.loadFromFile("sound.wav");
unsigned long long N = Buffer.getSampleCount();
const sf::Int16 *raw = new sf::Int16[N];
raw = Buffer.getSamples();
printf("%hu", sizeof(raw));// == 4 - почему?, если N шестизначное 

The function: void FFTAnalysis(in, out, N, N) passes an array of analyzed data, and an array where the converted data is written (double in and double out). And N is the size these arrays (the number of samples). So by the condition N must always be a multiple of the power of two. But what if the number of samples is not a multiple of the power of two?

3) When we get the array of transformed data, what data should we draw the graph on? As the coordinates of its vertices (points), take the imaginary part of the spectrum (i.e., the elements of the array of transformed data) or the power spectrum? In other words, what data set should we take as the coordinates of the points of the sine wave?

It should work the graph, but I do not understand what to take for the coordinates of the points? When I use the SFML library to get the number of samples and the samples themselves, and output them in a loop via printf ("%hu", raw[i]); to the console, I see almost only zeros with rare ones. How can I draw a shape based on this data? so the samples need to be pre-processed before taking them for the coordinates of the points?

Author: ZeusBios, 2017-10-23

1 answers

If we are talking about an uncompressed audio file, then a sample is a sample obtained during the digitization of the signal, i.e. simply the instantaneous value of the amplitude of the analog signal. From these samples, you can build your "sine wave", i.e., the representation of the signal in the time domain. The FFT is a fast Fourier transform, and by performing it we get a signal mapping in the frequency domain (frequency decomposition of the signal). How the specific library is implemented I do not know, but for the algorithm The FFT needs a multiplicity of 2. Although you can also add zeros at the end, well, okay, this is already a DSP. I recommend that you first get acquainted with the theoretical side of the question in order to clearly understand what you are doing. Success)

 1
Author: Denis Scherbakov, 2017-10-23 16:51:01