Remove the fish eye effect

There is a video from an action camera that uses a fish-eye effect when recording. The area that should be rectangular has a slightly rounded shape - it is marked in red in the snippet. I would like to stretch it back to the blue rectangle. I can draw 2 vertical or even all 4 lines that can be used to determine the distortion. How do I perform the transformation?

html, body, svg { height: 100%; margin: auto; display: block; }
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 256 171" version="1.1">
  <rect x="0.5" y="0.5" width="255" height="170" style="fill:none;stroke:#0000ff;stroke-width:1;" />
  <path style="fill:none;stroke:#ff0000;stroke-width:1;" d="M 9.2899892,9.8523004 C 0.36470721,58.788143 -1.8968968,109.53857 3.9756332,162.39638 97.464286,177.03692 179.23564,170.77632 254.10649,152.20674 257.52686,105.38064 253.68709,59.452773 242.81437,11.823413 163.07157,-1.6658033 92.718066,-2.2737144 9.2899892,9.8523004 Z" />
</svg>

If I understand correctly, you can use filter lenscorrection in ffmpeg, but I do not know how to choose the parameters. I tried to take one frame and play with it in Gimp, but nothing came of it.

In principle, I am satisfied with the algorithm, or rather the formula by which for each pixel of the new image, you can calculate the coordinates in the original one. In this case, the input data is the Bezier curves that I can draw.

It is guaranteed that the frame will look exactly like this: 4 polyline points with guides (ideally do without only two vertical lines):

screenshot with guides

Another example:

html, body, svg { height: 100%; margin: auto; display: block; }
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 201 106" version="1.1" >
  <path style="fill:none;stroke:#000000;stroke-width:1;" d="M 1.7008913,10.872024 C -0.67393974,43.547945 0.43942626,72.73567 0.94494026,102.53125 63.525193,106.64485 143.73295,106.1533 200.51635,98.940476 201.24176,66.541427 200.09818,34.516174 195.79166,3.1235117 135.27441,-2.4071433 83.446773,-0.03728631 1.7008913,10.872024 Z" />
</svg>

The answer to the question can be any option from the following:

  • A ready-made program that performs a task for a video or frame.
  • A way to select the correct parameters for ffmpeg.
  • Formula for calculating the old pixel coordinates from the coordinate in the result image.
  • Other view graphic transformation.
  • ...
Author: Qwertiy, 2019-06-02

2 answers

This problem can be solved by transforming the texture coordinates in the fragment shader.

let inputs = ['fisheye:321', 'cX:495', 'cY:334', 'rY:258', 'rZ:562', 'zoom:581']
let input = (id, val) => `<label for="${id}"></label>
<input id="${id}" type="range" min="0" max="1000" value="${val}" onmousemove="draw()"/>`
inputs.forEach(i => inp.innerHTML += input(...i.split(':')))
let gl = canvas.getContext('webgl');
let loader = new Image();
loader.crossOrigin = "anonymous";
loader.src = "https://i.imgur.com/G9H683l.jpg";
loader.onload = function() { 
    canvas.width = loader.width;
    canvas.height = loader.height;
    pid = gl.createProgram();
    shader(`
        float perspective = 1.0;          
        attribute vec2 coords;
        uniform float rY; 
        varying vec2 uv;
        void main(void) {
          mat3 rotY = mat3(vec3(cos(rY),  0.0, sin(rY)), 
                           vec3(0.0,      1.0,     0.0),
                           vec3(-sin(rY), 0.0, cos(rY)));
          vec3 p =  vec3(coords.xy, 0.) * rotY;
          uv = coords.xy*0.5 + 0.5;   
          gl_Position = vec4(p, 1.0 + p.z * perspective);
        }
    `, gl.VERTEX_SHADER);
    shader(`
      precision highp float;
      const vec2 res = vec2(${canvas.width}., ${canvas.height}.);  
      varying vec2 uv;
      uniform float fisheye;
      uniform float cX;
      uniform float cY;
      uniform float rZ; 
      uniform float zoom; 
      uniform sampler2D texture;

      // http://stackoverflow.com/questions/6030814
      void main(void) {
        float prop = res.x / res.y;        
        vec2 center = vec2(cX, cY);
        vec2 p = vec2(uv.x,uv.y/prop); 
        vec2 m = vec2(0.5, 0.5 / prop);
        vec2 d = p - m;
        float r = sqrt(dot(d, d));
        float power = (2.0 * 3.141592 / (2.0 * sqrt(dot(m, m)))) * fisheye; 
        float bind;            
        if (power > 0.0) {                
          bind = sqrt(dot(m, m)); 
        } else {                          
          if (prop < 1.0) bind = m.x; 
          else bind = m.y; 
        } 
        vec2 uv = p;                     
        if (power > 0.0) 
          uv = m + normalize(d) * tan(r * power) * bind / tan( bind * power);
        else if (power < 0.0)         
          uv = m + normalize(d) * atan(r * -power * 10.0) * bind / atan(-power * bind * 10.0);
        uv -= vec2(0.5, 0.5/prop); 
        vec2 sc = vec2(sin(rZ), cos(rZ));
        uv *= mat2(sc.y, -sc.x, sc.xy);
        uv *= zoom+1.; 
        uv -= center;
        uv += vec2(0.5, 0.5/prop);
        uv = vec2(uv.x, 1.-uv.y * prop);
        gl_FragColor = texture2D(texture, uv);
      }
    `, gl.FRAGMENT_SHADER);
    gl.linkProgram(pid);
    gl.useProgram(pid);
    gl.bindBuffer(gl.ARRAY_BUFFER, gl.createBuffer());
    gl.bufferData(gl.ARRAY_BUFFER, new Float32Array([-1,-1,1,-1,-1,1,-1,1,1,-1,1,1]), gl.STATIC_DRAW);
    let al = gl.getAttribLocation(pid, "coords");
    gl.vertexAttribPointer(al, 2, gl.FLOAT, false, 0, 0);
    gl.enableVertexAttribArray(al);
    let texture = gl.createTexture();
    gl.bindTexture(gl.TEXTURE_2D, texture);
    gl.texImage2D(gl.TEXTURE_2D, 0, gl.RGBA, gl.RGBA, gl.UNSIGNED_BYTE, loader);
    gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_MIN_FILTER, gl.NEAREST);
    gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_WRAP_S, gl.CLAMP_TO_EDGE);
    gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_WRAP_T, gl.CLAMP_TO_EDGE);
    gl.uniform1i(gl.getUniformLocation(pid, "texture"), 0);
    inputs = inputs.map(i => document.querySelector('#' + i.split(':')[0]))
    inputs.forEach(i => i.uniform = gl.getUniformLocation(pid, i.id))
    draw();
}
  
function draw() {
  inputs.forEach(i => {
    gl.uniform1f(i.uniform, i.value/5000-0.1);
    document.querySelector(`label[for="${i.id}"]`)
            .textContent = `${i.value} ${i.id}: ${(i.value/5000-0.1).toFixed(4)}`
  })
  gl.viewport(0, 0, gl.drawingBufferWidth, gl.drawingBufferHeight);
  gl.clearColor(0, 0, 0, 0);
  gl.drawArrays(gl.TRIANGLES, 0, 6);
}

function shader(src, type) {
  let sid = gl.createShader(type);
  gl.shaderSource(sid, src);
  gl.compileShader(sid);
  var message = gl.getShaderInfoLog(sid);
  gl.attachShader(pid, sid);
  if (message.length > 0) {
    console.log(src.split('\n').map(function (str, i) {
      return ("" + (1 + i)).padStart(4, "0") + ": " + str
    }).join('\n'));
    throw message;
  }
}
input{width:calc(100% - 190px)}
label{display:inline-block;width:180px}
<span id="inp"></span>
<canvas id="canvas" style="zoom:0.6"></canvas>

PS: after some research, we managed to build ffmpeg by screwing this shader to it, more details here

Code on github


Result (not tried hard with coefficients)

Up to

enter a description of the image here

After

enter a description of the image here

UPD: changes in the snippet

 21
Author: Stranger in the Q, 2019-06-14 20:23:16

Two hundred years ago, a book was published about how to write your Wolfenstein. There was a similar problem in the calculations: if you do everything according to the formulas, there was an effect of fish eye.

To correct it, the author suggested multiplying the distance to the wall by the cosine of the angle from the center of the image. It seems that a similar problem was encountered not only there.

an illustration for further discussion

I took your illustration and drew it a little further to show what I mean. So we have a point A, which should actually hit point B.

If you believe the formula used to get rid of the fish eye in the Raycasting algorithm, you can correct the distortion by dividing the ordinate A by the cosine λ, and the abscissa by the cosine φ.

These are the vertical and horizontal angles from the center of the image. The trouble is that they depend on the camera's focal length. The camera with a short focus sees a wide angle horizontally and vertically, which creates a fishy effect. eyes.

In the presented picture, the center of the screen corresponds to the coordinates (0; 0) and the positive directions along the axes to the right and up, that is, as in school textbooks.

We have:

yB = yA/cos(λ)
xB = xA/cos(φ)

From here:

cos(λ) = yA/yB и λ = arccos(yA/yB)
cos(φ) = xA/xB и φ = arccos(xA/xB)

So we can calculate the limit angles λ and φ.

Now the algorithm itself. First, we calculate the limit angles in each quadrant. Judging by your picture, the camera is not very symmetrical and in different quadrants, the image may be distorted by different ways.

Next, in each square, we go through all the points vertically and horizontally.

At point B with coordinates (xB, yB), we want to place a point that in the distorted image corresponds to point A with coordinates (xA, yA).

xA = xB * cos(φ)
yA = yB * cos(λ)

We take the color of the pixel (xA; yA) in the distorted image and repaint it with the pixel (xB; yB) in the target image.

Now go to the point next to the left. It will have a slightly smaller angle φ. As far as I understand, tg(φ) = xB/fH, where fH - the horizontal focal length of the camera. In an ideal world, the horizontal focus should be the same as the vertical focus, but it looks like the distortions are a little more complex. We can assume that we have two different foci in different quadrants, hoping that this will increase the accuracy of calculations in each quadrant. Having calculated through fH through tg(φ), we can then calculate a new value of φ through fH at each step:

tg(φ') = (xB - 1)/fH

Hence

φ' = atan2(fH, xB - 1)
xA' = (xB - 1) * cos(φ')

On this line the ordinate y remains unchanged, but on the next one we will need to recalculate it in the same way as we did for the abscissa.

 3
Author: Mark Shevchenko, 2019-06-11 12:56:18