Because the size of an image encoded in base64 is different from the original?

I am using the javascript file API to read images using the readasdataurl method and came across that my images were always larger than their originals, in one case even 1 MB larger.

This is the code I'm using

$(function() {
  'use strict';
  $('#source').on('change', function(evt) {

    var reader = new FileReader();

    var file = evt.target.files[0];
    var reader = new FileReader();

    reader.onloadend = function() {
      $('#encoded').text('El tamaño codificado es ' + reader.result.length / 1024 + ' kb')
    }

    if (file) {
      $('#original').text('El tamaño original es ' + file.size / 1024 + ' kb')
      reader.readAsDataURL(file);
    }
  });
});
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<input id="source" type="file">
<div id="original"></div>
<div id="encoded"></div>

I don't know if I'm missing something here or is there any reason for this behavior. Does anyone know what the cause may be?

 15
Author: devconcept, 2016-04-22

3 answers

Base64 is a printable character representation of a binary content. to make it printable, Base64 uses a 64-character alphabet, 6 bits per character.

The "waste" appears because Base64 is stored/transported in bytes (like everything else), so for each byte you store in Base64 representation you need 1 byte + 2 bits of the next storage byte.. disk, network, memory, etc.

A diagram says more than a thousand words:

 byte    | base 64
-------- + ---------------------------
 1 byte  | 1 byte + 2 bits.
 2 byte  | 2 byte + 4 bits.
 3 byte  | 3 byte + 6 bits = 4 bytes (recuerda, solo 6 bits)

That is to say that each block of 3 bytes of the original file is converted into 4 bytes in Base64 format, so that at least the size will increase in a 4/3 ratio, that is, 133%. To this you will have to add the padding (not always the file sizes will be multiples of 3) and the carriage returns found in the transfer file.

 11
Author: rnrneverdies, 2016-04-22 20:13:45

I just tried with an image of the following sizes:

  • Size: 16.8 KB (17,276 bytes)
  • Size on disk: 20.0 KB (20,480 bytes).

Making use of your example, it turns out:

  • the original size is 16.87109375 kb.
  • the encoded size is 22.517578125 kb.

Let's go to the San English Wikipedia https://en.wikipedia.org/wiki/Base64

Thus, the actual length of MIME-compliant Base64-encoded binary data is usually about 137% of the original data length, though for very short messages the overhead can be much higher due to the overhead of the headers.

And in the San Wikipedia in Spanish https://es.wikipedia.org/wiki/Base64

MIME does not specify a fixed size for lines encoded in base64, but it does require a maximum size of 76 characters. In addition, any character that does not belong to the Alphabet shall be ignored by decoders, though many implementations they use the CR/LF (carriage return and line break) characters to delimit the encoded lines. In this way, the actual size of MIME-encoded data is typically 140% of the size original.

Whether 137% or 140%, the size will increase when using Base64. Greetings!

 4
Author: fredyfx, 2016-04-22 19:08:04

FileReader transforms image A into a text string in base64:

data:image/jpeg;base64,/9j/4AAQSkZJRgABAQAAAQABAAD//gA8Q1JF...

While your variabe file is an object of type File that contains the image in binary.

For more information about the object File you can read this documentation

 3
Author: Avara, 2016-04-22 19:26:00