Get text in quotes with regex in JavaScript

Through a textarea I am capturing what the user enters. My goal is to capture from the textarea, anything that is between double quotes, or single quotes.
Note: no line breaks are assumed if the quotes have not been closed.

For example, if the textarea contains the following:

aaa 'bbb' ccc "ddd"

Then the regex should capture

bbb
ddd

The regex I'm using:

/((\".*?\")|(\'.*?\'))/g

Works fine for the case exposed, however it throws me an error in console Unterminated group when the textarea contains the following:

aaa 'rgba(255,255,255,'

What I need is for any string regardless of what it contains to be captured just as the strings in the first example were captured.

 2
Author: Mariano, 2016-10-15

3 answers

Three ways to capture text in quotes (single or double)

1. Simple

To get the text in single or double quotes, we use 2 groups. After the match, only one of these 2 groups will have the searched text, and we will use only that value. Thus, you get only the text in quotes (not including quotes).

/"([^"]*)"|'([^']*)'/g

function obtenerTextoEnComillas() {
    const regex = /"([^"]*)"|'([^']*)'/g,
    	  texto = document.getElementById("ingreso").value;
    var   grupo,
          resultado = [];
    
    while ((grupo = regex.exec(texto)) !== null) {
        //si coincide con comillas dobles, el contenido estará en el
        //   grupo[1], con el grupo[2] undefined, y viceversa
        resultado.push(grupo[1] || grupo[2]);
    }
    
    //resultado es un array con todas las coincidencias
    // mostramos los valores separados con saltos de línea
    document.getElementById("resultado").innerText = resultado.join("\n");
}
<textarea id="ingreso" style="width:100%" rows="4">
aaa 'bbb' ccc "ddd"a
aaa 'rgba(255,255,255,'
</textarea>
<input type="button" value="Obtener texto entre comillas" onclick="obtenerTextoEnComillas()">
<pre id="resultado"></pre>


2. All in one

We can get the text always searched within the same group (grupo[2]).
At the end of the expression, we use \1, which is a retroreference to Group 1 (or backreferences), to ensure that it ends with the same character that was captured at the beginning(the quotes used to open).

/(["'])(.*?)\1/g

function obtenerTextoEnComillas() {
    const regex = /(["'])(.*?)\1/g,
    	  texto = document.getElementById("ingreso").value;
    var   grupo,
          resultado = [];
    
    while ((grupo = regex.exec(texto)) !== null) {
        //el grupo 1 contiene las comillas utilizadas
        //el grupo 2 es el texto dentro de éstas
        resultado.push(grupo[2]);
    }
    
    //resultado es un array con todas las coincidencias
    // mostramos los valores separados con saltos de línea
    document.getElementById("resultado").innerText = resultado.join("\n");
}
Texto:
<textarea id="ingreso" style="width:100%" rows="4">
aaa 'rgba(255,255,255,'
"texto con comillas 'simples' incluidas" ... 'y "viceversa"'
</textarea>
<input type="button" value="Obtener texto entre comillas" onclick="obtenerTextoEnComillas()">
<pre id="resultado"></pre>


Or, allowing line breaks between quotes, replacing the dot with [\s\S]:

/(["'])([\s\S]*?)\1/g

function obtenerTextoEnComillas() {
    const regex = /(["'])([\s\S]*?)\1/g,
    	  texto = document.getElementById("ingreso").value;
    var   grupo,
          resultado = [];
    
    while ((grupo = regex.exec(texto)) !== null) {
        //el grupo 1 contiene las comillas utilizadas
        //el grupo 2 es el texto dentro de éstas
        resultado.push(grupo[2]);
    }
    
    //resultado es un array con todas las coincidencias
    // mostramos los valores separados con saltos de línea
    document.getElementById("resultado").innerText = resultado.join("\n\n");
}
Texto:
<textarea id="ingreso" style="width:100%" rows="4">
aaa 'rgba(255,
255,255,'
"texto con comillas 'simples' incluidas" ... 'y "viceversa"'
</textarea>
<input type="button" value="Obtener texto entre comillas" onclick="obtenerTextoEnComillas()">
<pre id="resultado"></pre>


Also, many times you want to implement structures more elaborate than .*? within quotes. This expression is hardly less efficient than the previous one, but many times more effective with more complex structures (such as the regex that will be seen later).

/(["'])([^"']*(?:(?!\1)["'][^"']*)*)\1/g
  • we define the first group to match any of the two types of quotes (["'])
  • at the end of the expression, we use \1, like retroreference to Group 1 (the quotes used to open).
  • In the middle, Group 2 ([^"']*(?:(?!\1)["'][^"']*)*), which will contain the searched text. Matches:

    • any text without either of the two types of quotes [^"']*, followed (optionally) by
    • quotes not captured in Group 1 (?!\1)["'], followed by more text allowed [^"']*
      (?!..) is a negative forecast (or negative lookahead).


    *In this structure we use a technique known as Unrolling the Loop, which follows the format normal* (?: especial normal* )*.



3. "With \" escapes\ ""

We can also consider escaped quotes with a slash \" as valid (just like most languages).
In this case, we use the modifier /y (sticky ), which forces the match start at the beginning of the text or at the end of the last match, and thus ensure that the quotes are balanced. *see compatibility

/[^'"\\]*(?:\\.[^'"\\]*)*(["'])([^"'\\]*(?:(?:(?!\1)["']|\\.)[^"'\\]*)*)\1/gy


Description:

/
[^'"\\]*                    # Texto antes de las comillas
(?:                         # Grupo sin capturar
    \\.[^'"\\]*             #   Un \escape y más texto
)*                          # repetido 0 o más veces
(["'])                      # Comilla inicial (grupo 1)
(                           # Grupo 2: texto entre comillas
    [^"'\\]*                #   Caracteres que no son comillas ni \
    (?:                     #   Grupo sin capturar
        (?:(?!\1)["']|\\.)  #     Comillas que no son las usadas o un \escape
        [^"'\\]*            #     Seguido de más caracteres permitidos
    )*                      #   repetido 0 o más veces (unrolling the loop)
)                           # fin del grupo 2
\1                          # Cierre de comillas (\1 es el texto capturado en el grupo 1)
/gy                         # Modos: g (todas las coincidencias) y (sticky, anclado)


Code:

function obtenerTextoEnComillas() {
    const regex = /[^'"\\]*(?:\\.[^'"\\]*)*(["'])([^"'\\]*(?:(?:(?!\1)["']|\\.)[^"'\\]*)*)\1/gy,
    	  texto = document.getElementById("ingreso").value;
    var   grupo,
          resultado = [];
    
    while ((grupo = regex.exec(texto)) !== null) {
        //el grupo 1 contiene las comillas utilizadas
        //el grupo 2 es el texto dentro de éstas
        resultado.push(grupo[2]);
    }
    
    //resultado es un array con todas las coincidencias
    // mostramos los valores separados con saltos de línea
    document.getElementById("resultado").innerText = resultado.join("\n");
}
Texto:
<textarea id="ingreso" style="width:100%" rows="4">
aaa 'bbb' ccc "ddd"a
aaa 'rgba(255,255,255,'
acá "se \"permiten\" 'comillas' con escapes"
</textarea>
<input type="button" value="Obtener texto entre comillas" onclick="obtenerTextoEnComillas()">
<pre id="resultado"></pre>
 3
Author: Mariano, 2017-01-17 06:26:08

Try doing it like this :

var str = 'aaa \'bbb\' ccc "ddd" aaa \'b"bb\' ccc "d\'dd"',
  re = /"[^"]*"|'[^']*'/,
  match;
while (match = re.exec(str)) {
  console.log(match[0]);
  str = str.replace(match[0], '');
}

Console output :

'bbb'
"ddd"
'b"bb'
"d'dd"
 0
Author: Marcos, 2016-10-15 18:27:45
(?:'|")(.+)(?:'|")

This will capture you if there is any internal content between single or double quotes

Https://regex101.com/r/ZOLlyd/1

You can try it here with all your cases

 -2
Author: Daniel Arias, 2016-10-16 10:29:35