How to join several text files into one?

Does anyone know how to select all the text files from the same directory and merge the information of all of them into one final text file?

Example: in Folder X, I have the files 1.txt, 2.txt and 3.txt. I need to merge the contents of all in just one text file.

I tried this code, which compiles but when it executes an exception of type IndexOutofRange is raised.

string[] stringArray = Directory.GetFiles(@"C:\InventX", "*.txt");
        System.Text.StringBuilder stringBuilder = new System.Text.StringBuilder();
        for (int i = 0; i <= stringArray.Count(); i++)
        {
            stringBuilder.Append(System.IO.File.ReadAllText(stringArray[i]));
        }
        string bulidOutput = stringBuilder.ToString();
        string newFilePath = @"C:\Lala.txt";
        System.IO.File.WriteAllText(newFilePath, bulidOutput);
Author: Maniero, 2014-05-22

4 answers

The error in your code is due to this condition:

for (int i = 0; i <= stringArray.Count(); i++)

Should be

for (int i = 0; i < stringArray.Count(); i++)

As it is, in the last iteration, when i == stringArray.Count() and given that arrays are zero index will raise the exception IndexOutOfRangeException.

To add, an efficient way to join files is to read them bit by bit and write as each bit is read. You can change the size of the buffer and compare the gains / losses relative to the performance to see which best suits your scenario.

public void UnirFicheiros(string directorio, string filtro, string ficheiroUnido)
{
    if (Directory.Exists(directorio))
        throw new DirectoryNotFoundException();

    const int bufferSize = 1 * 1024;
    using (var outputFile = File.Create(Path.Combine(directorio, ficheiroUnido)))
    {
        foreach (string file in Directory.GetFiles(directorio, filtro))
        {
            using (var inputFile = File.OpenRead(file))
            {
                var buffer = new byte[bufferSize];
                int bytesRead;
                while ((bytesRead = inputFile.Read(buffer, 0, buffer.Length)) > 0)
                {
                    outputFile.Write(buffer, 0, bytesRead);
                }
            }
        }
    }
}
 10
Author: Omni, 2014-12-04 16:38:58

Here is a simple example:

static void Main(string[] args)
{
    string diretorio = @"C:\teste";

    String[] listaDeArquivos = Directory.GetFiles(diretorio);

    if (listaDeArquivos.Length > 0)
    {
        string caminhoArquivoDestino = @"C:\teste\saida.txt";

        FileStream arquivoDestino = File.Open(caminhoArquivoDestino, FileMode.OpenOrCreate);
        arquivoDestino.Close();

        List<String> linhasDestino = new List<string>();

        foreach (String caminhoArquivo in listaDeArquivos)
        {
            linhasDestino.AddRange(File.ReadAllLines(caminhoArquivo));
        }

        File.WriteAllLines(caminhoArquivoDestino, linhasDestino.ToArray());
    }

}

Play with the methods and adapt them to your need.

 9
Author: Reiksiel, 2014-05-22 14:29:06

Since the approach doesn't seem to be good, I decided to make a compilable example that would solve the problem in a generic way.

using System;
using System.IO;
using Util.IO;

public class MergeFiles {
    public static void Main(string[] args) {
        int bufferSize;
        FileUtil.MergeTextFiles(args[0], args[1], args[2], (int.TryParse(args[3], out bufferSize) ? bufferSize : 0));
    }
}

namespace Util.IO {
    public static class FileUtil {
        public static void MergeTextFiles(string targetFileName, string sourcePath, string searchPattern = "*.*", int bufferSize = 0) {
        if (string.IsNullOrEmpty(sourcePath)) {
            sourcePath = Directory.GetCurrentDirectory();
        }
            if (targetFileName.IndexOfAny(System.IO.Path.GetInvalidPathChars()) != -1) {
                throw new ArgumentException("Diretório fonte especificado contém caracteres inválidos", "sourcePath");
            }
            if (string.IsNullOrEmpty(targetFileName)) {
                throw new ArgumentException("Nome do arquivo destino precisa ser especificado", "targetFileName");
            }
            if (string.IsNullOrEmpty(targetFileName)) {
                throw new ArgumentException("Nome do arquivo destino precisa ser especificado", "targetFileName");
            }
            if (targetFileName.IndexOfAny(System.IO.Path.GetInvalidFileNameChars()) != -1) {
                throw new ArgumentException("Nome do arquivo destino contém caracteres inválidos", "targetFileName");
            }
            var targetFullFileName = Path.Combine(sourcePath, targetFileName);
            if (bufferSize == 0) {
                File.Delete(targetFullFileName);
                foreach (var file in Directory.GetFiles(sourcePath, searchPattern)) {
                    if (file != targetFullFileName) {
                        File.AppendAllText(targetFullFileName, File.ReadAllText(file));
                    }
                }
            } else {
                using (var targetFile = File.Create(targetFullFileName, bufferSize)) {
                    foreach (var file in Directory.GetFiles(sourcePath, searchPattern)) {
                        if (file != targetFullFileName) {
                            using (var sourceFile = File.OpenRead(file))    {
                                var buffer = new byte[bufferSize];
                                int bytesRead;
                                while ((bytesRead = sourceFile.Read(buffer, 0, buffer.Length)) > 0) {
                                    targetFile.Write(buffer, 0, bytesRead);
                                }
                            }
                        }
                    }
                }
            }
        }
    }
}

I put on GitHub for future reference.

The Main() method is there just to facilitate a quick test, it is not in production condition. The MergeTextFiles() method is quite reasonable to use. It is not 100%, I did not make a unit of tests for it, I did not document, I did not think about all possible situations, but it is already okay transferred.

You can choose a size of buffer if you want to better control the copy shape. If you think you'll never need this, you can take this part of the method. But it does not hurt to leave, since the default is to make the full copy of the files within the criteria of the current implementation of .NET.

Possible improvements

Some improvements can be made to make it more generic or add features. You could for example, put a last parameter parameter extraNewLineOptions extraNewLineOption = extraNewLineOptions.NoExtraNewLine and an enumeration enum extraNewLineOptions { NoExtraNewLine, SelectiveExtraNewLine, AlwaysExtraNewLine }.

To allow an extra line break to be placed at the end of each file to ensure that it will not encode texts. This can be useful but in most cases it is not necessary, so it would be disabled by default . I leave to the creativity of each the implementation of this, mainly by SelectiveExtraNewLine() that would only put a line break if it does not exist at the end of the file, it is not so trivial to implement. You can create a overload to improve parameter usage.

Another improvement is to allow copying to be done asynchronously. Very useful if you have large volumes of files.

And the method could be breaking into parts as well.

Depending on .NET version

I used features to be able to run on virtually any version of .NET. if it is guaranteed to be used in newer versions, it is possible to swap the parameter checks by Contract.Requires(). Or even it is possible to remove all this since the verification of all these problems are also done in the so-called methods. Of course you would lose the locality of the information from where exactly the error originated.

Unfortunately there is no public method to check the validity of the Joker in advance. But if necessary it is possible to check how it is implemented in .NET sources (and possibly in Mono sources also ( in .NET Core ).

If you have C # 6 (through Roslyn), some improvements can be made.

Could use a using Util.IO.FileUtil; and then call the method directly: MergeTextFiles("combo.txt", ".", "*.txt").

In addition the statements int bufferSize; in the method Main() and int bytesRead; could be made inline during their use during TryParse() and while respectively: int.TryParse(args[3], out var bufferSize and while ((int bytesRead = sourceFile.Read(buffer, 0, buffer.Length)) > 0) {.

See the example in C# 6 in the ideone. E no .NET Fiddle . Also I put on GitHub for future reference .

 8
Author: Maniero, 2020-09-03 13:56:40

With StreamWriter

String[] arquivos = Directory.GetFiles(@".\Txts", "*.txt");
StreamWriter strWriter = new StreamWriter(".\\Final.txt");
foreach (String arquivo in arquivos)
{
    strWriter.WriteLine(File.ReadAllText(arquivo));
}
strWriter.Flush();
strWriter.Dispose();

reference:

 4
Author: , 2014-05-22 14:42:31