Read excel with Apache POI in java by column name

Good afternoon experts, I have a problem with a project that consists of reading an excel file and validating by columns the data thereof, I have managed to read the file without problems and validations also (by regular expressions) using the Apache POI library the detail is that I can only read columns by column index (example index 0 = Column 1, index 1 = column 2 etc) with the method getColumnIndex() the problem is that column name rather than by its index since the file to be read will tend to position the columns, is there any method to solve this, I have tried with:

columnIndex = cell.getSheet().getRow(rowIndex).getCell(0).getRichStringCellValue().toString();

But I can only read the whole row but not all columns and rows.

I attach the code I use to read the files:

 Workbook workbook = WorkbookFactory.create(new FileInputStream("C:\\archivo.xlsx"));

    Sheet sheet = workbook.getSheetAt(0);
    totalRows = sheet.getPhysicalNumberOfRows();
    System.out.println("Número total de filas: " + totalRows);
    Iterator<Row> rowIterator = sheet.iterator();

    while (rowIterator.hasNext()) {
        Row row = rowIterator.next();
        rowIndex = row.getRowNum();

        int rowIndex2 = rowIndex +1;

        if (rowIndex2 < 8) {
            continue;
        }
        Iterator<Cell> cellIterator = row.cellIterator();

        while (cellIterator.hasNext()) {

            cell = cellIterator.next();
            columnIndex = cell.getColumnIndex();


            switch (columnIndex) {
                case 0:
                    columnName = "columna 1";
                    //System.out.println(columnName+" -> " + rowIndex);
                    break;
                case 1:
                    columnName = "Columna 2";
                    break;
                case 2:
                    columnName = "Columna 3";
                    break;

            }


            value = this.getValue(cell);
            valid = this.isValid(columnIndex, value);

            if (valid && (value != "")) {
                continue;
            }


            if (value!="")
            {
            System.out.print("Valorno válido: " + columnName + " - " + rowIndex2);
            System.out.println(" -> valor no válido: " + value);
            }

        }

    }

    return procesarBTR();
}

private String getValue(Cell cell) {
    switch (cell.getCellType()) {
        case Cell.CELL_TYPE_BLANK:
            // return columnName;
            return "";

        case Cell.CELL_TYPE_BOOLEAN:
             return "CELL_TYPE_BOOLEAN";
           // return cell.getStringCellValue();

        case Cell.CELL_TYPE_ERROR:
            return "CELL_TYPE_ERROR";

        case Cell.CELL_TYPE_FORMULA:

            /*switch (cell.getCachedFormulaResultType()) {
                case Cell.CELL_TYPE_STRING:
                    return cell.getStringCellValue();
                case Cell.CELL_TYPE_NUMERIC:
                    return cell.getNumericCellValue() + "";
                default:
                    return "";}*/
            return cell.getStringCellValue();


        case Cell.CELL_TYPE_NUMERIC:
            return fmt.formatCellValue(cell);

        case Cell.CELL_TYPE_STRING:
            return cell.getStringCellValue();

        default:
            return "valor desconocido";

    }

}

Code that performs validations . . .

 3
Author: ricardo, 2016-03-09

1 answers

As I mentioned in comments (adapted):

In Excel you cannot name a column, there is no such concept. Therefore, there will be no way in other tools to read columns by their "name". POI works by reading rows using Class Row and cells in that row using Class Cell. Based on what you tell me, your best bet would be:

  1. declare a map that serves as column name and column location (number) metadata.
  2. En your Excel file and sheet to analyze, position yourself in the row that will have the names of the columns.
  3. For each cell in that row:

    3.1. If the cell has text and is not empty, store a new entry on the map. The input consists of the text value of the cell as a key and the column number as a value.

  4. once you have analyzed the metadata of your file, parse the rest of the data in Excel using only the columns you want to read. For this, you rely on the metadata map. For this part, it is assumed that you must have a collection or structure where the "column names" are that you are going to process.

An example of the algorithm described above is provided (important: it is assumed that the cells are not combined and that there is only 1 single row for column names)

//paso 0. Definir una colección con nombres de las columnas a procesar
//considera que esto lo puedes leer de un archivo de configuración,
//input de usuario o cualquier otra fuente
List<String> columnas = Arrays.asList("Apellido", "Prima");
//paso 1.
Map<String, Integer> mapNombresColumnas = new HashMap<>();
//paso 2.
//número de fila donde están los nombres de celda
//recuerda que POI está basado con índice 0
//si tus nombres están en la fila 1, entonces deberías iniciar esta
//variable con 0.
final int filaNombresColumnas = ...;
//ubicación del archivo Excel a procesar
File archivoExcel = new File("/ruta/de/archivo/excel.xlsx");
//abrir el archivo con POI
Workbook workbook = WorkbookFactory.create(archivoExcel);
//ubicarse en la hoja donde vas a procesar
//si es la primera hoja, debes indicar 0
Sheet sheet = workbook.getSheetAt(...);
//acceder a la fila con los nombres de las columnas
Row filaNombresColumna = sheet.getRow(filaNombresColumnas);
//paso 3.
//utilizando el poder de Java 8
row.cellIterator().forEachRemaining(cell -> {
    //paso 3.1.
    String valorCelda = cell.getStringCellValue().trim();
    if (!valorCelda.isEmpty()) {
        mapNombresColumnas.put(valorCelda, cell.getColumnIndex());
    }
});
//paso 4.
//se asume que los valores para procesar se encuentran en la fila
//siguiente a la fila donde están los nombres de las columnas
int indiceDatos = filaNombresColumnas + 1;
Row filaDatos = null;
//recorrer todas las filas con datos
while ((filaDatos = sheet.getRow(indiceDatos++)) != null) {
    //se procesan solo las celdas en base a los "nombres" de esas columnas
    for (String col : columnas) {
       //el resultado de mapNombresColumnas.get(col) es
       //el número de columna a leer
       //en este caso, solo se imprime el resultado
       //puedes reemplazar esto por la manera en que debas procesar la información
       System.out.print(filaDatos.getCell(mapNombresColumnas.get(col)) + " ");
    }
    System.out.println();
}

For an Excel file where the first sheet has the following structure:

        A           B         C      D
1    Nombre   | Apellido  | Edad | Prima
2    Luis     | Hernández |   28 | 150.43
3    José     | Zavala    |   35 | 170.21
4    Mercedes | Velásquez |   17 | 112.86

And using:

final int filaNombresColumnas = 0;
//...
Sheet sheet = workbook.getSheetAt(0);

The following result is obtained:

Hernández 150.43 
Zavala 170.21 
Velásquez 112.86 
 1
Author: , 2016-04-03 06:08:25