Apache POI – Is there a way to count occurrence of a string in a pivot table?

Tags: , ,



I am using version 4.1.2 of Apache Poi and I have this dataset:

String[] headers = new String[] { "Company", "Status" };

Object[][] sheetData = {
                {"Company 1", "OK"},
                {"Company 1", "NG"},
                {"Company 2", "NG"},
                {"Company 1", "OK"},
                {"Company 3", "OK"},
                {"Company 1", "NG"},
        };

I’m trying to create a pivot table using Apache POI that groups and counts the occurrence of strings from the 2nd column. I’ve tried:

pivotTable.addRowLabel(0);
pivotTable.addColumnLabel(DataConsolidateFunction.COUNT, 1);

But it somehow still counts the occurrences from the first column.

The pivot table I’m trying to create:

enter image description here

and the pivot table that is being generated:

enter image description here

Answer

The pivot table you are showing as the one you are trying to create shows the column 1 = B (Status) as a column label using a DataConsolidateFunction as well as column label used for labeling columns. So one column has two different properties in the pivot table here. That makes it complicated.

The DataConsolidateFunction column label is done already using pivotTable.addColumnLabel(DataConsolidateFunction.COUNT, 1). This also sets dataField setting for the column.

For the column label used for labeling columns apache poi 4.1.2 provides the method XSSFPivotTable.addColLabel. But this method removes the dataField setting. So we need set it new using the low level ooxml-shemas classes.

And the order of the commands is important here because they effect the same column. First do pivotTable.addColumnLabeland then do pivotTable.addColLabel. Else addColumnLabel will set dataField setting but will remove axis="axisCol" setting from that column. But because of the two different properties in the pivot table both settings are needed for that column.

Complete example:

import java.io.FileOutputStream;

import org.apache.poi.ss.SpreadsheetVersion;
import org.apache.poi.ss.util.AreaReference;
import org.apache.poi.ss.util.CellReference;
import org.apache.poi.ss.usermodel.DataConsolidateFunction;
import org.apache.poi.xssf.usermodel.*;

class CreatePivotTable {

 public static void main(String[] args) throws Exception {

  String[] headers = new String[] { "Company", "Status" };
  Object[][] sheetData = {
   {"Company 1", "OK"},
   {"Company 1", "NG"},
   {"Company 2", "NG"},
   {"Company 1", "OK"},
   {"Company 3", "OK"},
   {"Company 1", "NG"},
  };

  try (XSSFWorkbook workbook = new XSSFWorkbook(); 
       FileOutputStream fileout = new FileOutputStream("./ExcelResult.xlsx") ) {

   XSSFSheet dataSheet = workbook.createSheet("Data");
   XSSFRow row;
   XSSFCell cell;
   int r = 0;
   row = dataSheet.createRow(r++);
   int c = 0;
   for (String header : headers) {
    cell = row.createCell(c++);
    cell.setCellValue(header);
   }
   for (Object[] dataRow : sheetData) {
    row = dataSheet.createRow(r++);
    c = 0;
    for (Object value : dataRow) {
     cell = row.createCell(c++);
     if (value instanceof String) {
      cell.setCellValue((String)value);
     } //else if...
    }
   } 

   XSSFSheet pivotSheet = workbook.createSheet("Pivot");

   AreaReference areaReference = new AreaReference(
    new CellReference(0, 0),
    new CellReference(sheetData.length, headers.length-1),
    SpreadsheetVersion.EXCEL2007);

   XSSFPivotTable pivotTable = pivotSheet.createPivotTable(areaReference, new CellReference("A4"), dataSheet);

   pivotTable.addRowLabel(0);
   pivotTable.addColumnLabel(DataConsolidateFunction.COUNT, 1);

   pivotTable.addColLabel(1);
   //Method addColLabel removes the dataField setting. So we need set it new.
   pivotTable.getCTPivotTableDefinition().getPivotFields().getPivotFieldArray(1)
    .setDataField(true);

   workbook.write(fileout);

  }

 }
}


Source: stackoverflow