I’m trying to convert a csv file into an arff file using the following code.
var csvFile = new File("/path/to/input/file.csv"); var arffOutputFile = new File("/path/to/output/file.arff"); var loader = new CSVLoader(); loader.setSource(csvFile); var instances = loader.getDataSet(); var saver = new ArffSaver(); saver.setInstances(instances); saver.setFile(arffOutputFile); saver.writeBatch();
This code works, but the problem is the following. In my attributes list, I have a nominal attribute with values {yes, no}
and i need that the arff header shows as first value yes
. To be clearer, I need @attribute nominal_attr {yes,no}
and not @attribute nominal_attr {no,yes}
in the arff output header. The problem is that the order is determined by the value of the first Instance
in instances
: if the first row in csv input file has the no
value, in the header there will be @attribute nominal_attr {no,yes}
.
Is there a way to force the ArffSaver
to use a certain order in the header without changing the order of the Instances
?
Advertisement
Answer
Instead of fixing the output (ie ArffSaver), it would be easier fixing the input (ie CSVLoader). The -L
command-line option (nominalLabelSpecs
property in the GUI) allows you to specify the labels for nominal attributes. That way, you can force the order and available labels, if one of the CSV files doesn’t have all the labels present.
The following filters can be used as well to change the order of your labels: