I try to generate csv file from json type data. These are my json test data.
{ "realtime_start":"2020-09-25", "realtime_end":"2020-09-25",, "units": "Percent", "seriess": [ { "name": "James", "age": 29, "house": "CA" }, { "name": "Jina", "age": 39, "house": "MA", "notes": "Million tonne punch" }, }
The problem is json array type "seriess
” does not contain "notes"
node in all every nodes.
I made the below java codes to change this json data to csv file with header columns
JSONObject json = getJsonFileFromURL(...) JSONArray docsArray = json.getJSONArray("seriess"); docsArray.put(json.get("realtime_start")); docsArray.put(json.get("realtime_end")); docsArray.put(json.get("units")); JsonNode jsonTree = new ObjectMapper().readTree(docsArray.toString()); Builder csvSchemaBuilder = CsvSchema.builder(); for(JsonNode node : jsonTree) { node.fieldNames().forEachRemaining(fieldName -> {csvSchemaBuilder.addColumn(fieldName);} ); } CsvSchema csvSchema = csvSchemaBuilder.build().withHeader(); CsvMapper csvMapper = new CsvMapper(); csvMapper.writerFor(JsonNode.class).with(csvSchema).writeValue(new File("test.csv"), jsonTree);
But the incorrect results are shown like below,
realtime_start,realtime_end,units,names,age,house,realtime_start,realtime_end,units,names,age,house,notes, realtime_start,.....
The generated header columns does not contain distinct values. The header columns are added in duplicate. How can I generate the distinct header like below
realtime_start,realtime_end,units,names,age,house, notes
Any idea?
Update Part
I try to extract data from the FRED (FEDERAL RESERVE BANK OF ST. LOUIS). FRED provide simple and convenient Python api like below,
from fredapi import Fred import pandas as pd fred = Fred(api_key='abcdefghijklmnopqrstuvwxyz0123456789') data_unemploy = fred.search('Unemployment Rate in California') data_unemploy.to_csv("test_unemploy.csv")
But the java apis are deprecated, so I have to develop simple Java api which convert json values to csv file. I found the below Java codes with googling
JSONObject json = getJsonFileFromURL("https://api.stlouisfed.org/fred/series/search?search_text=Unemployment+Rate+in+California&api_key=abcdefghijklmnopqrstuvwxyz0123456789&file_type=json"); JSONArray docsArray = json.getJSONArray("seriess"); docsArray.put(json.get("realtime_start")); docsArray.put(json.get("realtime_end")); JsonNode jsonTree = new ObjectMapper().readTree(docsArray.toString()); JsonNode firstObject = jsonTree.elements().next(); // I am struggling with this line firstObject.fieldNames().forEachRemaining(fieldName -> {csvSchemaBuilder.addColumn(fieldName);} ); CsvSchema csvSchema = csvSchemaBuilder.build().withHeader(); CsvMapper csvMapper = new CsvMapper(); csvMapper.writerFor(JsonNode.class).with(csvSchema).writeValue(new File("test.csv"), jsonTree);
To extract columns from json data JsonNode firstObject = jsonTree.elements().next();
return the first json node. But this line does not return notes
column. because the first line does not contain the notes
key value.
So I change this code line to following lines
for(JsonNode node : jsonTree) { node.fieldNames().forEachRemaining(fieldName -> { csvSchemaBuilder.addColumn(fieldName); } ); }
But these lines throws the results which I do not expect. The repeated duplicated columns like below
realtime_start,realtime_end,units,names,age,house,realtime_start,realtime_end,units,names,age,house,notes, realtime_start,.....
I am totally stuck with this part.
Advertisement
Answer
Most probably it is easiest to write a bin type class like below :
public class CsvVo { private String realtime_start; private String realtime_end; private String units; private String name; private String age; private String house; private String notes; public void setRealtime_start(String realtime_start) { this.realtime_start = realtime_start; } //Other getters and Setters
Then you can Write :
public class ConvertJsonToCSVTest { public static void main(String[] args) throws JSONException { String jsonArrayString = "{n" + "t"realtime_start": "2020-09-25",n" + "t"realtime_end": "2020-09-25",n" + "t"units": "Percent",n" + "t"seriess": [{n" + "ttt"name": "James",n" + "ttt"age": 29,n" + "ttt"house": "CA"n" + "tt},n" + "tt{n" + "ttt"name": "Jina",n" + "ttt"age": 39,n" + "ttt"house": "MA",n" + "ttt"notes": "Million tonne punch"n" + "tt}n" + "t]n" + "}"; JSONObject inJson; List<CsvVo> list = new ArrayList<>(); inJson = new JSONObject(jsonArrayString); JSONArray inJsonSeries = inJson.getJSONArray("seriess"); for (int i = 0, size = inJsonSeries.length(); i < size; i++){ CsvVo line = new CsvVo(); line.setRealtime_start(inJson.get("realtime_start").toString()); line.setRealtime_end(inJson.get("realtime_end").toString()); line.setUnits(inJson.get("units").toString()); JSONObject o = (JSONObject)inJsonSeries.get(i); line.setName(o.get("name").toString()); line.setAge(o.get("age").toString()); line.setHouse(o.get("house").toString()); try { line.setNotes(o.get("notes").toString()); }catch (JSONException e){ line.setNotes(""); } list.add(line); } String[] cols = {"realtime_start", "realtime_end", "units", "name", "age", "house", "notes"}; CsvUtils.csvWriterUtil(CsvVo.class, list, "in/EmpDetails.csv", cols); } }
csvWriterUtil is like below :
public static <T> void csvWriterUtil(Class<T> beanClass, List<T> data, String outputFile, String[] columnMapping){ try{ Writer writer = new BufferedWriter(new FileWriter(outputFile)); ColumnPositionMappingStrategy<T> strategy = new ColumnPositionMappingStrategy<>(); strategy.setType(beanClass); strategy.setColumnMapping(columnMapping); StatefulBeanToCsv<T> statefulBeanToCsv =new StatefulBeanToCsvBuilder<T>(writer) .withMappingStrategy(strategy) .build(); writer.write(String.join(",",columnMapping)+"n"); statefulBeanToCsv.write(data); writer.close(); } catch (IOException e) { e.printStackTrace(); } catch (CsvRequiredFieldEmptyException e) { e.printStackTrace(); } catch (CsvDataTypeMismatchException e) { e.printStackTrace(); } }
Full example is available in GitRepo