Skip to content
Advertisement

Partition a Set into smaller Subsets and process as batch

I have a continuous running thread in my application, which consists of a HashSet to store all the symbols inside the application. As per the design at the time it was written, inside the thread’s while true condition it will iterate the HashSet continuously, and update the database for all the symbols contained inside HashSet.

The maximum number of symbols that might be present inside the HashSet will be around 6000. I don’t want to update the DB with all the 6000 symbols at once, but divide this HashSet into different subsets of 500 each (12 sets) and execute each subset individually and have a thread sleep after each subset for 15 minutes, so that I can reduce the pressure on the database.

This is my code (sample code snippet)

How can I partition a set into smaller subsets and process (I have seen the examples for partitioning ArrayList, TreeSet, but didn’t find any example related to HashSet)

package com.ubsc.rewji.threads;

import java.util.Arrays;
import java.util.Collections;
import java.util.HashSet;
import java.util.Iterator;
import java.util.Set;
import java.util.concurrent.PriorityBlockingQueue;

public class TaskerThread extends Thread {
    private PriorityBlockingQueue<String> priorityBlocking = new PriorityBlockingQueue<String>();
    String symbols[] = new String[] { "One", "Two", "Three", "Four" };
    Set<String> allSymbolsSet = Collections
            .synchronizedSet(new HashSet<String>(Arrays.asList(symbols)));

    public void addsymbols(String commaDelimSymbolsList) {
        if (commaDelimSymbolsList != null) {
            String[] symAr = commaDelimSymbolsList.split(",");
            for (int i = 0; i < symAr.length; i++) {
                priorityBlocking.add(symAr[i]);
            }
        }
    }

    public void run() {
        while (true) {
            try {
                while (priorityBlocking.peek() != null) {
                    String symbol = priorityBlocking.poll();
                    allSymbolsSet.add(symbol);
                }
                Iterator<String> ite = allSymbolsSet.iterator();
                System.out.println("=======================");
                while (ite.hasNext()) {
                    String symbol = ite.next();
                    if (symbol != null && symbol.trim().length() > 0) {
                        try {
                            updateDB(symbol);

                        } catch (Exception e) {
                            e.printStackTrace();
                        }
                    }
                }
                Thread.sleep(2000);
            } catch (Exception e) {
                e.printStackTrace();
            }
        }
    }

    public void updateDB(String symbol) {
        System.out.println("THE SYMBOL BEING UPDATED IS" + "  " + symbol);
    }

    public static void main(String args[]) {
        TaskerThread taskThread = new TaskerThread();
        taskThread.start();

        String commaDelimSymbolsList = "ONVO,HJI,HYU,SD,F,SDF,ASA,TRET,TRE,JHG,RWE,XCX,WQE,KLJK,XCZ";
        taskThread.addsymbols(commaDelimSymbolsList);

    }

}

Advertisement

Answer

Do something like

private static final int PARTITIONS_COUNT = 12;

List<Set<Type>> theSets = new ArrayList<Set<Type>>(PARTITIONS_COUNT);
for (int i = 0; i < PARTITIONS_COUNT; i++) {
    theSets.add(new HashSet<Type>());
}

int index = 0;
for (Type object : originalSet) {
    theSets.get(index++ % PARTITIONS_COUNT).add(object);
}

Now you have partitioned the originalSet into 12 other HashSets.

Advertisement