Skip to content
Advertisement

Need help parse this table

public String getTotalInfected() {
    try {
        Document doc = Jsoup.parse("<tr class="total_row_world odd" role="row">rn" + 
                "<td></td>rn" + 
                "<td style="text-align:left;">World</td>rn" + 
                "<td class="sorting_1">4,815,439</td>rn" + 
                "<td>+16,173</td>rn" + 
                "<td>316,853</td>rn" + 
                "<td>+333</td>rn" + 
                "<td>1,863,306</td>rn" + 
                "<td>2,635,280</td>rn" + 
                "<td>44,817</td>rn" + 
                "<td>618</td>rn" + 
                "<td>40.6</td>rn" + 
                "<td></td>rn" + 
                "<td></td>rn" + 
                "<td></td>rn" + 
                "<td style="display:none" data-continent="all">All</td>rn" + 
                "</tr>");
        Elements tr = doc.select("tr");
        System.out.println("tr elements in html: " + tr.size());
        Elements td = tr.select("td");
       System.out.println(td.text());
        return null;

    } catch (Exception ex) {
        return "Error in website linkage";
    }
}

Looking to scrape the numbers from the td tag For some reason nothing was scraped. I’m pretty new to JSoup library and scraping table is driving me crazy. Thanks for the help in advance!

Advertisement

Answer

You are missing <table>...</table>.

Demo:

import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.select.Elements;

public class JSoupPrj {
    public static void main(String[] args) {
        String html = "<table><tr class="total_row_world odd" role="row">rn" + "<td></td>rn"
                + "<td style="text-align:left;">World</td>rn" + "<td class="sorting_1">4,815,439</td>rn"
                + "<td>+16,173</td>rn" + "<td>316,853</td>rn" + "<td>+333</td>rn" + "<td>1,863,306</td>rn"
                + "<td>2,635,280</td>rn" + "<td>44,817</td>rn" + "<td>618</td>rn" + "<td>40.6</td>rn"
                + "<td></td>rn" + "<td></td>rn" + "<td></td>rn"
                + "<td style="display:none" data-continent="all">All</td>rn" + "</tr></table>";
        Document doc = Jsoup.parse(html);
        Elements tr = doc.select("tr");
        System.out.println("tr elements in html: " + tr.size());
        Elements td = tr.select("td");
        System.out.println(td.text());
    }
}

Output:

tr elements in html: 1
World 4,815,439 +16,173 316,853 +333 1,863,306 2,635,280 44,817 618 40.6    All
User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement