public String getTotalInfected() { try { Document doc = Jsoup.parse("<tr class="total_row_world odd" role="row">rn" + "<td></td>rn" + "<td style="text-align:left;">World</td>rn" + "<td class="sorting_1">4,815,439</td>rn" + "<td>+16,173</td>rn" + "<td>316,853</td>rn" + "<td>+333</td>rn" + "<td>1,863,306</td>rn" + "<td>2,635,280</td>rn" + "<td>44,817</td>rn" + "<td>618</td>rn" + "<td>40.6</td>rn" + "<td></td>rn" + "<td></td>rn" + "<td></td>rn" + "<td style="display:none" data-continent="all">All</td>rn" + "</tr>"); Elements tr = doc.select("tr"); System.out.println("tr elements in html: " + tr.size()); Elements td = tr.select("td"); System.out.println(td.text()); return null; } catch (Exception ex) { return "Error in website linkage"; } }
Looking to scrape the numbers from the td tag For some reason nothing was scraped. I’m pretty new to JSoup library and scraping table is driving me crazy. Thanks for the help in advance!
Advertisement
Answer
You are missing <table>...</table>
.
Demo:
import org.jsoup.Jsoup; import org.jsoup.nodes.Document; import org.jsoup.select.Elements; public class JSoupPrj { public static void main(String[] args) { String html = "<table><tr class="total_row_world odd" role="row">rn" + "<td></td>rn" + "<td style="text-align:left;">World</td>rn" + "<td class="sorting_1">4,815,439</td>rn" + "<td>+16,173</td>rn" + "<td>316,853</td>rn" + "<td>+333</td>rn" + "<td>1,863,306</td>rn" + "<td>2,635,280</td>rn" + "<td>44,817</td>rn" + "<td>618</td>rn" + "<td>40.6</td>rn" + "<td></td>rn" + "<td></td>rn" + "<td></td>rn" + "<td style="display:none" data-continent="all">All</td>rn" + "</tr></table>"; Document doc = Jsoup.parse(html); Elements tr = doc.select("tr"); System.out.println("tr elements in html: " + tr.size()); Elements td = tr.select("td"); System.out.println(td.text()); } }
Output:
tr elements in html: 1 World 4,815,439 +16,173 316,853 +333 1,863,306 2,635,280 44,817 618 40.6 All