I’m trying to get just the value from an xpath query for hrefs attributes but I can’t figure out how to state the query, at best I get my refs back in a list of DomAttr that I need to use getValue() on to get the actual link.
My very simple set-up is the following:
WebClient webClient = new WebClient(); HtmlPage page = webClient.getPage(siteRef); var hrefs = page.getByXPath("//@href"); // Returns a list of DomAttr
E: This returns the value but it also only returns the first element it finds
var hrefs = page.getByXPath("string(//@href)");
Advertisement
Answer
I guess you are right, there is no way to get an array (or List
) of String
from getByXPath
values.
Nevertheless, you can achieve that behavior by utilizing java streams. There you benefit from additional possibilities to work with that result list (e.g. filter it or use additional processing like toLowerCase
on String
s):
var hrefs = page.getByXPath("//@href") .stream() .filter(o -> o instanceof DomAttr) //to be sure you have the correct type .map(o -> ((DomAttr) o)) //cast the stream from Object to DomAttr .map(DomAttr::getValue) //get value of every DomAttr .collect(Collectors.toList()); //collect it to a list
hrefs
now contains a List<String>
.
Instead of collect
ing the results in last step you can further work with the stream.