I am processing animations taking place in a webpage in a video software. I am currently using Selenium 4 with Chrome and I’d like to achieve a better performance.
I could not find a way to leverage the Page.startScreencast
method via Selenium, so my current approach is simply taking periodic screenshots using
screenshotAs = ((TakesScreenshot) d).getScreenshotAs(OutputType.BYTES); BufferedImage bi = ImageIO.read(new ByteArrayInputStream(screenshotAs));
Having looked at performance with VisualVM I can see that significant time spent is in getScreenshotAs
which seems to talk to Chrome via HTTP even if I use a non-remote driver instance.
On the other hand the screenshot is returned as a Base64 encoded string which is then decoded into a byte[] and then ImageIO
reads it. As I need to work with the raw image, it is very inefficient and I wonder if there is any better lower level way to work with the image data. I’d like to eliminate the png roundtrip if possible, even the Base64 one.
I am also not notified when any changes happen (as opposed to how the startScreencast
works), so I could avoid taking unnecessary screenshots if there were no changes.
Is there any better way to do this that I am missing? Maybe JCEF would allow me to do this more efficiently? (I could not find this in their docs)
I was also looking at how OBS does it with CEF in it’s browser-source implementation, but unfortunately I cannot really see how image data is taken out of the CEF browser not being familiar with C++ at that level.
Advertisement
Answer
I too have struggled how to call startScreencast via Selenium interface however there are other libraries which allow you to do this; Selenium 4 is just a helper wrapper to abstract some of the complexities.
ChromeDriver driver=new ChromeDriver(); String addr = ((Map<String, String>) driver.getCapabilities().getCapability("goog:chromeOptions")).get("debuggerAddress"); ChromeService cs = new ChromeServiceImpl(addr.split(":")[0], Integer.parseInt(addr.split(":")[1] )); final ChromeTab tab = cs.getTabs().get(0); final ChromeDevToolsService devToolsService = cs.createDevToolsService(tab); final Page page = devToolsService.getPage(); page.startScreencast(StartScreencastFormat.JPEG, 20, 1000, 1000, 5); page.onScreencastFrame( event -> { //Do you stuff... page.screencastFrameAck(event.getSessionId()); } );
This code uses https://github.com/kklisura/chrome-devtools-java-client