Skip to content

Selenium screenshot performance

I am processing animations taking place in a webpage in a video software. I am currently using Selenium 4 with Chrome and I’d like to achieve a better performance.

I could not find a way to leverage the Page.startScreencast method via Selenium, so my current approach is simply taking periodic screenshots using

screenshotAs = ((TakesScreenshot) d).getScreenshotAs(OutputType.BYTES);
BufferedImage bi = ByteArrayInputStream(screenshotAs));

Having looked at performance with VisualVM I can see that significant time spent is in getScreenshotAs which seems to talk to Chrome via HTTP even if I use a non-remote driver instance.

On the other hand the screenshot is returned as a Base64 encoded string which is then decoded into a byte[] and then ImageIO reads it. As I need to work with the raw image, it is very inefficient and I wonder if there is any better lower level way to work with the image data. I’d like to eliminate the png roundtrip if possible, even the Base64 one.

I am also not notified when any changes happen (as opposed to how the startScreencast works), so I could avoid taking unnecessary screenshots if there were no changes.

Is there any better way to do this that I am missing? Maybe JCEF would allow me to do this more efficiently? (I could not find this in their docs)

I was also looking at how OBS does it with CEF in it’s browser-source implementation, but unfortunately I cannot really see how image data is taken out of the CEF browser not being familiar with C++ at that level.



I too have struggled how to call startScreencast via Selenium interface however there are other libraries which allow you to do this; Selenium 4 is just a helper wrapper to abstract some of the complexities.

    ChromeDriver driver=new ChromeDriver();

    String addr = ((Map<String, String>) driver.getCapabilities().getCapability("goog:chromeOptions")).get("debuggerAddress");

    ChromeService cs = new ChromeServiceImpl(addr.split(":")[0], 
    Integer.parseInt(addr.split(":")[1] ));

    final ChromeTab tab = cs.getTabs().get(0);
    final ChromeDevToolsService devToolsService = cs.createDevToolsService(tab);
    final Page page = devToolsService.getPage();

    page.startScreencast(StartScreencastFormat.JPEG, 20, 1000, 1000, 5);
            event -> {
                        //Do you stuff...

This code uses

User contributions licensed under: CC BY-SA
2 People found this is helpful