Skip to content
Advertisement

Java Playwright using connect with Proxy for browserless

I want to use Playwright.connect() method using Proxy to consume Browserless. According to Browserless doc.

https://docs.browserless.io/docs/playwright.html

Playwright official website

The standard connect method uses playwright’s built-in browser-server to handle the connection. This, generally, is a faster and more fully-featured method since it supports most of the playwright parameters (such as using a proxy and more). However, since this requires the usage of playwright in our implementation, things like ad-blocking and stealth aren’t supported. In order to utilize those, you’ll need to see our integration with connectOverCDP.

I thought well connect will have a .setProxy(), Like launch()

browserType.launch(new BrowserType.LaunchOptions().setProxy(proxy));

But connect methods it has 2 variations

default Browser connect(String wsEndpoint) {
   return connect(wsEndpoint, null);
}
Browser connect(String wsEndpoint, ConnectOptions options);

I thought well i will pick connect + ConnectOptions it sures has a .setProxy as well but it doesn’t.

class ConnectOptions {
 public Map<String, String> headers;
 public Double slowMo;
 public Double timeout;

 public ConnectOptions setHeaders(Map<String, String> headers) {
   this.headers = headers;
   return this;
 }

 public ConnectOptions setSlowMo(double slowMo) {
   this.slowMo = slowMo;
   return this;
 }
 public ConnectOptions setTimeout(double timeout) {
   this.timeout = timeout;
   return this;
 }
}

I have try this

final Browser.NewContextOptions browserContextOptions = new Browser.NewContextOptions().setProxy(proxy);
Browser browser = playwright.chromium()
            .connect("wss://&--proxy-server=http://myproxyserver:1111")
            .newContext(browserContextOptions)
            .browser();
browser.newPage("resource");

But the proxy returns authentication is required.

I’m confused now Browserless says that .connect could provide a Proxy but how? Is browserless wrong? Or am I missing something? I’m new on this technology.

I have tried as well using page.setExtraHTTPHeaders.

private void applyProxyToPage(final Page page,final String 
userPassCombination){
final String value = "Basic "+Base64.getEncoder().encodeToString(userPassCombination.getBytes(Charset.forName("UTF-8")));
page.setExtraHTTPHeaders(Collections.singletonMap("Authorization",value));
//page.setExtraHTTPHeaders(Collections.singletonMap("Proxy-Authorization",value));// Not working either

}

Advertisement

Answer

With the help of my friend Alejandro Loyola at Browserless, I am now able to connect. I will post the snippet:

private String navigateWithPlaywrightInBrowserlessWithProxy(final String token,final String proxyHost,final String userName,final String userPass,final String url){
    final Browser.NewContextOptions browserContextOptions = new Browser.NewContextOptions()
            .setProxy(new Proxy(proxyHost)
                    .setUsername(userName)
                    .setPassword(userPass));//Raw password not encoded in any way;
    try (final Playwright playwright = Playwright.create(); Browser browser = playwright.chromium().connectOverCDP("wss://chrome.browserless.io?token=" + token);final BrowserContext context = browser.newContext(browserContextOptions);){
        Page page = context.newPage();
        page.route("**/*.svg", Route::abort);
        page.route("**/*.png", Route::abort);
        page.route("**/*.jpg", Route::abort);
        page.route("**/*.jpeg", Route::abort);
        page.route("**/*.css", Route::abort);
        page.route("**/*.scss", Route::abort);
        page.navigate(url, new Page.NavigateOptions()
                .setWaitUntil(WaitUntilState.DOMCONTENTLOADED));
        return page.innerHTML("body");
    }
}

My gotchas were as follows.

I was using:

"wss://chrome.browserless.io/playwright?token=

Instead of:

"wss://chrome.browserless.io?token="

And use:

connectOverCDP
User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement