Python Playwright
last modified January 29, 2024
In this article we show how to automate browsers in Python with Playwright.
Playwright
Playwright is a cross-broser automation library created by Microsoft. It supports all modern rendering engines including Chromium, WebKit, and Firefox.
Playwright can be used in Node, Python, .NET and JVM.
Playwright allows to use a browser in a headless mode (the default mode), which works without the UI. This is great for scripting.
$ pip install --upgrade pip $ pip install playwright $ playwright install
We install Playwright library and the browser drivers.
Python Playwright get title
In the first example, we get the title of a web page.
#!/usr/bin/python
from playwright.sync_api import sync_playwright
with sync_playwright() as playwright:
webkit = playwright.webkit
browser = webkit.launch()
page = browser.new_page()
url = 'http://webcode.me'
page.goto(url)
title = page.title()
print(title)
browser.close()
The example retrieves and prints the title of a small webpage.
from playwright.sync_api import sync_playwright with sync_playwright() as playwright: ...
We use Playwright in a synchronous manner.
webkit = playwright.webkit
We use the Webkit driver.
browser = webkit.launch() page = browser.new_page()
We launch the browser and create a new page. The default browser mode is headless; that is, no UI is shown.
url = 'http://webcode.me' page.goto(url)
We navigate to the specified URL.
title = page.title() print(title)
We get the title and print it.
$ ./main.py My html page
Python Playwright create screenshot
In the following example we create a screenshot of a web page.
#!/usr/bin/python
from playwright.sync_api import sync_playwright
with sync_playwright() as playwright:
webkit = playwright.webkit
browser = webkit.launch()
page = browser.new_page()
url = 'http://webcode.me'
page.goto(url)
page.screenshot(path='shot.png')
browser.close()
The screenshot is created with the screenshot function; the
path attribute specifies the file name.
Python Playwright async example
The next example is an asynchronous version of the previous one.
#!/usr/bin/python
import asyncio
from playwright.async_api import async_playwright
async def main():
async with async_playwright() as playwright:
webkit = playwright.webkit
browser = await webkit.launch()
page = await browser.new_page()
url = 'http://webcode.me'
await page.goto(url)
await page.screenshot(path='shot.png')
await browser.close()
asyncio.run(main())
For the asynchronous version, we use the async/await keywords and
the asyncio module.
Python Playwright set HTTP headers
With the set_extra_http_headers function, we can specify HTTP
headers for the client.
#!/usr/bin/python
from playwright.sync_api import sync_playwright
with sync_playwright() as playwright:
webkit = playwright.webkit
browser = webkit.launch()
page = browser.new_page()
page.set_extra_http_headers({"User-Agent": "Python program"})
url = 'http://webcode.me/ua.php'
page.goto(url)
content = page.text_content('*')
print(content)
browser.close()
We set the User-Agent header to the request and navigate to the
http://webcode.me/ua.php URL, which returns the User-Agent header
back to the client.
$ ./main.py Python program
Python Playwright click on element
In the next example, we click on the button element with
click. After clicking on the button, a text message is displayed
in the output div element.
#!/usr/bin/python
import time
from playwright.sync_api import sync_playwright
with sync_playwright() as playwright:
webkit = playwright.webkit
browser = webkit.launch(headless=False)
page = browser.new_page()
url = 'http://webcode.me/click.html'
page.goto(url)
time.sleep(2)
btn = page.locator('button');
btn.click()
output = page.locator('#output');
print(output.text_content())
time.sleep(1)
browser.close()
The example starts the browser.
browser = webkit.launch(headless=False)
To start the UI, we set the headless option to
False.
time.sleep(2)
We slow down the program a bit.
btn = page.locator('button');
btn.click()
We locate the button element with locator and click on it with
click.
output = page.locator('#output');
print(output.text_content())
We locate the output element and get its text content.
$ ./main.py Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 ...
Python Playwright locating elements
In the next example, we find elements with locator.
#!/usr/bin/python
from playwright.sync_api import sync_playwright
with sync_playwright() as playwright:
webkit = playwright.webkit
browser = webkit.launch()
page = browser.new_page()
url = 'http://webcode.me/os.html'
page.goto(url)
els = page.locator('ul li').all();
for e in els:
print(e.text_content())
browser.close()
The program finds all li tags and prints their content.
$ ./main.py Solaris FreeBSD Debian NetBSD Windows
Source
Python Playwright documentation
In this article we have worked with the Python Playwright library.
Author
List all Python tutorials.