1

I've been using Playwright to generate a document from HTML code with a table of content corresponding to the H1-6 tags I'm using. I was hoping that bookmarks in the PDF would be generated from those tags like Weasyprint does automatically.

But I can't see or find a way to have them generated with Playwright. Is there any?

Here's what I'd like my PDF to look like:

PDF opened in a reader with a list of bookmarks corresponding to the different titles

I'm using Playwright for Python through pytest-playwright

I've tried the "tagged" option from the page.pdf function but it seems like it generates only the accessible tags, not the bookmarks.

page.pdf(path=path, tagged=True) cf. playwright.dev

I've been looking up "Playwright PDF bookmark" and "Playwright PDF table of content" with no success.

5
  • 4
    Did you try the outline option playwright.dev/python/docs/api/…? Commented Oct 9 at 13:47
  • 1
    Thanks! I feel embarrassed now. It seems that it needs both tagged and outline to work as expected. Commented Oct 9 at 16:06
  • 1
    Feel free to answer your own question 😉 Commented Oct 9 at 17:05
  • 1
    Please don't edit answers into questions--I rolled back the 'Solved: Using both "tagged=True" and "outline=True" is doing the trick!'. Please add a self answer. Thanks! Commented Oct 9 at 21:43
  • Solved thanks to @phuzi : Using both "tagged=True" and "outline=True" is doing the trick! page.pdf(outline=True, tagged=True) Commented Oct 11 at 8:25

2 Answers 2

1

As mentioned in the comments, page.pdf(outline=True, tagged=True) per the docs will generate the table of contents:

outline bool (optional) Added in: v1.42#

Whether or not to embed the document outline into the PDF. Defaults to false.

Here's a minimal, complete example:

from playwright.sync_api import sync_playwright # playwright==1.55.0


html = """<!DOCTYPE html><html><body>
<h1>header 1</h1>
<h2>header 2.1</h2>
<h3>header 3.1</h3>
<h3>header 3.2</h3>
<h2>header 2.2</h2>
</body></html>"""

with sync_playwright() as p:
    browser = p.chromium.launch()
    page = browser.new_page()
    page.set_content(html)
    page.pdf(path="out.pdf", outline=True, tagged=True)

Result viewed in Acrobat:

Screenshot of Adobe Acrobat showing the bookmarks alongside the PDF

Sign up to request clarification or add additional context in comments.

Comments

0

The whole task can be done in one command line in Windows without chrome dev or selenium/playwright and whilst MSEdge v141 currently has some problems running headless, Chrome-Headless-Shell is the preferred chromium team suggestion.

To use current MS Edge without any other app you need this format

start msedge --headless=new --user-data-dir="%cd%\temp" --log-level=3 --run-all-compositor-stages-before-draw --virtual-time-budget=80000 --no-pdf-header-footer --generate-pdf-document-outline --print-to-pdf="%cd%\out.pdf" "in.html"

enter image description here

"%~dp0chrome-headless-shell.exe" --generate-pdf-document-outline --run-all-compositor-stages-before-draw --virtual-time-budget=80000 --no-pdf-header-footer --print-to-pdf="%~dpn1.pdf" "%~1"
pause

You can just like LaTeX use SumatraPDf to show HTML text edits live* then press one key to convert live HTML view to live preview of PDF result. So, 1 x MS Notepad and 2 SumatraPDF previews with Chrome-headless-shell or Edge would do it.

  • SumatraPDF does have limitations when viewing more complex HTM (unless in simple ePub format), but good for small simple files either local or on the web.

Helper script to download and unpack chrome headless 32 into a work folder. Part of a suite of my fetchers

:: Create working directory as required & fetch binary
echo No chrome-headless-shell found. Fetching binary
md "%~dp0..\CHShell"
cd /d "%~dp0..\CHShell"
set "download=chrome-headless-shell-win32"
set "variant=win32/%download%.zip"
curl -o LATEST_RELEASE_STABLE.txt https://googlechromelabs.github.io/chrome-for-testing/LATEST_RELEASE_STABLE
set /p version=<LATEST_RELEASE_STABLE.txt
curl -O https://storage.googleapis.com/chrome-for-testing-public/%version%/%variant%
tar -xf "%download%.zip"
xcopy /E /H /K /Y %download% %cd%
rd /s /q %download%

:: Verify
if exist "chrome-headless-shell.exe" (
    echo chrome-headless-shell downloaded successfully.
REM for debug test pause
    exit /b 0
) else (
    echo ERROR: chrome-headless-shell.exe not found.
    pause & exit /b 1
)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.