Allow passing arbitrary HTML and getting an image of the rendered webpage #21

minimaxir · 2020-07-25T19:19:23Z

I am interested in using Kaleido as a backend for my imgmaker package, which currently uses Selenium + stable Chrome. The benefits listed in the README for this package would be well suited toward improving UX for mine.

However, the current scopes are more hardcoded toward Plotly's implementation (at the C level, which means it can't be addressed at the Python level). A Scope that simply takes in arbitrary HTML and returns an image of the generated webpage (including using width/height parameters to resize the browser) would be extremely useful for a number of image automation tasks, not just my own.

jonmmease · 2020-07-27T11:29:58Z

Hi @minimaxir, thanks for reaching out imgmaker looks really cool.

Yeah, this is a good usecase for Kaleido. Looks like the Chromium C++ layer has a captureScreenshot method that can export to PNG or JPG. There is also the the PrintToPDF that we're already using that can generate a PDF from the current page.

I'm thinking through whether there is a clean way to have this fit in the current architecture, or whether we would want to create a new processing path for this kind of thing.

This actually isn't too different from the current PDF processing path. For PDF, the JavaScript side of Plotly scope generates an SVG of the plot and then sets this as the src of an <img> tag. The C++ side then uses PrintToPDF to convert this SVG into a PDF.

Rather than having the JavaScript side create an SVG and set it as the contents of an <img> tag, I wonder if it would work for the JavaScript side to accept the Raw HTML content and set this as the srcdoc attributed of an <iframe> with the requested width/height. Then the C++ side could snapshot this using either captureScreenshot for PNG/JPG, or PrintToPDF for PDF. With the PDF format, you might even be able to get the text portion of your images in a vector format.

One general tricky/annoying part of handling arbitrary HTML is how to decide when the JavaScript on the page has actually finished loading. Is this an issue for you? Or does imgmaker deal with HTML/CSS only? I suppose there could just be an optional load-time argument that would wait a certain amount of time after the page loads before taking the snapshot.

jonmmease · 2020-07-28T11:30:12Z

Here's a long-term/long-shot idea that would make this, and many other use-cases, possible: #28

rubmz · 2020-07-28T12:46:05Z

Kaledio with Plotly do amazing job at creating psf in very low resource environments, like our headless linux server (I do the dev on windows, where it also performs perfectly out of the box, without any configurations, which just isn't the case with other pdf packages!). If think that if you opened the package for arbitrary HTML -> PDF, there will be buyers 👍

jonmmease mentioned this issue Jul 28, 2020

Make available for general chart/html/report conversion to pdf #27

Closed

jonmmease added the feature something new label Jul 28, 2020

brunotjuliani mentioned this issue May 10, 2021

ValueError: Failed to start Kaleido subprocess #90

Closed

god7i11a mentioned this issue Nov 16, 2022

(just) importing plotly.express breaks kaleido (on some machines) #152

Open

gvwilson self-assigned this Jul 26, 2024

gvwilson removed their assignment Aug 3, 2024

gvwilson added the P3 not needed for current cycle label Aug 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow passing arbitrary HTML and getting an image of the rendered webpage #21

Allow passing arbitrary HTML and getting an image of the rendered webpage #21

minimaxir commented Jul 25, 2020 •

edited

Loading

jonmmease commented Jul 27, 2020

jonmmease commented Jul 28, 2020

rubmz commented Jul 28, 2020

Allow passing arbitrary HTML and getting an image of the rendered webpage #21

Allow passing arbitrary HTML and getting an image of the rendered webpage #21

Comments

minimaxir commented Jul 25, 2020 • edited Loading

jonmmease commented Jul 27, 2020

jonmmease commented Jul 28, 2020

rubmz commented Jul 28, 2020

minimaxir commented Jul 25, 2020 •

edited

Loading