Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow passing arbitrary HTML and getting an image of the rendered webpage #21

Open
minimaxir opened this issue Jul 25, 2020 · 3 comments
Labels
feature something new P3 not needed for current cycle

Comments

@minimaxir
Copy link

minimaxir commented Jul 25, 2020

I am interested in using Kaleido as a backend for my imgmaker package, which currently uses Selenium + stable Chrome. The benefits listed in the README for this package would be well suited toward improving UX for mine.

However, the current scopes are more hardcoded toward Plotly's implementation (at the C level, which means it can't be addressed at the Python level). A Scope that simply takes in arbitrary HTML and returns an image of the generated webpage (including using width/height parameters to resize the browser) would be extremely useful for a number of image automation tasks, not just my own.

@jonmmease
Copy link
Collaborator

Hi @minimaxir, thanks for reaching out imgmaker looks really cool.

Yeah, this is a good usecase for Kaleido. Looks like the Chromium C++ layer has a captureScreenshot method that can export to PNG or JPG. There is also the the PrintToPDF that we're already using that can generate a PDF from the current page.

I'm thinking through whether there is a clean way to have this fit in the current architecture, or whether we would want to create a new processing path for this kind of thing.

This actually isn't too different from the current PDF processing path. For PDF, the JavaScript side of Plotly scope generates an SVG of the plot and then sets this as the src of an <img> tag. The C++ side then uses PrintToPDF to convert this SVG into a PDF.

Rather than having the JavaScript side create an SVG and set it as the contents of an <img> tag, I wonder if it would work for the JavaScript side to accept the Raw HTML content and set this as the srcdoc attributed of an <iframe> with the requested width/height. Then the C++ side could snapshot this using either captureScreenshot for PNG/JPG, or PrintToPDF for PDF. With the PDF format, you might even be able to get the text portion of your images in a vector format.

One general tricky/annoying part of handling arbitrary HTML is how to decide when the JavaScript on the page has actually finished loading. Is this an issue for you? Or does imgmaker deal with HTML/CSS only? I suppose there could just be an optional load-time argument that would wait a certain amount of time after the page loads before taking the snapshot.

@jonmmease
Copy link
Collaborator

Here's a long-term/long-shot idea that would make this, and many other use-cases, possible: #28

@rubmz
Copy link

rubmz commented Jul 28, 2020

Kaledio with Plotly do amazing job at creating psf in very low resource environments, like our headless linux server (I do the dev on windows, where it also performs perfectly out of the box, without any configurations, which just isn't the case with other pdf packages!). If think that if you opened the package for arbitrary HTML -> PDF, there will be buyers 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature something new P3 not needed for current cycle
Projects
None yet
Development

No branches or pull requests

4 participants