Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] core(driver): Implement a "pierce iframes" option in settings & CLI #9566

Closed
wants to merge 16 commits into from

Conversation

warrengm
Copy link
Contributor

Summary
Introduce a --pierce-iframes flag to the CLI and pierceIframes option to config which allows Lighthouse to use FCP events from iframes.

Setting this flag affects two cases:

  1. Pages without any content elements (text, images, other media) in the top-level but have an iframe. With this flag, LH will report the earliest FCP time of any iframe instead of failing with a NO_FCP error. In practice, this case has applications to measuring ad performance in isolation.
  2. Pages where an iframe paints some content element before any top-level content elements. With this flag, LH will report the FCP time of that iframe (relative to navigationStart of the top level) rather than the FCP time of the top level. This case may have practical applications to portal-based pages.

In progress
Tests

Open questions
Is it possible to (easilly) wait on FCP inside an iframe? The Page.lifecycleEvent in CRDP only applies to the top level page so there doesn't seem to be a good way to check for an iframes FCP. I briefly tried waiting on a firstContentfulPaint trace event to fire, but doing so would entangle much of the tracing code in driver.js. In the absence of a good solution, setting this flag disables FCP checks so the driver solely relies on waiting on load, network idleness, and CPU idleness.

Copy link
Collaborator

@patrickhulce patrickhulce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks very much for the contribution! :)

I might be missing a little context here, but why can't the ad be audited directly with perhaps different emulation settings for the correct ad size? i.e. what is the end goal of using this flag.

There are a lot of things that won't be accounted for with the change as it is (all the main thread identification stuff for example will be wrong for OOPIFs which I expect ads to mostly be) which means it's essentially only FCP that will work for iframes with this change and none of the other metrics. Is that sufficient? If it is, could we solve it another way to just identify and surface the FCP values of actual child frames rather than every FCP trace event we saw?

const frameEvents = keyEvents.filter(e => e.args.frame === mainFrameIds.frameId);
// Filter to just events matching the main (top-level) frame ID for sanity, unless we're
// piercing iframes
const frameEvents = settings.pierceIframes ?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little nervous about all the other assumptions that might fall through with this change, the flag feels a little less like pierceIframes and a little more like --look-at-all-events, which if the user is always running just a single tab in their chrome instance is probably the same thing, but definitely dangerous in other situations.

Maybe this just means we very carefully document usage of this flag?

@paulirish
Copy link
Member

I wanted to explore another alternative here.

My understanding is you want to know FCP/FP values for various iframes.

Perhaps instead of the trace.. we get that data via paint timing API.. very roughly:

$$('iframe[id^=google]')
    .map(i => i.contentWindow.performance.getEntriesByType('paint'))

Has this been considered? I believe these times would be baselined against iframe creation, rather than the parent page's timeOrigin... so (assuming you want that full duration) resolving that would be a little tricky. But is using Paint Timing vs the trace a solution to consider?

@paulirish
Copy link
Member

paulirish commented Aug 16, 2019

Introduce a --pierce-iframes flag to the CLI

A slight alternative to this approach would be that TraceOfTab also includes a eventsByFrameId property which basically the same object shape, but keyed by frameId.

something roughly like this:
image

we could just make this change, no special flag necessary.

@warrengm
Copy link
Contributor Author

This change isn't intended to be used for publisher ads audits, but for testing creatives. That said, that approach could work well for #8979.

This change is primarily intended to help users (i.e. advertisers or creative agencies, not publishers) who want to test individual ads in isolation (by running the ad in a top-level context, without any publisher ad tag like GPT). Lighthouse can work as-is for most of these purposes, but many ads load assets inside yet another child iframe. Additionally those users would be interested in seeing FCP metrics that reflect assets loaded inside of iframes. However, many audits fail with NO_FCP errors because there won't be a top-level firstContentfulPaint when those ad creatives run.

Since these users are interested in FCP and other core performance metrics, I feel like this change belongs in core LH (behind a flag) rather than introducing a new plugin. But maybe there should be an ad creative performance plugin?

I haven't fully thought through all of the implications on other metrics yet, but I can see potential impact there. I think most audits today already do not consider iframe boundaries (e.g. byte-efficiency audits). Additionally, some audits should fundamentally not pierce iframes. For example, Main Thread Work should not pierce OOP iframes by definition (I think I need to fix the code upon second examination).

I'd be comfortable changing the scope of the metric to something like --metrics-pierce-iframes.

@paulirish
Copy link
Member

Ah gotcha. The usecase definitely helps. :)

We've also noticed that websites that use <frameset> fail... eg. http://abehiroshi.la.coocan.jp/

Thinking about the core problem here.. perhaps we should consider the first FCP in the renderer thread, regardless of frameid. That would solve the frameset problem and perhaps your situation.

Actually... we could broaden that same approach to include any FCP in the frame tree (same process or OOP).

Being that we're just trying to get the first paint within the page (ignoring any other pages, etc), i think the semantics here are still good. @patrickhulce that sound right to you?

@patrickhulce
Copy link
Collaborator

Being that we're just trying to get the first paint within the page (ignoring any other pages, etc), i think the semantics here are still good. @patrickhulce that sound right to you?

Yeah, that sounds right. I'm definitely onboard for making lighthouse not throw NO_FCP if it's auditing a page with just an iframe that has an FCP. It feels like that's kind of a bug in FCP and we're just working around it 👍

As for the rest of the performance metrics though, I'm not sure how to handle this situation, but I'm fairly convinced just fixing FCP isn't sufficient. If we're explicitly wanting to understand the performance of an ad, then we should be ignoring the OOPIF boundary and doing something more like "treat all events in frame tree as main thread events", right? I've made the joke on several occasions that the quickest way to a 100 Lighthouse score is to just stick that page in an OOPIF. This obviously doesn't improve the user experience for that page whatsoever yet all the blocking work now happens "off the main thread", it's just a different thread that becomes the important one :)

Does that make sense or am I misunderstanding the test setup for these ads?

@warrengm
Copy link
Contributor Author

That sounds right, @patrickhulce. Do you have any references for the discussions around main thread work?

It sounds like we're agreed about limiting this change to paint metrics. Do you still want this behind a flag (--pierce-iframes-for-paint?). Two issues to consider are:

  1. I think speed index metric effectively ignores iframes already
  2. The First Contentful Paint definition in the Paint Timing spec explicitly ignores iframes: https://w3c.github.io/paint-timing/#sec-terminology

Given (2) I'm inclined to keep this change behind a flag but I can see how the behavior is not obvious for many Lighthouse users.

@patrickhulce
Copy link
Collaborator

Do you have any references for the discussions around main thread work?

I'm not sure I understand what you mean, which discussions are you referring to? :)

I think speed index metric effectively ignores iframes already

I don't think this is true. Auditing https://lh-iframe-screenshots-test.glitch.me/ shows the iframe in the screenshots and we base our speed index calculation directly off of those. The simulated version of speed index relies somewhat on main thread activity and so would be less affected by iframe activity but still uses observed speed index as an input so they definitely aren't ignored entirely.

The First Contentful Paint definition in the Paint Timing spec explicitly ignores iframes: https://w3c.github.io/paint-timing/#sec-terminology

Yeah this makes complete sense from a spec perspective since it would be impossible to determine the per-frame FCP if it were already merged at the browser level but it's simple for consumers like us to arrive at the merged version. I could see an argument being made for making this new proposal the default LH behavior but agreed behind a flag is safer 👍

It sounds like we're agreed about limiting this change to paint metrics.

I think so. If we're talking about making Lighthouse not completely fail in the presence of solo iframes, I agree we only need to limit a change to the paint metrics. If we're talking about presenting performance metrics on ads with a solo iframe test harness, then I feel strongly that we need to adjust all the metrics and not just FCP. I feel like stopping with FCP would lead to a misleadingly rosy picture of the performance impact of ads.

@warrengm
Copy link
Contributor Author

Thanks for the feedback!

I'm not sure I understand what you mean, which discussions are you referring to? :)

I'm curious if your joke came up in any previous conversations or issues. I'm mainly interested if there is any more context I can benefit from.

I don't think this is true. Auditing https://lh-iframe-screenshots-test.glitch.me/ shows the iframe in the screenshots and we base our speed index calculation directly off of those.

Ah, sorry for not being clear--I meant Lighthouse ignores iframe boundaries for speed index. I'm also basing this assumption on the fact that iframes show up on screenshots.

Yeah this makes complete sense from a spec perspective since it would be impossible to determine the per-frame FCP if it were already merged at the browser level but it's simple for consumers like us to arrive at the merged version. I could see an argument being made for making this new proposal the default LH behavior but agreed behind a flag is safer 👍

I'm OK with the merged version.

It does introduce a possibility of discrepancies between RUM data and LH data. This would only happen if the iframe's contentful paint happens before the page's contentful paint--which I don't anticipate happening on real pages. Additional, users should presumably be aware of this issue if the change is gated behind a flag.

I think so. If we're talking about making Lighthouse not completely fail in the presence of solo iframes, I agree we only need to limit a change to the paint metrics. If we're talking about presenting performance metrics on ads with a solo iframe test harness, then I feel strongly that we need to adjust all the metrics and not just FCP. I feel like stopping with FCP would lead to a misleadingly rosy picture of the performance impact of ads.

Totally agree that CPU costs need to be highlighted. The simplest way I see of doing this is to use trace.traceEvents rather than mainThreadEvents in computed/main-thread-tasks.js when the flag is set. I don't like that the behavior mismatches the name, but it's a nice way to control this centrally. (Maybe we should deal with naming separately?)

But this seems like an issue for pages as well. I hope that Lighthouse highlights the cost of having 10 rich media ads on a page. Was this taken into account when introducing MainThreadTasks? (It might not be an issue for FID metrics but seems like an issue for bootup-time and other audits)

Regarding your earlier comment, but I want to emphasize many many ads are still rendered in a same-origin friendly iframe, so they won't be OOPIF. Whether an ad is rendered in a cross-domain is largely based on ad selection so it's not always deterministic.

@patrickhulce
Copy link
Collaborator

I'm curious if your joke came up in any previous conversations or issues. I'm mainly interested if there is any more context I can benefit from.

Oh haha, no, sadly no paper trail for this one other than some internal chats here and there, sorry! (You might be able to speak to @jburger424 about that though?)

I meant Lighthouse ignores iframe boundaries for speed index. I'm also basing this assumption on the fact that iframes show up on screenshots.

Oh I see now! Yes, gotcha we were talking about the same thing. It "ignores" in that it treats it as part of page content my bad :)

It does introduce a possibility of discrepancies between RUM data and LH data. This would only happen if the iframe's contentful paint happens before the page's contentful paint--which I don't anticipate happening on real pages. Additional, users should presumably be aware of this issue if the change is gated behind a flag.

I think I tend to agree with you, but my gut says this move would be objectionable to several other perf folks... we'll see I suppose when it's proposed I guess 😆

The simplest way I see of doing this is to use trace.traceEvents rather than mainThreadEvents in computed/main-thread-tasks.js when the flag is set. I don't like that the behavior mismatches the name, but it's a nice way to control this centrally.

The problem I'm referring to is that pretty much everything in LH is based on activity on a single thread so lots of assumptions during processing won't hold up (durations and selfTime for tasks would probably be some NaNs or nonsense numbers). I think the best approach we could do immediately is select the busiest thread as the "main" thread and report the costs for that.

But this seems like an issue for pages as well. I hope that Lighthouse highlights the cost of having 10 rich media ads on a page. Was this taken into account when introducing MainThreadTasks? (It might not be an issue for FID metrics but seems like an issue for bootup-time and other audits)

It definitely is an issue for pages, hence my joke :) I think one of the shortcomings of the current state of performance metrics is failing to account for an OOPIF world and the fact that off main thread !== free. It's less of a problem there though because presumably what you care about measuring is the ability to interact specifically with the content of the page itself and not its iframes, so the fact that there's a 1 second blocking task off thread doesn't immediately mean you won't be able to interact with the page for 1 second. In the ad case we're talking about here, the page is explicitly meaningless and we want to know about the users' ability to interact with the iframe. A 1 second blocking task in the iframe will definitely prevent the user from interacting with the iframe for 1 second.

Was this taken into account when introducing MainThreadTasks? (It might not be an issue for FID metrics but seems like an issue for bootup-time and other audits)

In short, no it was not taken into account. All audits and all CPU activity exclusively analyzes main thread activity.

I want to emphasize many many ads are still rendered in a same-origin friendly iframe, so they won't be OOPIF. Whether an ad is rendered in a cross-domain is largely based on ad selection so it's not always deterministic.

Ah, this is great to know thanks! Even more important to ensure test practices for them always accurately reflect the potential same-process costs then :)

@warrengm
Copy link
Contributor Author

Based on the feedback I refactored this PR a bit:

  • Trace of tab now recurses into child frames
  • When pierceIframes is true, computed artifacts will search for FCP in child iframes
  • When pierceIframes is true, MainThreadTasks will include tasks from OOP child frames

PTAL

@patrickhulce
Copy link
Collaborator

Thanks for your work here @warrengm! You can check out the internal doc attached to #11311 (comment) for our broader assessment of how to approach these issues.

tl;dr we're probably going to take a different approach for "page performance" vs. "frame performance"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants