Compiler service: file parsing should use caching. #14848

safesparrow · 2023-03-04T20:24:26Z

Is your feature request related to a problem? Please describe.

IDE interactions are slow, partly because the same files with unmodified contents are parsed multiple times, with no caching mechanism.

Given a project with three files: A.fs, B.fs, C.fs, every time I:

Modify A.fs,
Open C.fs,
file B.fs is parsed again. Files A.fs and C.fs are probably parsed again too, although I'm not sure exactly when they are, and the main point is that all files for which typechecking is performed, are parsed again.

Note that I've only tested this in Rider.

Describe the solution you'd like

Introduce a caching mechanism with size configurable by consumers of FCS, which avoids parsing the same files again if nothing relevant has changed.

Additional context

Note that there is a parsing cache, but it's only utilised by the dedicated file parsing endpoint, and is not used when the typechecking endpoint is called instead.

Also note that there seem to be multiple ways in which parsing is called, with different stack traces.
The cache should address all of them if possible.

Also note that if this ends up using SourceText.GetHashCode, that method is quite slow as it has to scan the whole source.
It might be worth considering calculating the hash once upon creation of SourceText.

Related discussion I started: #14199

Related: Idea of parallel parsing in the FCS.
I briefly looked at the possibility of parsing project files in parallel in FCS. However, the complexity of IncrementalBuilder makes it rather difficult.
Also, once we have a caching mechanism with a large-enough size, the marginal gain of parallel parsing would be limited to speeding up the first-time typechecking after starting the IDE.

The text was updated successfully, but these errors were encountered:

majocha · 2023-03-05T01:33:49Z

Hi @safesparrow, can you try to run my draft #14849? This is VS only now.

safesparrow · 2023-03-05T10:03:47Z

@majocha Thanks for looking.

I'm using Rider with a locally built FCS, so can't easily test with VS.
AFAICS the PR only increases size of existing caches, but doesn't change the way they are used. As mentioned above, the existing parsing cache is only used in one case, when a dedicated parseFile request is received by the checker. However, when type-checking a file is requested, that will trigger type-checking of all preceeding files, which will in turn trigger parsing of those files - and that mechanism does not involve any caching. The parsing happens in this line: https://github.com/JetBrains/fsharp/blob/35e742b334353de45b11d1d5c4cfecb02e6e58fe/src/Compiler/Service/IncrementalBuild.fs#L484 . I think there is also a third way parsing can be triggered, but haven't looked that closely.
I understand that there was some worry about large(r) cache sizes vs memory usage and GC pressure. I don't necessarily think that 100 ASTs take up a significant amount of memory, but would be nice to do some tests and make an informed decision. Maybe size of 1000 is fine too, which would be great - but if not, leaving it to be configured by end-users/IDEs might be best.

majocha · 2023-03-05T10:56:26Z

AFAICS the PR only increases size of existing caches, but doesn't change the way they are used.

It does do a little bit more 😉, cache sizes are increased just to test it.

~~Anyway, looks like Rider doesn't use FCS endpoints like TryGetRecentCheckResultsForFile but maintains an internal version of the service and does it's own thing with it?~~

majocha · 2023-03-05T11:13:48Z

It should reuse cached parse results throughout now when TryGetRecentCheckResultsForFile receives an already parsed file version.

safesparrow · 2023-03-05T11:24:34Z

It does do a little bit more 😉, cache sizes are increased just to test it.

Can you elaborate on whether it affects this particular callsite from point 2. above:
https://github.com/JetBrains/fsharp/blob/35e742b334353de45b11d1d5c4cfecb02e6e58fe/src/Compiler/Service/IncrementalBuild.fs#L484
and deeper down this piece of code: https://github.com/JetBrains/fsharp/blob/35e742b334353de45b11d1d5c4cfecb02e6e58fe/src/Compiler/Service/IncrementalBuild.fs#L157

As I understand and as tested in Rider, this callsite is responsible for most indirect parsing actions when type-checking of a chain of files is triggered.
Note that I'm describing a situation when the IDE asks to type-check a file somewhere down in the project - through the BoundModel infrastructure this effectively does sequential parsing and type-checking of all files above it that are deemed no longer valid.
So again, given three files A.fs, B.fs, C.fs:

a change to A.fs invalidates all three files.
When ParseAndCheckFileInProject(C.fs) is then called, it will go through BoundModels of all the invalidated files and parse (here ) and type-check them.

Maybe I'm missing something, but I don't see anything in the PR that would use the cache in 2.

majocha · 2023-03-05T11:32:05Z

@safesparrow I'll have to look into it when I'm back at my PC 🙂. From what I see , what I tinkered with was much closer to the surface..

majocha · 2023-03-05T12:03:36Z

So IIUC the parse cache we have in service.fs is almost useless, because GetCheckResultsBeforeFileInProject sends us on a path that doesn't use it, and we need caching somewhere at a deeper level?

safesparrow · 2023-03-05T16:19:13Z

So IIUC the parse cache we have in service.fs is almost useless, because GetCheckResultsBeforeFileInProject sends us on a path that doesn't use it, and we need caching somewhere at a deeper level?

Almost, yes. Specifically, Rider uses the ParseAndCheckFileInProject endpoint instead and that triggers the cache-free parsing. Not sure if GetCheckResultsBeforeFileInProject results in the same, probably yes.

majocha · 2023-03-05T23:56:27Z

This caching mechanism does not seem to be working at all:
Compiler/Service/IncrementalBuild.fs#L116

This is never ever hit
Compiler/Service/IncrementalBuild.fs#L174

Instead a new SyntaxTree is being instantiated on every edit.

Some parse calls are there for diagnostics and there is a lot of setting things up involved (side-effects, too, I believe). It does not look practical to cache every call in a centralized way.

safesparrow · 2023-03-06T08:13:59Z

Instead a new SyntaxTree is being instantiated on every edit.
Some parse calls are there for diagnostics and there is a lot of setting things up involved (side-effects, too, I believe). It does not look practical to cache every call in a centralized way.

I think we need to get on the same page.
This issue is primarily about parsing files as part of a chain of parse+typecheck operations. Please have a look at the example in the description with the three file project.

Parsing on edit is different. There is little point caching those requests as the chance of seeing exactly the same source code upon multiple edits is small.
Compare it to parsing all files in the project after only one of them has changed - all but one request could be cached.
I believe the "active file edit" scenario is also why some endpoints have the cache argument which is sometimes set to false by the caller.

auduchinok · 2023-03-06T11:00:00Z

It would be nice if ParseAndCheckFileInProject could cache previous files trees in some cache indeed.

I believe the "active file edit" scenario is also why some endpoints have the cache argument which is sometimes set to false by the caller.

Yes, exactly. We've added it when started doing additional parsing that we didn't want to be cached anywhere.

majocha · 2023-03-06T12:39:38Z

This issue is primarily about parsing files as part of a chain of parse+typecheck operations.

@safesparrow that's exactly the scenario I stepped through in the debugger. I modify file A and see SyntaxTree ctored for file B each time. weakCache is always None in Parse method, although I'd expect it to be used as it is being set for some reason.

safesparrow · 2023-03-06T13:18:26Z

SyntaxTree object might be constructed, but hopefully the actual parsing (SyntaxTree.Parse) happens only if I open B.fs or C.fs and doesn't happen on every edit of A.fs - right? That's what I observed. Parsing all the files every time A.fs is modified would be even worse.

I don't think the weakCache mechanism is relevant here.
It's not really a cache if I understand it correctly. It can only be used if nothing above changed as it is not keyed in any way and the whole SyntaxTree object, alongside the weakCache, is destroyed when things above change.

You'll want the SyntaxTree.Parse method to use a cache similar to the one in service.fs, keyed only by file's contents and certain parsing options, but not dependent upon contents of files above it.

majocha · 2023-03-06T13:27:45Z

SyntaxTree object might be constructed, but hopefully the actual parsing (SyntaxTree.Parse) happens only if I open B.fs or C.fs and doesn't happen on every edit of A.fs - right? That's what I observed. Parsing all the files every time A.fs is modified would be even worse.

Yep, that's what I meant.

You'll want the SyntaxTree.Parse method to use a cache similar to the one in service.fs, keyed only by file's contents and certain parsing options, but not dependent upon contents of files above it.

Yes, I just tried this. I was hoping to use FSharpSource.TimeStamp to avoid hashing file contents, but it's useless. It changes on every call, not on actual source change like one would expect.

majocha · 2023-03-06T13:43:48Z

@safesparrow I did a quick and dirty draft to start something. This should already work.

OK, maybe it's placebo but editing service.fs no longer feels like wading through mud.

safesparrow added Feature Request Needs-Triage labels Mar 4, 2023

github-actions bot added this to the Backlog milestone Mar 4, 2023

vzarytovskii added this to F# Compiler and Tooling Mar 4, 2023

github-project-automation bot moved this to Not Planned in F# Compiler and Tooling Mar 4, 2023

safesparrow mentioned this issue Mar 5, 2023

Too many parsing requests for the active file on every edit/focus. JetBrains/resharper-fsharp#492

Open

majocha mentioned this issue Mar 5, 2023

Enable more caching of parse results in FCS #14849

Closed

safesparrow mentioned this issue Mar 5, 2023

Compiler service: parallel parsing. #14851

Open

majocha mentioned this issue Mar 6, 2023

Cache parsing results in incremental build #14852

Merged

3 tasks

safesparrow mentioned this issue Mar 6, 2023

Eagerly parse all files in the project after project load. JetBrains/resharper-fsharp#495

Open

0101 removed the Needs-Triage label Apr 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compiler service: file parsing should use caching. #14848

Compiler service: file parsing should use caching. #14848

safesparrow commented Mar 4, 2023 •

edited

Loading

majocha commented Mar 5, 2023

safesparrow commented Mar 5, 2023 •

edited

Loading

majocha commented Mar 5, 2023 •

edited

Loading

majocha commented Mar 5, 2023

safesparrow commented Mar 5, 2023 •

edited

Loading

majocha commented Mar 5, 2023

majocha commented Mar 5, 2023

safesparrow commented Mar 5, 2023

majocha commented Mar 5, 2023

safesparrow commented Mar 6, 2023 •

edited

Loading

auduchinok commented Mar 6, 2023

majocha commented Mar 6, 2023 •

edited

Loading

safesparrow commented Mar 6, 2023

majocha commented Mar 6, 2023

majocha commented Mar 6, 2023 •

edited

Loading

Compiler service: file parsing should use caching. #14848

Compiler service: file parsing should use caching. #14848

Comments

safesparrow commented Mar 4, 2023 • edited Loading

majocha commented Mar 5, 2023

safesparrow commented Mar 5, 2023 • edited Loading

majocha commented Mar 5, 2023 • edited Loading

majocha commented Mar 5, 2023

safesparrow commented Mar 5, 2023 • edited Loading

majocha commented Mar 5, 2023

majocha commented Mar 5, 2023

safesparrow commented Mar 5, 2023

majocha commented Mar 5, 2023

safesparrow commented Mar 6, 2023 • edited Loading

auduchinok commented Mar 6, 2023

majocha commented Mar 6, 2023 • edited Loading

safesparrow commented Mar 6, 2023

majocha commented Mar 6, 2023

majocha commented Mar 6, 2023 • edited Loading

safesparrow commented Mar 4, 2023 •

edited

Loading

safesparrow commented Mar 5, 2023 •

edited

Loading

majocha commented Mar 5, 2023 •

edited

Loading

safesparrow commented Mar 5, 2023 •

edited

Loading

safesparrow commented Mar 6, 2023 •

edited

Loading

majocha commented Mar 6, 2023 •

edited

Loading

majocha commented Mar 6, 2023 •

edited

Loading