-
-
Notifications
You must be signed in to change notification settings - Fork 524
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(linter): a new multi-file analysis runtime #9383
base: main
Are you sure you want to change the base?
Conversation
How to use the Graphite Merge QueueAdd either label to this PR to merge it via the merge queue:
You must have a Graphite account in order to use the merge queue. Sign up using this link. An organization admin has enabled the Graphite Merge Queue in this repository. Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue. |
CodSpeed Performance ReportMerging #9383 will not alter performanceComparing Summary
|
dedup pending paths Use OsStr instead of Path for faster hashing move file reads to io thread pool Revert "move file reads to io thread pool" This reverts commit f4deb10759bd252b0d4bf20c630ed517fd2e582d. simplify module task scheduling group wip wip wip code cleanup
5774fc2
to
ca69203
Compare
// deeper paths are more likely to be leaf modules (src/very/deep/path/baz.js is likely to have | ||
// fewer dependencies than src/index.js). | ||
// This heuristic is not always true, but it works well enough for real world codebases. | ||
self.paths.par_sort_unstable_by(|a, b| Path::new(b).cmp(Path::new(a))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sort by Path
is going to be slow. Maybe sort by number of /
or \\
with sort_by_cached_key
?
/// A module ready for linting. A `EntryModule` is generated for each path in `runtime.paths` | ||
/// | ||
/// It's basically the same as `ProcessedModule`, except `content` is non-Option. | ||
struct EntryModule { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Entry
is a really vague term ... We need a longer and more descriptive name.
// only stores entry modules in current group | ||
let mut entry_modules: Vec<EntryModule> = Vec::with_capacity(entry_group_size); | ||
|
||
// downgrade self to immutable reference so it can be shared among spawned tasks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The wording "downgrade" feels weird, may "set to ..."?
/// | ||
/// Modules with special extensions such as .vue could contain multiple source sections (see `PartialLoader::PartialLoader`). | ||
/// Plain ts/js modules have one section. Using `SmallVec` to avoid allocations for plain modules. | ||
section_module_records: SmallVec<[Result<ResolvedModuleRecord, Vec<OxcDiagnostic>>; 1]>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The term section
is vague, what is source sections
?
let mut encountered_paths = | ||
FxHashSet::<Arc<OsStr>>::with_capacity_and_hasher(me.paths.len(), FxBuildHasher); | ||
|
||
let mut module_relationships = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is module_relationships
? Probably need a better naming and a comment.
{ | ||
let mut loaded_modules = record.loaded_modules.write().unwrap(); | ||
for (specifier, dep_path) in requested_module_paths { | ||
// TODO: revise how to store multiple sections in loaded_modules |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's figure out out the TODO, it's unlikely we'll revisit this part of code and fix it later.
.par_bridge() | ||
.map_with(self.resolver.as_ref().unwrap(), |resolver, specifier| { | ||
resolver.resolve(dir, specifier).ok().map(|r| (specifier, r)) | ||
.par_iter() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
par_iter
is probably not required anymore, since there's no further processing other than resolver.resolve
. Spawning a new thread here may be more expensive than resolver.resolve
it self. We need the threads for heavy duties such as parse + lint.
let mut module_relationships = | ||
Vec::<(Arc<OsStr>, SmallVec<[Vec<(/*specifier*/ CompactStr, Arc<OsStr>)>; 1]>)>::new(); | ||
|
||
let (tx_resolve_output, rx_resolve_output) = mpsc::channel::<ModuleProcessOutput>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's best to write an algorithm overview here, because there are so many loops and multi-thread logic here.
} | ||
|
||
// Writing to `loaded_modules` based on `module_relationships` | ||
module_relationships.par_drain(..).for_each(|(path, requested_module_paths)| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why par_drain
? I'm not seeing any heavy processing, spawning threads may be slower than the acutaly work done here.
return; | ||
} | ||
let records = &modules_by_path[&path]; | ||
assert_eq!(records.len(), requested_module_paths.len()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
using assert_eq
will crash in release build, we probably need a panic message here.
fixes #9077, #7996, #8631.
Results of running
oxlint -c oxlint.json
on AFFiNE@97cc814a1.3x slower, takes 0.86x memory.
Before
After