-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] [SPARK-51097] [SS] Split apart SparkPlan metrics and instance metrics #50157
base: master
Are you sure you want to change the base?
Conversation
@cloud-fan I was recommended to get your advice on this change, as this is related to #49816 and was merged in recently |
/** | ||
* @return All instance metrics of this SparkPlan. | ||
*/ | ||
def instanceMetrics: Map[String, SQLMetric] = Map.empty |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we add this method to the base class? It looks specific to the StateStoreWriter
operartor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, it looks like that part wasn't needed. Reduced the scope, thanks!
@@ -434,12 +447,12 @@ trait StateStoreWriter | |||
} | |||
|
|||
protected def setStoreInstanceMetrics( | |||
instanceMetrics: Map[StateStoreInstanceMetric, Long]): Unit = { | |||
instanceMetrics.foreach { | |||
newInstanceMetrics: Map[StateStoreInstanceMetric, Long]): Unit = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is because its conflicting with the var name ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it's a conflict with the variable names, I'll rename this to otherStoreInstanceMetrics
to clear this up
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
What changes were proposed in this pull request?
SPARK-51097
This PR aims to fix the issue of instance metrics being propagated all the way to Spark UI, even when uninitialized.
Marked as WIP since the original RocksDB PR's changes were reverted, so I'll be making a separate PR to reintroduce both changes.
Why are the changes needed?
Previously, Spark UI would show every possible instance metric that we pre-allocated for every partition. This led to the SQL tab showing a lot of metrics that are unused.
Does this PR introduce any user-facing change?
This should eliminate instance metrics being shown on Spark UI, leaving only the instance metrics visible from Streaming query progress reports.
How was this patch tested?
Tests were reverified in RocksDBStateStoreIntegrationSuite
I ran some stateful queries and checked on SparkUI to make sure these metrics aren't showing up. Will upload screenshots soon.
Was this patch authored or co-authored using generative AI tooling?
No