[feat][storage] Add SpanKind support for badger #6376

Manik2708 · 2024-12-17T07:43:28Z

Which problem is this PR solving?

Fixes: Badger storage plugin: query service to support spanKind when retrieve operations for a given service. #1922

Description of the changes

Queries with span kind will now be supported for Badger

How was this change tested?

Writing unit tests

Checklist

I have read https://github.com/jaegertracing/jaeger/blob/master/CONTRIBUTING_GUIDELINES.md
I have signed all commits
I have added unit tests for the new functionality
I have run lint and test steps successfully
- for jaeger: make lint test
- for jaeger-ui: npm run lint and npm run test

Manik2708 · 2024-12-17T07:50:10Z

I have changed the structure of cache which is leading to these concerns:

Will a 3D map be a viable option for production?
Cache will never be able to retrieve operations of old data! When kind is not sent by the user, all operations related to new data will be sent. I have a probable solution for this! We might have to introduce boolean which when true will load the cache from old data (old index key) and mark all the span of kind UNSPECIFIED
To maintain consistency, we must take the service name from the newly created index, but extracting service name from serviceName+operationName+kind is the challenge! The solution which I have thought is reserving the last 7 places for len(serviceName)+len(operationName)+kind in the new index. This has an issue that we have to limit the length of serviceName and operationName to 999. This way we can get rid of the c.services map also. Removing this map is optional and a matter of discussion because for this we have to decide between storage and iteration, removing this map will lead to extra iterations in GetServices, I also thought of a solution for this:

data = map[string]struct
// Here this struct can be defined as
type struct {
expiryTime uint64
operations map[trace.SpanKind]map[string]uint64
}

Once the correct approach is discussed I will handle some more edge cases and make the e2e tests pass (making GetOperationsMissingSpanKind: false!

codecov · 2024-12-17T07:52:50Z

Codecov Report

Attention: Patch coverage is 95.85492% with 8 lines in your changes missing coverage. Please review.

Project coverage is 96.02%. Comparing base (06cc410) to head (e6cadda).

Files with missing lines	Patch %	Lines
internal/storage/v1/badger/spanstore/reader.go	93.47%	4 Missing and 2 partials ⚠️
internal/storage/v1/badger/spanstore/kind.go	89.47%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #6376      +/-   ##
==========================================
- Coverage   96.03%   96.02%   -0.01%     
==========================================
  Files         364      365       +1     
  Lines       20690    20823     +133     
==========================================
+ Hits        19870    19996     +126     
- Misses        626      631       +5     
- Partials      194      196       +2

Flag	Coverage Δ
badger_v1	`10.19% <51.81%> (+0.37%)`	⬆️
badger_v2	`1.86% <0.00%> (-0.03%)`	⬇️
cassandra-4.x-v1-manual	`14.66% <0.00%> (-0.20%)`	⬇️
cassandra-4.x-v2-auto	`1.85% <0.00%> (-0.03%)`	⬇️
cassandra-4.x-v2-manual	`1.85% <0.00%> (-0.03%)`	⬇️
cassandra-5.x-v1-manual	`14.66% <0.00%> (-0.20%)`	⬇️
cassandra-5.x-v2-auto	`1.85% <0.00%> (-0.03%)`	⬇️
cassandra-5.x-v2-manual	`1.85% <0.00%> (-0.03%)`	⬇️
elasticsearch-6.x-v1	`18.94% <0.00%> (-0.26%)`	⬇️
elasticsearch-7.x-v1	`19.02% <0.00%> (-0.26%)`	⬇️
elasticsearch-8.x-v1	`19.19% <0.00%> (-0.26%)`	⬇️
elasticsearch-8.x-v2	`1.86% <0.00%> (-0.03%)`	⬇️
grpc_v1	`10.72% <0.00%> (-0.15%)`	⬇️
grpc_v2	`7.76% <0.00%> (-0.11%)`	⬇️
kafka-3.x-v1	`9.98% <0.00%> (-0.14%)`	⬇️
kafka-3.x-v2	`1.86% <0.00%> (-0.03%)`	⬇️
memory_v2	`1.86% <0.00%> (-0.03%)`	⬇️
opensearch-1.x-v1	`19.07% <0.00%> (-0.26%)`	⬇️
opensearch-2.x-v1	`19.07% <0.00%> (-0.26%)`	⬇️
opensearch-2.x-v2	`1.86% <0.00%> (-0.03%)`	⬇️
tailsampling-processor	`0.47% <0.00%> (-0.01%)`	⬇️
unittests	`94.92% <95.85%> (-0.01%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Manik2708 · 2024-12-18T03:57:22Z

I have changed the structure of cache which is leading to these concerns:

Will a 3D map be a viable option for production?

Cache will never be able to retrieve operations of old data! When kind is not sent by the user, all operations related to new data will be sent. I have a probable solution for this! We might have to introduce boolean which when true will load the cache from old data (old index key) and mark all the span of kind UNSPECIFIED

To maintain consistency, we must take the service name from the newly created index, but extracting service name from serviceName+operationName+kind is the challenge! The solution which I have thought is reserving the last 7 places for len(serviceName)+len(operationName)+kind in the new index. This has an issue that we have to limit the length of serviceName and operationName to 999. This way we can get rid of the c.services map also. Removing this map is optional and a matter of discussion because for this we have to decide between storage and iteration, removing this map will lead to extra iterations in GetServices, I also thought of a solution for this:
data = map[string]struct
// Here this struct can be defined as
type struct {
expiryTime uint64
operations map[trace.SpanKind]map[string]uint64
}
Once the correct approach is discussed I will handle some more edge cases and make the e2e tests pass (making GetOperationsMissingSpanKind: false!

@yurishkuro Please review the approach and problems!

Manik2708 · 2024-12-19T08:20:09Z

@yurishkuro I have added more changes which reduces the iterations in prefill to 1 but it limits the serviceName to length of 999. Please review!

Manik2708 · 2024-12-19T08:50:52Z

I have an idea for old data without using the migration script! We can store the old data in two other data structures in cache (without kind). But then the only question which rises then: What to return when no span kind is given by user? Operations of new data of all kind or operations of old data (kind marked as unspecified) or an addition of both?

model/span.go

plugin/storage/badger/spanstore/cache.go

plugin/storage/badger/spanstore/writer.go

model/span.go

yurishkuro · 2024-12-20T01:40:35Z

What to return when no span kind is given by user?

then we should return all operations regardless of the span kind

Manik2708 · 2024-12-20T02:58:03Z

What to return when no span kind is given by user?

then we should return all operations regardless of the span kind

That means including all spans of old data also (Whose kind is not there in cache)?

Manik2708 · 2024-12-22T19:49:31Z

My current approach is leading to errors in unit test of factory_test.go. The badger is throwing this error infinetly times:

runtime.goexit
	/usr/local/go/src/runtime/asm_amd64.s:1700, retrying
badger 2024/12/23 01:12:11 ERROR: error flushing memtable to disk: error while creating table err: while creating table: /tmp/badger116881967/000002.sst error: open /tmp/badger116881967/000002.sst: no such file or directory
unable to open: /tmp/badger116881967/000002.sst
github.com/dgraph-io/ristretto/v2/z.OpenMmapFile

This is probably because f.Close is closed before the completion of prefill. That implies creation of new index for old data is slow. Hence I think we have only one way, if we want to skip even auto migration and that is using this function:

func getSpanKind(txn *badger.Txn, service string, timestampAndTraceId string) model.SpanKind {
	for i := 0; i < 6; i++ {
		value := service + model.SpanKindKey + model.SpanKind(i).String()
		valueBytes := []byte(value)
		operationKey := make([]byte, 1+len(valueBytes)+8+sizeOfTraceID)
		operationKey[0] = tagIndexKey
		copy(operationKey[1:], valueBytes)
		copy(operationKey[1+len(valueBytes):], timestampAndTraceId)
		_, err := txn.Get(operationKey)
		if err == nil {
			return model.SpanKind(i)
		}
	}
	return model.SpanKindUnspecified
}

The only problem is that, during prefilling 6*NumberOfOperations Get Queries will be called. Please review this approach @yurishkuro and I think we need to discuss about autoCreation of new index or should we skip the creation of any new index and use the function given above?

Manik2708 · 2024-12-26T05:27:15Z

@yurishkuro I finally got rid of migration and now I think its ready for review! Please ignore my previous comments. The current commit has no linkage them!

plugin/storage/badger/spanstore/cache.go

Manik2708 · 2024-12-26T17:00:12Z

make test is passing locally, should we rerun in CI?

plugin/storage/badger/spanstore/cache.go

Manik2708 · 2025-01-12T21:16:04Z

@yurishkuro This PR is ready to review, I have added dual lookups and backward compatibility tests in this PR.

plugin/storage/badger/config.go

yurishkuro · 2025-01-19T22:32:31Z

plugin/storage/badger/spanstore/backward_compatibility_test.go

+		}
+		err := writer.writeSpanWithOldIndex(&oldSpan)
+		require.NoError(t, err)
+		traces, err := reader.FindTraces(context.Background(), &spanstore.TraceQueryParameters{


not sure I follow this test. What does FindTraces have to do with span kind in the operations retrieval? Also, backwards compatibility test only makes sense when it is executed against old and new code.

We have changed the key but we need to make sure that traces are also fetched from old key when dual lookup is turned on. Please stress on a fact that operation key is used in getting traces also along with filling in cache, If you will look at this code, we are first writing span with old key and then testing whether it is able to fetch traces associated with that key (please see L42)

yurishkuro · 2025-01-19T22:38:51Z

plugin/storage/badger/spanstore/cache.go

+		}
+	*/
+	// The uint64 value is the expiry time of operation
+	operations map[string]map[model.SpanKind]map[string]uint64


to clarify, CacheStore is used to avoid expensive scans when loading services and operations, correct? In other words, it's all in-memory structure. In this case, why can we not change just the value of the map to be a combo {kind, expiration} instead of changing the structure? When loading, scanning everything for a give service is still going to be negligible amount of data.

Can't understand this! Are you saying to keep these structures?

services map[string]uint64 // Already in the cache operations map[string][string]kind type kind struct { kind SpanKind expiry uint64 }

If yes, then how to handle when query is to fetch all operations for a service and kind? Should we iterate all operations and skip those operations which are not of the required kind? (We are using a similar approach currently, i.e iteralting for all kinds and skipping unrequired kinds but this was justified because max kinds can be 6 but number of operations aren't defined, so will this option viable?)

Yes, this structure.

So iterating all operations and skipping not required kinds will be right?

While approaching towards this, I am leading to a conclusion that this approach will lead to the same problem that spans with same operation and service name but different kind will end up in overriding of data. So I don't think that this structure is going to be a correct approach! Rather I could think of only 3D map a viable option. So should we move forward with 3D map or can we have a better idea?

plugin/storage/badger/spanstore/cache.go

plugin/storage/badger/spanstore/kind.go

plugin/storage/badger/spanstore/writer.go

…er (#6575) ## Which problem is this PR solving? Comment: #6376 (comment) ## Description of the changes - Cache was directly contacting the db to prefill itself which is not a good way, now this responsibility is given to reader to read from badger and fill the cache. ## How was this change tested? - Unit and e2e tests ## Checklist - [x] I have read https://github.com/jaegertracing/jaeger/blob/master/CONTRIBUTING_GUIDELINES.md - [x] I have signed all commits - [x] I have added unit tests for the new functionality - [x] I have run lint and test steps successfully - for `jaeger`: `make lint test` - for `jaeger-ui`: `npm run lint` and `npm run test` --------- Signed-off-by: Manik2708 <[email protected]>

plugin/storage/badger/factory.go

plugin/storage/badger/spanstore/cache.go

Manik2708 · 2025-01-28T18:40:01Z

@yurishkuro A humble reminder to review this PR!

Signed-off-by: Manik2708 <[email protected]>

Manik2708 · 2025-02-07T08:56:06Z

internal/storage/v1/badger/spanstore/writer.go

+}
+
+// This method is to test backward compatibility for old index key
+func (w *SpanWriter) writeSpan(span *model.Span, writeOldIndex bool) error {


Not sure about this approach. But my aim was to lower the diff!

Manik2708 · 2025-02-07T08:57:22Z

internal/storage/v1/badger/spanstore/cache.go


-	store *badger.DB


Can't find any use of store in Cache when the responsibility to fill cache is given to reader

Manik2708 · 2025-02-07T09:05:54Z

internal/storage/v1/badger/factory.go

+		"jaeger.badger.dualLookUp",
+		featuregate.StageBeta, // enabed by default
+		featuregate.WithRegisterFromVersion("v2.2.0"),
+		featuregate.WithRegisterToVersion("v2.5.0"),


I am confused about the two options:

Versions used

Whether it should be the issue which should be linked or pull request, as issue is not talking about this change directly!

Signed-off-by: Manik Mehta <[email protected]>

Signed-off-by: Manik2708 <[email protected]>

Manik2708 · 2025-02-11T00:20:26Z

@yurishkuro Sorry for disturbance, but can you please review this PR and resolve the doubts?

yurishkuro · 2025-02-27T22:21:49Z

/gemini review

gemini-code-assist

Code Review

The pull request introduces SpanKind support for Badger, which is a valuable enhancement. The changes seem well-structured and include a backward compatibility test. However, there are a few areas that could benefit from further review and refinement.

Summary of Findings

Merge Readiness

The pull request appears to be in good shape overall, but I recommend addressing the comments provided below before merging. I am unable to directly approve this pull request, so please ensure that other reviewers also examine the changes and provide their approval before proceeding with the merge.

Copilot

PR Overview

This PR adds support for SpanKind in Badger storage by updating index keys, reader/writer logic, the cache structure, and integration tests.

Introduces new type mappings and conversion functions for SpanKind.
Updates factory, reader, writer, and cache layers to create and query indexes that incorporate span kind.
Adds and adjusts unit and integration tests to verify backward compatibility and new functionality.

Reviewed Changes

File	Description
internal/storage/v1/badger/spanstore/backward_compatibility_test.go	Adds tests to ensure backward compatibility after the index changes.
internal/storage/v1/badger/spanstore/kind.go	Introduces new type and mapping functions for span kind conversion.
internal/storage/v1/badger/factory.go	Registers a new feature gate and updates cache initialization and reader construction.
internal/storage/v1/badger/spanstore/reader.go	Updates TraceReader to accept a dual lookup flag and prefill operations by span kind.
internal/storage/v1/badger/spanstore/rw_internal_test.go	Adjusts tests to use the new TraceReader and CacheStore signatures.
internal/storage/v1/badger/spanstore/writer.go	Modifies index key creation to incorporate SpanKind for operations.
internal/storage/v1/badger/spanstore/cache.go	Updates cache structure for operations to be keyed by span kind.
internal/storage/integration/badgerstore_test.go, cmd/jaeger/internal/integration/badger_test.go	Remove legacy flags and update integration tests to reflect SpanKind support.

Copilot reviewed 10 out of 10 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (1)

internal/storage/v1/badger/spanstore/reader.go:60

[nitpick] Consider renaming 'dualLookUp' to 'dualLookup' to follow common naming conventions.

dualLookUp bool

internal/storage/v1/badger/spanstore/reader.go

Co-authored-by: Copilot <[email protected]> Signed-off-by: Manik Mehta <[email protected]>

Signed-off-by: Manik2708 <[email protected]>

Manik2708 requested a review from a team as a code owner December 17, 2024 07:43

Manik2708 requested a review from jkowall December 17, 2024 07:43

dosubot bot added enhancement storage/badger Issues related to badger storage labels Dec 17, 2024

yurishkuro added the changelog:new-feature Change that should be called out as new feature in CHANGELOG label Dec 20, 2024

yurishkuro reviewed Dec 20, 2024

View reviewed changes

model/span.go Outdated Show resolved Hide resolved

yurishkuro reviewed Dec 20, 2024

View reviewed changes

plugin/storage/badger/spanstore/cache.go Outdated Show resolved Hide resolved

yurishkuro reviewed Dec 20, 2024

View reviewed changes

plugin/storage/badger/spanstore/writer.go Outdated Show resolved Hide resolved

yurishkuro reviewed Dec 20, 2024

View reviewed changes

model/span.go Outdated Show resolved Hide resolved

yurishkuro reviewed Dec 20, 2024

View reviewed changes

model/span.go Outdated Show resolved Hide resolved

Manik2708 marked this pull request as draft December 22, 2024 14:04

Manik2708 marked this pull request as ready for review December 22, 2024 19:16

dosubot bot added the area/storage label Dec 22, 2024

Manik2708 requested a review from yurishkuro December 23, 2024 19:28

Manik2708 marked this pull request as draft December 26, 2024 02:07

Manik2708 force-pushed the kind branch from 61ad16d to 2aa4332 Compare December 26, 2024 05:21

Manik2708 marked this pull request as ready for review December 26, 2024 05:22

yurishkuro reviewed Dec 26, 2024

View reviewed changes

plugin/storage/badger/spanstore/cache.go Outdated Show resolved Hide resolved

Manik2708 requested a review from yurishkuro December 26, 2024 16:35

yurishkuro reviewed Dec 28, 2024

View reviewed changes

plugin/storage/badger/spanstore/cache.go Outdated Show resolved Hide resolved

yurishkuro reviewed Jan 19, 2025

View reviewed changes

Manik2708 mentioned this pull request Jan 20, 2025

[badger] Untangle cache logic from db access logic #6575

Merged

4 tasks

Manik2708 changed the title ~~SpanKind support for badger~~ [feat][storage] Add SpanKind support for badger Jan 21, 2025

Manik2708 marked this pull request as draft January 22, 2025 05:18

Manik2708 force-pushed the kind branch from 4747b0a to 3ba24a9 Compare January 22, 2025 09:01

Manik2708 marked this pull request as ready for review January 22, 2025 09:05

Manik2708 commented Jan 22, 2025

View reviewed changes

plugin/storage/badger/factory.go Outdated Show resolved Hide resolved

Manik2708 commented Jan 22, 2025

View reviewed changes

plugin/storage/badger/spanstore/cache.go Outdated Show resolved Hide resolved

Manik2708 requested a review from yurishkuro January 22, 2025 09:12

Manik2708 force-pushed the kind branch from ad29b45 to ed80cc0 Compare February 2, 2025 09:21

conflicts

b2842fd

Signed-off-by: Manik2708 <[email protected]>

Manik2708 force-pushed the kind branch from ed80cc0 to b2842fd Compare February 5, 2025 13:22

Manik2708 commented Feb 7, 2025

View reviewed changes

Manik2708 added 2 commits February 9, 2025 18:05

Merge branch 'main' into kind

72c23ac

Signed-off-by: Manik Mehta <[email protected]>

conflicts

69098e8

Signed-off-by: Manik2708 <[email protected]>

mahadzaryab1 self-requested a review February 13, 2025 14:19

Merge branch 'main' into kind

146b659

yurishkuro requested a review from Copilot February 27, 2025 22:21

gemini-code-assist bot reviewed Feb 27, 2025

View reviewed changes

Copilot AI reviewed Feb 27, 2025

View reviewed changes

internal/storage/v1/badger/spanstore/reader.go Outdated Show resolved Hide resolved

Manik2708 and others added 3 commits March 1, 2025 00:23

Update internal/storage/v1/badger/spanstore/reader.go

e1deae4

Co-authored-by: Copilot <[email protected]> Signed-off-by: Manik Mehta <[email protected]>

Merge branch 'main' into kind

a820441

typo fix

e6cadda

Signed-off-by: Manik2708 <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[feat][storage] Add SpanKind support for badger #6376

[feat][storage] Add SpanKind support for badger #6376

Manik2708 commented Dec 17, 2024

Manik2708 commented Dec 17, 2024 •

edited

Loading

codecov bot commented Dec 17, 2024 •

edited

Loading

Manik2708 commented Dec 18, 2024

Manik2708 commented Dec 19, 2024

Manik2708 commented Dec 19, 2024 •

edited

Loading

yurishkuro commented Dec 20, 2024

Manik2708 commented Dec 20, 2024

Manik2708 commented Dec 22, 2024

Manik2708 commented Dec 26, 2024

Manik2708 commented Dec 26, 2024

Manik2708 commented Jan 12, 2025

yurishkuro Jan 19, 2025

Manik2708 Jan 20, 2025

yurishkuro Jan 19, 2025

Manik2708 Jan 20, 2025

yurishkuro Jan 20, 2025

Manik2708 Jan 20, 2025

yurishkuro Jan 20, 2025

Manik2708 Jan 22, 2025

Manik2708 commented Jan 28, 2025

Manik2708 Feb 7, 2025

Manik2708 Feb 7, 2025

Manik2708 Feb 7, 2025

Manik2708 commented Feb 11, 2025

yurishkuro commented Feb 27, 2025

gemini-code-assist bot left a comment

Copilot AI left a comment

[feat][storage] Add SpanKind support for badger #6376

Are you sure you want to change the base?

[feat][storage] Add SpanKind support for badger #6376

Conversation

Manik2708 commented Dec 17, 2024

Which problem is this PR solving?

Description of the changes

How was this change tested?

Checklist

Manik2708 commented Dec 17, 2024 • edited Loading

codecov bot commented Dec 17, 2024 • edited Loading

Codecov Report

Manik2708 commented Dec 18, 2024

Manik2708 commented Dec 19, 2024

Manik2708 commented Dec 19, 2024 • edited Loading

yurishkuro commented Dec 20, 2024

Manik2708 commented Dec 20, 2024

Manik2708 commented Dec 22, 2024

Manik2708 commented Dec 26, 2024

Manik2708 commented Dec 26, 2024

Manik2708 commented Jan 12, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Manik2708 commented Jan 28, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Manik2708 commented Feb 11, 2025

yurishkuro commented Feb 27, 2025

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Summary of Findings

Merge Readiness

Copilot AI left a comment

Choose a reason for hiding this comment

PR Overview

Reviewed Changes

Manik2708 commented Dec 17, 2024 •

edited

Loading

codecov bot commented Dec 17, 2024 •

edited

Loading

Manik2708 commented Dec 19, 2024 •

edited

Loading