Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add interface for prefix cache indexer #657

Merged
merged 3 commits into from
Feb 14, 2025
Merged

Conversation

varungup90
Copy link
Collaborator

No description provided.

type PrefixCacheIndexer interface {
// MatchPrefix matches the longest prefix sequence for input request (passed as input tokens)
// and returns matched prefix (as tokens), remaining unmatched input request (as tokens) and pods matching the prefix
MatchPrefix(inputTokens []int, model string, pods []*v1.Pod) (matchedTokes []int, unMatchedTokens []int, matchedPods []*v1.Pod)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DwyaneShi pleas help review the routing indexer interface.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gangmuk You'd like to review this as well since you will join the indexer implementation.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo I think matchedTokes -> matchedTokens

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the usage of unMatchedTokens? is it better to return directly or just matched and caller could know the unMatched automatically

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For hash table usecase, we only need to add unmatched tokens hence returned unmatchedtokens.

Technically caller can know unmatched tokens from returned matched tokens but it will add extra computation to figure out.

@Jeffwan Jeffwan requested review from DwyaneShi and gangmuk February 12, 2025 23:59
type PrefixCacheIndexer interface {
// MatchPrefix matches the longest prefix sequence for input request (passed as input tokens)
// and returns matched prefix (as tokens), remaining unmatched input request (as tokens) and pods matching the prefix
MatchPrefix(inputTokens []int, model string, pods []*v1.Pod) (matchedTokes []int, unMatchedTokens []int, matchedPods []*v1.Pod)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo I think matchedTokes -> matchedTokens

@varungup90 varungup90 merged commit 3fc68b7 into main Feb 14, 2025
10 checks passed
@varungup90 varungup90 deleted the prefix-cache-refactoring branch February 14, 2025 00:07
varungup90 added a commit that referenced this pull request Feb 20, 2025
* Add interface for prefix cache indexer

* address review comments

* refactor dir layout

Signed-off-by: Varun Gupta <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants