-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for dynamic dictionaries of buffers #52
Conversation
Signed-off-by: Michael X. Grey <[email protected]>
Signed-off-by: Michael X. Grey <[email protected]>
Signed-off-by: Michael X. Grey <[email protected]>
Signed-off-by: Michael X. Grey <[email protected]>
Signed-off-by: Michael X. Grey <[email protected]>
Signed-off-by: Michael X. Grey <[email protected]>
Signed-off-by: Michael X. Grey <[email protected]>
Signed-off-by: Michael X. Grey <[email protected]>
Signed-off-by: Michael X. Grey <[email protected]>
Signed-off-by: Michael X. Grey <[email protected]>
Signed-off-by: Michael X. Grey <[email protected]>
Signed-off-by: Michael X. Grey <[email protected]>
Signed-off-by: Michael X. Grey <[email protected]>
Signed-off-by: Michael X. Grey <[email protected]>
Signed-off-by: Michael X. Grey <[email protected]>
Signed-off-by: Michael X. Grey <[email protected]>
Signed-off-by: Michael X. Grey <[email protected]>
Signed-off-by: Michael X. Grey <[email protected]>
Signed-off-by: Michael X. Grey <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel there is quite a lot of duplicated code between Buffer
, AnyBuffer
and JsonBuffer
. In particular, I'm not sure how useful JsonBuffer
would be over Buffer<serde_json::Value>
(possibly with conversion to AnyBuffer
).
I can see how AnyBuffer
would be useful for diagrams, but it doesn't address the issue that diagrams can only create Buffer<AnyMessage>
and converting that to AnyBuffer
loses a lot of the advantages of AnyBuffer
.
join_into
seems like to would help as well, but I don't see how I can use it in diagrams because buffer_access
, listen
and join
do not know the buffers or buffer maps at compile time. We can register join impls but the issue is that the workflow builder does not do any forward lookups atm, so it doesn't know which impl to use. Thats not a hard blocker but if it does support forward lookups, then join
should work as well as join_into
.
I agree and I'm happy to iterate on some aspects of the implementation, now that we have an MVP. Those three structs in particular should be reasonable to refactor. But I think most of the duplication between the modules (in terms of lines of code) comes from the various
For diagrams every message type would auto-register a vtable to create a
Due to the tragic absence of specialization we'll need to introduce a registration function along the lines of I haven't finished the pieces that will be needed for buffer access and listening, but they will work pretty similarly to
You're right that this will probably require forward lookups. We'll probably need a two-stage pass over all the operations where the first pass identifies the message types for all operations where that's fixed and then the second pass infers the message types for the rest based on their connections.
That's true, we don't strictly need The real value of this new approach is that we can assign explicit names to each item that is part of a join or part of a group of buffers. Now the json format can use explicit dictionaries of buffers instead of fragile arrays of buffers. That's the key advantage that I'm trying to unlock with this PR. |
Forgot to comment on this one. There's a very important difference: There's also the fact that |
Signed-off-by: Michael X. Grey <[email protected]>
Signed-off-by: Michael X. Grey <[email protected]>
Signed-off-by: Michael X. Grey <[email protected]>
Signed-off-by: Michael X. Grey <[email protected]>
Signed-off-by: Michael X. Grey <[email protected]>
Signed-off-by: Michael X. Grey <[email protected]>
Signed-off-by: Michael X. Grey <[email protected]>
Signed-off-by: Michael X. Grey <[email protected]>
Signed-off-by: Michael X. Grey <[email protected]>
Signed-off-by: Teo Koon Peng <[email protected]> Signed-off-by: Michael X. Grey <[email protected]> Co-authored-by: Michael X. Grey <[email protected]>
Signed-off-by: Michael X. Grey <[email protected]>
Signed-off-by: Michael X. Grey <[email protected]>
@koonpeng I've added support for using I've also added explicit support for
The advantage of this Now the only thing left before I take this PR out of draft is introducing buffer key structs for the |
src/buffer/buffer_map.rs
Outdated
} | ||
} | ||
|
||
impl From<&'static str> for BufferIdentifier<'static> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it intended that there is no impl<'a> From<usize> for BufferIdentifier<'a>
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, added: 530c92f
pub trait AddBufferToMap { | ||
/// Convenience function for inserting items into a [`BufferMap`]. This | ||
/// automatically takes care of converting the types. | ||
fn insert_buffer<I: Into<BufferIdentifier<'static>>, B: AsAnyBuffer>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a potential problem with this is that index based keys are not guaranteed to be sequential, I don't know if this is good or not in practice, but I think we can change insert_buffer
to only accept the string variant, and have a extend_buffer
that takes an iterator and inserts the index keys sequentially.
I also thought of making BufferMap
an enum
pub enum BufferMap<'a> {
String(HashMap<Cow<'a, str>, AnyBuffer>),
Index(Vec<AnyBuffer>),
}
But I guess there is probably a reason for the current impl.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My expectation is that no one will do ordered iteration using this key. If someone wants to iterate over indices then they will .get(&BufferIdentifier::Index(i))
.
Alternatively they may match the key to BufferIdentifier::Index(index)
and then use index
to align with the layout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I guess there is probably a reason for the current impl.
Mostly I found that funneling everything into the key made it much easier to have a simple and consistent error message. This also theoretically supports maps that can have a mix of names and indices but I don't think we're likely to need that.
#[cfg(test)] | ||
mod tests { | ||
use crate::{prelude::*, testing::*, BufferMap}; | ||
use crate::{prelude::*, testing::*, AddBufferToMap, BufferMap}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should AddBufferToMap
be in prelude?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit torn on this and open to opinions. I'm leaning towards no because realistically I think it'll only be used in tests and maybe in the diagram module. I think it's very unlikely that downstream users will ever use a BufferMap
directly.
Currently, to do this, you need to join multiple outputs with the serialize flag on, it would produce a {
type: "transform",
cel: "{ \"hello\": node_a, \"world\": node_b.msg }",
next: ...
} So |
The problem with
If I understand the proposal correctly, we would need to parse the CEL string, identify any node or buffer references that exist inside the string, build up the buffer map based on those references, and then serialize+join the values from the buffers before passing them to the
You're right about this advantage of {
"joining": {
"type": "serialize_join",
"buffers": {
"foo": "buffer_a",
"bar": "buffer_b"
},
"next": "transforming"
},
"transforming": {
"type": "transform",
"cel": "{ \"hello\": request.foo, \"world\": request.bar.msg }",
"next": { "builtin": "terminate" }
}
} If we're concerned about user convenience we can have widgets in the GUI that represent both of these operations combined. |
Signed-off-by: Michael X. Grey <[email protected]>
Just to confirm if my understanding is correct, {
"buffer_a": {
"type": "buffer"
},
"buffer_b": {
"type": "buffer"
}
"join": {
"type": "join",
"buffers": {
"foo": "buffer_a",
"bar": "buffer_b"
}
"next": ...
},
"join_vec": {
"type": "join",
"buffers": ["buffer_a", "buffer_b"],
"next": ...
},
"serialize_join": {
"type": "serialize_join",
"buffers": {
"foo": "buffer_a",
"bar": "buffer_b"
},
"next": ...
}
}
|
The problem here is that the next operation won't necessarily be a node. E.g. I don't mind if we make it optional to specify the
To be clear, even when using the Rust API, outputs are always converted into buffers when being joined. But the Rust API can do this implicitly via the When it comes to
If we do type inference for If inferring from the inputs is feasible, I think we should go with that. Otherwise there's nothing wrong with having an |
I think we should combine We should just stop using |
For |
Signed-off-by: Michael X. Grey <[email protected]>
Signed-off-by: Michael X. Grey <[email protected]>
Signed-off-by: Michael X. Grey <[email protected]>
Signed-off-by: Michael X. Grey <[email protected]>
Signed-off-by: Michael X. Grey <[email protected]>
Taking this out of the draft stage since all the needed features are implemented and all current feedback has been addressed. I may circle back to add more documentation in some places. |
This PR is to address issues in the core library that are creating roadblocks for #51
This is a hefty PR but adds some important layers of flexibility to buffers. The highlights are:
AnyBuffer
can represent a buffer of any message type, and is now a first-class citizen of the libraryAnyBufferKey
to access any buffer through theWorld
JsonBuffer
can represent a buffer of any message type that supports serialization and deserializationJsonBufferKey
to access the JSON data in the buffer through theWorld
BufferMap
can now be used withBuilder::join_into
to join a dictionary ofAnyBuffer
instances into a value that implementsJoinedValue
I'm leaving this PR as a draft because there are a few more items to finish:
Joined
macro that implements theJoinedValue
andBufferMapLayout
traits for any structBufferKeyMap
macro that implements theBufferKeyMap
andBufferMapLayout
traits for any struct that consists entirely ofBufferKey<T>
, and/orAnyBufferKey
fieldsBuilder::try_listen
to convert aBufferMap
into a specific struct that implementsBufferKeyMap
.