Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Composition Feature #44

Closed
JoranHonig opened this issue Jun 8, 2024 · 14 comments · Fixed by #63
Closed

Composition Feature #44

JoranHonig opened this issue Jun 8, 2024 · 14 comments · Fixed by #63

Comments

@JoranHonig
Copy link

I've browsed a bit through the available docs, but couldn't find whether this is possible.

So this is a question / maybe feature request.

Is there some way of composing multiple sets of predicates? Ideally you'd be able to multiple blocks of interdependent ascent code spread over multiple files.

@regexident
Copy link
Contributor

I too would like to be able to share Datalog code between ascent programs!

Right now there is a strong tension between keeping one's code modular, clean and by that testable, and the lack of support for code-sharing.

But given ascent's nature of being a compile-time proc-macro that generates a very specific Datalog runtime tailored to the very specific program that was passed to ascent! { … } I can't see how ascent could possibly be composed from multiple files, since the proc-macro only sees the scope that gets passed to it. Other than by use of more macros that is.

That said I was hoping to be able to work around this limitation by the use of Rust macros along the lines of this, which would allow one to share a "library" of common datalog snippets between multiple ascent!{ … } blocks:

macro_rules! node_relation {
    ($node:ty) => {
        relation node($node);
    }
}

macro_rules! edge_relation {
    ($node:ty) => {
        relation edge($node, $node);
    }
}

macro_rules! reachable_relation {
    ($node:ty) => {
        relation reachable($node, $node);

        // The transitive closure can be written using a so-called linear rule:
        reachable(x, y) <-- edge(x, y);
        reachable(x, z) <-- reachable(x, y), edge(y, z);
    }
}

ascent! {
    node_relation!(); // Error: "undefined macro"
    edge_relation!(); // Error: "undefined macro"
    reachable_relation!(); // Error: "undefined macro"

    relation closure_of_a(Node);

    closure_of_a(y) <-- reachable(Node("A"), y);
};

… but unfortunately this throws an "undefined macro" error from the ascent_macro's rule_expand_macro_invocations(…) function.

If ascent allowed the use of Rust macros at the rule-level, then the above should work, which admittedly isn't optimal, but at least it works.

@StarGazerM
Copy link

Rust macro are outside-in. If you call a macro inside ascent!, rust expander will first expand ascent! and then insider, if your macro code is to generate ascent code, since its expansion is after ascent compilation, it really can't work in the way you want.

If rust has some preprocessor, this will be simpler. If not, you will have to implement like writing another procedure macro to generate ascent code, instead of using normal macro.

@regexident
Copy link
Contributor

Rust macro are outside-in. If you call a macro inside ascent!, rust expander will first expand ascent! and then insider, if your macro code is to generate ascent code, since its expansion is after ascent compilation, it really can't work in the way you want.

Shoot, you're right of course. Well, that feels like a nail in the coffin for reusability in ascent then, I suppose, given that ascent generates a bespoke runtime from the provided Datalog syntax and given that proc-macros can only access their own token-trees.

@StarGazerM
Copy link

I think some ad-hoc solution but shouldn't been approved to ascent current implementation will be allow user to manually stratify a datalog program, and then you may be able to compose stratum, (in implementation will be something a macro take rules and add impl for compiled ascent structs) . But it still can't make you get something like define a TC and reuse in other datalog program.

@kmicinski
Copy link
Collaborator

Being able to get reuse from Datalog programs in general is a challenging open problem in language design. The issue is not superficial, and is rather deep. For example, you need to do whole-program complication to be able to calculate optimal indices, along with more deep optimizations that are typical of modern Datalogs.

I think this is good long-term motivation for a future version of Ascent, but it also plagues other Datalogs as well. There are some folks who are looking into module systems for Datalog reuse, for example Erdweg et al. have recently published some work on module systems for Datalog (https://dl.acm.org/doi/abs/10.1145/3689484.3690737). Their system, IncA, even has an Ascent backend.

@s-arash
Copy link
Owner

s-arash commented Nov 1, 2024

Given the limitations of Rust macros mentioned by @StarGazerM, and the fact that Datalog composability is challenging in general as mentioned by @kmicinski , the best thing that I think can be done is to have a include!() "macro" in Ascent, so code "modules" would be separate files containing Ascent relations/rules, that then can be include!()ed into actual Ascent programs.

It would look something like this:

// reachable.ascent(?)

relation reachable(Node, Node);

reachable(x, y) <-- edge(x, y);
reachable(x, z) <-- reachable(x, y), edge(y, z);
// main.rs

// ...
ascent! {
  include!("reachable.ascent");
  relation edge(Node, Node);
 // ...
}

@JoranHonig @regexident would something like this be useful and address your use cases?

@regexident
Copy link
Contributor

regexident commented Nov 1, 2024

@s-arash that would be a much welcome improvement for sure! 🙏🏻

This sort of thing sounds like a perfect use case for syn's fold() visitor.


A couple of observations:

No cyclic includes

Any includes in ascent would have to be acyclic.

Acyclic graph vs. tree of includes?

Or would either have to implemented some sort of include! de-duplication, or simply forbid redundant includes.

Update: We should probably use the DAG approach from the get go, see #44 (comment) for why.

One could start with a tree initially and look into expanding ascent to support a DAG of includes later, if tree-only includes are found to be impractical in real-world scenarios.

At most one struct Program per program

In order to avoid conflicts only one of the snippets (i.e. ascent files or inline macros) that form an ascent program, would be allowed to contain a struct Program …; then, I would assume?

In other words: an ascent snippet (be it from a file, or an inline macro) that contains a struct Program …; must not include ascent files that contain struct Program …; of their own (neither directly, nor transitively).

So when writing ascent files one would have to decide: an ascent file can either describe a self-contained ascent program and be runnable as is (which may require a struct Program …;), or be re-usable, in which case one would have to decide whether or not to include a struct Program …; in it, based on the individual use case.


I would be happy to help make this happen, too.

@regexident
Copy link
Contributor

To give some additional motivational context for such a composition feature:

I have a bunch of graph queries/filters/transformations implemented in ascent that all work on the same graph type and thus have to repeat the same EDB schema (i.e. relation node(…), relation edge(…), …) over and over again, which is both, annoying and a maintenance burden.

Full example code

Imagine a dozen of files like the following, each sharing the same "schema" head and common logic rules,
followed by additional program-specific logic further below, which might look something like this,
but would usually of course be significantly more complex:

ascent::ascent! {
    pub struct Program<Node, Edge>
    where
        Node: Clone + Eq + Hash,
        Edge: Clone + Eq + Hash;

    // Shared facts:

    relation node(
        Node // node
    );

    relation edge(
        Node, // source node
        Edge, // edge
        Node // target node
    );

    // Shared logic:

    relation reachable(
        Node, // source
        Node // target
    );

    reachable(x, y) <-- edge(x, _, y);
    reachable(x, z) <-- reachable(x, y), edge(y, _, z);

    relation neighbor(
        Node, // source
        Node // target
    );
   
    neighbor(x, y) <-- edge(x, _, y);

    // Program-specific logic:

    // ...
};

With an include!() feature in ascent as described by @s-arash above this could be cleaned up significantly:

The "shared" schema could be consolidated into a single file:

src/graph.ascent:

pub struct Program<Node, Edge>
where
    Node: Clone + Eq + Hash,
    Edge: Clone + Eq + Hash;

// Facts:

relation node(
    Node // node
);

relation edge(
    Node, // source node
    Edge, // edge
    Node // target node
);

And each "shared" logic relation would get its own .ascent file (in order to avoid program bloat), like so:

src/reachable.ascent:

relation reachable(
    Node, // source
    Node // target
);

reachable(x, y) <-- edge(x, _, y);
reachable(x, z) <-- reachable(x, y), edge(y, _, z);

src/neighbor.ascent:

relation neighbor(
    Node, // source
    Node // target
);

neighbor(x, y) <-- edge(x, _, y);

… which could then be included from each of the query/filter/transformation programs, like so:

// ...

ascent::ascent! {
    include!("graph.ascent");
    include!("reachable.ascent");
    include!("neighbor.ascent");

    // Program-specific logic:

    // ...
};

// ...

Or like this, if graph.ascent contained no struct Program…; of its own:

// ...

ascent::ascent! {
    pub struct Program<Node, Edge>
    where
        Node: Clone + Eq + Hash,
        Edge: Clone + Eq + Hash;

    include!("graph.ascent");
    include!("reachable.ascent");
    include!("neighbor.ascent");

    // Program-specific logic:

    // ...
};

// ...

@regexident
Copy link
Contributor

Now that I've actually played through such a scenario I think I would prefer to be able to form a DAG from such ascent snippets. 🤔

The motivation for supporting DAG-includes is this:

A file like this would have an implicit dependency on relations defined in graph.schema.ascent:

// reachable.ascent

relation reachable(
    Node, // source
    Node // target
);

reachable(x, y) <-- edge(x, _, y);
reachable(x, z) <-- reachable(x, y), edge(y, _, z);

As such from a maintenance point of view it would be desirable to avoid such implicit dependencies:

// reachable.ascent

include!("graph.ascent"); // 👈🏻

relation reachable(
    Node, // source
    Node // target
);

reachable(x, y) <-- edge(x, _, y);
reachable(x, z) <-- reachable(x, y), edge(y, _, z);

But since multiple files within a program's include graph would include graph.ascent one would need to de-duplicate includes, either based on the included file's path or content.

@StarGazerM
Copy link

Hi guys, one of my concerns is include will mess up rust analyzer and vscode code linter. I purpose for rules, maybe something more bi-direction, include rules work like add a impl function to the place where its included. Maybe I can explore myself and PR, I know it sounds vague.

@s-arash
Copy link
Owner

s-arash commented Nov 6, 2024

I did a bit of research. Unfortunately, there is currently no stable API for getting the file path of input tokens to macros, which is crucial to implementing include!. The API that exists is nightly-only (https://doc.rust-lang.org/stable/proc_macro/struct.SourceFile.html).

Related issues:
rust-lang/rust#54725
rust-lang/rfcs#3200

@StarGazerM
Copy link

StarGazerM commented Feb 7, 2025

okay found a hacky but very useful library called magic_macro build on stable API can do similar thing as include!.

I tried play around it on my funky testing fork. By wrapping shared ascent code in an empty macro(export_ascent) (a macro contains ascent code but expand to nothing) and use export_tokens in this magic_macro library sharing token between macros, we can then use another attributed proc macro to compose 2 ascent query.
Here is some of my experimental implementation:

// in ascent_macro/src/lib.rs:

/// An empty macro just used to share the code without actually expand it
#[proc_macro]
pub fn export_ascent(_input: TokenStream) -> TokenStream {
   quote! {}.into()
}

use macro_magic::import_tokens_attr;
/// Import an external database relation declaration into the ascent program.
#[import_tokens_attr]
#[proc_macro_attribute]
pub fn ascent_use(attr: TokenStream, input: TokenStream) -> TokenStream {
   let ascent_code: AscentCall = syn::parse(input).unwrap();
   let run_code = ascent_code.code;
   let ascent_call = ascent_code.ascent_macro;
   let ext_code : AscentCall = syn::parse(attr).unwrap();
   let used_code = ext_code.code;
   let par = ascent_call.to_string() == "ascent_par";
   let res = ascent_impl(
      run_code.into_iter().chain(used_code.into_iter()).collect(),
      ascent_call.to_string() == "ascent_run", par);
   
   match res {
      Ok(res) => res.into(),
      Err(err) => TokenStream::from(err.to_compile_error()),
   }
}

#[derive(Clone)]
pub(crate) struct AscentCall {
   pub ascent_macro: Ident,
   pub code: TokenStream,
}

impl Parse for AscentCall {
   fn parse(input: ParseStream) -> Result<Self> {
      let ascent_macro = input.parse()?;
      // parse the bang
      let _ = input.parse::<Token![!]>()?;
      let content;
      braced!(content in input);
      let code = content.parse()?;
      Ok(AscentCall { ascent_macro, code })
   }
}

Then we can in some module define TC:

// src/tc.rs
use macro_magic::export_tokens;
use ascent::{export_ascent, ascent_use, ascent};

#[export_tokens(TCDef)]
export_ascent! { // use ascent! if you want to expand TC also in this mod
   relation edge(i32, i32);
   relation path(i32, i32);

   path(x, y) <-- edge(x, y);
   path(x, y) <-- path(x, z), edge(z, y);
}

and reuse in different module

// src/test.rs
#[ascent_use(crate::tc::TCDef)]
ascent! {
    struct SingleReach;

    relation do_reach(i32, i32);
    relation reach(bool);

    reach(true) <-- do_reach(x, y), path(x, y);
}

benefits of this might be we can still use the power of rust analyzer!

@s-arash
Copy link
Owner

s-arash commented Feb 7, 2025

Very intriguing @StarGazerM, thanks! I'll try to play around with the concept a bit when I find some time.

@s-arash
Copy link
Owner

s-arash commented Feb 10, 2025

I've started prototyping an approach similar to what @StarGazerM suggested.

Here is the WIP PR: #63

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants