Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explore how private dependencies with links = affect esp-hal #2589

Open
MabezDev opened this issue Nov 22, 2024 · 18 comments
Open

Explore how private dependencies with links = affect esp-hal #2589

MabezDev opened this issue Nov 22, 2024 · 18 comments
Labels
1.0-blocker investigation Not needed for 1.0, but don't know if we can support without breaking the driver. RFC Request for comment

Comments

@MabezDev
Copy link
Member

MabezDev commented Nov 22, 2024

I'm slightly concerned that when we release 1.0 some of our deps may in inadvertently break esp-hal builds by accident. I'm not sure what happens when:

Another similar scenario:

I'm bringing this up now as although the rt crates are private, they might have the power to break builds if we're not careful, so we need to be sure.

@bjoernQ
Copy link
Contributor

bjoernQ commented Dec 4, 2024

Not sure I understand the concern correctly.

At least for the rt crates I guess nothing but the HALs should depend on them (just checked for whatever reason for riscv-rt (not esp-riscv-rt) there are some PACs and other random crates depending on them) - at least I have no idea why other crates might want to depend on them.

Another thing is having multiple versions of a PAC (or different PACs) in the dependency tree:

For different versions:

warning: Linking globals named '__EXTERNAL_INTERRUPTS': symbol multiply defined!

error: failed to load bitcode of module "esp32c6-099ea990cfb03038.esp32c6.4e03e79070bad390-cgu.0.rcgu.o":

warning: `examples` (bin "gpio_interrupt") generated 1 warning

For different PACs (well - who would do that?)

warning: Linking globals named 'DEVICE_PERIPHERALS': symbol multiply defined!

error: failed to load bitcode of module "esp32s3-c2aab2c2b43b913e.esp32s3.86eeb19eb3c7df55-cgu.0.rcgu.o":

I also tried to check which of our external dependencies (of esp-hal) includes links and doesn't seem like any of them does (esp-hal-embassy is a different thing)

Our own crates:
xtensa-lx - why? xtensa-lx-rt doesn't .... I'd expect the rt to contain it not the arch-support crate
esp-println - makes sense to not have multiple versions in parallel, usually only the binary crate would include it but esp-backtrace includes a dependency (since it needs to)
esp-hal - makes sense - there should be only one HAL but that also means users can only have one version of it in the dep-tree anyways

esp-riscv-rt currently doesn't include a links key (but risv-rt does)

@tom-borcin tom-borcin added the investigation Not needed for 1.0, but don't know if we can support without breaking the driver. label Dec 18, 2024
@jessebraham
Copy link
Member

I'm also not sure I understand the concern here. The HAL depends on these packages, not user code. We are in control of these dependencies, as long as we follow semver with our releases I don't think there should be any issues?

@jessebraham
Copy link
Member

Okay, so in the interest of trying to get this resolved...

  1. I believe esp-hal should include the links key, which it already does
  2. I do not believe it makes sense for xtensa-lx to include the links key; this should probably be removed
  3. Both of the runtime crates (i.e. esp-riscv-rt and xtensa-lx-rt) should include the links key; these will need to be added
  4. It's currently not clear to me whether or not esp-println requires the links key; will need further input on the reasons for this requirement
  5. Including multiple versions of the PACs will likely be problematic, and as such perhaps there should also contain the links key; these would need to be added

I don't believe points 1-3 should be too controversial, however please let me know if anybody disagrees.

Points 4 and 5 will require some further discussion I guess, so please let me know your opinions on this.

Once the above points are resolved, we can carry out the required changes and consider this complete.

@Dominaezzz
Copy link
Collaborator

5. Including multiple versions of the PACs will likely be problematic

Why is this?
My gut thinking here is, if there's anything in the PAC that can't be included multiple times it should be pulled out into a separate crate.

@bjoernQ
Copy link
Contributor

bjoernQ commented Jan 13, 2025

  1. Including multiple versions of the PACs will likely be problematic

Why is this? My gut thinking here is, if there's anything in the PAC that can't be included multiple times it should be pulled out into a separate crate.

it's because of the __EXTERNAL_INTERRUPTS: https://github.com/esp-rs/esp-pacs/blob/207964207e83c413ee495ea0545d4f89c04eefc5/esp32c6/src/lib.rs#L104

@Dominaezzz
Copy link
Collaborator

Ah I see, could that be moved to the hal? I can easily picture an application having multiple versions of the PAC in their project from 3rd party drivers. Given the PAC isn't getting stabilised anytime soon it'd be really inconvenient for links to get in the way.

@bjoernQ
Copy link
Contributor

bjoernQ commented Jan 13, 2025

Ah I see, could that be moved to the hal? I can easily picture an application having multiple versions of the PAC in their project from 3rd party drivers. Given the PAC isn't getting stabilised anytime soon it'd be really inconvenient for links to get in the way.

I can only imagine to copy+paste all the rt feature related code into esp-hal and not enabling the rt feature but that is .... not great

@MabezDev
Copy link
Member Author

Thanks for taking the time to look at this!

I believe esp-hal should include the links key, which it already does

I think I might have the opposite opinion, consider:

  • Someone depending on esp-smartled, which internally depends on [email protected]
  • the same user depends directly on [email protected] for the rest of their project
  • they now have two semver compatible versions of esp-hal in their tree, but their project won't compile because of links

I do not believe it makes sense for xtensa-lx to include the links key; this should probably be removed

Agree

Both of the runtime crates (i.e. esp-riscv-rt and xtensa-lx-rt) should include the links key; these will need to be added

I think I agree, but I'm also wondering what's the benefit, only one esp-hal version in the dep tree will call main, the others should be library only dependencies. I suppose it does stop a user depending directly on the rt crates and esp-hal at the same time? Given that these rt crates aren't 1.0 ready, and that we might change them between esp-hal versions I'd be inclined to actually not add links there.

It's currently not clear to me whether or not esp-println requires the links key; will need further input on the reasons for this requirement

I think we can ignore this one for now, but good to know it has a links field because I forgot 😆.

Including multiple versions of the PACs will likely be problematic, and as such perhaps there should also contain the links key; these would need to be added

I think we should really try and avoid a links field in the PACs.

Here is my controversial take, we should remove all the interrupt stuff out of the PACs and into the HAL, which may require making some changes (or using a fork) of svd2rust. Without that stuff, we are free to upgrade the PACs over time without worry. To me it's always felt a bit odd for the register access library to hold the static of the all the interrupt handlers, moving to the HAL also gives use the option to store the handlers in FLASH (for riscv) with a config option too, though that is a minor benefit.

With above said, I think we can punt this down the line. I don't think we need a links field in the PACs even before we move the interrupt stuff, because __EXTERNAL_INTERRUPTS is only relevant to the crate that's actually using them, it doesn't matter if 4 versions of the PACs and therefore 4 versions of the __EXTERNAL_INTERRUPTS array are in the tree, because the only one that will be used is the one the esp-hal version that creates the entry point uses.

Thoughts?

@bjoernQ
Copy link
Contributor

bjoernQ commented Jan 14, 2025

I suppose it does stop a user depending directly on the rt crates and esp-hal at the same time?

But why would someone depend on the rt crate in their binary? 🤔 And I guess (need to test that) when there are two versions of an rt-crate, people will run into linker issues (duplicate global symbols) which look a lot scarier than what they get with the links-key (while the error they see in that case is still not great in this situation)

I don't think we need a links field in the PACs even before we move the interrupt stuff, because __EXTERNAL_INTERRUPTS is only relevant to the crate that's actually using them, it doesn't matter if 4 versions of the PACs and therefore 4 versions of the __EXTERNAL_INTERRUPTS array are in the tree, because the only one that will be used is the one the esp-hal version that creates the entry point uses.

The __EXTERNAL_INTERRUPTS will be there multiple times and cause linker issues (b.c. they are #[no_mangle]) - even if the rt feature is only set by one crate it will be active for all versions. I guess if we want to support multiple PAC versions, we need to not enable the rt feature (and copy the relevant code into the HAL) - without the rt feature we can have multiple versions of a PAC

@MabezDev
Copy link
Member Author

Somehow I completely missed the #[no_mangle], thanks for pointing that out. Yeah, it seems not enabling the rt is the better way to do it, and handle interrupts in the HAL

@bugadani
Copy link
Contributor

bugadani commented Jan 14, 2025

There can't be two versions of esp-hal in a project because we have a no-mangle _ESP_HAL_DEVICE_PERIPHERALS symbol. The peripheral singletons would be different types, so users would need to separately initialize two hals anyway, which is just a bad idea anyway.

@jessebraham
Copy link
Member

I think I might have the opposite opinion, consider:

* Someone depending on esp-smartled, which internally depends on [email protected]
* the same user depends directly on [email protected] for the rest of their project
* they now have two semver compatible versions of esp-hal in their tree, but their project won't compile because of `links`

Having multiple versions of the HAL in their project will result in duplicated symbols, will it not? e.g. as Daniel pointed out there will now be two different versions of _ESP_HAL_DEVICE_PERIPHERALS.

I suppose it does stop a user depending directly on the rt crates and esp-hal at the same time? Given that these rt crates aren't 1.0 ready, and that we might change them between esp-hal versions I'd be inclined to actually not add links there.

Why does it matter if the runtime crates change between HAL releases? If they are dependencies of the HAL, and users are not intended to include them in their own projects directly, then they will always be using exactly the version specified by the HAL. If we do not add links and they, for whatever reason, end up including the runtime crates in their own projects directly, there will be duplicated symbols again.

Here is my controversial take, we should remove all the interrupt stuff out of the PACs and into the HAL, which may require making some changes (or using a fork) of svd2rust. Without that stuff, we are free to upgrade the PACs over time without worry. To me it's always felt a bit odd for the register access library to hold the static of the all the interrupt handlers, moving to the HAL also gives use the option to store the handlers in FLASH (for riscv) with a config option too, though that is a minor benefit.

What is the concern with upgrading the PACs as their currently stand? Again, they are a direct dependency of the HAL and in most use cases the user should not need to include them as direct dependencies. In the case they do, why would they not want to use the latest published version of the PAC? Why is making this a requirement an issue? I'm not really understanding this line of reasoning, maybe I'm missing something.

I'm also not sure what we gain by creating this extra work for ourselves, this interrupt-related code is all automatically generated for us, where as if we move it to the HAL any changes in the SVDs will require us to either copy-paste the changes in, or make the changes manually in the HAL, which will be an error-prone process and easy to forget to do.

@jessebraham jessebraham removed their assignment Jan 17, 2025
@bugadani
Copy link
Contributor

What is the concern with upgrading the PACs as their currently stand? Again, they are a direct dependency of the HAL and in most use cases the user should not need to include them as direct dependencies. In the case they do, why would they not want to use the latest published version of the PAC? Why is making this a requirement an issue? I'm not really understanding this line of reasoning, maybe I'm missing something.

Third party library authors can choose to depend on a specific PAC version. As you yourself are aware, a breaking PAC change can be rather annoying to apply. If we can allow a driver author to decouple their implementation from the PAC used by esp-hal (by taking the peripheral singleton, then essentially throwing it away and using the PAC internally), I think we can reduce a lot of friction for them.

Since the hardware doesn't change, I doubt there is much value in forcing (working) driver implementations to be updated just to keep it compiling.

@MabezDev
Copy link
Member Author

MabezDev commented Jan 27, 2025

I did some experimenting with this, you can find my full results, which have a break down of the test and the output in each test folder (inside README.md). I'll summarize what I found here, but I do encourage you to play around with it a bit yourself.

https://github.com/MabezDev/semver-playground


PACs

I don't believe PACs should have a links field, nor do I think that the __EXTERNAL_INTERRUPTS symbol, should be #[no_mangle] (it isn't on Xtensa, that's how I missed it the first time :D). __EXTERNAL_INTERRUPTS is a public static, which the HAL directly depends on, therefore we don't need a global namespace symbol for it. Not only do we not need it, it prevents a user directly depending on another version of our PACs to implement their own out of tree driver. but duplicate __EXTERNAL_INTERRUPTS symbol above shouldn't be an issue because only one PAC version in the dependency tree should have the rt feature enabled.

Actionable items:

  • svd2rust issue to not generate #[no_mangle] on __EXTERNAL_INTERRUPTS for RISCV generate #[no_mangle] on Xtensa

HAL

I don't think we need links for esp-hal. In my experimentation, only one major version of a package can exist in tree, e.g cargo will always choose a single 1.x and/or 2.x version etc. Whilst we can be happy that all 1.x versions will always resolve to a single version (we won't get any linker errors here), I decided to expand the test to see what would happen if we ever release [email protected]. In this case we would get a duplicate linker symbol for PERIPHERALS_TAKEN.

Expanding on above, I decided to see if we can utilize semver trick to make 1.x and 2.x interoperable, to avoid ecosystem upgrade churn if we ever need to release 2.0.0. The multiple-hal-semver-trick test explores this, and it works. However, due to the linker error we can't actually build the program (but the fact it get's to linking means that rustc/cargo is happy they're compatible).

Actionable items:

  • Remove the links key from esp-hal
  • Introduce an rt feature to cfg away esp_hal::init etc for libraries

rt crates

I think these should have a links field. If we implement the above to cfg away rt deps when esp-hal is being used a library then the only time a links error will occur is if:

  • A bad library enables the rt feature
  • An end user is trying to use two different HAL versions in the same crate

I don't think these are use cases we want to support. In general whilst multiple versions of the hal can coexist, only one hal version can be the entry point for the program.

Actionable items:

  • Check we don't have links in the xtensa/riscv crates
  • Add a links field to the xtensa-lx-rt/riscv-rt crates, if they don't have them already

Those are my current thoughts, I've listen some action items but I won't convert those to issues until this has had a bit of time to sit and we're sure this is the correct path forward. Just a reminder that the above comments are a brief summary of the actual results, so please read those before replying here, just in case I've already answered your question.

@bugadani
Copy link
Contributor

Introduce an rt feature to cfg away esp_hal::init etc for libraries

It would be possible to reference esp-hal twice, but what would be the point? The peripheral singletons are different types between the HAL versions, so if users initialize esp-hal 2.0, they won't be able to use esp-hal 1.0 drivers with 2.0 peripherals. I guess the option remains to unsafely grab old peripherals to use the old drivers, but that's pretty difficult to do when someone uses DMA or interrupts.

I can understand libraries working with multiple major HAL versions, but those libraries can have exclusive hal-version features, too.

@MabezDev
Copy link
Member Author

It would be possible to reference esp-hal twice, but what would be the point? The peripheral singletons are different types between the HAL versions, so if users initialize esp-hal 2.0, they won't be able to use esp-hal 1.0 drivers with 2.0 peripherals. I guess the option remains to unsafely grab old peripherals to use the old drivers, but that's pretty difficult to do when someone uses DMA or interrupts.

but they can, with semver trick, provided we don't need to fundamentally change anything to do with the peripheral struct itself or the driver constructors in v2.0.0. We can release another minor version of 1.x which uses 2.x peripherals (it depends on a future version of itself) meaning both 1.x and 2.x peripherals/drivers are interoperable. Take a look at the multiple-hal-semver trick test and in particular the 1.3.0 release of hal.

I can understand libraries working with multiple major HAL versions, but those libraries can have exclusive hal-version features, too.

Yeah if semver trick isn't appropriate for the reasons I mentioned above, then yes this is the route to go down.

@bugadani
Copy link
Contributor

Okay, I've read through your material this time :)

I think the #[no_mangle] on __INTERRUPTS/__EXTERNAL_INTERRUPTS is there to make sure there is only a single PAC crate imported with the "rt" feature. Removing #[no_mangle] is a breaking change as it would enable importing multiple PAC crates. Even if another exclusion mechanism is added, users could import one where __EXTERNAL_INTERRUPTS is #[no_mangle] and one where it is not. So while this smells wrong, changing it is also not correct I think.

As these arrays need to be on a set memory location, any change that allows multiple crates with "rt" enabled to exist is wrong. If anything, we should add #[no_mangle] to the Xtensa codegen.

@MabezDev
Copy link
Member Author

MabezDev commented Jan 28, 2025

Thanks for double checking my work! You're right, I completely forgot about the RT feature of the PACs, I'll update the test cases shortly, but I agree that actually we should have the #[no_mangle] but of course only one version of the PAC should have the rt feature enabled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1.0-blocker investigation Not needed for 1.0, but don't know if we can support without breaking the driver. RFC Request for comment
Projects
Status: Todo
Development

No branches or pull requests

6 participants