Skip to content
This repository was archived by the owner on May 28, 2024. It is now read-only.

Should 2a_model.R use med-obs or well-obs sites (but not both)? #148

Closed
lekoenig opened this issue Aug 16, 2022 · 3 comments · Fixed by #154
Closed

Should 2a_model.R use med-obs or well-obs sites (but not both)? #148

lekoenig opened this issue Aug 16, 2022 · 3 comments · Fixed by #154

Comments

@lekoenig
Copy link
Collaborator

Currently, 2a_model.R builds separate targets and input files for the "well-observed" sites and the "medium-observed" sites. This makes the file more difficult to interpret/remember to update each relevant target, and might be over-engineered for what we need. Will we always want both the medium-observed and well-observed outputs, or is it more likely that we'll just choose one and train the model using that dataset?

What if we created a new target called p2a_observed_sites or something like that that just serves as a toggle for which set of sites we're interested in, e.g.:

# p2_well_observed_sites could be replaced with p2_med_observed_sites below
tar_target(
p2a_observed_sites
   p2_well_observed_sites
),

This way, we would update the downstream targets to depend on p2a_observed_sites instead of either p2_well_observed_sites or p2_med_observed_sites, and we would create just one version of the input/output zarr data store depending on which set of sites we're interested in.

@lekoenig
Copy link
Collaborator Author

^ One place where I'm not sure what the impact of this proposed change would look like is p2a_metrics_files, so I'd also appreciate specific thoughts on that target.

@lekoenig
Copy link
Collaborator Author

Assuming we're happy with the medium-observed sites (see #122) and if we don't need to separately define medium- and well-observed sites for the analysis or a paper, we could also just create an object called n_days_well_obs <- 100 in _targets.R and then use that object in place of 300 when defining p2_well_observed_sites and then just get rid of all targets related to "medium observed" data.

I'm in favor of this change because it would simplify 2a_model.R and reduce the chance a coding error gets made, e.g. if we forget to update one reference from "well-observed" to "med-observed". Do you agree, @galengorski, @jsadler2? Any reason to keep both?

@galengorski
Copy link
Collaborator

I think making the complete switch over to medium observed sites makes sense, it looks like it improves the performance and I can't think of a reason that we would want to be able to toggle back to the well observed sites.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants