Should `2a_model.R` use med-obs or well-obs sites (but not both)? #148

lekoenig · 2022-08-16T19:42:11Z

Currently, 2a_model.R builds separate targets and input files for the "well-observed" sites and the "medium-observed" sites. This makes the file more difficult to interpret/remember to update each relevant target, and might be over-engineered for what we need. Will we always want both the medium-observed and well-observed outputs, or is it more likely that we'll just choose one and train the model using that dataset?

What if we created a new target called p2a_observed_sites or something like that that just serves as a toggle for which set of sites we're interested in, e.g.:

# p2_well_observed_sites could be replaced with p2_med_observed_sites below
tar_target(
p2a_observed_sites
   p2_well_observed_sites
),

This way, we would update the downstream targets to depend on p2a_observed_sites instead of either p2_well_observed_sites or p2_med_observed_sites, and we would create just one version of the input/output zarr data store depending on which set of sites we're interested in.

The text was updated successfully, but these errors were encountered:

lekoenig · 2022-08-16T19:47:07Z

^ One place where I'm not sure what the impact of this proposed change would look like is p2a_metrics_files, so I'd also appreciate specific thoughts on that target.

lekoenig · 2022-08-24T20:40:15Z

Assuming we're happy with the medium-observed sites (see #122) and if we don't need to separately define medium- and well-observed sites for the analysis or a paper, we could also just create an object called n_days_well_obs <- 100 in _targets.R and then use that object in place of 300 when defining p2_well_observed_sites and then just get rid of all targets related to "medium observed" data.

I'm in favor of this change because it would simplify 2a_model.R and reduce the chance a coding error gets made, e.g. if we forget to update one reference from "well-observed" to "med-observed". Do you agree, @galengorski, @jsadler2? Any reason to keep both?

galengorski · 2022-08-24T20:46:06Z

I think making the complete switch over to medium observed sites makes sense, it looks like it improves the performance and I can't think of a reason that we would want to be able to toggle back to the well observed sites.

lekoenig mentioned this issue Aug 16, 2022

Add site map to pipeline #149

Merged

lekoenig mentioned this issue Aug 24, 2022

Omit medium-observed targets #154

Merged

lekoenig mentioned this issue Sep 9, 2022

Clean up/finalize baseline LSTM #159

Closed

9 tasks

lekoenig closed this as completed in #154 Sep 30, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Should `2a_model.R` use med-obs or well-obs sites (but not both)? #148

Should `2a_model.R` use med-obs or well-obs sites (but not both)? #148

lekoenig commented Aug 16, 2022

lekoenig commented Aug 16, 2022

lekoenig commented Aug 24, 2022

galengorski commented Aug 24, 2022

Should 2a_model.R use med-obs or well-obs sites (but not both)? #148

Should 2a_model.R use med-obs or well-obs sites (but not both)? #148

Comments

lekoenig commented Aug 16, 2022

lekoenig commented Aug 16, 2022

lekoenig commented Aug 24, 2022

galengorski commented Aug 24, 2022

Should `2a_model.R` use med-obs or well-obs sites (but not both)? #148

Should `2a_model.R` use med-obs or well-obs sites (but not both)? #148