Skip to content
This repository was archived by the owner on May 28, 2024. It is now read-only.

Med obs model #139

Merged
merged 9 commits into from
Aug 16, 2022
Merged

Med obs model #139

merged 9 commits into from
Aug 16, 2022

Conversation

galengorski
Copy link
Collaborator

@galengorski galengorski commented Jun 17, 2022

Ok this is ready for review now. I realized that the error that I was getting was because I wasn't getting the model to retrain, so I have explicitly run the train_model step in the snakemake file with the -f forcerun command. I'm sure there is a more elegant and flexible solution to this.

@galengorski galengorski force-pushed the med_obs_model branch 2 times, most recently from 70cdc92 to 0ea015e Compare June 29, 2022 18:17
@galengorski galengorski requested a review from jsadler2 July 28, 2022 16:31
@galengorski galengorski marked this pull request as ready for review July 28, 2022 16:31
@galengorski galengorski requested a review from lekoenig July 28, 2022 17:14
Copy link
Collaborator

@lekoenig lekoenig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @galengorski! I haven't run or commented on the changes to the snakemake files, but this looks good to me!

x_vars: ['seg_ccov', 'seg_rain', 'seg_slope', 'seg_tave_air', 'seginc_swrad', 'hru_slope', 'hru_aspect', 'hru_elev', 'hru_area', 'hru_percent_imperv', 'covden_sum', 'covden_win', 'soil_moist_max']
site_set: "well_obs"

x_vars: ['pr','SLOPE','tmmx','srad','CAT_BASIN_SLOPE','CAT_ELEV_MEAN','CAT_BASIN_AREA','CAT_IMPV11', 'CAT_CNPY11_BUFF100','CAT_TWI']
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh! This makes a lot more sense now. We should probably add a comment somewhere in 2a_model.R (maybe above p2a_med_obs_data and p2a_well_obs_data) that reminds a user that x_vars needs to be updated in this config file if any variables are added/omitted.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea! I added a comment in 2a_model above p2a_well_obs_data

@@ -100,7 +101,7 @@ rule make_predictions:
input:
"{outdir}/prepped.npz",
"{outdir}/nstates_{nstates}/nep_{epochs}/rep_{rep}/train_weights/",
"../../../out/well_obs_io.zarr",
"../../../out/"+site_set+"_io.zarr",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this addition to make the "site set" more flexible.

@lekoenig
Copy link
Collaborator

One more note, @galengorski - do you mind linking issue #122 to this PR so that it gets automatically closed whenever you merge? You can do that by clicking the 'development' cog in the upper right corner of this PR page and selecting the relevant issue.

@galengorski galengorski linked an issue Jul 28, 2022 that may be closed by this pull request
Copy link
Collaborator

@jsadler2 jsadler2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work, @galengorski.

I have a few comments and I wanted to touch base about your changes to the 2a_metrics_files target before merging.

2a_model.R Outdated
# the 1_ models use the same model and therefore
# the same Snakefile as the 0_baseline_LSTM run
list(model_id = "1_metab_multitask",
# list(model_id = "1_metab_multitask",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you comment these out? Did you try them and they didn't work?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, just so I could specifically run the baseline models, I will uncomment them

2a_model.R Outdated
Comment on lines 143 to 159
weights_trained_file <- file.path("../../../out/models",p2a_model_ids$model_id,"nstates_10/nep_100/rep_0/train_weights")
output_feather_file <- file.path("../../../out/models",p2a_model_ids$model_id,"nstates_10/nep_100/rep_0/preds.feather")

# First create the prepped data files if they are not already.
# These are needed to make the predictions.
system(stringr::str_glue("snakemake {prepped_data_file} -s {snakefile_path} --configfile {config_path} -j"))
system(stringr::str_glue("snakemake {prepped_data_file} -s {snakefile_path} --configfile {config_path} -j -f"))

system(stringr::str_glue("snakemake {weights_trained_file} -s {snakefile_path} --configfile {config_path} -j -f"))

system(stringr::str_glue("snakemake {output_feather_file} -s {snakefile_path} --configfile {config_path} -j -f"))

# Then touch all of the existing files. This makes the weights "up-to-date"
# so snakemake doesn't train the models again
system(stringr::str_glue("snakemake -s {snakefile_path} --configfile {config_path} -j --touch"))
#system(stringr::str_glue("snakemake -s {snakefile_path} --configfile {config_path} -j --touch"))

# then run the snakemake pipeline to produce the predictions and metric files
system(stringr::str_glue("snakemake -s {snakefile_path} --configfile {config_path} -j --rerun-incomplete"))
system(stringr::str_glue("snakemake -s {snakefile_path} --configfile {config_path} -j -f"))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm understanding this correctly, one problem with this approach is that it will rerun the full Snakemake pipeline the p2a_metrics_files target needs to be run. For example, if I clone this onto my machine, it won't have the metrics file for exp 0, so targets will trigger the rebuilding of that branch of the p2a_metrics_files. Then it will

  • build the prepped data file, which good (line 142)
  • retrain the rep_0 model (line 143)
  • make the rep_0 preds (line 144)
  • build the overall metrics files (line 159) which would include making all of the predictions (unless they are already there, like rep_0 (line 144))

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the reason you added these lines in here was to force a retraining of the model because the previous trained model weights were different than the new, NHD ones. Is that right? I think a cleaner way to do that would be to just delete the previous model weights and run this target as it was written before. That should trigger a retraining of all of the model replicates etc. Did you try that?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah, good point! I think this makes sense, I am going through and testing it to make sure it works. One odd behavior that I am running into: When I got through and delete the weights out/models/0_baseline_LSTM/nstates_10/nep_100/rep_0/train_weights, then rerun p2a_metrics_files using the original set of snakemake commands (prepped.npz,touch, and rerun incomplete), the model produces a new train_weights file for all model ids except 0_baseline_LSTM. Even if I delete all the model output, everything within out/models/0_baseline_LSTM/nstates_10, I still don't get the model to retrain

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strange! So you don't have any outputs at all for 0_baseline_LSTM/nstates_10?

@@ -29,7 +30,7 @@ rule as_run_config:

rule prep_io_data:
input:
"../../../out/well_obs_io.zarr",
"../../../out/"+site_set+"_io.zarr",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe personal preference, but I think this is a little be cleaner. Using Python's string formatting. That way you don't have to open and close the quotes and I think it's a little less error prone since it's harder to accidentally add in an extra space.

Suggested change
"../../../out/"+site_set+"_io.zarr",
f"../../../out/{site_set}_io.zarr",

@@ -100,7 +101,7 @@ rule make_predictions:
input:
"{outdir}/prepped.npz",
"{outdir}/nstates_{nstates}/nep_{epochs}/rep_{rep}/train_weights/",
"../../../out/well_obs_io.zarr",
"../../../out/"+site_set+"_io.zarr",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"../../../out/"+site_set+"_io.zarr",
f"../../../out/{site_set}_io.zarr",

@@ -174,7 +175,7 @@ def get_grp_arg(wildcards):

rule combine_metrics:
input:
"../../../out/well_obs_io.zarr",
"../../../out/"+site_set+"_io.zarr",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"../../../out/"+site_set+"_io.zarr",
f"../../../out/{site_set}_io.zarr",

rule make_obs_preds_plots:
input:
pred_file="{outdir}/nstates_{nstates}/nep_{epochs}/rep_{rep}/preds.feather",
obs_file="../../../out/well_obs_targets.zarr",
obs_file="../../../out/"+site_set+"_io.zarr",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
obs_file="../../../out/"+site_set+"_io.zarr",
obs_file="../../../out/{site_set}_io.zarr",

@lekoenig lekoenig linked an issue Aug 11, 2022 that may be closed by this pull request
@galengorski galengorski requested a review from jsadler2 August 16, 2022 17:06
@galengorski galengorski merged commit 13fa05f into USGS-R:main Aug 16, 2022
@galengorski galengorski deleted the med_obs_model branch August 16, 2022 17:14
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Run model with medium observed sites Evaluate the utility of adding more sites
3 participants