<p>Lake temperature is an important environmental metric for understanding habitat suitability for many freshwater species and is especially useful when temperatures are predicted throughout the water column (known as temperature profiles). In this data release, multiple modeling approaches were used to generate predictions of daily temperature profiles for thousands of lakes in the Midwest. <br/> <br/> Predictions were generated using two modeling frameworks: a machine learning model (specifically an entity-aware long short-term memory or EA-LSTM model; Kratzert et al., 2019) and a process-based model (specifically the General Lake Model or GLM; Hipsey et al., 2019). Both the EA-LSTM and GLM frameworks were used to generate lake temperature predictions in the contemporary period (1979-04-12 to 2022-04-11 for EA-LSTM and 1980-01-01 to 2021-12-31 for GLM; times differ due to modeling spin-up/spin-down configurations) using the North American Land Data Assimilation System [NLDAS; Mitchell et al., 2004] as meteorological drivers. In addition, GLM was used to generate lake temperature predictions under future climate scenarios (covering 1981-2000, 2040-2059, and 2080-2099) using six dynamically downscaled Global Climate Models (GCM; Notaro et al., 2018) as meteorological drivers. Appropriate application of the six GCMs is dependent on the use-case and will be up to the user to determine. For an example of a similar analysis in the Midwest and Great Lakes region using 31 GCMs, see Byun and Hamlet, 2018. <br/> <br/> The modeling frameworks and driver datasets have slightly different footprints and input data requirements. This means that some of the lakes do not meet the criteria to be included in all three modeling approaches, which results in different numbers of lakes in the output (noted in the file descriptions below). The input data requirements for lakes to be included in the EA-LSTM predictions are lake latitude, longitude, elevation, and surface area, plus NLDAS drivers at the lake's location. All 62,966 lakes included this data release met these requirements. The input data requirements for lakes to be included in the contemporary GLM NLDAS-driven predictions are lake location (within one of the following 11 states: North Dakota, South Dakota, Iowa, Michigan, Indiana, Illinois, Wisconsin, Minnesota, Missouri, Arkansas, and Ohio), latitude, longitude, maximum depth (though more detailed hypsography was used where available), surface area, and a clarity esitmate, plus NLDAS drivers at the lake's location. 12,688 lakes included this data release met these requirements. The input data requirements for lakes to be included in the future climate scenario GCM-driven predictions were the same as for the contemporary GLM predictions, except GCM drivers at the lake's location were required in place of NLDAS drivers. 11,715 lakes included this data release met these requirements. <br/> <br/> This data release includes the following files:</p> <ol> <li><b>lake_locations.zip</b>: shapefiles with the centroid of each lake (62,966 lakes)</li> <li><b>lake_metadata.csv</b>: metadata for each lake with predictions available (62,966 lakes)</li> <li><b>lake_id_crosswalk.csv</b>: mapping between the identifications for lakes used in this data release to state and other organization systems</li> <li><b>lake_hypsography.csv</b>: lake-specific area-depth relationships (13,785 lakes)</li> <li><b>lake_temperature_observations.zip</b>: temperature observational data used in training and/or evaluation (8,760 lakes)</li> <li><b>meteorological_inputs_GCM.zip</b>: meteorological input data for future climate scenarios, zipped NetCDF files. One NetCDF file per climate model (see the "lake_metadata.csv" file for how to map the lakes to the cells in these NetCDF files).</li> <li><b>meteorological_inputs_NLDAS_{GROUP}.zip</b>: meteorological input data for the contemporary period organized into grids, groups of zipped CSV files (see the "lake_metadata.csv" file for how to map the lakes to these files).</li> <li><b>lake_temp_preds_EALSTM_NLDAS_AR-MN.zip</b>: daily lake temperature profiles for the contemporary period generated by the EA-LSTM model. The zip folder contains a NetCDF file for each of the following states: AR, IA, IL, IN, KS, KY, LA, MI, and MN. Includes data for 33,646 lakes across these 9 states.</li> <li><b>lake_temp_preds_EALSTM_NLDAS_MO-WY.zip</b>: daily lake temperature profiles for the contemporary period generated by the EA-LSTM model. The zip folder contains a NetCDF file for each of the following states: MO, MS, MT, ND, NE, OH, OK, SD, TN, TX, WI, and WY. Includes data for 29,320 lakes across these 12 states.</li> <li><b>lake_temp_preds_GLM_NLDAS.zip</b>: daily lake temperature profiles for the contemporary period generated by GLM. The zip folder contains a NetCDF file for each of the following states: AR, IA, IL, IN, MI, MN, MO, ND, OH, SD, and WI. Includes data for 12,688 lakes across these 11 states.</li> <li><b>lake_temp_preds_GLM_GCM_{CLIMATE MODEL}.zip</b>: daily lake temperature profiles for future climate scenarios generated by GLM, one zip file per climate model. Each zip file contains a NetCDF file for each of the following states: AR, IA, IL, IN, MI, MN, MO, ND, OH, SD, and WI. Includes data for 11,715 lakes across these 11 states.</li> <li><b>lake_temp_metrics_GLM_NLDAS.feather</b>: annual lake temperature metrics for the contemporary period derived from daily predictions generated by GLM (12,688 lakes)</li> <li><b>lake_temp_metrics_GLM_GCM.feather</b>: annual lake temperature metrics for future climate scenarios derived from daily predictions generated by GLM (11,715 lakes)</li> <li><b>lake_temp_model_evaluation_metrics.csv</b>: overall and seasonal evaluation metrics for each model + meteorological driver dataset</li> <li><b>extract_output_from_netCDFs.R</b>: an R script showing examples for how to pull lake temperature predictions and meteorological data from the NetCDF files</li> <li><b>netCDF_extract_utils.R</b>: an R script containing functions used in "extract_output_from_netCDFs.R"</li> <li><b>lake_locations.png</b>: a figure showing the centroids for all 62,966 lakes included in this data release</li> </ol> <p>This work was completed with funding support from the Midwest Climate Adaptation Science Center (MW CASC) and as part of the USGS project on Predictive Understanding of Multiscale Processes (PUMP), an element of the Integrated Water Prediction Program, supported by the Water Availability and Use Science Program to advance multi-scale, integrated modeling capabilities to address water resource issues. Access to computing facilities was provided by USGS Advanced Research Computing, USGS Tallgrass Supercomputer (<a href="https://doi.org/10.5066/F7D798MJ">doi.org/10.5066/F7D798MJ</a>).</p>