Black Rail Species Habitat Model (Continuous) - CWHR [ds3250]

Metadata Updated: December 23, 2025

The Range and Distribution Mapping and Analysis Project (RADMAP) in the California Department of Fish and Wildlife’s (CDFW) Biogeographic Data Branch (BDB) develops and maintains spatial models for use in conservation decision making, including species range maps and species habitat models. RADMAP is building a library of vetted species range maps and habitat models within California for use by CDFW staff and partners. This species habitat model (SHM) is a continuous layer depicting a species’ predicted habitat associations within each cell, represented as a value between 0 and 1. Values closer to 1 depict a higher probability that the habitat conditions present support the species within that cell. The focal taxon may or may not actually occur in areas predicted with a high relative probability; habitat may be suitable but unoccupied, particularly for taxa with small and/or declining populations or limited mobility. Models may not accurately reflect all habitats used per taxon due to a dearth of presence data, thereby limiting the scope of environmental space represented by the SHM. Users should refer to the validation metrics and consider the level of uncertainty associated with the model when interpreting model outputs. Areas with high relative predicted values indicate areas where the habitat is most likely to support the species and may be prioritized for locating survey or monitoring sites for scientific studies aiming to conserve and protect focal taxa. Occurrence records were obtained from various sources, as indicated below. To reduce sampling bias and avoid model overfitting, which can reduce model applicability to unsampled areas, we excluded spatially autocorrelated presence records. Owing to the breadth of our study region, we accounted for topographical heterogeneity using a digital elevation model and filtered occurrence records based on this new raster at three distinct distances. Areas with species occurrences and high topographical heterogeneity were filtered at closer Euclidean distances. For potentially ecologically relevant environmental covariate inputs we computed a Pearson correlation coefficient matrix to assess the strength of association among variables; those that were highly autocorrelated (greater than or equal to 0.7) were removed from further analyses and the most ecologically relevant variables kept. All covariates were continuous and formatted at a resolution of 30 m. Potential habitat use was estimated using a maximum entropy approach (Maxent) implemented via R language that relied on presence data and comparisons between environmental covariate values at presence localities and those at randomly selected background sites (Phillips et al. 2006). To demarcate the specific geographical area used for model calibration, background locations were selected via local adaptive convex-hull polygons based on known species’ dispersal and movement limitations. This allowed RADMAP to exclude uncolonized suitable habitat, potentially due to dispersal barriers or inhibitory biotic interactions; it also precluded overfitting models to environmental conditions immediately adjacent to occurrence records, thus improving predictive performance. For each model, we randomly sampled > 10,000 background locations. Five feature-class combinations and five regularization-multiplier settings were adjusted rather than using default Maxent settings to improve model fit. We considered a range of regularization multipliers in integer-sized increments from 1 to 5, then divided data into training and test groups using geographically structured k-fold cross validation (k = 4) to reduce overfitting to environmental conditions among spatial partitions. A total of 25 models were run during this phase of model development. Additionally, 25 full models were run using all available presence data. Models were evaluated with multiple statistics, including test omission error rate (OER) and true skill statistic (TSS), which are threshold-dependent metrics. Second, we generated receiver-operating-characteristic curves and assessed model performance using area-under-the-curve (AUC) analyses for test data, a threshold-independent metric. AUC calculates an average value for the k-folds used on the analysis and assesses the difference between AUC training and test data (AUCdiff), the latter of which is used to quantify overfitting (i.e., lower values indicate better fits). The training model with the lowest OER and highest AUC and TSS values for both test and training models was considered the top model. We calculated the contribution percent of each predictor variable to the top model to identify the explicit role of each in influencing the distribution of a species. Models were extrapolated to the taxon’s range to predict across all potentially occupied areas within California. Results of the Maxent model are included in the pdf attachment, including top model review score, validation metrics, model output details, covariate response curves, percent contribution of covariates to the top model, and a full list of covariates included in the model. The top model results are linked here: https://nrm.dfg.ca.gov/FileHandler.ashx?DocumentID=239507. Each focal taxon’s location data was extracted (when applicable) and collated from the following list of data sources. BIOS datasets are bracketed with their “ds” numbers and can be located on CDFW’s BIOS viewer: https://wildlife.ca.gov/Data/BIOS. California Natural Diversity Database, Terrestrial Species Monitoring [ds2826], North American Bat Monitoring Data Portal, VertNet, Breeding Bird Survey, Wildlife Insights, eBird, iNaturalist, other available CDFW or partner data. Please refer to the Range Map and Species Habitat Model Use Case Guidance document on how best to interpret RADMAP outputs, including range maps, continuous SHMs, and categorical SHMs. Specifically, users should follow these guidelines to determine which products to utilize for conservation, management, and policy decision making use cases: https://nrm.dfg.ca.gov/FileHandler.ashx?DocumentID=222269.

Access & Use Information

Public: This dataset is intended for public access and use. Non-Federal: This dataset is covered by different Terms of Use than Data.gov. License: See this page for license information.

Downloads & Resources

Dates

Metadata Created Date	December 23, 2025
Metadata Updated Date	December 23, 2025

Metadata Source

Data.json Data.json Metadata
Download Metadata

Harvested from State of California

Additional Metadata

Resource Type	Dataset
Metadata Created Date	December 23, 2025
Metadata Updated Date	December 23, 2025
Publisher	California Department of Fish and Wildlife
Maintainer	BIOS_Admin
Identifier	226a0eb7-321a-4584-8a25-a1b4290a55e6
Data First Published	2025-12-10T17:46:52.000Z
Data Last Modified	2025-12-10T17:54:02.000Z
Category	Natural Resources
Public Access Level	public
Metadata Context	https://project-open-data.cio.gov/v1.1/schema/catalog.jsonld
Schema Version	https://project-open-data.cio.gov/v1.1/schema
Catalog Describedby	https://project-open-data.cio.gov/v1.1/schema/catalog.json
Harvest Object Id	7c24c31d-4726-492a-bb59-7cb89ce41017
Harvest Source Id	3ba8a0c1-5dc2-4897-940f-81922d3cf8bc
Harvest Source Title	State of California
License	http://www.opendefinition.org/licenses/cc-by
Source Datajson Identifier	True
Source Hash	963dc4165a75ac55ccff94f95ca5f53846046266d15dd84cd6434f2919b151b5
Source Schema Version	1.1

Didn't find what you're looking for? Suggest a dataset here.

Data Catalog