This dataset contains all data and code necessary to reproduce the analysis described under the heading "Experiment 3" in the manuscript:
Taliercio, E., Eickholt, D., Read, Q. D., Carter, T., Waldeck, N., & Fallen, B. (2023). Parental choice and seed size impact the uprightness of progeny from interspecific Glycine hybridizations. Crop Science. https://doi.org/10.1002/csc2.21015
The attached files are:
-
G_max_G_soja_seedweight_seedcolor_analysis.Rmd: RMarkdown notebook containing all analysis code. The CSV data files should be placed in a subdirectory called data within the working directory from which the notebook is rendered.
-
G_max_G_soja_seedweight_seedcolor_analysis.html: Rendered HTML output from RMarkdown notebook, including figures, tables, and explanatory text.
-
counts_seedwt.csv: CSV file containing the number of progeny selected and average 100-seed weight data for each combination of cross, size class, and replicate. Columns are:
- F3_location: text identifier of F3 nursery location, either
"CLA" or "FF"
- plot: numeric ID of plot
- pop: numeric ID of population
- max: name of G. max parent
- soja: name of G. soja parent
- F2_location: text identifier of F2 nursery location, either
"Caswell" or "Hugo"
- n_planted: number of seeds planted (raw)
- n_selected: number of progeny selected
- size_ordered: seed size class, to be converted to an ordered factor
- size_combined: seed size class aggregated to fewer unique levels
- ave_100sw: average 100-seed weight for the given size class
- n_planted_trials: number of seeds planted rounded to nearest integer
-
seedcolor.csv: CSV file with additional data on number of seeds of each color by population. Columns are:
- cross: text identifier of cross
- line: text identifier of line
- light: number of light seeds
- mid: number of mid-green seeds
- brown: number of brown seeds
- dark: number of dark or black seeds
- population: identifier of population type (F2 derived or selected)
- max: name of G. max parent
- n_total: sum of the light, mid, brown, and dark columns
- soja: name of G. soja parent
The data processing and analysis pipeline in the RMarkdown notebook includes:
- Importing the data (slightly cleaned version is provided)
- Creating boxplots of proportion selected by cross, nursery location, and size class
- Fitting logistic GLMM to estimate the probability of selection as a function of parent, 100-seed weight, and their interactions
- Extracting and plotting random effect estimates from model
- Calculating and plotting estimated marginal means from model
- Taking contrasts between pairs of estimated marginal means and trends
- Calculating Bayes Factors associated with the contrasts
- Generating figures and tables for all above results
- Additional seed color analysis: importing data (slightly cleaned version is provided)
- Additional seed color analysis: drawing exploratory bar plot
- Additional seed color analysis: fitting multinomial GLM modeling the proportion of seeds with each color as a function of population
- Additional seed color analysis: generating expected value predictions from GLM and taking contrasts
- Additional seed color analysis: creating figures and tables for model results
This research was funded by CRIS 6070-21220-069-00D, United Soybean Board Project # 2333-203-0101, and falls under National Program NP301.
Resources in this dataset:
Resource Title: RMarkdown document with all analysis code.
File Name: G_max_G_soja_seedweight_seedcolor_analysis.Rmd
Resource Title: Rendered HTML version of notebook.
File Name: G_max_G_soja_seedweight_seedcolor_analysis.html
Resource Title: Progeny counts and seed weight data.
File Name: counts_seedwt.csv
Resource Title: Seed color counts data.
File Name: seedcolor.csv