Frequent itemset mining (FIM), a technique used for finding patterns in consumer purchasing behavior, can be applied to data from large-scale biomonitoring studies to identify combinations of chemicals that frequently co-occur in people. As a proof of concept, we applied FIM to biomonitoring data from the National Health and Nutrition Examination Survey. In this way, we identified 90 chemical combinations consisting of relatively few chemicals that occur in at least 30% of the US population, as well as 3 super-combinations consisting of relatively many chemicals that occur in a small but non-negligible proportion of the US population. Thus, we have demonstrated a technique for narrowing a large number of possible chemical combinations down to a much smaller collection of prevalent chemical combinations.
This dataset is associated with the following publication:
Kapraun, D.F., J.F. Wambaugh, R. Tornero-Velez, and R.W. Setzer. (ENVIRONMENTAL HEALTH PERSPECTIVES) Identifying Prevalent Chemical Mixtures in the US Population. ENVIRONMENTAL HEALTH PERSPECTIVES. National Institute of Environmental Health Sciences (NIEHS), Research Triangle Park, NC, USA, 125(8): 1-16, (2017).