The environment and the chemicals to which we are exposed is incredibly complex, with over 100 million chemicals in the largest chemical registry and over 70,000 in household use alone. We are not even able to enumerate all chemically possible small molecules. Instead, all detectable molecules in complex samples can now be captured using high resolution mass spectrometry (HRMS). Non-target HRMS provides a “snapshot” of all chemicals present in a sample and allows for retrospective data analysis through digital archiving. However, while typical HRMS measurements yield tens of thousands of features, scientists are still unable to identify the vast majority of these, leading to critical bottlenecks in identification and data interpretation. Identifying the chemical unknowns in living organisms and our environment is essential for unravelling the causes of disease and toxicity, improving our understanding of biological processes and developing new strategies to counteract disease. For instance, the causes of Parkinson’s disease (PD), a disease affecting >6.2 million people worldwide, are largely unclarified and hypothesized to be due to a complex combination of environmental and genetic factors. Recent studies indicate a strong connection between the gut microbiome and PD, yet over 60 % of significant metabolites in microbiome experiments are unknown. While non-target HRMS methods now provide a basis to identify unknowns, true unknown elucidation based on HRMS remains extremely time consuming and, in many cases, a matter of luck. Prioritizing efforts to find significant metabolites or potentially toxic substances responsible for observed effects is the key, which involves reconciling highly complex samples with expert knowledge and careful validation. We need to initiate a fundamental shift away from single-substance assessments and develop generic approaches to connect chemical and biological knowledge with signals observed in real samples, scalable to tens of thousands of chemicals, features and samples.This project, ECHIDNA (ENvironmental CHemnformatics to IDentify uNknown chemicals And their effects), will build computational methods suitable for investigating and elucidating unknowns and causes of effects using HRMS of small molecules. Computational and experimental developments will improve structure elucidation, including cheminformatics approaches as well as stable and dynamic isotope labelling of samples. New cheminformatics methods will improve our understanding of the fundamentals of HRMS and work towards the “holy grail” of a full Computer-Assisted Structure Elucidation (CASE) system for HRMS. A microbiome-PD cohort study will yield complex samples with patient/early stage/control information allowing a discovery-based prioritization of potential neurotoxins with target and non-target HRMS. A combination of experimental bioassays and computational toxicity methods will help prioritize potential neurotoxins amongst these samples. The simple biological systems E. coli and yeast will yield information on known and unknown metabolites in these well-understood systems, with sufficient material to allow for detailed elucidation efforts of unknowns with nuclear magnetic resonance spectroscopy. This project involves several internal partners within the Luxembourg Centre for Systems Biomedicine (LCSB), the Enzymology & Metabolism Group, Eco-Systems Biology Group and the Bioinformatics Core. Three non-contracting partners include Friedrich-Schiller University Jena (Cheminformatics and Computational Metabolism), ETH Zurich (Institute for Molecular Systems Biology) and Helmholtz Centre for Environmental Research (Effect-Directed Analysis). Together this forms an exciting and talented consortium to tackle the formidable challenge of identifying the unknown small molecules and their toxicological effects.