In the ZINC20 database, with the aid of maximum substructure searches, common substructures were obtained from molecules with high-strain-energy and combustion heat values, and further provided domain knowledge on how to design high-energy-density hydrocarbon (HEDH) fuels. Notably, quadricyclane and syntin could be topologically assembled through these substructures, and the corresponding assembled schemes guided the design of 20 fuel molecules (ZD-1 to ZD-20). The fuel properties of the molecules were evaluated by using group-contribution methods and density functional theory (DFT) calculations, where ZD-6 stood out due to the high volumetric net heat of combustion, high specific impulse, low melting point, and acceptable flash point. Based on the neural network model for evaluating the synthetic complexity (SCScore), the estimated value of ZD-6 was close to that of syntin, indicating that the synthetic complexity of ZD-6 was comparable to that of syntin. This work not only provides ZD-6 as a potential HEDH fuel, but also illustrates the superiority of learning design strategies from the data in increasing the understanding of structure and performance relationships and accelerating the development of novel HEDH fuels.
Keywords: density functional theory; high-energy density; high-throughput screening; hydrocarbon fuels; materials design.