|
|
ISSN print edition: 0366-6352
ISSN electronic edition: 1336-9075
Registr. No.: MK SR 9/7
Published monthly
|
Machine learning prediction of organic moieties from the IR spectra, enhanced by additionally using the derivative IR data
Maurycy Krzyżanowski and Grzegorz Matyszczak
Faculty of Chemistry, Warsaw University of Technology, Warsaw, Poland
E-mail: bk.maurycy@gmail.com
Received: 14 September 2023 Accepted: 2 January 2024
Abstract: AbstractInfrared spectroscopy is a crucial analytical tool in organic chemistry, but interpreting IR data can be challenging. This study provides a comprehensive analysis of five machine learning models: logistic regression, KNN (k-nearest neighbors), SVM (support vector machine), random forest, and MLP (multilayer perceptron), and their effectiveness in interpreting IR spectra. The simple KNN model outperformed the more complex SVM model in execution time and F1 score, proving the potential of simpler models in interpreting the IR data. The combination of original spectra with its corresponding derivatives improved the performance of all models with a minimal increase in execution time. Denoising of the IR data was investigated but did not significantly improve performance. Although the MLP model showed better performance than the KNN model, its longer execution time is substantial. Ultimately, KNN is recommended for rapid results with minimal performance compromise, while MLP is suggested for projects prioritizing accuracy despite longer execution time. Graphical abstract
Keywords: Machine learning; Infrared spectroscopy; Data preparation; Spectrum derivative
Full paper is available at www.springerlink.com.
DOI: 10.1007/s11696-024-03301-z
Chemical Papers 78 (5) 3149–3173 (2024)
|