CCSP 2.0: An Open-Source Jupyter Tool for the Prediction of Ion Mobility Collison Cross Sections in Metabolomics
Watson, Chandler Avery
MetadataShow full item record
Tandem mass spectrometric methods revolutionized the chemical identification landscape, allowing serums and molecules to be separated in two or more dimensions. Ion Mobility Mass Spectrometry workflows combined with liquid or gas chromatographic separation have continued to progress chemical identification and further increase the amount and confidence of these identities. Such advancements have also given birth to a new molecular descriptor: the Collision Cross Section, sparking heavy interest in the analytical-computational chemistry to compile these values for known molecules. The main shortcoming has been predicting the CCS value for new molecules such as Poly-Fluorinated Alkyl Sub-stances. Preliminary prediction software has revealed that predicting CCS values for this molecular class is possible, but it can prove temporally, computationally, and financially expensive between different licenses and genetic algorithm. This work combines open-source Python modules (NumPy, Mordred, Pandas, etc.) to construct an alternative workflow that is completely free and capable of running on a mid-specification laptop within a half hour. Using the M-H and combined M+H and M-H datasets taken from the McClean CCS Compendium, median prediction errors of 2.07% and 1.84%, respectively, were found using Support Vector Regression within 5 minutes on a mid-spec laptop, satisfying the 2.50% benchmark. This overall success illustrates the power and versatility of this workflow to produce low errors with datasets as large as 1300+ molecules and as few as 37. This script can be distributed on file-sharing sites like GitHub where other users may customize the free source code to fit their experimental needs.