CIBION   24492
CENTRO DE INVESTIGACIONES EN BIONANOCIENCIAS "ELIZABETH JARES ERIJMAN"
Unidad Ejecutora - UE
congresos y reuniones científicas
Título:
Improving identification of compounds in metabolomic studies through correlation and statistics
Autor/es:
PELCZER, ISTVÁN; HOIJEMBERG, PABLO ARIEL
Lugar:
Chicago, IL
Reunión:
Conferencia; 2017 Pittsburg Conference on Analytical Chemistry & Applied Spectroscopy (Pittcon); 2017
Institución organizadora:
Pittcon
Resumen:
In an untargeted metabolomic study the search for biomarker molecules serves to answer many questions, for which there is a need to learn the identity of these metabolites. Several dozen metabolites are normally detected by NMR analysis of biofluids in measurable quantities, where the spectra can show about a few hundred peaks.The identification of compounds is normally done with the aid of commercial software packages containing their own databases, by literature search, and/or by searches in public databases by lists of chemical shifts. The input for the database query can be improved using STOCSY, which is attainable given the amount of data collected for the multivariate data analysis.Despite its usefulness, the STOCSY analysis is tedious and cumbersome, normally obtaining a trace by selecting a driver peak to find peaks highly correlated to it, and it is performed in a trace-by-trace fashion over all peaks of interest. Here we present a methodology developed to reduce the analysis time by increasing information recovery applying further statistical analysis on the ?information redundant? STOCSY correlation matrix (all peaks included), yielding lists of peaks for database queries that produce more reliable hits during identification.The methodology adds an automated step after the correlation matrix calculation to group traces from different driver peaks based on their similarity. As the STOCSY tool suffers from overlapped peaks, good alternatives to 1D 1H spectra are its use on 1D 13C spectra, 1D projections from 1H 2D J-Resolved spectra, and small-size data matrices like in a ?spectrum-to-spreadsheet? procedure. Examples of the application of this methodology on biological samples and synthetic mixtures will be addressed.