INVESTIGADORES
HOIJEMBERG Pablo Ariel
congresos y reuniones científicas
Título:
Making statistical sense of the 13C-jungle of changing biological mixtures; careful peak deconvolution, STOCSY coefficient-based analysis, component ID, and spectra-to-spreadsheet data reduction
Autor/es:
HOIJEMBERG, PABLO ARIEL; PELCZER, ISTVÁN
Lugar:
Lucca (Barga)
Reunión:
Conferencia; Gordon Research Conference on Computational Aspects of Biomolecular NMR; 2015
Institución organizadora:
Gordon Research Conferences
Resumen:
13C-NMR is becoming a competitive alternative to conventional 1H-NMR based mixture analysis, assistedby a variety of statistical treatments. The emergence of 13C-detection optimized cryoprobes offerssignificantly elevated sensitivity, a critical issue for real-life mixtures.13C chemicals shifts are closely correlated to the molecular structure and they are not nearly as sensitive toenvironmental conditions than those for 1H. In 13C-NMR there are rare cases of overlap, due to the muchlarger dispersion, and the peaks are generally singlets. It also matters that no solvent suppression isnecessary, which is a relief when biological samples are concerned, and data processing is usually easyand straightforward. All this promises that 13C-NMR can very well be a good tool for analysis of verycomplex mixtures of various origin -- biological or other -- including statistical analysis of all kinds.However, the singlet, narrow resonances, the relative noisiness, and large digital size of the spectra comewith some practical difficulties for analysis. Any small deviation of related peaks across the sample set,as little as the linewidth, leads to irregular variations in the statistics, which makes careful peak alignmentessential. Also, the inherently high dynamic range of NMR is somewhat hampered by the relativenoisiness for 13C spectra relative to those of 1H, especially. Also, the empty noise segments are usuallyquite extensive relative to the footprint occupied by the peaks.We have devised tools and built a general protocol, which fit the nature and the conditions of mixtureanalysis by 13C-NMR, eventually concluding to the best protocol of converting most of the analysis tospreadsheet manipulations. The first step is to collect quantitative or semi-quantitative 13C-spectra for theset of samples. The spectra are then automatically peak-picked, which may include inherent curvefitting. Then the spectra may be subjected to a detailed visual inspection, to weed out potential problemsof partial overlap and add missing small peaks for example, followed by careful peak alignment. Thepeaks (including their intensity/integral) can then be passed to regular multivariate analysis, thus ignoringpeak-free noisy segments.13C-NMR is quite suitable to identify selected components based on database and spectrum predictiontools. Our protocol for component ID relies on STOCSY analysis of the peaks, with the software packagecontaining also efficient peak alignment tools (courtesy of K. Veselkov and J.K. Nicholson, ImperialCollege, London, UK). The correlation coefficients plot is reduced to the numbered peaks only, the restof the spectral regimes are discarded. In the next step we organize the peaks/coefficients according to thehierarchal clustering analysis (HCA) organization order along the orthogonal (y) dimension and then takehorizontal traces. The peaks with highest correlation coefficient in these traces identify a compound,which is then cross-referenced with database and/or spectrum prediction information. A case study willbe presented for analyzing 13C-NMR data of honey samples.Both protocols rely heavily on high-quality peak identification (curve fitting) for further analysis, whileignoring the rest of the spectra. We have concluded that the most efficient way for most datamanipulation would be if it were done purely in the spreadsheet, where the peak position and its integralare practically the only parameters to consider. Alignments, clustering, identification of components areobviously very easy and straightforward in this environment -- although none of the current softwarepackages offer a convenient toolkit tailored for such a task. We'll make the argument for, anddo anticipate that the spectra-to-spreadsheet strategy will become a top choice, especially for 13C-NMRbased analysis of mixtures.