Chemo Metrics

March 29, 2018 | Author: sukh_singh_193745 | Category: Chemometrics, Statistics, Scientific Method, Applied Mathematics, Analysis


Comments



Description

ChemometricsChemometrics A data collection task, whether in science, business or engineering, typically involves many measurements made on many samples. Such multivariate data has traditionally been analyzed using one or two variables at a time. However, this approach misses the point; to discover the relationships among all samples and variables efficiently, we must process all of the data simultaneously. Enter chemometrics. Chemometrics is the field of extracting information from multivariate chemical data using tools of statistics and mathematics. Chemometrics is typically used for one or more of three primary purposes: y y y To explore patterns of association in data; To track properties of materials on a continuous basis; and To prepare and use multivariate classification models. The algorithms in primary use in the field have demonstrated a significant capacity for analyzing and modeling a wide assortment of data types for an even more diverse set of applications. Exploratory Data Analysis Patterns of association exist in many data sets, but the relationships between samples can be difficult to discover when the data matrix exceeds three or more features. Exploratory data analysis can reveal hidden patterns in complex data by reducing the information to a more comprehensible form. Such a chemometric analysis can expose possible outliers and indicate whether there are patterns or trends in the data. Exploratory algorithms such as principal component analysis (PCA) and hierarchical cluster analysis (HCA) are designed to reduce large complex data sets into a series of optimized and interpretable views. These views emphasize the natural groupings in the data and show which variables most strongly influence those patterns. Continuous Property Regression In many applications, it is expensive, time consuming or difficult to measure a property of interest directly. Such cases require the analyst to predict something of interest based on related properties that are easier to measure. The goal of chemometric regression analysis is to develop a calibration model which correlates the information in the set of known measurements to the desired property. Chemometric algorithms for performing regression include partial least squares (PLS) and principal component regression (PCR) and are designed to avoid problems associated with noise and correlations in the data. Because the regression algorithms used are based in factor analysis, the entire group of known measurements is considered simultaneously, and information about correlations among the variables is automatically built into the calibration model. Chemometric regression lends itself handily to the on-line monitoring and process control industry, where fast and inexpensive systems are needed to test, predict and make decisions about product quality. applied mathematics. When these techniques are used to create a classification model. Contents [hide] y y y 1 Introduction 2 Origins 3 Techniques o 3. biochemistry and chemical engineering.1 Multivariate calibration .Classification Modeling Many applications require that samples be assigned to predefined categories. An very nice introduction to the use of chemometrics with spectroscopic data was written by Steve Brown and is presented on the SpectroscopyNow web site. It is a highly interfacial discipline. Chemometrics research is occurring at many locations around the globe. in which categories are already known. Visit some of these chemometrics sites to get a feel for the type of work that can be accomplished through a multivariate approach. or predicting an unknown sample as belonging to one of several distinct groups. a chemometric system can be built that is objective and thereby standardize the data evaluation process. or "classes". but to investigate and address problems in chemistry. k-nearest neighbor (KNN) and soft independent modeling of class analogy (SIMCA) are primary chemometric workhorses. search Chemometrics is the science of extracting information from chemical systems by data-driven means. you may find ideas that will benefit your own work. the free encyclopedia Jump to: navigation. the answers provided are more reliable and include the ability to reveal unusual samples in the data. In this manner. using methods frequently employed in core dataanalytic disciplines such as multivariate statistics. A classification model is used to predict a sample's class by comparing the sample to a previously analyzed experience set. it mirrors several other interfacial µ-metrics¶ such as psychometrics and econometrics. and computer-science. This may involve determining whether a sample is good or bad. In this way. >>>>>>>>>>>>> Chemometrics From Wikipedia. properties of chemical systems are modeled with the intent of learning the underlying relationships and structure of the system (i.4 Other Techniques 4 Software 5 Further Reading 6 References 7 External links o o o [edit] Introduction Chemometrics is applied to solve both descriptive and predictive problems in chemistry.y y y y 3.. The data resulting from infrared and UV/visible spectroscopy are often easily numbering in the thousands of measurements per sample. Wold was a professor of organic chemistry at Umea University in Sweden. In descriptive applications. and thus while the standard chemometric methodologies are very widely used industrially.e. Many early applications involved multivariate classification.3 Multivariate Curve resolution 3. Mass spectroscopy. [edit] Origins Although one could argue that even the earliest analytical experiments in chemistry involved a form of chemometrics. and the development of improved chemometric methods of analysis also continues to advance the state of the art in analytical instrumentation and methodology. the datasets are often very large and highly complex. and by the late 1970¶s and early 1980¶s a wide variety of data. The term µchemometrics¶ was coined by Svante Wold in 1974 [1]. and the International Chemometrics Society was formed shortly thereafter by Wold and Bruce Kowalski. and Kowalski was a professor of analytical chemistry at University of Washington in Seattle. and hundreds to millions of cases or observations.and computer-driven chemical analyses were occurring. academic groups dedicated to the continued development of chemometric theory and methods are relatively rare. clustering 3. In predictive applications. atomic emission/absorption and chromatography experiments are also all by nature highly multivariate. pattern recognition. The structure of these data was found to be conducive to using techniques such as principal . It is an application driven discipline. nuclear magnetric resonance. properties of chemical systems are modeled with the intent of predicting new properties or behavior of interest. Chemometric techniques are particularly heavily used in analytical chemistry. the field is generally recognized to have emerged in the 1970¶s as computers became increasingly exploited for scientific investigation.2 Classification. In both cases. Multivariate analysis was a critical facet even in the earliest applications of chemometrics. model identification). involving hundreds to tens of thousands of variables. two pioneers in the field. numerous quantitative predictive applications followed. NMR spectra and mass spectra. Through the 1980¶s three dedicated journals appeared in the field: Journal of Chemometrics (from John Wiley & Sons). which includes reference values for the properties of interest for prediction. Several important books/monographs on chemometrics were also first published in the 1980¶s. Applied Spectroscopy. Multivariate calibration techniques such as partial-least squares regression. These journals continue to cover both fundamental and methodological research in chemometrics. [edit] Techniques [edit] Multivariate calibration Many chemical problems and applications of chemometrics involve calibration. Anal. Chim. clustering. proteomics and metabonomics. Analytical Chemistry. The process requires a calibration or training data set. Talanta). . An account of the early history of chemometrics was published as a series of interviews by Geladi and Esbensen [6] [7]. 3) multivariate process conditions/states to final product attributes. and pattern recognition. Partial least squares in particular was heavily used in chemometric applications for many years before it began to find regular use in other fields. als ³Chemometrics: a textbook´ [4]. At present. while the datasets may be highly multivariate there is strong and often linear low-rank structure present. and ³Multivariate Calibration´ by Martens and Naes [5]. Acta. The objective is develop models which can be used to predict properties of interest based on measured properties of the chemical system.g. including concentrations for an analyte of interest for each sample (the reference) and the corresponding infrared spectrum of that sample. exploiting the interrelationships or µlatent variables¶ in the data. PCA and PLS have been shown over time very effective at empirically modeling the more chemically interesting low-rank structure. process modeling and process analytical technology. Examples include the development of multivariate models relating 1) multiwavelength spectral response to analyte concentration. Some large chemometric application areas have gone on to represent new domains. one can assemble data from a number of samples. such as pressure. Sharaf. and providing alternative compact coordinate systems for further numerical analysis such as regression. or principal component regression (and near countless other methods) are then used to construct a mathematical model that relates the multivariate response (spectrum) to the concentration of the analyte of interest. for example. Chemometrics and Intelligent Laboratory Systems (from Elsevier [1]). and the measured attributes believed to correspond to these properties. most routine applications of existing chemometric methods are commonly published in application-oriented journals (e. This is primarily because. 2) molecular descriptors to biological activity. For case 1). Illman and Kowalski¶s ³Chemometrics´ [3].. including the first edition of Malinowski¶s ³Factor Analysis in Chemistry´ [2].. temperature. the µ-omics¶ fields of genomics. and such a model can be used to efficiently predict the concentrations of new samples. Raman. infrared. such as molecular modeling and QSAR. and partial least-squares (PLS). flow. and Journal of Chemical Information and Modeling (from the American Chemical Society). cheminformatics.components analysis (PCA). Massart et. [edit] Multivariate Curve resolution In chemometric parlance. multivariate curve resolution methods can be used to extract the fluorescence spectra of the individual fluorophores. regression/classification trees. and hence inverse approaches tend to be more frequently applied in contemporary multivariate calibration. which are extremely broad and nonselective compared to other analytical techniques (such as infrared or Raman spectra). The problem is usually ill-determined due to . The main advantages of the use of multivariate calibration techniques is that fast.. [edit] Classification. concentrations. Equally important is that multivariate calibration allows for accurate quantitative analysis in the presence of heavy interference by other analytes. pattern recognition.g. and again many of the core techniques used in chemometrics are common to other fields such as machine learning and statistical learning. Some of the earliest work on these techniques was done by Lawton and Sylvestre in the early 1970's. expensive or destructive testing (such as HPLC). clustering Supervised multivariate classification techniques are closely related to multivariate calibration techniques in that a calibration or training set is used to develop a mathematical model capable of classifying future samples. The techniques employed in chemometrics are similar to those used in other fields ± multivariate discriminant analysis. blind source/signal separation. neural networks. and at least in theory provide superior predictions in the mean-squared error sense [10][11][12] . as the analytical measurement modalities.g. The selectivity of the analytical method is provided as much by the mathematical calibration.[13][14] These approaches are also called self-modeling mixture analysis. essentially unmixing the total fluorescence spectrum into the contributions from the individual components. along with their relative concentrations in each of the samples. can often be used successfully in conjunction with carefully developed multivariate calibration methods to predict concentrations of analytes in very complex matrices. spectra) and can therefore be considered optimal descriptors. For example near-infrared spectra.[9] Inverse methods usually require less physical knowledge of the chemical system. optimal predictors). cheap. logistic regression. Unsupervised classification (also termed cluster analysis) is also commonly used to discover patterns in complex data sets. The use of rank reduction techniques in conjunction with these conventional classification methods is routine in chemometrics. [5][8] The principal difference between these approaches is that in classical calibration the models are solved such that they are optimal in describing the measured analytical responses (e. multivariate curve resolution seeks to deconstruct data sets with limited or absent reference information and system knowledge. For example. and spectral unmixing..Techniques in multivariate calibration are often broadly categorized as classical or inverse methods. from a data set comprising fluorescence spectra from a series of samples each containing multiple fluorophores. or nondestructive analytical measurements (such as optical spectroscopy) can be used to estimate sample properties which would otherwise require time-consuming. whereas in inverse methods the models are solved to be optimal in predicting the properties of interest (e. for example discriminant analysis on principal components or partial least squares scores. Refer to Tauler and de Juan for recent comprehensive reviews.[21][22][23] Spectroscopy has been used successfully for online monitoring of manufacturing processes for 30-40 years. multiway methods are applied to data sets that involve 3rd. unmodality. The techniques employed commonly in chemometrics are often closely related to those used in related fields. sensitivity. and this process data is highly amenable to chemometric modeling. verification & validation. kinetic or mass-balance constraints). chemometrics is quantitatively oriented.[24] or the newer term process analytical technology continues to draw heavily on chemometric methods and MSPC. including multivariate definitions of selectivity.[20] Chemometric model selection usually involves the use of tools such as resampling (including bootstrap. for . Multivariate curve resolution is commonly applied in the study of chemical reactions and processes. and increasingly in chemical hyperspectral imaging. 4th. model selection. Multiway methods are heavily used in chemometric applications. such as non-negatively. and the performance of classifiers as a true-positve rate/false-positive rate pairs (or a full ROC curve). SNR and prediction interval estimation. or second-order arry) of data is routine in several fields. multiway modeling of batch and continuous processes is increasingly common in industry and remains an active area of research in chemometrics and chemical engineering. Multivariate statistical process control (MSPC). cross-validation). Data of this type is very common in chemistry. or known interrelationships between the individual components (e. and figures of merit Like most arenas in the physical sciences. and figures of merit. so considerable emphasis is placed on performance characterization..[25][26] These are higher-order extensions of more widely used methods.[19] Performance characterization. For example. particularly the use of signal pretreatments to condition data prior to calibration or classification. or higher-orders. modeling and optimization accounts for a substantial amount of historical chemometric development.[15][16] [edit] Other Techniques Experimental design remains a core area of study in chemometrics and several monographs are specifically devoted to experimental design in chemical applications. Process analytical chemistry as it was originally termed. A recent report by Olivieri et al.g. The performance of quantitative models is usually specified by root mean squared error in predicting the attribute of interest. permutation. although many complex experiments are purely observational. so the application of additional constraints is common. while the analysis of a table (matrix. Signal processing is also a critical component of almost all chemometric applications. provides a comprehensive overview of figures of merit and uncertainty estimation in multivariate calibration. [17][18] Sound principles of experimental design have been widely adopted within the chemometrics community. Specifically in terms of MSPC.rotational ambiguity (many possible solutions can equivalently represent the measured data). and there can be little control over the properties and interrelationships of the samples and sample properties. . process variables vs. Support Vector Machines is another very powerful tool for dealing with classification problems in nonlinear datasets. Batch process modeling involves data sets that have time vs. trilinear decomposition. The multiway mathematical methods applied to these sorts of problems include PARAFAC.example a liquid-chromatography / mass spectroscopy (LC-MS) system generates a large matrix of data (elution time versus m/z) for each sample analyzed. The data across multiple samples thus comprises a data cube. batch number. and multiway PLS and PCA.
Copyright © 2024 DOKUMEN.SITE Inc.