2013 | Kathleen R. Murphy, Colin A. Stedmon, Daniel Graeber and Rasmus Bro
This article provides a comprehensive tutorial on the practical application of PARAFAC (Parallel Factor Analysis) to fluorescence excitation emission matrices (EEMs) in fluorescence spectroscopy. PARAFAC is a multi-way method used to decompose trilinear data arrays, facilitating the identification and quantification of independent chemical components. The tutorial covers the entire process from data import to model validation and interpretation, with a focus on preparing and modeling fluorescence datasets, particularly those from environmental samples where the number, identity, and behavior of fluorophores are not known.
Key steps include:
1. **Data Import**: Transferring data from instruments to software like MATLAB, using toolboxes such as N-way and Tensorlab.
2. **Preprocessing**: Correcting systematic biases, removing non-fluorescence signals, and normalizing datasets to ensure variability and trilinearity assumptions are met.
3. **Exploratory Data Analysis**: Identifying and removing unrepresentative data, and determining the appropriate number of components.
4. **Model Validation**: Using methods like split-half analysis to confirm the robustness of the model.
5. **Model Refinement**: Applying constraints to improve model stability and interpretability.
6. **Interpreting Results**: Interpreting PARAFAC components to represent independent fluorophores or groups thereof, and tracking changes in fluorescence intensity.
The article also introduces a new MATLAB toolbox, dREEM, designed to support improved visualization and sensitivity analyses of PARAFAC models in fluorescence spectroscopy. The tutorial is supported by a real-world dataset from San Francisco Bay, demonstrating the practical application of PARAFAC in environmental research.This article provides a comprehensive tutorial on the practical application of PARAFAC (Parallel Factor Analysis) to fluorescence excitation emission matrices (EEMs) in fluorescence spectroscopy. PARAFAC is a multi-way method used to decompose trilinear data arrays, facilitating the identification and quantification of independent chemical components. The tutorial covers the entire process from data import to model validation and interpretation, with a focus on preparing and modeling fluorescence datasets, particularly those from environmental samples where the number, identity, and behavior of fluorophores are not known.
Key steps include:
1. **Data Import**: Transferring data from instruments to software like MATLAB, using toolboxes such as N-way and Tensorlab.
2. **Preprocessing**: Correcting systematic biases, removing non-fluorescence signals, and normalizing datasets to ensure variability and trilinearity assumptions are met.
3. **Exploratory Data Analysis**: Identifying and removing unrepresentative data, and determining the appropriate number of components.
4. **Model Validation**: Using methods like split-half analysis to confirm the robustness of the model.
5. **Model Refinement**: Applying constraints to improve model stability and interpretability.
6. **Interpreting Results**: Interpreting PARAFAC components to represent independent fluorophores or groups thereof, and tracking changes in fluorescence intensity.
The article also introduces a new MATLAB toolbox, dREEM, designed to support improved visualization and sensitivity analyses of PARAFAC models in fluorescence spectroscopy. The tutorial is supported by a real-world dataset from San Francisco Bay, demonstrating the practical application of PARAFAC in environmental research.