Comparison of methods to determine chromophore concentrations from fluorescence spectra of turbid samples.

BACKGROUND AND OBJECTIVE
This paper compares the ability of three analytic techniques to predict chromophore concentration from fluorescence emission spectra of homogeneous, turbid samples with optical properties similar to human tissue.


STUDY DESIGN, MATERIALS AND METHODS
Two models of light propagation were implemented (exponential attenuation, two flux Kubelka-Munk theory); a priori information about sample optical properties was used to analyze data with the two flux Kubelka-Munk model. The third data analysis technique utilizes the method of partial least squares (PLS) to develop an empirical, linear model of sample fluorescence from a training set with optical properties and known concentrations representative of those to be predicted. This model can be applied to predict chromophore concentrations in the unknown samples.


RESULTS
Of the three methods, PLS achieved the most accurate results and was able to predict fluorophore concentration to within +/- 6% of true values.


CONCLUSION
We investigated conditions under which PLS predictions were most accurate and find that best results are achieved when predictions are based on fluorescence emission spectra at more than one excitation wavelength with inclusion of the tail of Rayleigh scattering at the excitation wavelength.


INTRODUCTION
words: fluorescence spectroscopy, partial least squares, scattering, tissue, two flux Kubelka-Munk theory tra. These fluorophores are contained within a scattering and abiorbing matrix that affects the many biologically important mol-Only limited attempts have been amount of excitation light that can reach the fluorophores and the amount of fluorescence that ecules can escape the tissue surface and reach the detector. Although we presented preliminary data comparing a number of methods for predicting fluorophore concentration [7], to our knowledge, a comprehensive study including samples that exhibit a wide range of optical properties typical of a variety of tissues has not yet been presented.
made to interpret tissue fluorescence spectra in terms of tissue chemical composition [1,21. Instead, empirical algorithms have been employed to classify tissue type based on fluorescence spectra with impressive results [3-51. Despite the SUCcess of empirical methods, a technique to relate tissue spectra to tissue chemistry is required to exploit the full potential of fluorescence diagnostics. Although extensive methodologies havebeen developed to analyze fluorescence spectra of dilute mixtures [6], achieving such interpretation of tissue fluorescence is limited by several factors. In this report, we compare the effectiveness of three methods to predict the concentration of chromophores in turbid samples. In the first method, a physical model of light propagation is used to describe the fluorescence of a turbid sample in terms of individual chromophore concentrations and fluorescence and attenuation lineshapes. Turbid spectra are analyzed by leastsquares fit to the model, varying chromophore concentration until the best agreement between turbid data and model is achieved. This technique, assuming exponential light attenuation, has frequently been used to analyze tissue fluorescence [1,8]. The method requires initial measurement of individual chromophore fluorescence and attenuation lineshapes, but not a priori information about sample optical properties.
In the second method, the effects of scattering and absorption are deconvolved using the optical properties of the sample and a model of light propagation. The resulting spectrum, corrected for scattering and absorption, is then fit to a linear combination of single component dilute spectra having unit concentrations, yielding fluorophore concentrations in the turbid sample. The model of light propagation used in this investigation is the two flux . This method requires sample optical properties at the excitation and emission wavelengths, which can be calculated from measurements of transmission and diffuse reflection.
Finally, the third technique employed here uses the method of partial least squares (PLS) to develop an empirical linear model of turbid sample fluorescence from a training set with known concentrations and optical properties similar to the samples to be predicted [lo]. Recently, efforts have been made to quantify chromophores in biological samples using PLS, where work has focused on the determination of blood glucose levels from IR absorption spectra [ll]. These studies, carried out using FT-IR techniques, have shown promise in the extraction of quantitative information from transmission spectra both in vitro [ l l l and in vivo. [121.
Although studies have indicated that PLS is appropriate to the task of predicting fluorophore concentration in complex, nonscattering mixtures [131, to our knowledge, there has not yet been a study that explicitly examines the ability of PLS to predict chromophore concentrations from fluorescence spectra of turbid media such as human tissue. For PLS to accurately predict concentrations of chromophores in vivo, it must be shown that the technique is an accurate predictor in samples that absorb and scatter light. In the case of biological tissues, scattering coefficients can range between 50-400 cm-' in the UVNisible portion of the spectrum, and absorption coefficients range from <1 to >25 cm-' depending on tissue type [14]. PLS does not explicitly require a priori knowledge of the sample optical properties but does require fluorescence spectra from a training set of samples with known concentrations, similar in chemical complexity to the unknown sample of interest [13].
We constructed a series of 22 turbid phantoms, described in Table 1, with known composition and optical properties similar to human tissue. Fluorescence emission spectra of the same set of phantoms were analyzed using each of the three methods described above to calculate concentration. Calculated values were compared to true values to assess the accuracy of prediction of each method.

MATERIALS AND METHODS
Phantoms were constructed using the methods of [15]. Phantom components include fluorophores (Rhodamine B and Flavin Adenine Dinucleotide) , absorber (oxy-hemoglobin), and scatterer (polystyrene microspheresj suspended in isotonic phosphate-buffered saline (PBS), pH 7.4. Rhodamine B was obtained from Exciton Laser Dyes and reagent grade Flavin Adenine Dinucleotide (FAD) was purchased from Eastman-Kodak. These were used without further purification. Samples were contained in 2 mm pathlength quartz cuvettes at room temperature. Using published values of molar extinction coefficients for these chromophores [16,17], it was established that a 2 mm pathlength solution was dilute over the wavelength range from 350-700 nm for a mixture containing only 0.43 pM Rhodamine B and 10 FM FAD in PBS. (The sum of the optical densities at the excitation and emission wavelengths studied here is <0.1 for a solution consisting of 0.43 KM Rhodamine B and 10 p M FAD in a PBS solution contained in a 2 mm pathlength cuvette.) The effects of fluorophore reabsorption for each sample are therefore negligible. In this study, concentrations of Rhodamine B and FAD were at or below these levels.
Human blood was used as a source of oxyhemoglobin. Blood was drawn from the author (A.J.D.); red blood cells were separated from whole blood via centrifugation and were lysed us-Determining Chromophore Concentrations ing chilled, deionized water. Red blood cell ghosts were removed by centrifugation. Concentrations of absorber were investigated that provided total transmission >1% for samples contained in 2 mm pathlength cuvettes at all wavelengths between 450 nm and 700 nm. The final concentration of hemoglobin used in each turbid sample was between 3.8 pM and 94.1 pM. Previous work indicates that scattering coefficients consistent with those typically seen in tissue can be generated using 1 pm diameter polystyrene microspheres [El. The concentration of 1 pm diameter monodispersed polystyrene spheres (Bangs Laboratories) was varied between 0.16% and 0.8% by volume over the sample set. In addition to exhibiting minimal fluorescence, monodispersed polystyrene spheres enable one to use Mie theory to calculate the anisotropy coefficient and the scattering coefficient [181. At these concentrations, the polystyrene spheres remain in suspension for a period of several hours. The composition of the two dilute and 22 turbid samples studied here were chosen to span the range typical for human tissue with minimal intersample correlations between chromophore concentrations (maximal correlation coefficient was 0.15). This minimized the likelihood that PLS predictions of the concentration of one chromophore would be based solely on spectral signatures of a correlated chromophore.

77
Fluorescence. Samples were mounted in 2 mm pathlength, 1 cm width, quartz cuvettes. Spectra were collected using a scanning spectrofluorimeter equipped with double excitation and emission monochromators (SPEX Fluorolog 11, Edison, NJ). Fluorescence emission spectra were acquired for 22 turbid phantoms and two dilute samples at both 458 nm and 488 nm excitation. Because variations in temperature can significantly affect the emission of fluorescence 161, refrigerated samples were allowed to equilibrate to room temperature prior to data collection. At each excitation wavelength, fluorescence emission was collected in 5 nm increments over a range starting at wavelengths 10 nm longer than the excitation wavelength. Emission scans were terminated at a wavelength 10 nm less than twice the excitation wavelength or 700 nm, whichever was lower. The spectral resolution of the excitation and emission monochromators was 5.6 nm, full width at half maximum (FWHM). The excitation beam was 2 mm x 2 mm square and was focused and incident normally on the front face of the sample. Emission was collected at a 20" angle with respect to the excitation beam from a region larger than that illuminated. Emission signal was integrated for 1 s at each emission wavelength increment.
Data presented here have been corrected for the nonuniform spectral response of the emission monochromator and photomultiplier tube using correction factors supplied by the manufacturer. Variations in excitation intensity were corrected for using a Rhodamine B quantum counter 1191. The excitation intensity was always <40 p W / mm2.
Optical properties. Total optical properties were calculated for each sample from total transmission and reflectance spectra using two flux . Total reflection and transmission measurements were made for the scattering samples using an absorption spectrophotometer (Hitachi U-3300) with a 90 mm diameter Spectralon integrating sphere accessory. The interrogation beam was normal to the sample surface and at incidence was 1 x 4 mm. Ports were circular and 25 mm in diameter. Samples were placed in 1 cm wide, 2 mm pathlength quartz cuvettes to minimize the effects of light losses at the side of the cuvette. Fresnel reflection was not collected but was instead allowed to exit the integrating sphere via the original entrance port. The resolution of the system was 2 nm (FWHM) and the scan time was 300 n d m i n .

Physically Based Models
Spectra were analyzed using models of tissue fluorescence derived from two physical models of light propagation in tissue. We compared the accuracy of results using two models, one based on exponential attenuation of light, the other based on two flux Kubelka-Munk theory.
Exponential attenuation. A model based on exponential attenuation of light has been used frequently to analyze tissue spectra, yielding chromophore contributions, which are assumed proportional to chromophore concentrations. For the case in which the sample is homogeneous, the excitation and collection areas are infinite, constituent fluorophores are FAD and Rhodamine B, and the primary absorber is oxy-hemoglobin, a model assuming exponential attenuation [ll is given by Eq. (1) where Io(A,) is the intensity of the excitation beam at the sample surface, S(A,,) is the fluorescence intensity as a function of emission wavelength, A, , , FAD@,,), and Rho@,,) represent the normalized fluorescence spectra of FAD and Rhodamine B, respectively; C1 and C2 indicate the absolute contribution of each fluorophore to the turbid spectrum. This contribution is proportional to the concentration and peak quantum yield of each fluorophore. The denominator includes the normalized sum of oxy-hemoglobin attenuation at the excitation and emission wavelengths, [Abs(AeX) + Abs(A,,)], with contribution C,. Because the hemoglobin is lysed from the cell, we assume that the contribution of scattering to the attenuation is negligible. A wavelength independent attenuation (describing scattering from the polystyrene beads) is represented by C,.
Turbid emission spectra were fit to Eq. (1) in the least-squares sense with C,-C, as free parameters of the fit, subject to the constraints C1-C3 > 0. To establish a scale for the coefficients C1-C3, C, was set to 1 in the fitting process. C1-C3 are reported relative to C,. Alternatively, a priori information about attenuation at the excitation and emission wavelength can be used to calculate C, for each sample. This technique has been employed to predict the concentration of fluorescent components in biological tissues using both fixed C, = 1 (cervix) [8] and variable C, (ar-Two flux Kubelka-Munk-based method. A second model of light propagation in turbid media, two flux Kubelka-Munk theory, was used to derive an analytic expression mathematically to remove the effects of scattering and absorption from the spectrum of the turbid medium. As shown in detail in ref. [20], the spectrum of a turbid sample, Smr, of finite thickness, hr., can be related to the spectrum of a dilute solution containing the same concentration of fluorophore, SDil, via a transfer function (TF) which depends on the sample thickness and the optical properties of the sample at the excitation and emission wavelengths (Eq. (2)).

tery) [I].
The form of the transfer function for infinite excitation and collection beams and index matching at the sample front surface is calculated using two flux Kubelka-Munk theory in C201. This TF can be calculated from the sample optical properties, which can be measured from diffuse reflection and transmission. S,, was calculated for all 22 samples; corrected spectra were then fit to a linear combination of single component dilute spectra to yield fluorophore concentrations. These were compared to true values.

Predicting Chromophore Concentration Using PLS
The method of PLS has been recently reviewed [21,22]. The PLS method involves the regression between two matrices, X and Y . For spectroscopic analysis of n mixtures with p unknowns, X and Y represent spectral and concentration matrices respectively. PLS seeks a calibration matrix B such that where B is the set of calibration constants for the system [23]. The n rows of X and Y contain information about n sample mixtures. The rows of X The predictive performance of PLS depends on the composition of the training and validation sets as well as the spectral information included in the data, One technique used to evaluate the accuracy of prediction and to select the optimum number of factors to retain in the model is the method of cross validation. Cross validation is done by evaluating the ability of a PLS calibration to predict the concentrations of unknown spectra as a function of the number of factors (rank) used in creating the calibration. For each rank tested, a calibration is generated using the training set, but omitting one spectrum from the training set (n-1 samples) to serve as a validation spectrum. The concentration of the sample associated with the validation spectrum is then predicted from the calibration. The squares of the errors in prediction for all components in the validation spectrum are summed for each rank. These operations are repeated n times to include each sample in the validation set. The final calibration is produced using the number of factors for which the sum of the squares of the errors in predictions for all components is minimized.
We performed several PLS analyses in an effort to determine the optimal conditions to predict chromophore concentrations in turbid samples with optical properties similar to tissue. Specifically, the analysis addressed the accuracy of prediction as a function of: (1) optical properties of calibration samples, (2) number of calibration samples, and (3) spectral data composition as described in detail below. Calculations were performed using MatLab version 4.2a and the Chemometrics Toolbox (Mathworks, Natick, MA) on a Macintosh Quadra 700. Literature indicates that the spectra used in the training and calibration sets should have similar complexity [13]. Our sample set included data from both turbid and dilute samples. We investigated whether the inclusion of the dilute samples in the training set affected the accuracy of prediction for turbid samples. Two calibration sets were formed, one including only the turbid samples, the other included the same turbid samples in ad-dition to dilute samples containing only pure fluorophores and buffer. Using the model derived from each training set, chromophore concentrations predicted for the turbid samples using cross validation were compared to actual values.
We next investigated the effect of varying the number of training samples on the accuracy of prediction. In the first case, results were obtained by dividing the turbid samples randomly into a training set (1 1 samples) and a validation set (1 1 samples). Predicted concentrations of the validation set samples were compared to those obtained when the method of cross validation was used to predict concentrations of the validation set samples.
Finally, the effects of spectral data composition on the accuracy of PLS prediction using cross validation were compared for all turbid samples. Specifically, prediction accuracy from emission spectra at a single excitation wavelength (458 or 488 nm) was compared to that achieved when PLS was performed on data vectors containing the 458 nm excited spectra followed consecutively by the 488 nm excited spectrum. In addition, the effects of inclusion of the Rayleigh scattering tail on PLS prediction with cross validation were assessed for all turbid samples. The Rayleigh scattering tail is a spectral feature that is due to scattered excitation light that is not completely rejected by the emission monochromator of the fluorimeter and is a measure of the diffuse reflectance of the sample at the excitation wavelength. Accuracy of prediction was compared for spectral data at both excitation wavelengths for two cases: (1) with the full emission spectra, and (2) with the initial 20 nm (containing the Rayleigh scattering tail) of each spectrum truncated.
The results of application of each of these analytic methods are summarized and discussed in the following section. In each case, an average figure of merit is used to describe predictive ability. The average relative error of prediction (AREP) was calculated via Eq. (4), as the absolute difference between predicted concentration (Pi) and actual concentration (Ti) averaged over all n samples in the set. This figure was normalized to the maximum actual concentration of each constituent material (Ti,=) making up the phantom (FAD: 10 p M , Rhodamine B: 0.43 pM, oxyhemoglobin: 94.1 pM and polystyrene microspheres: 0.79%). For the purposes of comparison, the true scattering coefficient was assumed to be that calculated from Mie theory given the concentration of polystyrene spheres.

Fluorescence Spectra Predictive Ability
Emission spectra from optically dilute sam-   (2) to calculate the equivalent dilute spectrum for the turbid sample data shown in Figure 2a. This calculated dilute spectrum is fit to a linear combination of dilute FAD and Rhodamine fluorescence. The Kubelka-Munk-based model clearly removes the valleys caused by re-absorption of fluorescent light by oxy-hemoglobin, and the resulting spectrum can be described accurately by a linear combination of dilute solution spectra. Figure 3b compares the predicted FAD concentration to the true concentration for turbid data analyzed with this method. Predictions based on Kubelka-Munk theory consistently overestimate the true concentration of FAD by a scale factor of 5.5 (the ratio of KM prediction to true concentration averaged over the sample set for both FAD and Rhodamine B). Table 2a presents the AREP of the Kubelka-Munk method for FAD and Rhodamine concentration with and without the scale factor. Figure 4a illustrates the ability of the PLS method to describe spectral data of turbid samples; the agreement between the data and model is excellent for the typical fit shown at 458 nm excitation. The effect of composition of the training set on the predictive ability of PLS for FAD concentration is illustrated in Figure 4b.  Error estimates of PLS prediction using separate training and validation sets, each of size 4 2 are compared to those obtained using the method of cross validation on a sample set of size n/2 containing the same samples in the validation set above. Figure 5 compares the predicted concentration of FAD obtained using the two techniques to the true concentration. Table 2c summarizes the AREP for FAD and Rhodamine concentration for the two techniques applied to 458 nm excitation data (four PLS factors retained). When the prediction set contains 11 samples, the AREP for FAD and Rhodamine concentration are similar whether concentrations were predicted using a calibration matrix generated from a separate calibration set or calibration matrices generated from the prediction set itself under cross-validation. This verifies for this data that error estimates obtained with cross validation are within k 4% of those expected with separate training and validation. Figures 6 and 7 illustrate the effects of spectral data composition on the accuracy of PLS prediction of FAD concentration in turbid samples via cross validation. In Figure 6, the predicted value of FAD concentration is compared to true values for three different data types: (1) emission spectra at 458 nm excitation (four PLS factors retained), (2) emission spectra at 488 nm excitation (six PLS factors retained), and (3) concatenation of emission spectra at 458 and 488 nm to a single vector for each sample (ten PLS factors retained). Table 2d compares the AREP for Rhodamine and FAD concentrations for the three cases and indicates that AREP is significantly reduced when data at both excitation wavelengths is incorporated in the prediction.

PLS
Finally, Figure 7 illustrates the effect of inclusion of the Rayleigh scattering tail at the short wavelength end of each spectrum on PLS prediction with cross validation for the case in which spectra acquired at 458 and 488 nm are combined.
FAD concentration (Fig. 7a) and scattering coefficient (polystyrene microsphere concentration, Fig. 7b) for turbid samples are plotted both with (ten PLS factors retained) and without (six PLS factors retained) the Rayleigh scattering tail. Table 2e compares the AREP for FAD and polystyrene concentration using the two data formats and indicates a substantial decrease in average relative error of prediction (AREP) when the Rayleigh scattering tails are included; this decrease is greatest for polystyrene concentration.
In summary, the predictive ability of the PLS method is optimal for the case in which the training set is composed only of turbid samples and spectral data for each sample includes combined emission spectra at 458 nm and 488 nm excitation with the Rayleigh scattering tail retained in both spectra to include information about diffuse reflectance at the excitation wavelength. Figure 8a-d summarizes the results of PLS under these optimal conditions and shows the predicted concentration vs. actual concentration for FAD, Rhodamine B, oxy-hemoglobin, and l-Fm diameter polystyrene microspheres, respectively. Table 2a gives the AREP for each of these constituents. AREP for FAD, Rhodamine, and polystyrene are -5%, that of oxy-hemoglobin is slightly higher at about 14%.

DISCUSSION
Whereas all three methods of data analysis presented here provide a good description of the shape and intensity of the fluorescence emission spectrum, the most accurate predictions of chromophore concentration are based on PLS. The model based on exponential attenuation can be used accurately to fit turbid data given the fluo-rescence and attenuation lineshapes of the constituent chromophores, but significantly overestimates the quantities of each component in an unpredictable fashion. We have shown that two flux Kubelka-Munk theory can be used to deconvolve absorption and scattering effects from fluorescence emission spectra given the optical properties for each sample. Again, prediction of fluorophore concentration based on this method consistently overestimates true values; with this method overestimation is by a constant scaling factor. With the incorporation of the scaling factor, AREP was within 20% for fluorophore concentration. Finally, using the method of PLS with a training set of samples with optical properties similar to those to be analyzed, we were able to obtain the most accurate predictions of turbid sample components.
Although PLS does accurately predict concentrations in this set of experiments, the two Determining Chromophore Concentrations Actual FAD Crmcentration (uM) Fig. 6. Comparison of PLS prediction of FAD concentration for spectral data which includes emission spectrum at only 458 nm (unfilled squares) or 488 nm (gray squares) and to the columnwise combination of the spectra at both of these wavelengths (dark squares).
"model based" algorithms perform poorly. Presumably if the physics is right, the model should work. Recent work by Gardner [24] indicates that the effects of air/quartz/water boundary conditions can significantly reduce the collection efficiency of fluorescence emitted from a dilute fluorescent sample. This suggests that the dilute solution spectra that are used as references for assessing the accuracy of the KM results should be corrected for boundary condition and pathlength effects. Since no dilute spectra are acquired explicitly using the PLS method, this problem is avoided.
Our results show, to maximize predictive ability, the training set must reflect the chemical and optical complexity of the samples to be inves-tigated. We also have shown that by combining spectra from more than one excitation wavelength, and including the Rayleigh scattering tails as a measure of diffuse reflectance at the excitation wavelength, the accuracy of concentration prediction can be increased. Under these ideal conditions, the maximum AREP was <14% for a set of samples mimicking the optical properties of human tissue. Best results were obtained for fluorophore and scatterer concentrations (AREP < 5%); prediction errors were slightly higher for absorber concentration. We attribute this result to the nature of the measurement technique itself. Fluorescence emission spectra are not direct indicators of absorber concentration. Clearly the measured intensity of the fluores- cence emission spectrum is directly related to the concentration of the fluorophore. Because the intensity of the Rayleigh scattering tail is proportional to the amount of scatterer, the fluorescence emission spectrum is also a direct measure of scatterer concentration. Spectral absorption features, however, depend on both the fluorophore and scatterer concentration. The calculated absorber concentration is therefore more subject to error than either the fluorophore or the scatterer concentration.
There are a number of benefits to using PLS to obtain concentration information from turbid tion and validation data are acquired in the same geometry. Furthermore, by acquiring calibration and validation data in different geometries, the matrix B can be used to gauge the sensitivity of prediction errors on measurement geometry, thus yielding information about light propagation under these conditions. A few important obstacles remain before PLS can be practically applied to analyze fluorescence of turbid tissues. Because most tissues are data' in tur- Fig. 8. Summary of PLS prediction results. Cross validation bid systems is complex because fluorescence is is used to predict the concentrations of all sample constituhighly geometry dependent. A model developed ents. The 458 nm and 488 nm spectra are combined and the for one excitation and collection geometry will Rayleigh scattering tail is included. can be to data ' 01-Oxyhemoglobin (triangles), predicted concentration vs. aclected with a different geometry. Using PLS, getual concentration. (d) Polystyrene (diamonds) predicted scatometry effects are incorporated when calibratering coefficient vs. actual scattering coefficient. inhomogeneous and layered, we must study the effects of these on PLS prediction accuracy. This can be done by performing similar studies using the layered phantoms described in [151. Prediction accuracies can be compared for data acquired with both conventional and confocal geometries [25], where only fluorescence generated in a single layer is collected. The most significant obstacle is the question of how to reliably construct a tissue training set where concentrations of fluorophores and absorbers of biological interest such as NADH, FAD, and oxy-hemoglobin, can be quantified. A technique for verification of fluorophore concentration in corneal epithelium is described in [26]. This work documents an assay method that can be used to simultaneously quantify and extract both NADH and NAD from tissue. Another avenue for verification of PLS quantification techniques on tissue involves adding exogenous fluorescent dyes to biological tissues, applying the method of PLS to calculate dye concentration and comparing this result to concentration as determined by separate biochemical assays. These assay methods have been frequently used to determine photosensitizer concentration for photodynamic therapy [27] and suggest that the clinical application of PLS analysis of fluorescence emission spectra may be used to monitor local photosensitizer concentration and allow greater precision in dosage and control of therapy.