Determination of optical properties in highly attenuating media with an endoscope-compatible reflectance approach

Accurate in vivo optical property data in the ultraviolet to visible range are scarce for many endoscope-accessible organs, yet such information is essential for understanding light propagation and identifying dosimetry standards for biomedical optical spectroscopy. We have performed a preliminary study towards the development of a reflectance-based system for endoscopic measurement of tissue optical properties relevant to fluorescence spectroscopy. To address the constraint of instrument channel diameter and strong attenuation of light in the spectral region of interest, maximum fiber separation distance was limited to 2.5 mm. Measurements were performed on tissue phantoms for a range of optical properties relevant to gastrointestinal mucosal tissues in the ultraviolet to visible range: absorption coefficients from 1 to 25 cm-1 and reduced scattering coefficients from 5 to 25 cm-1. Neural network and partial least squares algorithms were trained on radial reflectance profiles generated by a Monte Carlo model as well as by experimental data. These routines were then used to estimate absorption and reduced scattering coefficients from reflectance data. Results are discussed in terms of the optimization of models for optical property determination.


INTRODUCTION
Fiberoptic-based fluorescence spectroscopy systems have shown great potential for endoscopic detection of gastrointestinal neoplasia and monitoring of photodynamic therapy. However, in order to perform theoretical analyses for optimization of fluorescence and other spectroscopic techniques, in situ tissue optical property data in the ultraviolet A (UVA, 320-400 nm) and visible (VIS, 400-700 nm) regions are needed. Optical property data in this range are also necessary to determine accurate dosimetry standards for endoscopic optical diagnostic devices.
The use of reflectance measurements to quantify tissue optical properties is well established. The most common approach involves measurements at a series of radial distances from the source location, typically with a linear array of optical fibers [1][2][3][4][5] . This technique involves three primary steps: (1) generation of a calibration set of radial reflectance profiles corresponding to known sets of optical properties (either through direct measurements of tissue phantoms or light transport simulations); (2) use of the calibration data in training an algorithm to identify the relationship between reflectance profile and optical properties; and (3) measurement of a reflectance profile in an unknown sample. Variations on this technique have included the use of a CCD camera for radial reflectance measurements 6 , the measurement of reflectance at individual locations 7 and the determination of the slope of the radial decay curve at two locations (with each slope measurement requiring two measurements) 8,9 . Typically, measurements performed in prior studies have involved wavelengths in the upper visible and infrared regions -regimes in which absorption and scattering are much less than in the UVA-VIS regions. The source-collection separation distances used in these prior studies typically range from several millimeters to centimeters.
It should be noted that in two previous studies, endoscopic measurement of esophageal optical properties was performed at 514 and 630 nm using a probe with radial fiber orientation, thus enabling source-collection separation distances of up to 14 mm 8,9 . Two primary factors distinguish this work from the vast majority of prior studies: the range of optical properties and the illumination/collection delivery system. Our intent was to develop an approach that is useful for UVA and visible regions of the spectrum. Limited data in the literature indicates that the absorption and scattering coefficients for gastrointestinal mucosal tissues in this spectral region may reach values as high as 30 and 25 cm -1 , respectively. [9][10][11][12] Therefore, the large source-detector separation distances used in previous studies may not be appropriate. The second major issue is the size of the instrument channel through which the probe is delivered. In a typical gastroscope, instrument channel diameter is approximately 2.0-2.8 mm (Olympus, Karl Storz), which significantly complicates the implementation of probe designs suggested previously.
The goals of this study were to determine the feasibility of a relatively simple reflectance-based approach to endoscopecompatible optical property determination as well as to evaluate neural network and partial least squares techniques and the use of different types of calibration data. This process involved performing Monte Carlo simulations and experimental measurements (on tissue phantoms) for a wide range of tissue optical properties as well as the development of data processing algorithms based on neural networks and partial least squares. Results are discussed in relation to optimization of techniques for in vivo determination of tissue optical properties.

Overview of Data Sets
The first task in this study was to generate radial reflectance distributions for a large number of µ a -µ s ' pairs. Two lists, each containing 30 optical property pairs, were developed: a "uniform" list including µ a values of 1, 5, 10, 15, 20, and 25 cm -1 and µ s ' values of 5, 10, 15, 20, and 25 cm -1 ( Figure 1a); as well as a "random" list generated using a uniform random distribution over a µ a range of 0-25 cm -1 and a µ s ' range of 5-25 cm -1 (Figure 1b). One simulated and one experimentally-measured radial distribution was generated for each of these 60 optical property pairs. The resulting 120 distributions were divided into four data sets labeled by how the optical property pair and the actual distribution were developed (simulated-uniform, simulated-random, measured-uniform or measured-random).

Computational Methods
A weighted photon Monte Carlo model of light transport was developed to calculate radial reflectance distributions for given optical property pairs (µ a , µ s '). A brief explanation of the Monte Carlo method is provided here, while more thorough versions are available elsewhere. 13,14 . This technique involves the repeated generation of random numbers and calculation of stochastic relations to simulate the random walk of a large number of individual photons through tissue. These relations are used to calculate parameters such as the angle and location of photon launch, the step size between scattering events, the angle of scatter, refraction/reflection at surfaces and photon termination. Photons exiting the tissue at the surface were subjected to angular restriction according to NA = n i sinθ, where n i = 1.37. The location of "detected" photons were recorded in radial bins 0.025 mm in width.

Experimental Methods
A diagram of the experimental apparatus is provided in Figure 2. All reflectance measurements were performed using a temperature-controlled 5 mW laser diode (670 nm source, Edmund Industrial Optics, Barrington, NJ), which was found to have an emission peak at 675 nm. This source was used due to its low cost, stability and ease of comparison with prior studies. While future research in our lab will involve shorter wavelengths, for the present study the actual wavelength is much less important than the optical property range, which should correspond to biological tissues in the UVA-VIS range. Neutral density filters were used to adjust the input power. A custom-designed fiberoptic probe (FiberTech Optica, Ontario, Canada) was used to deliver light from the source to the sample and guide diffuse reflectance from the sample to the detector. The source leg of the probe consisted of a single fiber, while the detection leg consisted of six fibers spaced at center-to-center distances of 0.23, 0.67, 1.12, 1.57, 2.01 and 2.46 mm from the source fiber. All fibers had a diameter of 0.2 mm. The probe was submersed in a small cuvette which was large enough to provide a semiinfinite medium -no change in reflectance was produced when larger cuvette sizes were used. The detection leg terminated at an inverted microscope (Diavert, Leitz, Germany) with the output imaged onto a CCD camera (Model CH250, Photometrics, Tuscon, AZ) Images were acquired and stored on a personal computer. The linearity of the CCD images was verified using a power meter (Labmaster, Coherent Inc., Santa Clara, CA) Tissue phantoms were generated for each of the sixty optical property pairs on the uniform and random lists. These phantoms were created by combining varying concentrations of a scatterer (Intralipid®, Baxter Healthcare Corporation, Deerfield, IL), an absorber (nigrosin) and distilled water. Scattering coefficients were determined through linear scaling given a µ s ' value of 18 cm -1 at 675 nm for a 15% concentration of the stock 10% Intralipid®, as calculated from spectrophotometer measurements and the inverse adding doubling technique, assuming that g = 0.72. Absorption coefficients were determined based on a µ a value of 34 cm -1 for the stock nigrosin solution at 675 nm, as determined from transmission measurements.

Data Processing Methods
Raw reflectance data from computational or experimental results was preprocessed using the following equation: where R is the intensity at a specific radial location and f is the collection fiber position (from 1 to 6). This formulation was used to convert absolute data to normalized profiles, reduce the variation from several orders of magnitude to less than one order of magnitude and ensure that all values were positive. These data were used to both develop and evaluate models for predicting optical parameters from radial reflectance data. Development of predictive models was performed with partial least squares (PLS) and neural networks (NN) techniques. These methods were implemented due to their established use in biomedical optics and spectroscopy literature. Calculations were performed using the Matlab software package (The Mathworks, Inc.) with Neural Networks and Chemometrics toolbox routines.
The method of PLS has been reviewed previously 15,16 . This approach involves the determination of a calibration matrix B, by regression between the two matrices, X and Y, which representing the reflectance and corresponding optical property matrices, respectively. The matix X contains m=5 columns of reflectance data (representing five fiber locations). The matrix Y contains p=2 columns -the two optical properties which characterize each sample. The PLS algorithm employs a singular value decomposition function to iteratively decompose the optical property and reflectance data and form a model matrix B: such that the n rows of X and Y contain information about n samples. The elements of B describe the linear PLS model relating the distribution of S to absorption and scattering coefficients. The steps used to calculate B are detailed in 17 . Cross validation of the algorithm was then performed to evaluate its performance and select the appropriate number of factors for the model.
The NN procedure involved a feed-forward backpropagation network based on a Levenburg-Marquardt training function. The input layer had five nodes, corresponding to the five nonzero S values and the output layer contained two nodes, one for each of the two optical properties predicted. Training of the network involved dividing the calibration data into three subsets. Approximately half of the calibration data was used for a training set, a quarter for a testing set and the final quarter for a validation set. Algorithm training was terminated when the performance gradient decreased beyond a preset level.

General Trends in Computational Results
Simulations were performed for uniform and random optical property data sets. Selected results from the uniform data set are presented in Figure 3 to illustrate the effect of absorption and scattering coefficients on radial reflectance distributions. In these graphs, the effect of each optical property is isolated by holding the other optical property constant at a moderate level. Reflectance data are presented in processed form according to equation 1.
Characteristic changes are evident for both µ a and µ s '. As µ a increased, there was a significant increase in S at all fiber positions and the slope of the curve increased dramatically. The magnitude of the changes in S due to µ a were relatively constant as µ a increased. Increasing µ s ' produced a more subtle effect on S (Figure 3b). At the second collection fiber (r = 0.67), µ s ' had minimal influence on S, whereas the other collection fibers showed variations in S which increased with distance from the source fiber. The magnitude of these changes was greatest at higher µ s ' levels. While µ s ' caused a change in slope, this change was smaller in magnitude than that produced for µ a .
These data may provide some initial insight into the optical property estimation process. Since changes in µ a appear to affect S to a greater extent than µ s ' does, it is likely that it will be possible to predict µ a with greater accuracy than µ s '. These data also indicate that determination of reflectance at point 2 (r = 0.67 mm) and the slope at larger radial distances may be important for determinating the inverse solution -the former is almost independent of µ s ' whereas the latter is dependent on both µ a and µ s '.

Evaluation of Optical Property Estimation Models
In this section we analyze the role of algorithm type and calibration/validation set type on prediction accuracy. Each of the four types of data sets were implemented in NN and PLS routines to calibrate optical property determination models which were, in turn, evaluated using each of the four data sets. The ability of each model to predict optical properties for a given validation data set was quantified using the root mean square error (RMSE): where pred refers to the model's prediction and true to the actual optical property value and n indicates the total number of data points. For each calibration data set, a single self-validation (evaluation based on the same data set used for calibration) and the three other validations were performed (Table 1).
Two cases which are particularly relevant to future in vivo studies are 17 and 20 (Table 1), since these represent the selfvalidation and independent evaluation against experimental data for one of the most accurate models tested. Detailed results for these cases are presented in Figure 4. The self-validation graphs in Fig. 4(a,b) show a relatively low level of error throughout the µ a range, but greater error for µ s ', especially at higher values. A similar result is seen in Figure 4(c,d), with the greatest inaccuracies produced for µ s ', particularly at µ s ' = 25 cm -1 .
The graphs in Figure 5 illustrate the results for µ s ' in cases 31 and 32. Both of these graphs contain points which represent poor estimations by the model. The true optical properties of the outlier in Fig. 5a were µ a = 22.8 cm -1 , µ s ' = 23.6 cm -1 and for the two apparent outliers in Fig. 5b, µ a = 25 cm -1 , µ s ' = 5 cm -1 , and µ a = 1 cm -1 , µ s ' = 25 cm -1 . Each of these three optical property pairs are near the edge of the calibrated range (see Figure 1.). The level of error at these locations indicates that for optimal estimates, the range of calibration values should be significantly greater than the range of expected optical property values.    Table 1). Evaluation was performed using a data set comprised of experimental measurements of tissue phantoms with random optical properties. Absorption coefficient data is presented in graphs (a) and (c), whereas reduced scattering results are presented in graphs (b) and (d). Corresponding root mean square errors (from Table 1 Table 1). One outlier in (a) and two poorly-fitting data points in (b) correspond to optical property pairs which were near the edge of the optical property range (see Figure 1) over which the model was calibrated.

General Estimation Accuracy Issues
Prior theoretical investigations have indicated that the standard assumption that differences in phase function from tissue to tissue have minimal effect on light propagation may not be appropriate when small fiber separation distances are used 18,19 , necessitating an alternate approach that involves determination of a third optical property parameter which describes scattering anisotropy 20 . This is due to the fact that very few scattering events take place, thus increasing the relative detection of photons which have undergone high-angle scattering events. If errors were produced due to our use of a Henyey-Greenstein phase function with g = 0.9, they would be most prominent for the lowest µ s ' values investigated, since making light more diffuse decreases the dependence of reflectance on the phase function. However, the agreement between simulated and measured reflectance data was found to be more accurate for low µ s ' values than high ones. This indicates that errors due to phase function were relatively small compared to measurement error.
In general, calibrated models were less accurate in predicting µ s ' than predicting µ a . The average RMSE for all models listed in Table 1 was 1.46 for µ a and 3.10 for µ s ' . As mentioned in Section 3.1, this discrepancy may be due to the fact that in comparison to µ a , changes in µ s ' cause relatively little variation in S. Therefore, any error in S is magnified in the estimation of µ s ' .

Validation Issues
It is generally expected that RMSE would be lower for self-validation than for validation involving other data sets. In most cases, the self-validation RMSE was relatively low, however in several cases it was not lower than when other validation data sets were used. Overall, the highest RMSE levels produced during self-validation were for cases in which measured data was used for calibration. This may be an indication that the experimental data contained a greater degree of error than simulation data.
Since the simulation results incorporate minimal levels of statistical error, the cases in which randomly generated simulation data were used for calibration and uniform data for validation, or vice versa, provide a direct evaluation of the processing procedure (eg. cases 2, 5, 18, 21). As expected, these cases produced the most accurate optical property estimates, providing support for the efficacy of neural network and partial least squares approaches in developing models for optical property estimation.

Neural Networks vs. Partial Least Squares
In almost every case, the use of the NN algorithm gave rise to lower RMSE values than PLS, indicating that a NN approach may be optimal for determination of optical properties in situ. While it is difficult to identify with certainty the reason for the better performance, it may be due to the fact that the NN routine used here incorporated internal validation checks using training, testing and validation sets to avoid overfitting, whereas PLS did not contain such rigorous checks.

Simulation vs. Experimental Data Sets
Calibration with simulation data almost always produced higher RMSE values when validated with experimental data than for validation with simulation data. However, calibration with experimental data did not tend to result in significantly higher RMSE values when simulation data was used in validation. It is possible that these results are due to the fact that the experimental data contained a greater level of error, thus making validation difficult regardless of which data set was used for calibration. These errors may have been due in part to inaccuracies in CCD camera intensity measurement. In the future, we plan to address the issue of experimental error by moving to a system in which in-line neutral density filters can be used to homogenize the light levels read by the CCD.

Uniform vs. Random Data Sets
In general, the random validation sets enabled slightly more accurate validation (lower RMSE values) than did uniform sets. This may be due to the fact that the range of values covered in the random data is slightly smaller than that in the uniform set (Figure 1), which may have led to increased errors when the random data set was used to calibrate the model and then the model had to evaluate data points which were slightly outside the range it had been trained on. This problem may be corrected in the future by always using validation sets which are at least slightly smaller in range than the corresponding calibration set. The literature indicates that calibration of PLS with random spacing of the inputs to a PLS algorithm are necessary to create robust models without artifactual correlations. However, our results for PLS model calibration with uniform data sets do not indicate a significantly reduction in model quality.

Optimal Estimation for Experimental Data
Overall, the best prediction (lowest RMSE values) of the measured-uniform data set was produced when a measuredrandom data set was used for calibration (# 31). The prediction results for the measured-random data set was nearly equivalent for cases #20, 24, and 28, representing calibration with simulated-uniform, simulated-random and measureduniform data. Originally, we hypothesized that the best prediction of an experimental data set would come as a result of calibration with an experimental data set. However, these results indicate that the performance of simulation-calibrated models can closely approximate those calibrated with experimental data. In all of these cases, the model type was a neural network.

CONCLUSIONS
This preliminary study represents a significant step towards development of a combined experimental-numerical approach to endoscopic determination of tissue optical properties in the ultraviolet A and visible ranges. Results indicate the feasibility of a small fiber separation approach to optical property determination. In general, models predicted µ a with a greater degree of accuracy than µ s ' . Both partial least squares and neural network approaches were shown to be valid, although the latter produced more accurate results than the former. While the best estimation of optical properties from experimental reflectance data was produced through neural network model calibration with experimental data, the use of computational results for calibration provided highly accurate predictions as well.

ACKNOWLEDGMENTS
TJP was supported by an appointment to the Research Fellowship Program at the Center for Devices and Radiological Health administered by Oak Ridge Associated Universities through a contract with the U.S. Food and Drug Administration. AJD acknowledges Michael Berns and the Beckman Foundation for fellowship support. The authors would also like to thank Dr. Frederic Bevilacqua for his insightful comments on our manuscript.