Photon noise and constant-volume operators

In an earlier paper [J. Opt. Am. A 2,1769 (1985)] a class of nonlinear image processing operators was introduced in which each photoreceptor creates a nonnegative point-spread function whose center height is proportional to its quantum catch and whose volume is constant, so that the local spatial-summation area varies inversely with the local quantum catch. These constant-volume (CV) operators are designed to maximize spatial resolution in the presence of photon noise. In the previous paper it was shown that when CV operators are applied to deterministic images, they produce a surprising range of effects that are reminiscent of human vision, including Mach bands and Weber's-law behavior. In this paper the consequences of applying CV operators to images containing Poisson noise are analyzed. It is shown that a fixed-parameter CV operator can duplicate the global qualitative properties of spatial vision for retinal illuminances ranging from absolute threshold to 1000 Td. Although there are fundamental obstacles to modeling the exact quantitative properties of human spatial vision by CV operators, these operators seem likely to be useful in machine vision.


INTRODUCTION
This paper is a sequel to a recent paper by Cornsweet and Yellott,l which introduced a class of nonlinear image-processing operators based on the following idea: each point in the input image gives rise to a nonnegative point-spread function whose center height is proportional to the light intensity at that point and whose volume is constant, so that the area covered by the point spread, the local spatial-summation area, varies inversely with the input intensity at each point. The output image is the sum of the point-spread functions. ( (IDS) operators. Their paper began with a motivation of IDS operators in terms of their ability to deal efficiently with photon noise, in particular, to adjust automatically the size of the spatial-summation area according to the prevailing quantum catch, so as to maximize spatial resolution at every light level while maintaining a fixed reliability for spatialcontrast detection. Despite this motivation, however, the mathematical analysis of IDS operators in that paper deliberately ignored both the noisiness and the granularity of light: it treated the input image I as a deterministic function whose values range continuously over the nonnegative real numbers. In other words, the analysis derived the consequences of applying IDS operators to functions that represent the expected quantum catch at each point in an input image rather than the catch itself. This restriction was motivated by mathematical expediency: it seemed sensible to begin the analysis of IDS operators with the simplest case and to defer an exact treatment of noisy images to a later paper.
In this paper the theory of IDS operators is extended to photon-noisy images, i.e., to the case in which the input values I(u, v) are Poisson random variables, like the quantum catches of the photoreceptors. To highlight the similarities and the differences between operators of the form of Eq. (1.1) and linear operators, what we called IDS operators in the earlier paper are here renamed constant-volume (CV) operators. (From the standpoint of this new terminology, linear operators can be thought of as constant-area operators. This distinction is explained in Section 3.) Thus Eq. (1.1) is the general form of a spatially continuous CV operator, and Fig. 1 illustrates a spatially discrete CV operator whose spread function is Gaussian.
Cornsweet and Yellott's deterministic analysis focused on the consequences of applying CV operators to test images widely studied in psychophysics: edges, spots, and sinusoidal gratings. It showed that besides the expected effect of causing spatial summation, and thus, spatial resolution, to vary with input intensity, CV operators also automatically create a surprising range of additional effects that resemble important properties of human spatial vision, properties not usually associated with spatial-summation mechanisms. In particular, when CV operators are applied to deterministic images, they act as bandpass filters, so that Mach bands appear at edges; the peak amplitude of the Mach band at an edge separating regions of intensity L and L + D depends only on the contrast ratio DIL, so that increment thresholds for spots could be expected to obey Weber's law. Moreover, the background intensity at which the detectability of any given spot begins to obey Weber's law depends on its size: the larger the spot, the sooner its increment threshold should begin to follow Weber's law. This is a property of human vision 2 that is hard to explain in terms of models for early visual processing in which the retinal image is first Overall, the main conclusion of the earlier paper was that while CV operators bear little resemblence to mechanisms commonly used to model retinal image processing (in particular, there is no lateral inhibition: the point-spread function is never negative 3 ), they are remarkably adept at duplicating important features of human spatial vision. Consequently, it seemed worthwhile to extend their analysis to photon-noisy images, to determine how well a more realistic model of retinal image processing based on CV operators could match the quantitative properties of spatial vision.
In this paper we derive the statistical properties (pointwise means and variances) of the output images created by CV operators applied to photon-noisy versions of the input images dealt with in the earlier paper: edges, spots, and gratings. These results permit us to calculate the input image parameters (grating contrast, etc.) that would permit any test image to be discriminated from a uniform field with a given level of reliability and consequently to compare the predictions of a CV operator model with psychophysical data. These results also allow us to assess the potential usefulness of CV operators in artificial-image-processing applications, although that is not the primary focus of this paper.

Overview of Results
An important (and unexpected) result of this analysis is the discovery that when CV operators are applied to photonnoisy images, their consequences are qualitatively different at high and low light levels, a property that was not at all evident from the deterministic analysis of the earlier paper. 1 At high light levels (10 or more mean quanta/receptor) the expected output image for any noisy input image is essentially the same as the output image that would be produced by applying the same CV operator to the corresponding deterministic input image, i.e., the image whose intensity at each point is the mean quantum catch of the noisy image. Moreover, the variance at each output point is a constant that is independent of the input image (a constant that can be made arbitrarily small by a one-time setting of the scale parameter of the operator). Consequently, at high light levels the effects of CV operators on noisy images are essentially the same as their effects in the deterministic case: they still act as self-adjusting bandpass filters whose pass-band varies with the retinal illuminance, and Weber's law still holds for any test spot once the background intensity reaches a critical level.
However, at very low light levels (when the mean quantum catch falls below 0.1 quantum/receptor), the intrinsic nonlinearity of the CV operation has no opportunity to express itself, and every CV operator, no matter what its spread function, becomes effectively equivalent to some low-pass linear filter. Thus, at low light levels, CV operators applied to photon-noisy images have entirely different consequences from those that we would expect from the deterministic case: they no longer create Mach bands at edges or imply Weber's law. Instead, spot detectability obeys the deVries-Rose law.
From a psychophysical standpoint, this change from bandpass to low-pass filtering means that when photon noise is explicitly acknowledged, CV operators give a natural account of the fact that the shape of the human spatial CSF changes from bandpass to low pass as retinal illuminance falls to low levels. 4 When only deterministic input images were considered, that effect could not be duplicated by any CV operator, so it is interesting to find that under morerealistic assumptions it proves to be an inevitable consequence of photon statistics.
Altogether, the present analysis shows that a simple oneparameter CV model can give at least a qualitative account of the major changes that occur in human spatial vision as the retinal illuminance varies between the absolute threshold [-10-4 troland (Td)] and 1000 Td. In the model considered here, photon-noisy retinal images (the quantum catches of the photoreceptors over a 250-msec time interval) are transformed by a fixed-parameter CV operator (the Gaussian operator G defined below in Section 3, with its scale parameter a held constant across all illuminance levels), and it is assumed that any test image becomes discriminable from a uniform field when the peak value of d' across its output image reaches some fixed threshold level. This model correctly predicts that as the retinal illuminance rises from 10-4 to 103 Td, (1) Ricco's area shrinks 5 ; (2) visual acuity rises (for gratings, by an overall factor on the order of 100, which matches the increase in human acuity 6 ); (3) peak spatial-contrast sensitivity rises: the sensitivity at the best spatial frequency grows as the square root of the mean retinal illuminance up to 0.1 Td (Ref. 4) and reaches an asymptote of the order of 100 (threshold contrast -1%) at 10 Td (5) increment thresholds obey the deVries-Rose law at low background illuminances and Weber's law at high ones; (6) the background illuminance at which Weber's law begins to hold is higher for small test spots than for large ones. 2 The model also correctly predicts that two sinusoidal gratings whose frequencies are both above the resolution limit at a given mean luminance level (and thus are invisible when viewed individually) can give rise to visible beats when superimposed. 7 It seems rather remarkable that such a broad range of phenomena can all be created by a single self-adjusting mechanism driven only by photon noise. One gets a sense that, despite their lack of resemblence to familiar modeling devices, CV operators must somehow capture a fundamental property of retinal image processing. That would not be John I. Yellott, Jr. However, a theory that only predicts the global qualitative properties of spatial vision is clearly inadequate: the real question is whether a model based on CV operators can make uniformly accurate quantitative predictions. The present analysis shows that this cannot be achieved by the bare-bones CV model considered here, a model in which the photoreceptor quantum catch is simply accumulated for a fixed period of time and then filtered by a CV operator. That model has two major flaws. First, at low retinal illuminances (<0.1 Td, the range where CV operators become effectively linear), the illuminance-related changes produced by the CV operation are too sluggish to match human performance. For example, at low light levels human visual acuity grows rapidly with retinal illuminance, nearly proportionally to its square root, 6 whereas the Gaussian CV operator predicts a much slower rate of growth. This is a byproduct of its effective linearity in this range, which causes it (and all CV operators) to predict incorrectly that at low light levels the spatial CSF should translate rigidly upward (in a log-log plot) as the mean retinal illuminance increases.
[Experimentally, we find instead that between absolute threshold and 0.1 Td, the CSF shifts both upward and sideways (i.e., parallel to the log frequency axis) as illuminance rises. 4 ] The other major flaw appears at moderate-to-high light levels, where the nonlinear properties of the CV operation fully express themselves.
Here, the problem is that the illuminance-related changes in contrast sensitivity produced by CV operators are too drastic to match human performance. For example, the Gaussian CV operator implies that above 10 Td the entire spatial CSF should shift rigidly along the (log) frequency axis as the mean retinal illuminance varies, so that both the peak of the CSF and its high-frequency cutoff (i.e., visual acuity) should grow proportionally with the square root of the retinal illuminance.
(It seems virtually certain that this is true for any CV operator, regardless of its spread function, although mathematical scruples preclude a blanket assertion.) Neither of these effects occurs in human vision: above 10 Td the peak and the cutoff of the CSF for humans both change hardly at all. 4 In addition, the CV model predicts that as retinal illuminance increases, contrast sensitivity at low spatial frequencies should decrease. That effect is never observed in human vision. Of course the model analyzed here represents a rather primitive implementation of the basic CV idea: its linear analog would be a model with a single working part, a single linear filter, such as a difference of Gaussians, whose parameters are required to remain constant across all retinal illuminance levels. In comparison with that model, the achievements of the CV model analyzed here are quite impressive and might encourage us to construct more-elaborate theories in which the basic CV operation is supplemented by other mechanisms. For scotopic and mesopic vision, this could be a useful exercise, because in that range the basic change that needs to be made seems fairly obvious, and it would be interesting to see whether it actually works. For photopic vision, on the other hand, I do not see any obvious way to design a viable model based on CV operators; the operation itself is not well adapted to perform in a high light environment, where photon noise is no longer a significant factor. (These points are discussed in Section 11.) From the standpoint of psychophysical theory, then, the usefulness of CV operators seems at best problematic. Certainly it would be premature to say that such a basic mathematical tool will find no place in psychophysical theory.
That would be like saying that linear operators are useless because they cannot model all the properties of spatial vision. The results presented here do not rule out the possibility that CV operators could be used to construct precise models of spatial vision for a restricted range of retinal illuminances. However, the results do show that there are fundamental obstacles that must be overcome for such a program to be successful. Even if such models could be devised, physiology suggests that the results would probably be more a formal exercise than a description of retinal reality. While it is surprisingly hard to find physiological evidence that clearly rules out the possibility of CV-like operations in the retina, what evidence does exist is negative, and recent studies suggest that the vertebrate retina achieves the same general goal (illuminance-dependent spatial filtering) by entirely different means. (These points are discussed in Section 11.) For visual science, then, CV operators may well remain only theoretical curiosities: devices for modeling a retina that might have been but not the one that we have. For image engineering, however, their future seems more promising. The results presented here show that they provide a simple and effective algorithm for transforming photonnoisy input images whose mean intensities (and, consequently, signal-to-noise ratios) span a large range into output images that have a constant dynamic range, a noise level uniformly smaller than any desired upper bound, and a spatial resolution that automatically adjusts to match the prevailing light level. Although a discussion of specific applications is beyond the scope of this paper, it seems likely that such an algorithm could be useful in the design of surveillance systems and visual robots.

Organization of the Paper
In Sections 2-4 the stage is set for the analysis, and Sections 5-9 give the results. In Section 2 we review the constraints imposed on contrast detection by photon noise and show how they motivate an intensity-dependent spatial-summation operation located at the level of the photoreceptors, an operation implemented by intensity-dependent pointspread functions. In Section 3 we show how CV operators can be arrived at deductively, starting from a general family of variable-point-spread operators, by requiring two properties: (1) an area-intensity trade-off adapted to Poisson statistics and (2) dc suppression. As in the preceding paper, we focus here on a special case: the Gaussian CV operator. In Section 3 we define that operator, explain its unique mathematical convenience, and show that its output image values are uniformly bounded. Section 3 also includes an explanation of the relationship between physically realizable (i.e., discrete) CV operators and their continuous approximations, which are analytically indispensable but require tactful handling.
In Section 4 we review the mathematical properties of CV operators applied to deterministic input images, properties that were derived in the preceding paper.' These are still useful in the noisy-input case, because when the illuminance is not too low (10 or more mean quanta/receptor), the expected output image produced by a Gaussian CV operator for any photon-noisy input image is essentially the same as its output to the corresponding expected input image, i.e., the deterministic image obtained by replacing the input random variables with their expected values. In Section 5 we prove that convenient fact and also show that the output variance of a Gaussian CV operator has an upper bound (for all possible input images) that varies inversely with the square of its scale parameter. When the scale parameter is adjusted to fit psychophysical data, this upper bound proves to be small compared with the output values themselves: the maximum possible standard deviation is of the order of 2% of the mean output value, and for most images the actual standard deviation proves to be about 0.2% of the mean. In other words, the output images produced by this operator can be made virtually noise free. In Sections 6-9 we deal with specific types of input images, the photon-noisy versions of the inputs considered in Ref. 1. In Section 6 we derive the mean and the variance of the response to uniform fields. Surprisingly, the mean response to a noisy uniform field with intensity (mean quanta/receptor) I is not a constant, as it is for deterministic images, but it grows as 1 -exp(-I). This is the first indication of what proves to be a general rule: CV operators produce qualitatively different effects, depending on whether the illuminance is more or less than 10 quanta/receptor. In Section 7 we deal with the response to edges. A key result of the previous paper was that in the deterministic case, edges in the input image create Mach bands in the output image, and the peak amplitudes of these Mach bands obey Weber's law. Here, it is shown that the same is true of the expected response to Poisson noisy edges once the input illuminance reaches 10 mean quanta/receptor. Moreover, the peak amplitude value is independent of the scale parameter of the CV operator, which means that the edge-response signal-to-noise ratio can be made arbitrarily large by adjusting that parameter (though at a cost in spatial resolution quanta/receptor, its shape changes from bandpass to low pass. The same changes also appear in the CSF, and in addition (because of changes in the output noise level) it shows an overall loss of contrast sensitivity at mean illuminances of <1 quantum/receptor, a loss quite similar to the one observed in human CSF's. Section 8 concludes with a demonstration that CV operators also predict the fact that gratings that are individually invisible, because their frequencies are higher than the resolution limit at a given retinal illuminance level, give rise to visible beats when superimposed.
In Section 9 responses to spots (targets of the sort used in psychophysical increment threshold measurements) are discussed. The Gaussian CV operator's threshold-versusbackground-intensity (TVI) curves, which obey the de-Vries-Rose law (quantum-limited detection) at low background intensities and Weber's law at high ones, are discussed. If a small amount of dark light is assumed, to limit detection on zero backgrounds (i.e., absolute threshold), the CV operator's TVI curve provides a fairly good (but intrinsically never perfect) fit to human data. In Section 9 we also show how CV operators give rise to Ricco's law and cause the size of Ricco's area to shrink as background illuminance rises. Section 9 concludes with a calculation of the quantum efficiency of CV operators (i.e., of an observer whose input is a photon-noisy image filtered by a Gaussian CV operator).
In Section 10 some unsolved mathematical problems posed by CV operators are listed. In contrast to the deterministic case dealt with in the preceding paper,' the noisyinput case has not proved to be mathematically docile, and most of its properties are known analytically only for special cases (in particular, the Gaussian case) or, even worse, are known only empirically, from the results of simulations. Thus, while the main outlines of the consequences of image processing by CV operators now seem clear from a mixture of analysis and computation, many open problems still remain. Those described in Section 10 are only the ones that seem most immediate.
Finally, in Section 11 we summarize the successes and failures of the Gaussian CV operator as a model of image processing in the retina and discuss the problems involved in remedying its defects.

PHOTON NOISE AND SPATIAL SUMMATION
As Rose 8 and deVries 9 pointed out more than 40 years ago, contrast detection by any visual system is ultimately limited by the ability of the system to determine whether the mean of one Poisson random variable differs from that of another. For example, suppose that a patch of retina contains R photoreceptors and that its illuminance, the mean quantum catch per receptor, is I in one time interval and I + CI (C > 0) in the next. Any sensible mechanism for detecting this increase will at least require the actual total catch in the second interval to be greater than that in the first. Using that criterion alone (and the normal approximation to the Poisson distribution), the probability of detection is NIC[RI/(2 + C)]1/ 2 }, where N is the normal distribution function: For this probability to be at least 0.99, it is necessary that C[RI/(2 + C)]'/ 2 > 2.3, or, as an order-of-magnitude requirement, It is instructive to compare relation (2.2) with the parameters of human vision. The working range of the visual system is roughly -4 to +6 log Td, and the mean quantum catch of the photoreceptors is about 4 per sec per receptor per troland.1 0 If 0.25 sec is taken as a conservative upper bound for the temporal integration period of photoreceptors, relation (2.2) implies that the minimum contrast that can reliably be detected from the quantum catch of a single receptor ranges from C = 316 at 10-4 Td through C = 3.16 at 1 Td to C = 0.003 at 106 Td. Human contrast thresholds are generally much lower than this (of course, for targets larger than a receptor): observers can detect contrasts of the order of C = 1 when retinal illuminance is 10-4 Td and of the order of C = 0.01 above 1 Td. 4 ' 1 Clearly this would be impossible without some mechanism that, in effect, sums the quantum catches of many receptors, creating a signal that can satisfy the kind of constraint represented by relation (2.2).
On the other hand, spatial summation necessarily limits spatial resolution: a signal based on the total quantum catch of many receptors cannot carry information about the catch of any single one. Thus there is always a conflict between the ability to detect small contrasts and the ability to detect any contrast at all in small areas. Requirement (2.2) suggests resolving this conflict by causing the summation area R to vary inversely with the illuminance level I, permitting a fixed minimum contrast to be detected with a fixed reliability across all illuminance levels and simultaneously maximizing spatial resolution at any given level.
Assuming that intensity-dependent spatial summation is desirable, how should it be accomplished? The first design question is, Where should the size of the summation area be set: at the level of the summation units that collect signals from photoreceptors or at the level of the receptors themselves? Any attempt to implement the first solution must contend with an awkward "catch-22" dilemma: in order to adjust the size of its summation area according to the light level, a summation unit must estimate that level from signals it gets from receptors in its current summation area, but it cannot decide whether any given receptor should be within that area until it knows the light level. A more attractive option is to let each photoreceptor vote on the appropriate size of the summation area, guided by its own quantum catch, i.e., spread its signal over an area that varies according to that catch. In that case there is no catch-22 problem: no receptor needs to know anything more than its own input, and locally optimal spatial-summation areas can be created by parallel computations occuring simultaneously throughout the retina.

CONSTANT-VOLUME OPERATORS
In Section 2 we showed that photon noise motivates an intensity-dependent spatial-summation operation, and design considerations suggest implementing that operation at the level of the photoreceptor point-spread functions. At that early level it seems pointless to build in any orientation bias, so the operation should be rotation invariant, and parsimony dictates that it should also be translation invariant.
Combining all four requirements, we are led to a class of image-processing operators of the general form arbitrary real-valued spread function, which is assumed to be nonnegative to capture the idea of summation; and F and G are real functions that remain to be determined. This class includes both linear operators (the case in which F is the identity function and G is a constant) and the constantvolume operators defined in Section 1 (the case in which both F and G are the identity function).
We shall now specify F and G. Operator (3.1) describes a two-stage process in which each input point (u, v) first gives rise to a point-spread function whose value at the output point (x, y) is

{F[I(u, v)]/G[I(u, v)]IV,,
where V, is the constant given by If it is assumed that S(0) > 0, the equivalent area covered by the point-spread function around input point (u, v) is its volume divided by its center height, i.e., The signal-detection argument in Section 2 implies that this area should vary inversely with the input intensity I(u, v), and so we are led to the specialization G(I) = I.
This leaves F to be identified. Nothing about photon noise per se seems to force that choice, so its rationale must be sought elsewhere. One consideration here is to ensure that the output signals created by 0 should always fall within a fixed dynamic range, regardless of the actual retinal illuminance level (since that level spans a range of 10 log units, while the optic nerve has an effective working range of about 2). Also, at any fixed mean illuminance level, it seems sensible that the responses to input contrasts around that level should be able to exploit the entire response range.
John I. Yellott, Jr. (3.4) This suggests that our operator should be designed to create dc suppression, i.e., to cause all uniform fields, whatever their intensity, to produce the same baseline response. If the input image I in Eq. (3.1) is a uniform field, i.e., I (u, v) equals any positive constant L, then, after the change of output image is also a uniform field whose value at every point is (3.6) Consequently, to guarantee dc suppression, we require F(L)/ G(L) to be a constant, which we can arbitrarily take to be 1. However, the uniform field response, expression i.e., the CV operators described in Section 1.

Gaussian Constant-Volume Operators
In this paper, as in the preceding one, we focus mainly on a special case: the Gaussian CV operator, denoted by G and defined as follows: Here, the point-spread function around an input point (u, v) with an intensity I(u, v) is a two-dimensional Gaussian probability-density function whose equivalent area is 27r2/ I(u, v). The distance unit in both the input and output image planes is assumed to be the diameter of one photoreceptor, and so 27r-a 2 /I(u, v) describes the size of the point spread in receptor areas. The scale parameter a entirely determines the numerical properties of G; sometimes we use the notation G, to make that dependence explicit. From a mathematical standpoint, the Gaussian is a uniquely convenient form of the point-spread function, because it is the only separable case, i.e., S(

Discrete Constant-Volume Operators
Continuous CV operators such as that described by Eq. (3.8) are analytically useful because they admit the power of the calculus, but from a physical standpoint they can be regarded only as approximations to discrete CV operators, operators that could actually be constructed. To appreciate the limits of these continuous approximations, we need a precise model of the discrete CV operators that they are supposed to emulate. Such operators can be described as follows.
Imagine that an image falls upon a checkerboardlike array of densely packed square photoreceptors, each measuring 1 X 1 (so that the distance unit in the input image plane is one receptor diameter). The center of one receptor is taken to be the origin of the input plane, and each receptor is identified by its centerpoint coordinates (u, v), where u = 0, ±1, +2, . . , and v = 0, i1, ±2, . . , relative to that origin. An input image I corresponds to some function mapping the set of receptor coordinates I(u, v)} into nonnegative integers {I(u, v)}: the quantum catches of the receptors. Next, imagine a second plane, the output image plane, containing another densely packed checkerboard array of square summation units, each again 1 X 1 (so that the distance unit in the output image plane is one photoreceptor diameter). Each summation unit is identified by the coordinates of its centerpoint, say, (x, y), and the summation unit (x, y) is thought of as lying directly below receptor (x, y). When receptor (u, v) catches I(u, v) quanta, it creates a point-spread function over the output image plane whose value at any point (p, g) is (3.9) The summation unit centered at (x, y) integrates the pointspread function (3.9) over its 1 X 1 surface area, so that the point-spread contribution from receptor (u, v) to summation The total point-spread contribution from all the receptors to the summation unit (x, y) is the sum of expression (3.10) over all u, v, and that number is the output image value at coordinates (x, y); i.e., the discrete output image is exactly (3.11) Now we make approximations. If the point-spread function (3.9) is essentially constant over the 1 X 1 area of summation unit (x, y), the integral expression (3.10) can be replaced by If it is assumed that approximately constant over areas the size of a single receptor, the sum in Eq. (3.12) can be replaced by the integral Eq.

Saturation
The error created by approximating Eq. (3.11) with Eq. (3.7) depends on the size of the input values I(u, v) relative to the spatial extent of the basic spread function S. In general, as I becomes large, the area covered by a point-spread function IS(Ir 2 ) shrinks, and eventually that function can no longer be treated as constant across a single summation unit. The most important practical consequence of this is that the continuous approximation [Eq. (3.7)] fails to reveal a saturation effect that limits the performance of any real CV operator at high light levels. Saturation occurs when the quantum catch of a receptor creates a point-spread function whose entire volume is confined to an area smaller than the receptor itself, smaller, that is, than the 1 X 1 area of the summation unit below that receptor. In that case, the true point-spread contribution from the receptor to the summation unit directly below it, i.e., the integral [Eq. (3.10)] for x = u, y = v), is the volume constant Vs, but the approximation treats it as I(u, v)S(0), which is potentially unbounded. When the quantum catch at every receptor exceeds the saturation limit of a discrete CV operator, its true output image is a uniform field with the value V everywhere, so all contrast is lost. However, that loss will not necessarily be apparent from its continuous approximation, since the approximation envisages infinitely small receptors and summation units, which would make saturation impossible. For example, when the input is a grating of the form L(1 + m cos 2rfu), with m 2 < 1, and L grows without bound, any discrete CV operator will eventually saturate and produce a uniform output field for every frequency f. However its continuous approximation will disguise that fact and imply instead that the operator's high-frequency cutoff simply increases without bound.
To avoid such pitfalls, we must know the input level at which a discrete CV operator will begin to show significant saturation effects. Here, we are concerned chiefly with the Gaussian operator G approximated by Eq. (3.8), and in all numerical examples its parameter a is taken to be 100.
(That choice is based on psychophysical modeling considerations discussed below.) Analysis and computational experience show that this operator begins to saturate when the input level reaches about 104 quanta/receptor. (At that level the equivalent area of the point-spread function is roughly 6 receptors, and the point spread from a receptor to its own summation unit is about 15% of the total spread.) Consequently, its continuous approximation will be used only for images of considerably <104 quanta/receptor.

Integer-Valued Inputs and Bounded Output Images
Whether we deal with discrete CV operators or their continuous approximations, it is important to bear in mind that the input image values I(u, u) are receptor quantum catches and consequently must be integers, never fractions. The importance of this constraint can be seen if we attempt to determine what the maximum output value can be for a given CV operator, e.g., the Gaussian case [Eq. On that basis it was asserted, without proof, that the Gaussian CV operator compresses all possible input images into a common finite range of output values. But when we try to prove that claim, an apparent problem arises. Suppose that we seek to construct the input image I that will create the When it is converted to polar coordinates, Eq. which is infinite, a disturbing result. Of course, the solution of Eq. (3.14) is unrealistic because it fails to represent the fact that the unbounded input image 2 2 /r 2 would saturate the receptor at the origin, making its output 1 rather than -.
To correct for this, we can break Eq. (3.14) into two parts, one giving a realistic account of the point-spread contribution of the receptor at the origin and the other giving the total contribution of all other receptors: However, this concession to realism does not solve the problem; the maximum output is still infinite. In fact, any input image of the form I(r) = K, for r R, and I(r) = (2a-2 /r 2 ), for r > R, will make Eq. Thus it is not the large values of the maximal input image (2u-2 /r 2 ) near the origin that cause the output to be infinite; rather, it is the small values of 2a 2 /r 2 far from the origin. And therein lies the fallacy of the solution. Since the values of any real input image must be integers, the maximizing input function I(r) = (2a-2 /r 2 ) describes a possible input only when r < oaV2. For larger values of r, the integer-valued I(r) that maximizes the integrand in Eq. (3.13) is I(r) = 1. Consequently, the maximum output value for any physically possible input image is in fact bounded:

RESPONSES TO DETERMINISTIC INPUT IMAGES
To understand the effects of CV operators on photon-noisy images, integer-valued stochastic processes, it is analytically convenient to begin with their responses to deterministic input images whose values need not be integers, i.e., arbitrary nonnegative real functions I(u, v). These can be thought of as the expectations of actual noisy images.
Cornsweet and Yellottl derived the main results of applying CV operators to such images. Their analysis focused on the Gaussian operator G [i.e., Eq. (3.8)] but emphasized that that case is unique only in its mathematical convenience: all CV operators share the same general properties. To demonstrate this, Cornsweet and Yellott proved a number of theorems that apply to all operators of the form of Eq. (1.1), independent of the exact form of the spread function S. In this section we review some of these results for later use.
(The first two are obvious consequences of the construction of CV operators. All proofs can be found in Ref. 1.)

Result 1: Invariance under Translation and Rotation
For all CV operators, translating or rotating the input image by any amount leaves the output image unchanged except for translation or rotation by the same amount.

Result 2: Uniform Field Response
For all CV operators, the output image produced by any nonzero uniform field input I(u, v) L is a uniform field whose value at every point is the volume constant original image by the factor k1/ 2 on both dimensions, applying 0 to that image, and then rescEaling the output image by the factor 1/k/ 2 , i.e., back to the original size. Weber's-law behavior and many other properties follow from this.

Result 4: Edge Responses, Mach Bands, and Weber's Law
When the input image is an edge of the form I(u, v) = L, for u < 0, and I(u, v) = L + D, for u > 0 (i.e., a step), the output of the Gaussian CV operator is 12 ], (4.2) where N is the normal integral [Eq. (2.1)]. Figure 2 illustrates the response of G (with the parameter a = 100) to three different input edges. Here L ranges from 1 to 100 quanta/receptor but the contrast D/L is fixed at 0.5. It can be seen that all the responses exhibit Mach bands, and the peak and trough amplitudes of these Mach bands are the same in all cases. Analysis of Eq. In other words, the peak and trough values of the Mach bands in any edge response always obey Weber's law. This is a general property of all CV operators. Another general property is that the distances of the peak and the trough of the Mach bands from the edge itself vary as 1/L 1 / 2 . This suggests that spatial resolution will vary as L1/ 2 , which proves to be true.
(In view of result 1, all the statements in result 4 apply to any edge, whatever its location and orientation.)

Result 5: Responses to Sinusoids
Since CV operators are nonlinear, they must exhibit some harmonic distortion, and Fig. 3 shows the form that distortion takes in the Gaussian case when the input is a high-input contrast = 0.9 frequency = 0.01 cycles/receptor mean intensity = 100 quanta/receptor input contrast = 0.1 frequency = 0.01 cycle/receptor mean intensity = 100 quanta/receptor with an error on the order of m 2 . Consequently, for lowcontrast inputs it is sensible to speak of the MTF of the Gaussian CV operator: the ratio of output contrast to input contrast as a function of the input frequency f. Equation (4.4) shows that this MTF is (2r 2 -2f2/L)exp(_r 2 a 2 f2lL). (4.5) By a remarkable coincidence, at any fixed mean illuminance L this MTF is the same as that of Marr and Hildreth's1 3 del 2 -G operator, i.e., the linear operator whose impulse response is the negative Laplacian of a Gaussian. However, in contrast to the MTF of a linear operator, expression (4.5) changes with the mean illuminance: plotted in log-log coordinates, as shown in Fig. 4, it shifts bodily along the frequen-1N cy axis as L changes, so that the best frequency is (I/aer2l/ 2 ) L 1 / 2 , and visual acuity (defined as the highest frequency at  Psychophysics shows that human visual acuity for gratings does grow roughly as the square root of retinal illuminance 6 (although only up to about 10 Td). Thus CV operators give a fairly accurate account of that aspect of visual performance. However, the human spatial CSF does not maintain a fixed shape across all light levels and simply shift left or right along the (log) frequency axis as the mean illuminance changes. Instead, as the illuminance increases from zero, the CSF initially shifts but also rises (sensitivity increases at all frequencies) and then, at around 1 Td, changes shape from low pass to bandpass. 4 Beyond 10 Td, the peak frequency of the CSF remains quite constant at roughly 5 cycles/deg. Applied to deterministic images, no CV operator can create a MTF that rises or changes shape with the mean illuminance, and so when only such images were considered it seemed impossible for CV operators to model those two properties of human spatial vision. As we shall see, both prove to be natural consequences of CV operators applied to photon-noisy images.

Result 6: Spot Responses and Increment Thresholds
When the input image I is a square spot of width W and intensity L + D, centered at the origin, and surrounded by a uniform background field of intensity L, the output image of the Gaussian CV operator is profiles (across the x axis) of the response to spots of a fixed size for three different background intensities. The increment D was adjusted to keep the peak response value constant, simulating an increment threshold measurement. It can be seen that as the background intensity rises, the Mach bands at the edges of the spot become narrower and eventually cease to overlap. Once that level is reached, the peak response value always occurs at the peaks of the Mach bands, and the size of that peak obeys Weber's law. Consequently, a plot of the threshold value of D against the background intensity L (a TVI curve) will indicate that increment thresholds obey Weber's law above some critical background intensity level, a level that is higher for smaller spot sizes, as illustrated in Fig. 6. Human TVI curves show a qualitatively similar behavior. 2 Equation (4.6) can also be used to show that the increment threshold obeys Ricco's law for spot sizes smaller than a critical area, an area that shrinks as the background intensity increases. (Figure 10  Altogether, then, when CV operators are applied to deterministic images, they create a wide range of effects resembling well-known properties of human spatial vision: Mach bands, Weber's-law behavior, Ricco's-law behavior, and J-shaped TVI curves. However, because these images are noise free, there is nothing in the model at this stage that creates a natural threshold; i.e., there are no intrinsic limits to contrast detection. Consequently the choice of a threshold response value is arbitrary, and meaningful comparisons with psychophysical data are impossible. To make such comparisons, we need to introduce photon noise and then determine the input parameters required to produce threshold-level signal-to-noise ratios. In the next section we begin that process.

NOISY IMAGES: NOTATION AND PRELIMINARY RESULTS
We now apply CV operators to photon-noisy input images, i.e., images in which the input values I(u, v) are Poisson random variables. To distinguish between these random inputs and deterministic ones, Q(u, v) is used to denote the random variable corresponding to the quantum catch at the The random variables Q(u, v) are assumed to be mutually independent. The output image corresponding to input Q is the stochastic (5.2) In particular, the output of the Gaussian CV operator G is   (u, v). This permits us to apply results already obtained for deterministic input images to a significant number of noisy input images-a great convenience, because the former are typically simple closed-form expressions whose implications are relatively easy to grasp, whereas the exact expressions for expected output images usually involve infinite series whose implications are not apparent from inspection.

Theorem 1
If q(u, v) > 10 for all (u, v), with an error of at most 0.045. To put this value in context, recall that the uniform-field response of G is 1.0, and for a = 100, all response values fall between 0 and 5.5, with most between 0 and 2. Computational experience shows that the error created by approximation (5.4) is generally much less than 0.045: that maximum value corresponds to the worst possible case, as will appear in the proof. As q increases above 10, the maximum error decreases quickly.

Proof
We proceed by constructing the input image that maximizes the difference between the two sides of relation (5.4), and we show that the difference is at most 0.045. For this purpose it is sufficient to consider an arbitrary output point (x, y), and it is convenient to choose (x, y) = (0,0). If we write q for q(u, v), Q for Q(u, v), and r 2 for u 2 + 2 to reduce the notational burden, the right-hand side of relation (5.4) becomes X exp(-q)(qn/n!)}dudv =fJ J (q/27ra 2 )exp(-r 2 /2a-2 )exp{(-q) Finally, we need to maximize integral B. A computer search shows that for r 2 /2a2 > 0.1 and q > 10, the integrand in expression (5.7) is nonnegative for r 2 /2a-2 < 0.16; in that range its maximum value is produced by q = 10, and the minimum value (0) is produced when q is any constant of the order of 30 or more. For r 2 /2a-2 > 0.16 the integrand is never positive, and its maximum value is 0, obtained when q is any constant on the order of 30 or more. Numerical integration shows that when the minimum possible integrand is used for all r in this range, the total value of the integral is not less than -0.034. Consequently, this negative contribution cannot cancel the positive contribution from A. It follows that the total error IA + BI is maximized by setting q = 10 for 0.1 < r 2 /2a-2 < 0.16 and q = 30 for 0.16 < r 2 /2-2 . For that image, Bmax = 0.008., and when this is added to Amax, the maximum total error is 0.045. 1 When q(u, v) falls below 10, approximation (5.4) rapidly becomes a poor one: the expected response of G can no longer be predicted from G[q].
From a practical standpoint, the expected output image EIG[Q]l is interesting only to the extent that it is a good predictor of the actual output images G[Q] that will be generated by any illuminance function q. Theorem 2 shows that the scale parameter a can be adjusted to make the variance of G[Q] arbitrarily small across all input images, and so the accuracy of that prediction can be made arbitrarily good. In other words, the output image noise level can be preset below any required bound by a one-team adjustment of the parameter a-. It will be shown below that to match certain human psychophysical data, a must be around 100. For that a, the upper John I. Yellott, Jr. (5.6) bound on the right-hand side of relation (5.8) implies a maximum standard deviation of 0.02, and so most output values will be close to their expectations: the output noise will be on the order of 2% of the mean. [In fact, both analysis and computer simulation suggest that the upper bound given by relation (5.8) is too large by a factor of 10: for most input images the output standard deviations are of the order of 0.002. A sharper theorem seems in order here, although it is not easy to see how to prove one.] It is interesting to compare this 2% value with the noise of the input image. The operator G 100 has a working range (i.e., is below saturation) from 0 to about 104 mean quanta/receptor, and throughout this range its output values have a standard deviation less than 0.02. The standard deviation of the receptor quantum catch is q 2 when the mean is q, so its noise-to-signal ratio (i.e., 1/q/ 2 ) reaches 0.02 only when q = 2500 quanta/receptor, that is, when the retinal illuminance is of the order of 1000 Td. To reach the noise-to-signal ratio of 0.002, which characterizes G 100 for most input images, requires 2.5 X 105 quanta/receptor, a retinal illuminance of more than 105 Td. In comparison with its input images, it seems fair to describe the output images created by G 100 as virtually noise free. (Figures 9 and 10   To show this, we begin by evaluating the left-hand side: (5.10) since the random variables Q(u, v)exp[(-Q(u, v)/2a-2 )(u 2 + v 2 ) are mutually independent.
To save notation, we now write Q for Q(u, v), q for q(u, v), and r 2 for u 2 + 2 and omit the limits of integration, which are always the entire (u, v) plane. Using the fact that Var{X} = E X2 -E 2 1X), we can rewrite Eq. This quantity will depend on the expected input image q (u, v) and the scale parameter a-. Let A(a-, q) denote the first integral in Eq. (5.11), and let B(a-, q) denote the second (so that the variance is A(a-, q) -B(a-, q). Then A(a-, q) = JJ (1/27ra-2)2 n 2 exp(-nr 2 /a 2 )qn X exp(-q)/n!]dudv 12) where p = q exp(-r 2 /a-2 ). After the change of variables s = u/a-and t = v/a, Eq. (5.12) becomes where q'(s, t) = q(as, at) and v 2 = S 2 + t 2 .
Next we apply the same process to B(a-, q):

UNIFORM-FIELD RESPONSE
Recall from Section 4 that for any CV operator, all uniformfield input images produce the same output image: a uniform field whose intensity is the volume constant V, given by Eq. (3.4). That property was built into CV operators during their construction in Section 3. Consequently, it is a bit surprising to find that, for noisy uniform fields, the expected output image is not constant but increases (to an asymptote of V 8 ) as retinal illuminance rises.

Theorem 3
If the expected input image q(u, v) equals a constantL for all (u, v), the expected output image of any CV operator is

Proof
Since the expected input image is uniform, the expected output image values will be the same for all (x, y), and so for convenience we pick (x, y) = (0, 0). The left-hand side of Eq. (6.1) is then which is the right-hand side of Eq. (6.1). 1 Since 1 -exp(-L) = 1 -P(Q = 0), an alternative statement of theorem 3 is that the expected response value for a uniform field with q(u, v) _ L is the volume constant Vs times 1 -P(Q = 0), i.e., times the probability that the quantum catch is not zero. It is interesting to note (from the form of the proof) that when theorem 3 is expressed that way, its validity does not depend on the assumption that Q is a Poisson random variable. Why is the expected response to noisy uniform fields not constant, like the response to deterministic uniform fields? An intuitive answer runs as follows. If the input function I(u, v) L is permitted to assume values between 0 and 1, the area of the point-spread function LS(Lr 2 ) can become infinitely large as L becomes small. As L decreases, and with it the size of the point-spread contribution from any given receptor to the output value at a given point, that decrease is exactly compensated for by the increased number of receptors whose point spreads can reach that point. However, for photon-noisy images the quantum catch can only be 0, 1, 2, .. ., and so the maximum area point spread is produced by Q = 1. Thus, as E{Q} decreases to low levels, the number of receptors whose point spreads can reach any given output point does not increase indefinitely but instead reaches an upper limited determined by the area of S(r 2 ). At the same time, the expected number of receptors that actually contribute any point spread at all decreases (since P(Q = 0) is rising), and so the expected size of the total point spread arriving at any point decreases.
Theorem 3 holds for any spread function S. I have not been able to obtain a comparably general result for the variance of the uniform-field response. However, for Gaussian CV operators, the variance turns out to have a simple expression.

Theorem 4
For the Gaussian CV operator Ge, the expected output value for a noisy, uniform-field input image Q with q(u, v) -L is  ly with a-, this means that, for any given contrast D/L, the signal-to-noise ratio of the peak edge response can be made arbitrarily large by increasing a. In other words, any edge (and consequently any target bounded by sharp edges) can be made as detectable as we like. However, there is a cost for this increase in edge detectability, because, although the size of the peak response is independent of a-, its location is not: the peak occurs at x = Xmax = a-[(1/D)ln(1 + D/L)] 11 / 2 , and the width of the entire Mach band grows as a. This is indicative of the fact (discussed in the next section) that the spatial resolution of G, varies inversely with a-, with the result that, in the range L ' 10, the product of the visual acuity times the edge-response signal-to-noise ratio is a constant. Figure 8 shows that when L falls below 10 quanta/receptor, the expected edge response departs from the deterministic response: first it becomes less than the deterministic response, and its Mach bands become attenuated; then at L values <1 the Mach bands entirely disappear. This behavior can be understood analytically from Eq. which increases monotonically with x. If G were a linear operator, the Mach bands that it creates above 10 quanta/ receptor would imply that it is a bandpass (or high-pass) filter, and the loss of those Mach bands below 10 quanta/ receptor would imply that it is a low-pass filter. shows that for any fixed L and D, the edge response varies inversely with a-2 , but the upper bound given by relation (5.8) in that theorem is too large to be practically useful. Consequently, to gain an idea of the actual variance, we resort to simulation. Figures 9 and 10 show the results of applying the discrete version of G, with a = 100, to simulated photon- illuininance L, i.e., the variance given by Eq. (6.3). For L 2 10 and a = 100, Eq. (6.3) implies a standard deviation of 0.002, and the peak height of the Mach band in the expected edge response for 50% contrast is 1.048, and so in that range the peak value of the response signal-to-noise ratio (i.e., the peak expected edge response minus the expected uniformfield response, all divided by the uniform-field standard deviation) is 24. When L is small (<0.1), so that Eq. (7.2) describes the expected edge response and there are no Mach bands, the response simply increases monotonically from a low-side asymptote of L to a high-side asymptote of L + D. In the latter region the signal-to-noise ratio at each output point is 2a-D(r/L) 1 / 2 or, if the contrast D/L is denoted by C, 2-C(wrL)1/ 2 . Thus, in the low-light range, the operator's output signal-to-noise ratio is not independent of the illuminance level, as it is for L 2 10, but instead varies as L1/ 2 , as does its input. However, the output signal-to-noise ratio is much larger than the input ratio: atL = 0.1, where the input signal-to-noise ratio for a 50% contrast edge is 0.16, the signal-to-noise ratio for G 100 is 56. The effect of this improvement can be appreciated by comparing the input and output image profiles for the case L = 0.1 in Fig. 9.
The bottom right-hand panel in Fig. 10 shows how saturation begins to occur when the illuminance level reaches 104 quanta/receptor. For L = 10 through L = 1000 the peak value of the edge response (i.e., the peak of the Mach band) remains essentially constant at 1.05, which is the value im-  If spatial resolution is defined operationally by the highest spatial frequency f at which the MTF exceeds some fixed threshold value, then Eq. (8.4) shows that, in the range L ' 10, resolution will be directly proportional to L1/ 2 and inversely proportional to a. Combined with the fact that the edge-response signal-to-noise ratio varies directly with a-, this last relationship justifies the remark (in Section 7 above) that G creates an exact trade-off between spatial resolution and contrast detectability. Consequently, the MTF can be computed for all values of L. Figure 11 shows the MTF's of Ge, with a = 100, for L values ranging from 0.01 to 1000 mean quanta/receptor.
The frequency scale here is in cycles per receptor, and for L = 100 the peak frequency of the MTF is f = 0.023 cycle/receptor. The diameter of human photoreceptors is about 0.5 min of visual angle, and so this peak frequency would be 2.8 cycles/deg. Human spatial CSF's for mean illuminances of the order of 100 Td (i.e., a mean quantum catch around 100 per receptor per 0.25 sec) peak at about 3 cycles/deg, 4 and so the roundnumber value a = 100 provides a reasonable fit to that aspect of visual performance, which is the reason that this a was selected for special attention. The use of the MTF peak frequency to estimate a is motivated by the fact (shown below) that this is also the peak of the CSF at high light levels, and that value does not depend on the d' that is assumed to correspond to threshold. The choice a = 100 implies a peak frequency of 6 cycles/deg at L = 500, which provides a good match to the peak frequency measured by Campbell and Green' 4 at a mean illuminance of 500 Td.
Campbell and Green found that the peak frequency remains at around 6 cycles/deg when the optics of the eye are bypassed by interferometry, and so that value presumably reflects the purely neural mechanisms that we seek to model. As Fig. 11 shows, the MTF g(a, f, L) undergoes a metamorphosis as L falls below 10 quanta/receptor: its shape changes from the bandpass form given by Eq.
For L S 0.1, Le-L L -L 2 . All terms in the above equation that contain L 2 are •L 2 and therefore S0.01, and when they are all set equal to zero, the equation becomes simply (8.5) which holds for all m. This shows that for L S 0.1, the CV operator G, becomes equivalent to the low-pass linear filter whose impulse response is the Gaussian function (1/27ra 2 ) exp(-r 2 /2a 2 ). Intuitively, this change occurs because an expected quantum catch of 0.1 or less means that almost all receptors catch at most one quantum. Consequently the only point-spread function that has a chance to express itself is the one corresponding to Q = 1; all the rest occur with negligible probability. The effect is thus equivalent to applying the linear operator whose impulse response is the CV point-spread function for Q = 1. This is obviously a general property of CV operators-at light levels for which the mean quantum catch per receptor is uniformly S0.1, the CV operator with the spread function S becomes equivalent to the linear operator whose impulse response is S(x 2 + y 2 ). The MTF's in Fig. 11 indicate how much the expected contrast of any input grating L(1 + m cos 27rfu) will be attenuated by G, (for m < 0.1), and theorem 4 gives the mean and the standard deviation of the uniform-field response at any point. Consequently, for any point in the output image, we can calculate d' as the difference between the expected value of the output at that point for the grating versus a uniform field of the same illuminance, divided by the standard deviation of the uniform-field response. In particular, at any peak (say, at x = 0), d' will be If it is assumed that the grating is discriminable from a uniform field when da reaches some threshold value, Eq. (8.4) can be used to determine the input contrast m needed to achieve that value for any a, f, and L. A curve plotting the inverse of the contrast threshold against f, for fixed a and L, is a CSF. Figure 12 shows the CSF's of G_, for a = 100, based on the assumption that the threshold is achieved when d max = 21/2. (Ignoring probability summation, i.e., assuming that the observer uses only the response at a single output point, that d' corresponds to a hit probability of 0.75 for a falsealarm probability of 0.25 in a yes-no signal-detection experiment or to a correct-response probability of 0.85 in an unbiased two-alternative forced-choice experiment). Figure 12 shows that for mean illuminance L 2 10 quanta/ receptor, the contrast sensitivity at the best frequency is constant, but the best frequency itself (and the location of the entire CSF) varies proportionally as L 2 . As L falls below 10, the shape of the CSF changes from bandpass to low pass, and its level decreases; the overall contrast sensitivity decreases. Below L = 0.1, that decrease is proportional to L112, as would be expected, since in that range G acts as a linear filter with a fixed impulse response, and consequently its contrast sensitivity is directly governed by Poisson statistics.
The CSF's in Fig. 12 do not extend down to a sensitivity of 1 (threshold contrast 100%) because the nonlinearity of G can become appreciable for inputs whose contrasts are more than about 30%. Consequently the acuity Gloo cannot be read directly from its CSF's (except for illuminances below 0.1 mean quantum/receptor, where G becomes effectively linear). To find the highest detectable frequency, it is necessary to evaluate Eq. (8.1) to find the peak response value and then determine the highest-frequency f that yields a threshold value of d' at m = 1. However, in practice it is much more convenient to base the acuity calculation on an input contrast of m = 0.9, because in this case the expected input image q(u) = L(1 + m cos 27rfu) will be uniformly >10 for all L 100, and so in that range Eq.  Figure 13 shows visual acuity as a function of L for G 100 , based on the assumption that the threshold corresponds to a d' of 21/2 at the peak of the response to a 90% contrast sinusoidal grating. Acuity rises slowly at L values of <0.1 quantum/receptor. In that range, where G is linear, the highest-frequency f at which a grating with a contrast m produces a given d' is determined by the relationship exp[2(7rf) 2 ] = (m/d')(2ora-w/ 2 /Ll/ 2 ). For m = 0.9, d' = 21/2, and a = 100, Eq. (8.7) implies an acuity of 0.002 cycle/receptor, or 0.24 cycle/deg, at L = 10-4 quantum/receptor. If it is assumed that 1 Td corresponds to a quantum catch of 4 quanta/sec per receptor and that the integration time of the visual system is 0.25 sec, then L can be identified with 10-4 Td. Human acuity at that retinal illuminance appears to be about 0.6 cycle/deg. 6 At L = 1000, the acuity of G1 00 has risen to 26 cycles/deg (0.5 cycle/receptor), so the overall increase from -4 to +3 log Td is 2 log units. Human acuity at 1000 Td is 60 cycles/deg, and so it too increases by 2 log units from -4 to +3 log Td. (To reach 60 cycles/deg, G1 00 needs a mean illuminance of 5000 quanta/receptor.) Thus the overall growth of acuity with illuml/threshold contrast  inance is predicted fairly well by a CV operator. However, the form of the curve for log(acuity) versus log(L) in Fig. 13 does not provide a good match to psychophysical results: it is concave upward (acuity grows faster at higher illuminances), whereas plots of comparable human data are always concave downward. In other words, human acuities are always significantly higher than those predicted by Goo throughout the range -4-+3.7 log Td. This discrepancy can be eliminated for any given value of L by adjusting the scale parameter a, but the overall shape of the curve will remain concave upward. Evidently, no Gaussian CV operator will provide a good fit to psychophysical plots of acuity versus mean retinal illuminance. This defect has two causes. One is the fact that at low light levels, CV operators become equivalent to low-pass linear filters, and a linear filter allows grating acuity to grow proportionally with the square root of mean retinal illuminance only if its MTF is proportional to 1/f. [More precisely, if acuity grows as L1/ 2 from fi to f2 as L increases from L, to L 2 , then the MTF of the filter must be proportional to 1/f fromfl tof 2 . The MTF of a Gaussian filter is exp(-27r2a-2f2), which decreases much faster than 1/f.] At high light levels (about 10 quanta/receptor, or 10 Td, assuming a 250-msec time frame) the problem is the opposite; here, the Gaussian CV operator does cause acuity to grow proportionally with the square root of mean retinal illuminance, but this rapid growth occurs too late to match human performance; above 10 Td, human acuity grows slowly, if at all.
The failures of G, at high light levels are not confined to its acuity predictions; the latter are only symptomatic of a more general problem. This is the fact that, for illuminances above 10 quanta/receptor, the CSF's in Fig. 12 shift bodily along the log frequency axis as L increases, so that the best frequency varies as L1/ 2 , while the sensitivity at that frequency remains constant. Human CSF's do not shift bodily as retinal illuminance rises from 10 to 1000 Td; instead, they maintain a roughly constant best frequency over that range, and as illuminance rises, the sensitivity increases at that frequency and all higher ones. 4 In addition, human CSF's show no absolute loss of low-frequency sensitivity as retinal illuminance rises, but because of the shift property, all Gaussian CV operators create such a loss above L = 10 quanta/receptor.
Broadly speaking, one can say that in comparison with the visual system, Gaussian CV operators respond too drastically to changes in the mean light level above 10 quanta/receptor. The visual system evidently prefers to use extra quanta in that range to improve its highfrequency performance while maintaining a constant best frequency and a fixed passband, whereas G uses them to shift the entire passband toward higher frequencies, as though obsessed with improving visual acuity regardless of the cost to low-frequency performance. Neither the MTF's for Ga shown in Fig. 11 nor the CSF's shown in Fig. 12 take into account contrast reductions imposed on the retinal image by the optics of the eye. That factor can be incorporated into the model by multiplying the MTF of Ga by that of the eye. For illuminances in the range 10-4-103 quanta/receptor, this makes essentially no difference. Figure 14 shows the MTF's of G T multiplied by an optical contrast reduction factor of the form exp(-27r 2 p 2 f2), as though the eye has a Gaussian point-spread function of the form (1/2rp 2 )exp(-0.5r 2 /p 2 ). Ap value of 1. reasonably good fit to Gubisch's 5 optical MTF for a 2.4-mm pupil; that value has been used to create the curves shown in Fig. 14. It can be seen that for mean illuminances of <1000, incorporation of the optical MTF has a negligible effect on the overall MTF of Ga and consequently cannot significantly affect its CSF predictions. In other words, those predictions still remain well off the mark after optical factors are taken into account. It seems almost certain that these high-light defects are and whose apparent contrast is the reverse of that of the input gratings (i.e., it has a trough instead of a peak at x = 0).7 Figure 15 shows that the operator Ga can produce this effect. For a = 100, the highest spatial frequencies passed by this operator are 0.07 cycle/receptor at L = 100 and 0.10 . The response is essentially a reversecontrast cosine whose frequency is the difference 0.01 and whose contrast is 7%. Since the output standard deviation here is 0.002, this contrast represents a signal-to-noise ratio of 35, and so the beat frequency should be quite visible.

RESPONSE TO SPOTS
A spot here means a region of some constant retinal illuminance L + D surrounded by a uniform field of illuminance L, the sort of target used in increment threshold measurements. The simplest case analytically is a square spot, dealt contrast = 0.9 f = 0.10 L = 100   with below in theorem 8. That shape is sufficient to illustrate the essential properties of the response of G to all spots, because these response properties depend not on the shape of the test spot but on its size. For any spot there is some illuminance level beyond which the maximum response value occurs at the peak of the Mach bands generated by its edges. Since the amplitude of that peak depends only on the Weber fraction D/L, the detectability of any spot will obey Weber's law once L reaches the critical level. That level depends not on the exact shape of the spot but on the distance between its edges: the smaller that distance is, the higher L must be before the Mach bands from opposite sides shrink enough so that they no longer overlap, and Weber's law can begin to hold.

Theorem 8
Suppose that the expected input image q(u, v) is a square spot described by q(u, v) = L + D, for Jul ' W/2, and Ivl ' W/2, and q(u, v) = L elsewhere (so that the spot width is W). The expected output image for the Gaussian operator Ga is then   Figure 17 shows two TVI curves for Gloo, one for a square spot whose width is 1000 receptors In addition to increment thresholds, Fig. 17 also shows predicted absolute thresholds for G1 00 , i.e., the spot intensity D that is needed to produce a d 8 X of 6 when L = 0. Nothing in the model itself limits detection when the background quantum catch is zero, but we can calculate how much dark light would be required to produce the absolute threshold that Aguilar and Stiles measured for their test spot. That amount proves to be 10-4.4 quantum/receptor, and the predicted absolute threshold points in Fig. 17 are based on that value. Of course, the prediction for W = 1000 is constrained to match Aguilar and Stiles's absolute threshold, and so there is no significance to the goodness of fit of this point. However, it is significant that the amount of dark light needed to match the psychophysical absolute threshold is small enough to have no effect on increment thresholds for background intensities greater than L = 10-4. (At L = 10-4 the dark light raises the threshold by 0.06 log unit.) All the predicted points in Fig. 17 were obtained with the assumption that the nominal L value is augmented by a dark light equivalent to 10-44 mean quantum/receptor. Except at L = 10-4, these thresholds are graphically indistinguishable from those predicted without dark light.
If the absolute threshold points are ignored, Fig. 17 shows that the TVI curves for Goo consist essentially of three branches: a low-light branch (L < 0.1) in which the slope is 0.5 (detection obeys the deVries-Rose law), a high-light branch in which the slope is 1.0 (detection obeys Weber's law), and a transitional branch connecting the two extremes. All three branches are to be expected from any Gaussian CV operator with any scale parameter. In the low-light region, when L is less than 0.1, all CV operators become equivalent to linear low-pass filters, and the deVries-Rose law follows directly from linearity combined with Poisson statistics. At high light levels, Weber's law follows from the fact that the peak amplitudes of the Mach bands depend only on the Weber fraction and that the mean and the variance of the uniform-field response are constant; thus d'max depends only on the Weber fraction. Figure 17 also shows that the background illuminance at which Weber's law begins to hold depends on the size of the test spot: for W = 1000 the critical illuminance is 10 quanta/receptor; for W = 100 it is 100 quanta/receptor. However, once its critical background level is reached, the TVI curve for the smaller spot superim- 34 poses upon that of the larger one. This will be true for any size spot once the background illuminance reaches its Weber's-law range. (Of course, a small spot may never actually reach that range, because saturation can occur before it is achieved. For Gloo the TVI curve for W = 10 becomes asymptotic only at L = 10,000 quanta, which is the level at which saturation effects begin to become significant, and so smaller spots will never obey Weber's law.) In the intermediate range, between the deVries-Rose region and its Weberian asymptote, the behavior of the G TVI curve reflects a complicated interaction between two effects: the development of Mach bands (owing to the growing influence of the operator's intrinsic nonlinearity) and the narrowing of those Mach bands (owing to the shifting of the operator's illuminance-dependent MTF). One curious consequence of this interaction is that at L = 100, the smaller spot has a lower increment threshold than the larger one.
Comparing Aguilar and Stiles's TVI curve with the Gioo curve for W = 1000, we see that while the overall fit is fairly good, there are two serious discrepancies. The human curve begins to follow Weber's law at background illuminances of around L = 10-2, whereas Gloo requires L = 10 and obeys the deVries-Rose law quite closely up to L = 1. Also, at the high end of the background range, Aguilar and Stiles's curve begins to exhibit saturation at L = 100, and saturation is complete by L = 1000. Gloo also saturates (although the figure does not show it), but not until L exceeds 10,000.
Neither of these discrepancies can be remedied by altering the scale parameter a: no a will cause Weber's law to hold below L = 0.1, because Ga will always be effectively a linear operator for illuminances in that range; to produce saturation at L = 1000 requires a a value on the order of 7, which would cause dmax in the Weber's law range to be only 0.4 (i.e., for D/L = 0.1). In addition, it should be noted that the  (9.3) and this will be the largest mean response in the output image. Consequently, the maximum value of d' across the output image will be Eq. and again, Ricco's law will hold to an accuracy of two decimal places when the arguments of the normal integral N fall in its linear range, i.e., when W < a-/(L + D)1/ 2 . In that range, Eq. (9.6) simplifies to d' = 0.8DW 2 /a. (9.7) If it is assumed that D/L = 0.1, then the width of Ricco's area at high background levels equals 0.95a-/Ll/ 2 receptor diameters. At L = 1000, that width would be 3 receptors, or 0.025 deg, which is about 7 times smaller than the smallest diameter of Ricco's area obtained by Barlow at his highest background levels. 2 In other words, the CV operator overpredicts the extent to which Ricco's area shrinks as the background illuminance rises from 0 to 1000 Td: it implies that the diameter will decrease by a factor of about 30, whereas the observed decrease is around a factor of 6. On the other hand, from 0 to 100 Td the predicted linear shrinkage factor is only 10.5, which is not a bad prediction.

Quantum Efficiency
For the task of detecting a square spot of width W and illuminance L + D mean quanta/receptor surrounded by a background of illuminance L, an ideal observer using the entire quantum catch within the spot has a d' of DW/L 2 .
Consequently, to achieve a fixed value of d' for any background level and spot size, an ideal quantum-limited detector requires a mean total increment DW 2 = d'WL 12 . The quantum efficiency (QE) of any other detector is the ratio between that minimum and the mean total increment required by that detector to achieve the same d'. Here, we consider the QE of a spot detector based on a Gaussian CV operator: first the photon-noisy input image is filtered by Ga, and then the detection decision is based on the value of the output image at its center point, i.e., (x, y) = (0, 0). (For small spots that point always has the highest d' in the output 'image.) This is the same detection mechanism that was assumed above to derive the TVI curves in Fig. 17. It is undoubtedly not the best possible CV observer: the ideal observer would make use of more than one point in the output image. However, it seems practically impossible to calculate the joint statistical properties of the entire ensemble of random variables that compose the full output image, and so the ideal CV observer cannot readily be determined. The less-than-ideal CV observer considered here has the advantage of being mathematically tractable and permits us to calculate at least a sensible lower bound for the QE of a detection mechanism whose input is a photon-noisy image filtered by a CV operator.
We consider here only spots smaller than Ricco's area, as determined in the preceding section. For these spots the peak d' value is given by Eq.  The following three problems seem to be especially important: (1) Can theorem 1 be generalized to arbitrary spread functions? Theorem 1 shows that for the Gaussian operator G defined by Eq. (u, v)) is uniformly >10 quanta/receptor. The problem is to determine whether this is true for all CV operators, i.e., all operators of the general form of Eq. (1.1). If it is, the entire deterministic theory could be applied generally in the photon-noisy case at high light levels, just as it has been here for the Gaussian operator. showed that for the Gaussian operator, the variance of the uniform-field response is proportional to 1 -exp(-2L), where L is the mean quantum catch/ per receptor. In other words, the variance becomes independent of the field intensity at high light levels. Is this true in general? [Equation (6.1) of theorem 3 shows that the mean of the uniform-field response is always proportional to 1 -exp(-L), regardless of the spread function, and it seems likely that a similar result holds for the variance.] The practical significance of solving problems (1) and (2) was noted in Section 8: a yes answer to both questions would prove the conjecture that for all CV operators the CSF shifts bodily along the log frequency axis as retinal illuminance varies above 10 quanta/receptor, so that the peak spatial frequency and the visual acuity are both proportional to the square root of the mean illuminance level.
(3) Can it be shown that at high light levels, say, where EjQ(u, v)) is uniformly >10, the variance is constant across all points in the output image, regardless of the exact form of the expected input image? (The natural value of this constant would be the asymptotic uniform-field variance.) The simulation results reported in Section 7 suggest that this is true for Gaussian operators, but we have no analytic proof for any CV operator, including the Gaussian one. visual acuity to rise; peak contrast sensitivity to grow to an asymptote on the order of 100 (threshold contrast -1%); the shape of the CSF to change from low pass to bandpass; and increment thresholds to shift from deVries-Rose-law behavior to Weber's law behavior, a change that occurs sooner for large targets than for small ones. The model also correctly predicts that two gratings whose frequencies are both higher than the resolution limit at a given mean luminance level (so that both resemble uniform fields when presented separately) can give rise to visible contrast at their difference frequency when viewed simultaneously. However, this model generally fails to duplicate the exact quantitative parameters of spatial vision. In particular, at low light levels it underpredicts the growth of visual acuity with retinal illuminance, and at high levels it overpredicts it. In addition, it incorrectly implies that at illuminance levels above 10 Td the peak frequency of the spatial CSF should vary proportionally with the square root of the mean retinal illuminance and that low-frequency contrast sensitivity should decrease as retinal illuminance increases in that range.
Thus the simplest CV model succeeds in duplicating the overall qualitative properties of spatial vision up to at least 1000 Td but fails to make accurate quantitative predictions. Could these defects be cured by tinkering? The prospects for this seem rather different at high and low light levels. At low levels the basic problem is that the intrinsic nonlinearity of the CV operation has no opportunity to express itself. When the probability of a photoreceptor's catching more than one photon per time frame becomes negligible (<0.1 Td) the only point spread that has a chance to occur is the one corresponding to a 1-quantum catch, and so a CV operator with a spread function S becomes effectively equivalent to the linear operator whose impulse response is S (r 2 ). Consequently, in this range CV operators share the principle defect of linear operators: no matter what the spread function is, increasing the mean illuminance level can only cause the CSF to translate rigidly upward (in a log-log plot) so that it can never match the lateral shifts seen in human CSF's.
The most obvious remedy for this low-light problem is to alter the assumption that the CV operation occurs at the level of individual photoreceptors. If we assumed instead that the quantum catches of some number of receptors are first pooled and then subjected to a CV operation, the operator's nonlinear effects could be made to reveal themselves at low light levels; acuity could be made to grow as the square root of retinal illuminance, and Weber's law could be made to hold. For example, if the CV operation were applied at the level of a second-stage unit that sums the quantum catch of 100 receptors, the effective nonlinear range of a CV model could be extended down to 0.01 Td. Moreover, the same change would cause the CV operator to begin to saturate at a level 100 times lower than before; e.g., G1 00 would begin to saturate at around 100 Td instead of at 104 Td, and so it could duplicate the saturation properties of the rod system.' 6 Altogether, it seems at least conceivable that a viable model of scotopic spatial vision could be constructed along these lines.
At high light levels the defects of CV models arise from the intrinsic properties of the operation itself: the fact that it is designed to cause spatial resolution to grow at the maximum possible rate (i.e., as the square root of the mean quantum catch). From a modeling standpoint, this singled-minded design philosophy has two unfortunate consequences: it forces both the peak frequency and the high-frequency cutoff of the CSF to grow faster than they should to match human performance, and it has the perverse effect of causing the low-frequency contrast sensitivity to decrease as the retinal illuminance increases. As the mean quantum catch increases to very high levels (above 103) this last problem becomes especially critical, because the MTF of the CV operator begins to eliminate all the spatial frequencies that can actually get through the optics of the eye (Fig. 14), creating a kind of self-imposed blindness.
It is not clear how these high-light defects could be remedied. One natural starting point (suggested in Ref. 1) would be to assume a compressive transformation of the receptor signal, e.g., to take the input to the CV operator to be some power function of the quantum catch rather than the catch itself. In this case the Gaussian CV operator would become  In the deterministic case it is easy to show that a generalized CV operator of this sort satisfies the scaling theorem and consequently still obeys Weber's law.' However, acuity no longer grows as the square root of the mean illuminance L but grows instead LP/ 2 , and this is true of the entire MTF: plotted against log frequency, it shifts along the axis so that its peak frequency is proportional to LP1 2 . If this remains true in the noisy-input case (which seems likely), the relationship between the mean illuminance and the CSF could  However, it would still be the case that as illuminance rises, the CSF would shift bodily toward higher frequencies, and so sensitivity would still fall at low frequencies. This last problem is frustrating. It arises from an excessive quantum catch (the visual analog of being too rich) and might seem to be solved easily by simply throwing away part of that catch, for example, by performing the CV operation on a quantum catch accumulated over a shorter time interval. However, the original (i.e., larger) catch must still be available to support high-frequency performance, and so this approach would require multiple memories: one for the full quantum catch at each receptor over the last 200 msec (for example), another for the catch over the last 100 msec, etc. This approach seems much too cumbersome to be worth pursuing, and I have not found any attractive alternative.
In the face of these difficulties, is it worthwhile to invest further effort in developing CV-based models for biological visual systems? The answer obviously depends on whether they correspond to any physiological reality, but that point is not so easily decided as one might suppose. Current physiologically informed descriptions of the retina' 8 certainly contain nothing resembling CV mechanisms, and it might seem that the issue is long since settled, since CV operators involve no lateral inhibition: their point-spread functions are never negative. But lateral inhibition itself is only a hypothesis designed to explain physiological results, e.g., the bandpass nature of the CSF's of retinal ganglion cells,' 9 and for the most part, the same results that are usually taken to demonstrate lateral inhibition would also be created by CV operators. Also, we have seen that CV operators provide a natural account of the apparent failure of lateral inhibition at low light levels, the apparent loss of the inhibitory surround portion of the ganglion-cell receptive field. 2 0 Thus the fact that CV operators do not appear in current physiological models does not prove that they do not exist in the retina. It only shows that physiologists have not been aware of them as theoretical tools.
What sort of experiment could definitely rule out the possibility of CV operations in the retina? The most direct approach would be to measure the CSF's of individual retinal ganglion cells at retinal illuminance levels spanning a broad range. Such measurements have been made for cat ganglion cells by Derrington and Lennie, 2 ' who found that the CSF (for X cells) could always be fitted by adjusting the center and surround sensitivity parameters of a linear difference-of-Gaussians receptive-field model, with no need to vary the spatial scale parameter of either the center or the surround mechanism. Enroth-Cugell and Robson' 9 also reported essentially the same result in their classic paper on the X-Y distinction. (In their case the spatial scale parameters had to be altered to fit the CSF at low light levels, but only by a negligible amount.) This seems to show that the cat retina does not operate in a CV fashion: otherwise there should be much larger changes in the apparent size of receptive fields.
However, this does not necessarily mean that things work the same way in the primate retina: cat visual acuity increases hardly at all as the mean luminance rises from 10-5 to 100 cd/M 2 , while human acuity rises by a factor of the order of 50. 22 Apparently there are rather substantial dif-ferences in the retinal hardware of cats and monkeys. 2 3 I am not aware of any experiment along the lines of that done by Derrington and Lennie with monkeys, but data of that sort could provide a clear-cut answer to the question of whether anything resembling a CV operation is carried out the primate retina.
I suspect that the answer is no, first, because of the intrinsic defects of that operation as a psychophysical model, and second, because it seems so improbable that cat and primate retinas are designed according to fundamentally different principles. In addition, current work indicates that the vertebrate retina has found other ways of altering its spatial filtering properties as a function of the prevailing light level: in the fish retina, recent evidence suggests that the effectiveness of lateral inhibition mediated by horizontal cells is modulated by feedback signals carried from the inner to the outer plexiform layer by interplexiform cells. 2 4 In cat retina it is thought that a similar result is achieved by an illuminance-dependent modulation of the effectiveness of rodcone gap junction.' 8 Altogether, then, it seems likely that the remarkable similarities between the effects of CV operators and the global properties of spatial vision stem not from an identity of mechanism but from a common preoccupation with fundamental physical problems faced by all visual systems: problems created by photon noise and by the extremely large dynamic range of retinal image intensities in natural environments. The Poisson statistics of photon noise dictate an intensity-dependent spatial-summation mechanism to maximize resolution across different light levels, and the mismatch between the dynamic ranges of retinal inputs and outputs dictates a dc suppression mechanism. In Sections 2 and 3 it was shown that if we try to achieve these two goals by a single operation located at the photoreceptors, we are led inevitably to the class of CV operators, which in turn automatically give rise to bandpass filtering and Weber's-law behavior. From that perspective, then, all the essential features of spatial vision (that is, those that we think of as being due to retinal processes) can be seen as arising from a single algorithm designed to solve two problems that must be solved by any visual system. If our retina solves these same problems in a non-CV fashion and creates the same essential properties, this suggests that the formal linkage between them is preserved even if we drop the assumption that both problems must be solved simultaneously at the receptor level. The interesting theoretical question is whether one can find an axiomatic framework that makes such a linkage apparent.