Aligned Image Sets Under Channel Uncertainty: Settling Conjectures on the Collapse of Degrees of Freedom Under Finite Precision CSIT

A conjecture made by Lapidoth et al. at Allerton 2005 (also an open problem presented at ITA 2006) states that the degrees of freedom (DoF) of a two user broadcast channel, where the transmitter is equipped with two antennas and each user is equipped with one antenna, must collapse under finite precision channel state information at the transmitter (CSIT). That this conjecture, which predates interference alignment, has remained unresolved, is emblematic of a pervasive lack of understanding of the DoF of wireless networks-including interference and X networks-under channel uncertainty at the transmitter(s). In this paper, we prove that the conjecture is true in all non-degenerate settings (e.g., where the probability density function of unknown channel coefficients exists and is bounded). The DoF collapse even when perfect channel knowledge for one user is available to the transmitter. This also settles a related recent conjecture by Tandon et al. The key to our proof is a bound on the number of codewords that can cast the same image (within noise distortion) at the undesired receiver whose channel is subject to finite precision CSIT, while remaining resolvable at the desired receiver whose channel is precisely known by the transmitter. We are also able to generalize the result along two directions. First, if the peak of the probability density function is √ allowed to scale as O(( √P)α), representing the concentration of probability density (improving CSIT) due to, e.g., quantized feedback at rate (α/2) log(P), then the DoF is bounded above by 1+α, which is also achievable under quantized feedback. Second, we generalize the result to arbitrary number of antennas at the transmitter, arbitrary number of single-antenna users, and complex channels. The generalization directly implies a collapse of DoF to unity under non-degenerate channel uncertainty for the general K-user interference and M × N user X networks as well.

made under the assumption of perfect channel knowledge, the degrees of freedom under channel uncertainty at the transmitters have remained mostly a mystery.A prime example is the, heretofore unresolved, conjecture by Lapidoth, Shamai and Wigger from the Allerton conference in 2005 [2], also featured at the "Open Problems Session" at the Inaugural Information Theory and its Applications (ITA) workshop in 2006 [3], which claims that the DoF collapse under finite precision channel state information at the transmitter (CSIT).Specifically, Lapidoth et al. conjecture that the DoF of a 2 user multiple input single output (MISO) broadcast channel (BC) with 2 antennas at the transmitter and 1 antenna at each of the receivers, must collapse to unity (same as single user) if the probability distribution of the channel realizations, from the transmitter's perspective, is sufficiently well behaved that the differential entropy rate is bounded away from −∞.The condition excludes not only settings where some or all channel coefficients are perfectly known, but also scenarios where some channel coefficients are functions of others, even if their values remain unknown.The best DoF outer bound under such channel uncertainty, also obtained by Lapidoth et al., is 4  3 .Deepening the mystery is the body of evidence on both sides of the conjecture.On the one hand, supporting evidence in favor of the collapse of DoF is available if the channel is essentially degraded, i.e., the users' channel vector directions are statistically indistinguishable from the transmitters' perspective [4], [5].On the other hand, the idea of blind interference alignment introduced by Jafar in [6] shows that the 2 user MISO BC achieves 4  3 DoF (which is also an outer bound, thus optimal), even without knowledge of channel realizations at the transmitter, provided that one user experiences time-selective fading and the other user experiences frequency-selective fading.Since the timeselective channel is assumed constant across frequency and the frequency-selective channel is assumed constant across time, it makes some channel coefficients functions of others (they are equal if they belong to the same coherence time/bandwidth interval), so that the model does not contradict the conjecture of Lapidoth et al.Thus, quite remarkably, this conjecture of Lapidoth, Shamai and Wigger, which predates interference alignment in wireless networks, has remained unresolved for nearly a decade.
Following in the footsteps of Lapidoth et al., subsequent works have made similar, sometimes even stronger conjectures, as well as partial attempts at proofs.For instance, the collapse of DoF of the MISO BC was also conjectured by 0018-9448 © 2016 IEEE.Personal use is permitted, but republication/redistribution requires IEEE permission.
Weingarten, Shamai and Kramer in [7] under the finite state compound setting.However, this conjecture turned out to be too strong and was shown to be false by Gou, Jafar and Wang in [8], and by Maddah-Ali in [9], who showed that, once again, 4  3 DoF are achievable (and optimal) for almost all realizations of the finite state compound MISO BC, regardless of how large (but finite) the number of states might be.Since the differential entropy of the channel process is not defined (approaches −∞) for the finite state compound setting, this result also does not contradict the conjecture of Lapidoth et al.A related refinement of the conjecture, informally noted on several occasions (including by Shlomo Shamai at the ITA 2006 presentation) and mentioned most recently (although in the context of i.i.d.fading channels) by Tandon, Jafar, Shamai and Poor in [10] -is that the DoF should collapse even in the "PN" setting, where perfect (P) CSIT is available for one of the two users, while no (N) CSIT is available for the other user.A valiant attempt at proving this conjecture is made in [11], but it turns out to be unsuccessful because it relies critically on an incorrect use of the extremal inequality of [12] under channel uncertainty. 1 Thus the "PN" conjecture has also thus far remained unresolved.
That these conjectures remain unresolved, is emblematic of a broader lack of understanding of the DoF of wireless networks under non-degenerate forms of channel uncertainty.For instance, by extension, under non-degenerate channel uncertainty we also do not know the DoF of the vector broadcast channel with more than 2 users, or the DoF of interference networks, X networks, cellular, multi hop, or twoway relay networks, with or without multiple antennas, or any of a variety of settings with partial uncertainty, such as mixed [15], [16] or alternating [10] channel uncertainty.Thus, the resolution of these conjectures is likely to have a broad impact on our understanding of the "robustness" of the DoF of wireless networks.This is the motivation for our work in this paper.

A. Overview of Contribution
The main contribution of this work is to prove the conjecture of Lapidoth, Shamai and Wigger, thereby closing the ITA 2006 open problem, as well as the "PN" conjecture of Tandon et al., for all non-degenerate forms of finite precision CSIT, which includes all settings where density functions of the unknown channel realizations exist and are bounded.For all such settings, we show that the DoF collapse to unity as conjectured.Remarkably, this is the first result to show the total collapse of DoF under channel uncertainty without making assumptions of degradedness, or the (essentially) statistical equivalence of users.
Our approach, which is reminiscent of Korner and Marton's work on the images of a set in [17], is based on estimating the size of the images of the set of codewords as seen by the two users.Specifically, we bound the expected number of codewords that are resolvable at their desired receiver whose images align (within bounded noise distortion) at the undesired receiver under finite precision CSIT.We show that this quantity is ≈ O((log(P)) n ) where n is the length of codewords, and P is the power constraint which defines the DoF limit as P → ∞.This is negligible relative to the total number of resolvable codewords, which is ≈ O(P nd/2 ) when the desired information is sent at rate d 2 log(P), i.e., with DoF d > 0 (normalization by 1  2 log(P) is because we initially deal with real channels).The difference between the entropy contributed by any set of codewords at their desired receiver (desired DoF) and the entropy contributed by the same set of codewords at the undesired receiver (DoF consumed by interference) tends to zero in the DoF sense.Under non-degenerate channel uncertainty, it is not possible to utilize the DoF at the desired receiver without sacrificing the same number of DoF at the undesired receiver due to interference.Therefore, the DoF are bounded above by unity, the same as with a single user.
We also generalize this result in several directions.First, we extend it to include CSIT that improves as P → ∞, e.g., through quantized feedback at rate α 2 log(P), so that the probability density function of unknown channel coefficients concentrates around the correct realizations.This refinement of CSIT is captured by the growth in the peak value of the probability density function.We show that if the peak of the probability density function of unknown channel coefficients grows no faster than O(P α 2 ), representing e.g., improving channel quantization from feedback at rate α 2 log(P), then the total DoF are bounded above by 1 + α.Furthermore, with quantized feedback of rate α 2 log(P) this DoF bound is achievable.
Next, we go beyond 2 users and generalize the result to the K user MISO broadcast channel where the transmitter has K antennas and there are K users with a single antenna each.We also go beyond the restriction to real channels and generalize the results to complex channels.In all cases we prove that the DoF collapse to unity under non-degenerate channel uncertainty.Since the outer bound for this MISO BC is also an outer bound for the MISO BC with fewer than K antennas at the transmitter or fewer than K users, our result also establishes the collapse of DoF to unity for K user interference networks, and for M × N X channels, under non-degenerate channel uncertainty.Remarkably, the best known outer bounds for K user interference and M × N user X networks under non-degenerate channel uncertainty (except for essentially degraded settings) prior to this work were K 2 and M N M+N−1 (same as with perfect CSIT).Thus, this work on finite precision CSIT and the work of Cadambe and Jafar in [18] where perfect CSIT was assumed, reveal a surprising contrast between the two sides of the same coin.In both cases the best previously known DoF outer bound was K /2 and the best previously known DoF inner bound was 1.Both works close this large gap.However, whereas under perfect CSIT, Cadambe and Jafar close the gap in the optimistic direction, showing that K /2 is optimal, in this work under finite-precision CSIT, we close the gap in the pessimistic direction, showing that 1 DoF is optimal.

B. Notation
We use the Landau O(•), o(•) and (•) notations as follows.For functions f ) denotes that there exists a positive finite constant, M, such that 1 M g(x) ≤ f (x) ≤ Mg(x), ∀x.We use P(•) to denote the probability function Prob(•).We define x as the largest integer that is smaller than or equal to x when x > 0, the smallest integer that is larger than or equal to x when x < 0, and x itself when x is an integer.The index set {1, 2, • • • , n} is represented compactly as [1 : n] or simply [n] when it would cause no confusion.X [s] represents the random vector (X (1), X (2), • • • , X (s)) and {X [s] } represents the set {X (t) : t ∈ [s]}.The cardinality of a set A is denoted as |A|.The support of a random variable is denoted as supp(X).

II. THE 2 USER MISO BC WITH PERFECT CSIT FOR ONE USER
To prove the collapse of DoF in the strongest sense possible, let us first enhance the 2 user MISO BC by allowing perfect CSIT for user 1.Consider the vector broadcast channel with 2 users where the transmitter is equipped with 2 antennas, each user is equipped with 1 receive antenna, and there are 2 independent messages W 1 , W 2 that originate at the transmitter and are desired by users 1 and 2, respectively.The transmission takes place over n channel uses.The channel state information at the transmitter (CSIT) is denoted as T , and includes perfect channel state information for the channel vector of user 1 but not for the channel vector of user 2. In the terminology of Tandon et al. [10], this is the PN setting, although not restricted to any statistical equivalence assumptions.
The best outer bound for the DoF of the PN setting based on known results so far is 3  2 , which is obtained from the finite state compound model by Weingarten, Shamai and Kramer in [7] and is applicable to finite precision CSIT as well.While Weingarten et al. conjectured that their outer bound was loose even in the finite state compound setting, predicting a collapse of DoF, this conjecture was shown to be false by Gou, Jafar and Wang in [8], who showed that 3  2 DoF are achievable under the finite state compound model, through the DoF tuple (d 1 , d 2 ) = (1, 0.5).The key to achievability is to split user 1's 1 DoF into two parts that carry 0.5 DoF each.These parts align at user 2, consuming half the available signal space of user 2, while remaining resolvable at user 1. User 2's signal, carrying 0.5 DoF, is then sent in the null space of user 1's channel, and is resolvable from the 0.5 dimensional interference-free space at user 2. Note that zero forcing at user 1 is possible because perfect CSIT for user 1 is assumed to be available.
The 3  2 DoF outer bound is also applicable in the blind interference alignment setting (BIA) introduced by Jafar in [6], where user 1 experiences time or frequency selective fading but user 2 experiences a relatively flat fading channel.Here also the outer bound is shown to be achievable through the pair (d 1 , d 2 ) = (1, 0.5).The key is to send two symbols for user 1, one from each antenna, repeated over two channel realizations where the channel of user 1 changes but the channel of user 2 remains the same.Thus, user 1 sees two linear combinations of the two symbols from which both symbols can be resolved, whereas user 2 only sees the same linear combination over both channel uses.Thus the interference occupies only 0.5 DoF at user 2. The remaining 0.5 DoF at user 2 is utilized by sending his desired signal, carrying 0.5 DoF, into the null space of user 1.
The finite state compound setting and the blind interference alignment setting reveal some of the challenges of proving the collapse of DoF for the PN setting.Any attempt at proving a collapse of DoF must carefully exclude such scenarios from the channel model.With this cautionary note, we are now ready to introduce the channel model for our problem.

A. General Channel Model
Following [2], we will start with the real channel model, where all symbols take only real values.The extension to complex channels is cumbersome but conceptually straightforward, and will be presented in Section VII for the sake of completeness.
The channel is described as follows: where all symbols are real.At time t ∈ N, Ỹk (t) is the symbol received by user k, Zk (t) ∼ N (0, 1) is the real additive white Gaussian noise (AWGN), Xk (t) is the symbol sent from transmit antenna k, and Gkj (t) is the channel fading coefficient between the j th transmit antenna and user k.The channel coefficients are not restricted to i.i.d.realizations, but are assumed to be drawn from a continuous distribution such that the joint density of G(t) exists.The transmitter is subject to the power constraint: To avoid degenerate situations we will assume that the range of values of each of the elements Gij is bounded away from zero and infinity, as is the determinant of the overall channel matrix -i.e., | Gij (t)|, det( G(t)) are all (1).Stated explicitly, there exists positive finite constant M, such that Note that this is not a major restriction because by choosing the bounding constants large enough, the omitted neighborhoods can be reduced to a probability measure less than for arbitrarily small , and thus has only a vanishing impact on the DoF.

B. Canonical Form
For the purpose of deriving a DoF outer bound it suffices to work with a simplified channel with fewer parameters.(See Appendix B for justification).The simplified form of the channel model, shown in Fig. 1 has outputs Y 1 (t), Y 2 (t) ∈ R, and inputs X 1 (t), X 2 (t) ∈ R, so that: The channel coefficient G(t) is also bounded away from zero and infinity, i.e., there exists finite positive M, such that The new power constraint is expressed as where P = ( P).Further, for notational convenience let us define the set of admissible inputs.

Messages, Rates, Capacity, DoF
The messages W 1 , W 2 are jointly encoded at the transmitter for transmission over n channel uses at rates R 1 , R 2 , respectively, into a 2 n R 1 +n R 2 × n codebook matrix over the input alphabet.The codebook is denoted by C(n, (R 1 , R 2 ), P).For given power constraint parameter P, the rate vector (R 1 , R 2 ) is said to be achievable if there exists a sequence of codebooks C(n, (R 1 , R 2 ), P), indexed by n, such that the probability that all messages are correctly decoded by their desired receivers approaches 1 as n approaches infinity.The closure of achievable rate vectors is the capacity region C(P).
The closure of all achievable DoF tuples (d 1 , d 2 ) is called the DoF region, D. The sum-DoF value is defined as

D. Non-Degenerate Channel Uncertainty
Let us denote by T , all available channel state information at the transmitter.Then, non-degenerate channel uncertainty corresponds to the assumption that the conditional probability density functions of channel coefficients exist and are bounded, as explained next.
1) Peak of Density Function Is Bounded for Fixed P: For the G i j the transmiters are only aware of the joint probability density function (PDF).∀k ∈ [2 : K ], define the set of channel coefficient variables k .Finite precision CSIT corresponds to the existence of bounded density functions.Precisely, the finite precision CSIT model assumes that there exists a finite positive constant f max , 0 < f max < ∞ such that ∀n ∈ N, and for all finite cardinality disjoint subsets max .The condition implies that a zero measure set cannot carry a non-zero probability.So it precludes scenarios where, e.g., the channel is perfectly known or when one channel coefficient is a function of the rest.In all such cases, a zero measure set carries a non-zero probability, thus precluding the existence of a bounded constant f max as defined above.This restriction essentially accomplishes the same goal as the restriction by Lapidoth et al. [2] that the differential entropy should be greater than −∞.
For example, if conditioned on the available CSIT T , the channel realizations are independent, then we can simply choose f max as the peak value of the marginal density functions.
2) Peak of Density Function Is Allowed to Scale With P: To model CSIT that improves as a function of P, we allow f max (P) to scale as O(( The case studied by Lapidoth et al. in [2], where the density does not depend on P, is represented here by setting α = 0.The positive values of α allow us to address settings where the CSIT improves with P, e.g., due to quantized channel feedback of rate α 2 log(P), so that the weight of the distribution is increasingly concentrated around the true channel realizations.Note that the maximum value of α is unity, because a feedback rate of 1  2 log(P), implying 1 real DoF worth of feedback, is sufficient to approach perfect CSIT performance over channels that take only real values [19], [20].
Since the receivers have full channel state information, T is globally known.For compact notation, we will suppress the conditioning, writing f G [n] (g [n] ) directly instead.

E. K User Extension
Extending beyond the 2 user case, the simplified channel model in the K user setting is described as follows.
where the inputs, X k (t) ∈ R, are subject to the power constraint The G i j (t) terms are known to the transmitter only up to finite precision and are assumed to be bounded away from 0 and infinity.Further, the density of the k th users' unknown channel coefficients, k > 1, is bounded by

III. RESULTS
We state the main result in its most general form, for K users.The 2 user case, corresponds to setting α 2 α.
Theorem 1: For the K user MISO BC with non-degenerate channel uncertainty, the sum-GDoF are bounded above as A. Settling the Conjecture by Lapidoth et al. in [2] The 2 user setting studied by Lapidoth et al., where the joint pdf is fixed, i.e., it does not depend on P, is captured here when α = 0 (equivalently, α 2 = 0).When α = 0, the sum-GDoF are bounded above by unity, thus settling the conjecture of Lapidoth et al. for non-degenerate channel uncertainty models.

B. Settling the "PN" Conjecture
Since we allow perfect CSIT for one user, and one may assume (as a special case of our result) that the channels are i.i.d., the collapse of DoF for α = 0, also proves the conjecture of Tandon et al. for the 2 user setting.

C. Interference and X Networks
Consider any one-hop wireless network where all receivers are equipped with a single antenna each.This includes all interference and X networks.Allowing the transmitters to cooperate produces a MISO BC setting.Since cooperation cannot hurt, the outer bound for the MISO BC under nondegenerate channel uncertainty applies to interference and X networks as well.In all cases, the DoF collapse to unity.

D. Limited Rate Feedback (α > 0)
Consider the 2 user setting, with α = α 2 > 0. This case is interesting because it has direct implications to the achievable DoF under limited rate quantized channel state feedback for the channel vector of user 2. If the feedback link has α DoF, i.e., the feedback rate scales as α 2 log(P) bits per channel use, then this corresponds to ∼ P α 2 channel quantization levels, so that the size of a quantization interval scales as 1 ( √ P) α and the channel density restricted to a quantization interval, i.e., f G [n] (g [n] |T ) scales as P α 2 .Theorem 1 tells us that in this case the GDoF are bounded above as D ≤ 1 + α.It is also easy to see that under such quantized feedback, the DoF tuple (d 1 , d 2 ) = (1, α) is achievable, simply by best-effort zeroforcing at the transmitter and treating residual interference as noise at the receiver 2. Thus, D = 1 + α is the optimal sum-DoF value if the quantized channel state feedback is limited to rate α 2 log(P).This generalizes the results from [19] and [20] where it was shown that in order to achieve the same DoF as with perfect CSIT, i.e., D = 2, the quantized feedback rate should scale as 1  2 log(P), i.e., carry one full degree of freedom (α = 1).The bound for the K user extension is similarly tight as well.

IV. ALIGNED IMAGE SETS UNDER CHANNEL UNCERTAINTY
The main idea we want to illustrate intuitively in this section is a geometric notion of aligned images of codewords-loosely related to Korner and Marton's work on the images of a set in [17] but under a much more specialized setting-which is the key to our proof.As the proof in Section V will show, the problem boils down to the difference of two terms when only information to user 1 is being transmitted, The first term, h(Y [n]  1 |G [n] ), we wish to maximize because it represents the rate of desired information being sent to user 1.

The second, h(Y
) we wish to minimize, because it represents the interference seen by user 2, due to the information being sent to user 1.If G [n] was perfectly available to the transmitter, then X [n]   2 could be chosen to cancel G [n] X [n]  1 thus eliminating interference entirely at user 2. With only statistical knowledge of G [n] , zero forcing is not possible.Indeed, the purpose of X [n]  2 is mainly to align interference into as small a space as possible.However, instead of consolidating interference in the sense of vector space dimensions, as is typically the case in DoF studies involving interference alignment, here the goal is for X [n]  2 to minimize the size of the image, as seen by user 2, of the codewords that carry information for user 1.This is the new perspective that is the key to the proof.

A. Toy Setting to Introduce Aligned Image Sets
For illustrative purposes, let us start with a rather extreme over-simplification, by considering the case with n = 1 channel use, ignoring noise, and using the log of the cardinality of the codewords as a surrogate for the entropy.With this simplification, the quantity that we are interested in is the difference: averaged over G. Recall that by |supp(A)| is meant the cardinality of the set of values taken by the random variable A. The codebook is the set of (X 1 , X 2 ) values.Note that |supp(X 1 )|, the number of distinct values of X 1 , is the number of distinct "codewords" that can be seen by user 1, who (once noise is ignored) only sees Y 1 = X 1 , so that his "rate" is log |supp(X 1 )|.Given the set of X 1 values, we would like to associate each X 1 value with a corresponding X 2 value, such that the number of distinct values of Y 2 = G X 1 + X 2 is minimized.In other words, we wish to minimize the image of the set of codewords as seen by user 2, by choosing X 2 to be a suitable function of X 1 .
Consider two codewords (X 1 , X 2 ) = (x 1 , x 2 ) and (X 1 , X 2 ) = (x 1 , x 2 ).If x 1 = x 1 then these codewords are distinct from user 1's perspective, and thus capable of carrying information to user 1 via the transmitter's choice to transmit one or the other.Suppose the channel is G. Then for these two codewords to "align" where they cause interference, they must have the same image as seen by user 2. This gives us the condition for aligned images that is central to this work.
In other words, G must be the negative of the slope of the line connecting the codeword (x 1 , x 2 ) to the codeword (x 1 , x 2 ) in the X 1 , X 2 plane.For a given channel realization G, all codewords that align with (x 1 , x 2 ) (i.e., whose images align with the image of (x 1 , x 2 )) as seen by user 2, must lie on the same line that passes through (x 1 , x 2 ) and has slope −G.
Conversely, all codewords that lie on this line have images that align with the image of (x 1 , x 2 ) at user 2. For any codeword that does not lie on this line, there is a parallel line with the same slope, −G, that represents the set of codewords whose images align with the image of that codeword.Thus, these lines of the same slope, −G, partition the set of codewords into equivalence classes, such that codewords that lie on the same line have the same image at user 2. Also note that a different channel realization, G , gives rise to a different equivalent class partition, corresponding to lines with slope −G .This is illustrated in Fig. 3. Since the X 2 values are functions of X 1 values, in the figure we label the codewords only on the

B. Sketch of Proof
Staying with the intuitive character of this section, let us conclude with an outline which will be useful to navigate the structure of the proof that appears in the subsequent section.
From the perspective of DoF studies, the presence of noise essentially imposes a resolution threshold, e.g., δ, such that the codewords with images that differ by less than δ, are unresolvable.As the first step of the proof, this effect is captured by discretizing the input and output alphabet and eliminating noise, as is done in a variety of deterministic channel models that have been used for DoF studies [21]- [23], so that instead of differential entropies we now need to deal only with entropies H ( Ȳ [n] 1 |G [n] ) and H ( Ȳ [n] 2 |G [n] ).Here X1 , X2 represent the discretized inputs, Ȳ1 , Ȳ2 the discretized outputs, and Ȳ1 = X1 .The next step is to note that we are only interested in the maximum value of the difference 2 |G [n] ).It then follows that without loss of generality, X[n] 2 can be made a function of X[n] 1 , and therefore ).Thus, the difference of entropies is equal to ).

Now, conditioned on Ȳ [n]
2 , G [n] , the set of feasible values of X[n] 1 is an aligned image set S(G [n] ), i.e., all these X[n] 1 produce the same value of Ȳ [n] 2 for the given channel realization G [n] .
Since entropy is maximized by a uniform distribution, where the last step uses Jensen's inequality.Thus, the difference of entropies is bounded by the log of the expected cardinality of the aligned image sets.
The most critical step of the proof then is to bound the expected cardinality of aligned image sets.This is done by bounding the probability that two given X[n] 1 are in the same aligned image set, i.e., the probability of the set of channels for which the two produce the same image 2 .Recall that for two codewords to belong to the same aligned set in the absence of noise, the channel realization over each channel use must be the slope of the vector connecting the corresponding codeword vectors.The blurring of δ around the two codewords also blurs the slope of the line connecting them, but by no more than ±δ/ , where is the distance (difference in magnitudes) between the two codeword symbols over that channel use.Thus, the probability that the given two codewords that are resolvable at user 1 cast the same image at user 2 is bounded above by ≈ f max 2δ .(G [n] )| is bounded above by ≈ n log( f max δ) + n log(log(P)). Since ), normalizing by n 2 log(P) and sending first n and then P to infinity sends this term to α.Thus, combining with (12) produces the sum-DoF outer bound value 1 + α, giving us the result of Theorem 1.Note that in the DoF limit, δ = (1), and it will be useful to think of it as 1 for simplicity, so that the inputs and outputs are restricted to integer values.With this sketch as the preamble, we now proceed to the actual proof.
V. PROOF OF THEOREM 1 FOR K = 2 USERS For ease of exposition, the proof is divided into several key steps.The first step is the discretization of the channel to capture the effect of noise.This leads to a deterministic channel model.The DoF of the deterministic channel model will be shown to be an outer bound to the DoF of the canonical channel model, which in turn is an outer bound on the DoF of the general channel model.

1) Deterministic Channel Model
The deterministic channel model has inputs X1 (t), X2 (t) ∈ Z and outputs Ȳ1 (t), Ȳ2 (t) ∈ Z, defined as with the discretization of inputs subject to the power constraint and the set of inputs that satisfy the power constraints defined as The bounded density assumptions on the unknown channel coefficients sequence G [n] are the same as before.
Lemma 1: The DoF of the canonical channel model are bounded above by the DoF of the deterministic channel model.The proof of Lemma 1 appears in Appendix and follows along the lines of similar proofs by Bresler and Tse in [22].

2) Difference of Entropies Representing Desired Signal and Interference Dimensions
Starting from Fano's inequality, we proceed as follows.
so that what remains is to bound the difference of entropy terms: Note that in (24) we bounded where (29) follows from the observation that for a given values (all integers) and the entropy of a variable that can take finitely many values is at most the log of the number of values.
In general, because the mapping may be random, L is a random variable.Because conditioning cannot increase entropy, Let L o ∈ L be the mapping that minimizes the entropy term.Then, choosing we have because the choice of the mapping function does not affect the positive entropy term, and it minimizes the negative entropy term.Henceforth, because 1 , we will refer to codewords only through X[n] 1 values.

4) Definition of Aligned Image Sets
The aligned image set containing the codeword for channel realization G [n] is defined as the set of all codewords that cast the same image as ν[n] at user 2.
Since we will need the average (over G [n] ) of the cardinality of an aligned image set, E|S ν[n] (G [n] )|, it is worthwhile to point out that the cardinality |S ν[n] (G [n] )| as a function of G [n] , is a bounded simple function, and therefore measurable. 2 It is bounded because its values are restricted to natural numbers not greater than (1 + √ P ) 2n .To verify that it is a simple function, it suffices to show that the sets where some specific codewords like ν [n]  1 and ν [n]  2 align, are measurable sets.S 12 = {G [n] : {{G [n] : {G [n] : where 2 ).So, S 12 is a countable union of intersections of open intervals and closed intervals in R [n] , which makes it a measurable set. 3 Thus, |S ν[n] (G [n] )| is a simple function.
Rearranging terms, we note that D ≤ lim sup 2 A simple function is a finite sum of indicator functions of measurable sets [24]. 3Note that Gν = m, corresponds to G ∈ [m/ν, (m + 1)/ν), which is the intersection of an open interval and a closed interval.Also recall that a closed or open subset of R [n] is Lebesgue measurable, and countable unions and intersections of Lebesgue measurable sets are Lebesgue measurable.
Bounding the Average Size of Aligned Image Sets ).Finally combining (49) with ( 26) and (34) we have the desired outer bound The generalization of the proof to the K user setting is, for the most part, straightforward based on the 2 user case studied earlier.To avoid repetition our presentation will only briefly summarize the aspects that follow directly and use detailed exposition for only those aspects that require special attention.We divide the proof into a similar set of steps for ease of reference with the 2 user case.

1) Deterministic Channel Model
As in the 2 user case, the deterministic channel model is described as: where the integer inputs satisfy the following per-symbol power constraint As before, let us define X [n] as the set of codewords that satisfy the power constraint.We have the following bound.
Lemma 2: The DoF of the canonical model are bounded above by the DoF of the deterministic model.We omit the proof of Lemma 2 since it is a straightforward extension of the 2 user proof which was already presented in much detail.

2) Difference of Entropy Terms
For the k th user we bound the rate as where G n includes all channel realizations.Adding the rate bounds we obtain In DoF terms, So we need to bound each of the following difference of entropy terms, ∀k ∈ We will bound these terms one at a time.The remainder of the proof will show that D ,k ≤ α k . 3 ) For a given channel realization for user k−1, G [n]  k−1 , there are multiple vectors k−1 .Thus, given the channel for user k − 1, the mapping from Ȳ [n] k−1 to one of these vectors Now note that Let a minimizing mapping be L o .Fix this as the deterministic mapping, This implicitly allows the transmitter to have full knowledge of the channel vector of user k − 1.We note that the choice of mapping does not affect the positive entropy term k |G [n] ), so that we can bound D ,k as follows.
Henceforth, note that

4) Define Aligned Image Sets
For channel realization G [n] , define the aligned image set

5) Bounding Difference of Entropies in Terms of Size of Aligned Image Sets
where we used Jensen's inequality in (68).Rearranging terms, we note that

6) Bounding the Probability of Image Alignment
Given the channel, G

and two realizations of Ȳ [n]
k−1 , say ȳ[n] and ȳ [n] , which map to So we have, where (−1,1) ∈ (−1, 1), and we define Thus, for all t ∈ [1 : n] such that x j * (t ) (t) = x j * (t ) (t), the value of G kj * (t ) (t) must lie within an interval of length no more than . Therefore, the probability that the images due to ȳ[n] and ȳ [n] align at user k, is bounded as follows.
It will be useful to express the bound in terms of ȳ (t), ȳ(t).To this end, let us proceed as follows.
Bounding the Average Size of Aligned Image Sets where

8) Combining the Bounds to Complete the Proof
Combining (74) and (69) we have Finally, combining (56), ( 62) and (75) we have the result, The channel model for the complex setting is the identical to the real setting described in Section II-E, except that all symbols are complex and instead of (6), the DoF are defined as The deterministic channel model is described similar to the real setting as; where the real and imaginary parts of the inputs, i.e.XkR (t) and XkI (t) are integers and satisfy the following per-symbol power constraint Similar to the real setting, ∀k ∈ [2 : K ], define the set of channel coefficient variables where G ki,R (t), G ki,I (t) are the real and imaginary parts of k .Similar to the real setting, ∀n ∈ N, and Furthermore, similar to the K user real setting where we allow f max,k (P) to scale as O(( √ P) α k ) for some α k ∈ [0, 1], here also we allow the density functions of each user to scale at different rates, representing different amounts of CSIT feedback, so that for the unknown channel coefficients associated with user k we have peak density constraint, f max,k (P) = O( √ P α k ).The generalization of the proof to the complex channel coefficients setting is, for the most part, straightforward based on the K user case studied earlier.To avoid repetition, here we focus only on the differences.
In DoF terms, We need to bound each of the following difference of entropy terms, ∀k ∈ We will bound these terms one at a time.The remainder of the proof will show that D ,k ≤ α k .Note that, the functional dependence part which claims that Ȳ [n] k is a function of Ȳ [n] k−1 , G [n] , aligned image sets and bounding difference of entropies in terms of size of aligned image sets are also the same as the real setting.
(G [n] ) Now we bound the probability of alignment of images.For notational compactness let us define Without loss of generality assume, where (−1,1) ∈ (−1, 1).Similarly we have, +G kj * I (t)( x j * (t ),R − x j * (t ),R ) Thus, for all t ∈ [1 : n] such that x j * (t ),R (t) = x j * (t ),R (t) the value of G kj * (t )R (t) and G kj * (t )I (t) must lie within an interval of length no more than 4 x j * (t),R − x j * (t),R .Therefore, the probability that the images due to ȳ[n] and ȳ [n] align at user k, is bounded as follows.
It will be useful to express the bound in terms of ȳ R (t), ȳR (t), ȳ I (t) and, ȳI (t).To this end, let us proceed as follows.
where max Finally, combining (81), ( 83) and ( 96) we have the result, Since CSIT is almost available with infinite precision, the collapse of DoF under finite precision channel uncertainty is a sobering result that stands in stark contrast against the tremendous DoF gains shown to be possible with perfect channel knowledge [18], [25].However, as evident from the conjecture of Lapidoth, Shamai and Wigger, the pessimistic outcome is not unexpected.In terms of practical implications, just like the extremely positive DoF results, the extremely negative DoF results should be taken with a grain of salt.The collapse of DoF under finite precision CSIT is very much due to the asymptotic nature of the DoF metric, and may not be directly representative of finite SNR scenarios which are of primary concern in practice.From a technical perspective, the new outer bound technique offers hope for new insights through the studies of more general forms of CSIT, such as finite precision versions of delayed [26], mixed [15], [16], topological [27], blind [6] and alternating [10] CSIT.

APPENDIX
The proof is in two parts.First we prove that we can limit the inputs and outputs to integer values without reducing the DoF.Then, we will show that the long-term (percodeword) power constraints can be replaced with shortterm (per-symbol) power constraints without reducing the DoF.The proofs follow along the lines of similar proofs by Bresler and Tse in [22], are specialized to the broadcast setting, and fill in several details that are omitted in [22].

A. Integer Inputs and Outputs
Given codebooks with real codewords, (X [n]  1 , X [n]  2 ) ∈ R n × R n for the canonical channel model, we show that the deterministic channel model with integer inputs X [n]  1 , X [n]  2 , and outputs achieves the same DoF.Thus, removing noise and limiting the inputs to integer values, as done in the deterministic model, does not reduce DoF relative to the original canonical channel model. Define 1 .Taking a similar approach to [22, Lemma 5] we have, so that the difference between I (W 1 ; Y [n]  1 |G [n] ) and 1 |G [n] ) approaches 0 when normalized by n 2 log(P), as first n and then P is sent to infinity.
Similarly, by defining 2 , we have, 2 |G [n] ) so that the difference between I (W 2 ; Y [n]  2 |G [n] ) and I (W 2 ; Ȳ [n] 2 |G [n] ) approaches 0 when normalized by n 2 log(P), as first n and then P is sent to infinity.. Thus, the deterministic channel with inputs ( X 2 ), and per-codeword power constraints, achieves at least the same DoF as the original canonical channel model.

B. Per-Symbol Power Constraints
Given codewords ( X 2 ) for the deterministic channel with outputs ( Ȳ 2 ), such that the codewords satisfy per-codeword power constraints, define ∀t ∈ [1 : n] 2 ) satisfy per-symbol power constraints.Now let us compare the rates achieved on the channel ( X 2 ) to the rates achieved on the new channel ( X where (t) ∈ {−1, 0, 1}.To complete the proof we only need to show that n t =1 H ( Xi (t)) ≤ n o(log(P)), i.e., these terms can contribute no more than 0 in the DoF sense.We will use the fact that any integer number X can be written as Q X Q − Q1(x < 0)+(X mod Q) where Q √ P is also an integer value. Define where the expectation is over messages, i.e., the choice of codewords, so that (140) In the following derivation we will make use of the fact that ρm log 1  ρm is an increasing function of ρm when ρm < 1/e, so for these ρm values one can replace ρm with ρ * m to obtain an outer bound.Further, we will use the fact that the maximum value of ρm log 1  ρm is 1 e ln(2) .We bound the entropy term in (133) as follows  In addition to the channel vector for user 1, let us allow the CSIT to include the determinant of the channel matrix.This cannot reduce capacity, so it can only make the outer bound stronger.

G11 (t), G12 (t), det(G(t)) ∈ T , ∀t ∈ N (156)
Note that for continuous distributions, G(t) is not a function of T .With the available CSIT, suppose the transmitter sets: det( G(t)) X 2 (t) (157) Substituting into (1) we obtain the canonical channel model (4).Noting that the transformation from X1 (t), X2 (t) to X 1 (t), X 2 (t) is invertible and that the new power constraint (5) allows all feasible X1 (t), X2 (t), it is evident that the capacity of the channel in its canonical form cannot be smaller than that of the original channel.Thus, the canonical channel transformation is valid for our DoF outer bound.Remark: It is not necessary to provide side information of the determinant of the channel matrix to the transmitter.One could also normalize the desired channel coefficient values to unity by scaling the received signals at the receivers, which would only scale the noise variance by a bounded amount that is inconsequential for DoF.We choose to provide the determinant as side information to the transmitter, because for a pessimistic outer bound that shows the collapse of DoF, including more CSIT only makes the result stronger.It shows that even this additional CSIT cannot prevent the collapse of DoF.

Fig. 1 .
Fig. 1.Canonical form of the 2 user MISO BC with perfect CSIT for user 1.

Fig. 2 .
Fig. 2.Canonical form of the 3 user MISO BC with Graded Channel Uncertainty.

Fig. 3 .
Fig. 3. Two codewords, corresponding to X 1 = ν and X 1 = γ , and their equivalence classes, S ν and S γ , containing all codewords (X 1 , X 2 ) that have the same image at user 2 as the codewords corresponding to X 1 = ν and X 1 = γ , respectively.The partitioning into equivalence classes depends on the channel realization.The figure shows the distinct equivalence classes for two channel realizations, G and G .X 1 axis.So, for example, the codeword (X 1 , X 2 ) = (ν, X 2 (ν)) is simply referred to as the codeword ν.This codeword belongs to the equivalence class S ν (G) under the channel realization G and to the equivalence class S ν (G ) under the channel realization G .Also, note that two codewords that belong to the same equivalence class under one channel realization, cannot belong to the same equivalence class under any other channel realization.For instance, codewords λ and ν belong to the same equivalence class S ν (G) under channel realization G, but they belong to different equivalence classes, S ν (G ) and S γ (G ), under a different channel realization G .