Channel Acquisition for Massive MIMO-OFDM with Adjustable Phase Shift Pilots

We propose adjustable phase shift pilots (APSPs) for channel acquisition in wideband massive multiple-input multiple-output (MIMO) systems employing orthogonal frequency division multiplexing (OFDM) to reduce the pilot overhead. Based on a physically motivated channel model, we first establish a relationship between channel space-frequency correlations and the channel power angle-delay spectrum in the massive antenna array regime, which reveals the channel sparsity in massive MIMO-OFDM. With this channel model, we then investigate channel acquisition, including channel estimation and channel prediction, for massive MIMO-OFDM with APSPs. We show that channel acquisition performance in terms of sum mean square error can be minimized if the user terminals' channel power distributions in the angle-delay domain can be made non-overlapping with proper phase shift scheduling. A simplified pilot phase shift scheduling algorithm is developed based on this optimal channel acquisition condition. The performance of APSPs is investigated for both one symbol and multiple symbol data models. Simulations demonstrate that the proposed APSP approach can provide substantial performance gains in terms of achievable spectral efficiency over the conventional phase shift orthogonal pilot approach in typical mobility scenarios.


I. INTRODUCTION
F ORTHCOMING 5G cellular wireless systems are expected to support 1000 times faster data rates than the currently deployed 4G long-term evolution (LTE) system. To achieve the high data rates required by 5G, many technologies have been proposed [1]- [3]. Among them, massive multipleinput multiple-output (MIMO) systems, which deploy unprecedented numbers of antennas at the base stations (BSs) to simultaneously serve a relatively large number of user terminals (UTs), are believed to be one of the key candidate technologies for 5G [4]- [6].
Orthogonal frequency division multiplexing (OFDM) is a multi-carrier modulation technology suited for high data rate wideband wireless transmission [7], [8]. Due to its robustness to channel frequency selectivity and relatively efficient implementation, OFDM combined with massive MIMO is a promising technique for wideband massive MIMO transmission [4]. As in conventional MIMO-OFDM, the performance of massive MIMO-OFDM is highly dependant on the quality of the channel acquisition. Pilot design and channel acquisition for massive MIMO-OFDM is of great practical importance.
Optimal pilot design and channel acquisition for conventional MIMO-OFDM has been extensively investigated in the literature. The most common approach is to estimate the channel response in the delay domain, and optimal pilots sent from different transmit antennas are typically assumed to satisfy the phase shift orthogonality condition in both the single-user case [9]- [11] and the multi-user case [12]. Note that such phase shift orthogonal pilots (PSOPs) have been adopted in LTE [13]. When channel spatial correlations are taken into account, optimal pilot design has been investigated for both the single-user case [14] and multi-user case [15]. Although these orthogonal pilot approaches can eliminate pilot interference in the same cell, they do not take into account the pilot overhead issue, which is thought to be one of the limiting factors for throughput in massive MIMO-OFDM [4]. When such approaches are directly adopted in time-division duplex (TDD) massive MIMO-OFDM, the corresponding pilot overhead is proportional to the sum of the number of UT antennas, and would be prohibitively large as the number of UTs becomes large. This becomes the system bottleneck, especially in high mobility scenarios where pilots must be transmitted more frequently. Therefore, a pilot approach that takes the pilot overhead issue into account is of importance for massive MIMO-OFDM systems.
In this paper, we propose adjustable phase shift pilots (AP-SPs) for massive MIMO-OFDM to reduce the pilot overhead. For APSPs, one sequence along with different adjustable phase shifted versions of itself in the frequency domain are adopted as pilots for different UTs. The proposed APSPs are different from conventional PSOPs [9], [10], [12], in which phase shifts for different pilots are fixed, and phase shift differences between different pilots are no less than the maximum channel delay (divided by the system sampling duration) of all the UTs. Since in our approach the phase shifts for different pilots are adjustable, more pilots are available compared with conventional PSOPs, which leads to significantly reduced pilot overhead.
The proposed APSPs exploit the following two channel properties: First, wireless channels are sparse in many typical propagation scenarios; most channel power is concentrated in a finite region of delays and/or angles due to limited scattering [16]- [19]. Such channel sparsity can be resolved in the angle domain in massive MIMO due to the relatively large antenna array apertures, which has been observed in recent massive MIMO channel measurement results [20], [21]. Second, channel sparsity patterns, i.e., channel power distributions in the angle-delay domain, for different UTs are usually different. 1 For APSPs, when the phase shifts for pilots employed by different UTs are properly scheduled according to the above channel properties, channel acquisition can be achieved simultaneously in an almost interference-free manner as with conventional PSOPs. There has recently been increased research interest on utilizing channel sparsity for channel acquisition in massive MIMO. For instance, a timefrequency training scheme [25] and a distributed Bayesian channel estimation scheme [24] were proposed for massive MIMO-OFDM by exploiting the channel sparsity. As the approaches in [24] and [25] focus on channel acquisition for a single UT, the corresponding pilot overhead would still grow linearly with the number of UTs. Channel sparsity has also been exploited to mitigate pilot contamination in multi-cell massive MIMO [26], [27]. Note that compressive sensing has been applied to sparse channel acquisition in some recent works (see, e.g., [19], [22], [23], [28] and references therein), in which the corresponding pilot signals are usually assumed to be randomly generated. However, it is usually quite difficult to implement random pilot signals in practical systems [29]. For example, adopting large dimensional random pilot signals in the massive MIMO-OFDM systems considered here requires huge storage space and high complexity channel acquisition algorithms. In addition, a low peak-to-average power ratio (PAPR) for randomly generated pilot signals usually cannot be guaranteed. These drawbacks can be mitigated via proper design of the deterministic sensing matrices (see, e.g., [30], [31] and references therein).
The main contributions of this paper are summarized as follows: • Based on a physically motivated channel model, we establish a relationship between the space-frequency domain channel covariance matrix (SFCCM) and the channel power angle-delay spectrum for massive MIMO-OFDM. We show that when the number of BS antennas is sufficiently large, the eigenvectors of the SFCCMs for different UTs tend to be equal, while the eigenvalues depend on the respective channel power angle-delay spectra, which reveals the channel sparsity in the angle-delay domain. Then we propose the angle-delay domain channel response matrix (ADCRM) and the corresponding angledelay domain channel power matrix (ADCPM), which can model the massive MIMO-OFDM channel sparsity 1 There has been recent work that considers channels with a sparse common support [22], [23]. However, for massive MIMO channels, the common support assumption might not hold due to the increased angle resolution [22], [24]. Thus, in this work we assume that the channel sparsity patterns of different UTs are different (but not necessarily totally different), although the proposed APSP approach can also be applied to the common support cases.
in the angle-delay domain, and are convenient for further analyses.
• With the presented channel model, we propose APSPbased channel acquisition (APSP-CA) for massive MIMO-OFDM in TDD mode. For APSPs, equivalent channels for different UTs will experience corresponding cyclic shifts in the delay domain. Using this property, we show that the sum mean square error (MSE) of channel estimation (MSE-CE) can be minimized if the UTs' channel power distributions in the angle-delay domain can be made non-overlapping with proper pilot phase shift scheduling. Taking the time-varying nature of the channel into account, we further investigate channel prediction during the data segment using the received pilot signals. We show that the sum MSE of channel prediction (MSE-CP) can also be minimized if the UTs' channel power distributions in the angle-delay domain can be made non-overlapping with proper pilot phase shift scheduling, which coincides with the optimal channel estimation condition. A simplified pilot phase shift scheduling algorithm is developed based on this optimal channel acquisition condition. The proposed APSP-CA approach is investigated for cases involving both one symbol and multiple consecutive symbols. • The proposed APSP-CA is evaluated in several typical propagation scenarios, and significant performance gains in terms of achievable spectral efficiency over the conventional PSOP-based channel acquisition (PSOP-CA) are demonstrated, especially in high mobility scenarios. Portions of this work previously appeared in the conference paper [32].

A. Notations
We adopt the following notation throughout the paper. We use = √ −1 to denote the imaginary unit. ⌊x⌋ (⌈x⌉) denotes the largest (smallest) integer not greater (smaller) than x.
· N denotes the modulo-N operation. δ(·) denotes the delta function. Upper (lower) case boldface letters denote matrices (column vectors). The notation is used for definitions. Notations ∼ and ∝ represent "distributed as" and "proportional to", respectively. We adopt I N to denote the N × N dimensional identity matrix, and I N ×G to denote the matrix composed of the first G (≤ N ) columns of I N . We adopt 0 to denote the all-zero vector or matrix. The superscripts (·) H , (·) T , and (·) * denote the conjugate-transpose, transpose, and conjugate operations, respectively.

B. Outline
The rest of the paper is organized as follows. In Section II, we investigate the sparse nature of the massive MIMO-OFDM channel model. In Section III, we propose APSP-CA over one OFDM symbol in massive MIMO-OFDM, including channel estimation and prediction. We investigate the multiple consecutive pilot symbol case in Section IV. Simulation results are presented in Section V, and conclusions are given in Section VI.

II. MASSIVE MIMO-OFDM CHANNEL MODEL
In this section, we propose a physically motivated massive MIMO-OFDM channel model, and investigate the inherent channel sparsity property. We consider a single-cell TDD wideband massive MIMO wireless system which consists of one BS equipped with M antennas and K single-antenna UTs. We denote the UT set as K = {0, 1, . . . , K − 1} where k ∈ K represents the UT index. We assume that the channels of different UTs are statistically independent. We assume that the BS is equipped with a one-dimensional uniform linear array (ULA), 2 with antennas separated by one-half wavelength. Then the BS array response vector corresponding to the incidence angle θ with respect to the perpendicular to the array is given by [17] v M,θ = 1 exp (−π sin (θ)) . . .
We assume that the signals seen at the BS are constrained to lie in the angle interval A = [−π/2, π/2], which can be achieved through the use of directional antennas at the BS, and thus no signal is received at the BS for incidence angles θ / ∈ A [33]. We consider OFDM modulation with N c subcarriers, performed via the N c -point inverse DFT operation, appended with a guard interval (a.k.a. cyclic prefix) of length N g (≤ N c ) samples. We employ T sym = (N c + N g ) T s and T c = N c T s to denote the OFDM symbol duration with and without the guard interval, respectively, where T s is the system sampling duration [13]. We assume that the guard interval length T g = N g T s is longer than the maximum channel delay of all the UTs [34], [35].
We assume that the channels remain constant during one OFDM symbol, and evolve from symbol to symbol. We denote the uplink (UL) channel gain between the antenna of the kth UT and the mth antenna of the BS over OFDM symbol ℓ and subcarrier n as [g k,ℓ,n ] m . Using a physical channel modeling approach (see, e.g., [17], [36]- [39]), the channel response vector g k,ℓ,n ∈ C M×1 can be described as where v M,θ is given in (1), g k (θ, τ, ν) is the complex-valued joint angle-delay-Doppler channel gain function of UT k corresponding to the incidence angle θ, delay τ , and Doppler frequency ν. Note that the number of significant channel taps in the delay domain is usually limited, and smaller than N g ; i.e., |g k (θ, qT s , ν)| is approximately 0 for most q. Since the locations of the significant channel taps in the delay domain are usually different for different UTs, we adopt (2) in this paper to obtain a general channel representation applicable for all the UTs. We write the kth UT's channel at OFDM symbol ℓ over all subcarriers as which will be referred to as the space-frequency domain channel response matrix (SFCRM). From (2), it is not hard to show that We assume that channels with different incidence angles, delays, and/or Doppler frequencies are uncorrelated [17], [38], [39]. We also assume that the temporal correlations and joint space-frequency domain correlations of the channels can be separated [35], [38], i.e., where S ADD k (θ, τ, ν), S AD k (θ, τ ), and S Dop k (ν) represent the power angle-delay-Doppler spectrum, power angle-delay spectrum, and power Doppler spectrum of UT k, respectively [17], [40].
From (4) and (5), we can obtain the following channel statistical property (see Appendix A for the derivations) where ̺ k (∆ ℓ ) is the channel temporal correlation function (TCF) given by and R k is the space-frequency domain channel covariance matrix (SFCCM) given by In this work, we consider the widely accepted Clarke-Jakes channel power Doppler spectrum, 3 with the corresponding channel TCF given by [40], [41] where J 0 (·) is the zeroth-order Bessel function of the first kind, and ν k is the Doppler frequency of UT k. Note that the Clarke-Jakes power Doppler spectrum is an even function, i.e., ̺ k (∆ ℓ ) = ̺ k (−∆ ℓ ), and satisfies ̺ k (0) = 1. Also, we assume that according to the law of large numbers, the channel elements exhibit a joint Gaussian distribution, i.e., Before proceeding, we investigate in the following proposition a property of the large dimensional SFCCM, and present a relationship between the SFCCM and the power angle-delay spectrum for massive MIMO-OFDM channels.
Proposition 1: where θ m arcsin (2m/M − 1), and τ n nT s . Then when the number of antennas M → ∞, the SFCCM R k tends to sense that, for fixed non-negative integers i and j, Proof: See Appendix B. The relationship between the space-frequency domain channel joint correlation property and the channel power distribution in the angle-delay domain for massive MIMO-OFDM is established in Proposition 1. Specifically, for massive MIMO-OFDM channels in the asymptotically large array regime, the eigenvectors of the SFCCMs for different UTs tend to be the same, which shows that massive MIMO-OFDM channels can be asymptotically decorrelated by the fixed space-frequency domain statistical eigendirections, while the eigenvalues depend on the corresponding channel power angle-delay spectra.
Proposition 1 indicates that, for massive MIMO-OFDM channels, when the number of BS antennas M is sufficiently large, the SFCCM can be well approximated by Although the waves impinging on the BS are assumed to be sparsely distributed in the angle domain due to limited scattering around the BS (typically mounted at an elevated position), the waves departing the mobile UTs are usually uniformly distributed in angle of departure. Thus the Clarke-Jakes spectrum is suitable to model the time variation of the channel [40], [41].
It is worth noting that the approximation in (12) is consistent with existing results in the literature. For frequency-selective single-input single-output channels, (12) agrees with the results in [35], [42]. For frequency-flat massive MIMO channels, the approximation given in (12) has been shown to be accurate enough for a practical number of antennas, which usually ranges from 64 to 512 [27], [33], [43], [44], and a detailed numerical example can be found in [27]. Since the SFCCM model given in (12) is a good approximation to the more complex physical channel model in (8) when the number of BS antennas is sufficiently large, we will thus exclusively use the simplified SFCCM model in (12) in the rest of the paper. Realistic wireless channels are usually not wide-sense stationary [17], i.e., R k varies as time evolves, although with a relatively large time scale. 4 In practice, acquisition of the large dimensional R k is rather difficult and resource-intensive for massive MIMO-OFDM. However, when we shift our focus from the space-frequency domain to the angle-delay domain, the problem can be significantly simplified. Motivated by the eigenvalue decomposition of the SFCCM given in (12), we decompose the SFCRM as follows is referred to as the angle-delay domain channel response matrix (ADCRM) of UT k at OFDM symbol ℓ. In the following proposition, we derive a statistical property of the ADCRM. Proposition 2: For massive MIMO-OFDM channels, when the number of antennas M → ∞, elements of the ADCRM H k,ℓ satisfy where Ω k is given in (10).
Proof: See Appendix C. Proposition 2 shows that, for massive MIMO-OFDM channels, different elements of the ADCRM H k,ℓ are approximately mutually statistically uncorrelated, which lends the ADCRM in (14) its physical interpretation. Specifically, different elements of the ADCRM correspond to the channel gains for different incidence angles and delays, which can be resolved in massive MIMO-OFDM with a sufficiently large antenna array aperture. Note that [Ω k ] i,j corresponds to the average power of [H k ] i,j , and can describe the sparsity of the wireless channels in the angle-delay domain. Hereafter we will refer to Ω k as the angle-delay domain channel power matrix (ADCPM) of UT k. The dimension of the ADCPM Ω k is much smaller than that of the SFCCM R k , and most elements in Ω k are approximately zero due to the channel sparsity. In addition, Ω k is composed of the variances of independent angle-delay domain channel elements, and thus can be estimated in an element-wise manner. Therefore, in practice there will be enough resources for one to obtain an estimate of Ω k with guaranteed accuracy. In the rest of the paper, we will assume that the ADCPMs of all the UTs are known by the BS.
Before we conclude this section, we define the extended ADCRM as follows Similarly, the extended ADCPM, which corresponds to the power distribution of the extended ADCRMH k,ℓ,(Nc) , is defined asΩ Such definitions will be employed to simplify the analyses in the following sections.

III. CHANNEL ACQUISITION WITH APSPS OVER ONE SYMBOL
Based on the sparse massive MIMO-OFDM channel model presented in the previous section, we propose APSP-CA for massive MIMO-OFDM, including channel estimation and prediction. In this section, we first investigate the case where the APSPs are sent over one OFDM symbol, while the multiple symbol case will be investigated in the next section.

A. APSPs over One Symbol
We assume that all the UTs are synchronized. During the UL pilot segment, namely, the ℓth OFDM symbol of each frame, all the UTs transmit the scheduled pilots simultaneously, and the space-frequency domain signal received at the BS can be represented as where [Y ℓ ] i,j denotes the received pilot signal at the ith antenna over the jth subcarrier, G k,ℓ is the SFCRM defined in (3), X k = diag {x k } ∈ C Nc×Nc denotes the frequency domain pilot signal sent from the kth UT, Z ℓ is the additive white Gaussian noise (AWGN) matrix during the UL pilot segment with elements identically and independently distributed (i.i.d.) as CN (0, σ ztr ), and σ ztr is the noise power.
The proposed APSP over one OFDM symbol for a given UT k is given by where X = diag {x} ∈ C Nc×Nc satisfying XX H = I Nc is the basic pilot matrix shared by all UTs in the same cell, and σ xtr is the pilot signal transmit power. The APSP signal given in (19) can be seen as a phase shifted version of √ σ xtr X with phase shift φ k in the frequency domain. Note that the proposed APSP has the same PAPR as that of X in the time domain, thus existing low PAPR sequence designs can be easily incorporated into our approach. In addition, as the basic pilot matrix X can be predetermined, only X and the pilot phase shift indices rather than the entire pilot matrices are required to be stored, and the required storage space can be significantly reduced. From (19), it can be readily obtained that, for ∀k, k ′ ∈ K, which indicates that cross correlations of the proposed APSPs for different UTs depend only on the associated phase shift difference. It is worth noting that, for conventional PSOPs, the phase shift differences for different pilots are set to satisfy the orthogonality condition However, for our APSPs, the phase shifts for different pilots are adjustable, and pilots for different UTs can even share the same phase shift, which leads to more available pilots, and thus pilot overhead can be significantly reduced.

B. Channel Estimation with APSPs
In this section we investigate channel estimation during the pilot segment under the minimum MSE (MMSE) criterion using the proposed APSPs. Direct MMSE estimation of the SFCRM G k,ℓ requires information about the large dimensional SFCCM R k and a large dimensional matrix inversion, which is difficult to implement in practice. However, with the sparse massive MIMO-OFDM channel model presented above, when we shift our focus from the space-frequency domain to the angle-delay domain, channel estimation can be greatly simplified. The BS can first estimate the ADCRM to obtainĤ k,ℓ , then the SFCRM estimates can be readily obtained asĜ k,ℓ = V MĤk,ℓ F T Nc×Ng via exploiting the unitary equivalence between the angle-delay domain channels and the space-frequency domain channels given in (13), while the same MSE-CE performance can be maintained. In the following, we focus on estimation of the ADCRM H k,ℓ under the MMSE criterion.
Recalling (13), the received pilot signal at the BS in (18) can be rewritten as After decorrelation and power normalization of Y ℓ , the BS can obtain an observation of the UL channel H k,ℓ , given by (22) shown at the top of the next page, where (a) follows from (20). Using the unitary transformation property, it can be readily shown that the pilot noise term in (22) exhibits a Gaussian distribution with i.i.d. elements distributed as CN (0, σ ztr /σ xtr ), and (22) can be simplified as where ρ tr σ xtr /σ ztr is the signal-to-noise ratio (SNR) during the pilot segment, and Z iid ∈ C M×Ng is the normalized AWGN matrix with i.i.d. elements distributed as CN (0, 1).
Note that the pilot interference term H where (a) follows from (16), and (b) follows from the permutation matrix definition given in Section I-A. Thus, the pilot interference term H (23) is a column truncated version of the extended ADCRMH k ′ ,ℓ,(Nc) with a cyclic column shift, where the shift factor depends on the corresponding pilot Recalling Proposition 2, elements of the ADCRM H k ′ ,ℓ are statistically uncorrelated. Consequently, elements of the pilot interference term H , a column truncated copy of H k ′ ,ℓ with cyclic column shift, are also statistically uncorrelated. Thus, using the same methodology as in the previous section, the corresponding power matrix of the pilot interference term H which is a column truncated version of the extended ADCPM Ω k ′ ,(Nc) defined in (17) with cyclic column shift φ k ′ − φ k . With the channel observation Y k,ℓ in (23), and the fact that the angle-delay domain channel elements are uncorrelated as derived in Proposition 2, the MMSE estimateĤ k,ℓ can be obtained in an element-wise manner as follows [47] Ĥ k,ℓ LetH k,ℓ = H k,ℓ −Ĥ k,ℓ be the angle-delay domain channel estimation error of the kth UT, then the corresponding MSE-CE can be obtained as where (a) follows from the orthogonality principle of MMSE estimation [47]. Before we proceed, we define the sum MSE-CE of all the UTs as Due to the incurred pilot interference, performance of the APSP-based channel estimation might deteriorate. However, we will show in the following proposition that such effects can be eliminated with proper phase shift scheduling for different pilots. Proposition 3: The sum MSE-CE ǫ CE is lower bounded by and the lower bound can be achieved under the condition that, for ∀k, k ′ ∈ K and k = k ′ , Proof: See Appendix D. Proposition 3 shows that with the proposed APSPs, the sum MSE-CE can be minimized when phase shifts for different pilots are properly scheduled according to the condition given in (31). The interpretation is very intuitive. With frequency domain phase shifted pilots, equivalent channels will exhibit corresponding cyclic shifts in the delay domain, as seen from (24). If the equivalent channel power distributions in the angledelay domain for different UTs can be made non-overlapping after pilot phase shift scheduling, the pilot interference effect can be eliminated, and the sum MSE-CE can be minimized.
Wireless channels are approximately sparse in the angledelay domain in many practical propagation scenarios, and typically only a few elements of the ADCPM Ω k are dominant in massive MIMO-OFDM. When such channel sparsity is properly taken into account, the equivalent angle-delay domain channels for different UTs are almost non-overlapping with high probability, assuming proper pilot phase shifts. This suggests the feasibility of the proposed APSPs for massive MIMO-OFDM.
Note that performance of the proposed APSP approach is related to the channel sparsity level. For the case where channels of different UTs have a sparse common support with s (≤ N g ) representing the number of the columns containing non-zero elements in the ADCPM [22], [23], the maximum number of UTs that can be served without pilot interference is ⌊N c /s⌋. However, for practical wireless channels, most of the channel elements in the angle-delay domain are close to zero, and the condition in (31) usually cannot be satisfied exactly, which will lead to degradation of the channel acquisition performance. In such cases, it is clear that the more sparse the channels are, the better performance can be achieved by the proposed APSP approach.
Before we conclude this subsection, we remark here that several existing pilot approaches satisfy the optimal condition given in Proposition 3. For the case where channel sparsity property is not known, it is reasonable to assume that all the angle-delay domain channel elements are identically distributed, i.e., all the ADCPM elements are equal, in which case the optimal condition in (31) can be achieved when |φ k − φ k ′ | ≥ N g for ∀k = k ′ , i.e., the extended channels in the delay domain for different UTs are totally separated, which coincides with the conventional PSOPs [12]. For frequencyflat massive MIMO channels, i.e., N c = 1, the condition in (31) can be achieved when Ω k ⊙ Ω k ′ = 0 for ∀k = k ′ , i.e., different UTs can share the same pilot when the respective channels have non-overlapping support in the angle domain, which coincides with previous works such as [33], [43]. In our work, the proposed APSPs exploit the joint angle-delay domain channel sparsity in massive MIMO-OFDM, and are more efficient and general from the pilot overhead point of view.

C. Channel Prediction with APSPs
In the previous subsection, we investigated channel estimation during the pilot segment. Directly employing the pilot segment channel estimates in the data segment might not always be appropriate [48], especially in high mobility scenarios, which are the main focus of the APSPs. In this subsection, we investigate channel prediction during the data segment based on the received pilot signals, using the proposed APSPs.
For frame-based massive MIMO-OFDM transmission, the BS utilizes the received signals during the pilot segment to acquire the channels in the current frame. If the pilot segment channel estimateĤ k,ℓ is directly employed as the estimate of the channel H k,ℓ+∆ ℓ during the data segment, the corresponding sum MSE-CE for a given delay ∆ ℓ between the pilot symbol and data symbol can be written as In high mobility scenarios, the channel TCF satisfies ̺ k (∆ ℓ ) → 0 for relatively large delay |∆ ℓ |. When ̺ k (∆ ℓ ) < 1/2, i.e., 1 − 2̺ k (∆ ℓ ) > 0, it can be observed from (32) that the sum MSE-CE expression ǫ CE (∆ ℓ ) is even larger than the sum channel power [Ω k ] i,j , and channel estimation performance cannot be guaranteed, which motivates the need for channel prediction.
For channel prediction, the BS utilizes the received pilot signals as well as the channel TCF to get estimates of the channels during the data segment. Under the MMSE criterion, with the angle-delay domain channel property of massive MIMO-OFDM given in Proposition 2, it is not hard to show that an estimate of the ADCRM H k,ℓ+∆ ℓ based on Y k,ℓ can be obtained in an element-wise manner as follows Recalling the pilot segment channel estimate in (27), it can be seen thatĤ which indicates that optimal channel estimates during the data segment can be easily obtained via prediction with initial channel estimates obtained during the pilot segment, and the complexity of channel prediction in massive MIMO-OFDM can be further reduced. Similar to (29), the sum MSE-CP for a given delay ∆ ℓ between the data symbol and pilot symbol can be defined as From (35), it can be seen that pilot interference will affect channel prediction performance similar to the channel estimation case. However, we will show in the following proposition that such effects can still be eliminated with proper pilot phase shift scheduling.

Proposition 4:
The sum MSE-CP ǫ CP (∆ ℓ ) ∀∆ ℓ is lower bounded by and the lower bound can be achieved under the condition that, for ∀k, k ′ ∈ K and k = k ′ , Proof: The proof is similar to that of Proposition 3, and is omitted for brevity.

D. Frame Structure
There exist two typical frame structures for TDD massive MIMO transmission [49]. One type of frame structure (which will be referred to as type-A) begins with the UL pilot segment, followed by the UL and downlink (DL) data segments, as shown in Fig. 1(a). In the second type (which will be referred to as type-B), the UL pilot segment is placed between the UL data segment and DL data segment, as shown in Fig.  1(b). For the proposed APSP approach, the delay between the tail-end symbols of the data segment and the pilot segment will be longer than the PSOP approach due to the reduced pilot segment length. In addition, the APSP approach focuses on high mobility scenarios where channels vary relatively quickly. Thus the type-B frame structure is well-suited for the proposed APSP approach.

E. Pilot Phase Shift Scheduling
In the previous subsections, we investigated channel estimation and prediction for massive MIMO-OFDM with APSPs, and obtained the optimal pilot phase shift scheduling condition applicable to both channel estimation and prediction. However, such an optimal condition cannot always be met in practice, but pilot phase shift scheduling can still be beneficial. Several scheduling criteria can be adopted. For example, if we schedule the pilot phase shifts based on the MMSE-CE criterion, the problem can be formulated as arg min where ǫ CE is defined in (29). Such a scheduling problem is combinatorial, and optimal solutions must be found through an exhaustive search. Note that the optimal phase shift scheduling conditions for channel estimation and prediction are the same, thus solution of the problem (38) can also be expected to perform well under the MMSE-CP criterion.

Algorithm 1 Pilot Phase Shift Scheduling Algorithm
Input: The UT set K and the corresponding ADCPMs {Ω k : k ∈ K}; the preset threshold γ Output: Pilot phase shift pattern {φ k : k ∈ K} 1: Initialization: φ 0 = 0, scheduled UT set K sch = {0}, unscheduled UT set K un = K\K sch 2: for k ∈ K un do 3: Search for a phase shift φ that satisfies If φ cannot be found in step 3, then φ = arg min

5:
Update φ k = φ, K sch ← K sch ∪{k}, K un ← K un \ {k} 6: end for Motivated by the optimal condition for channel estimation and prediction obtained in previous subsections, a simplified pilot phase shift scheduling algorithm can be developed. We first define the following function that measures the degree of overlap between two real matrices A, B ∈ R M×N as follows From the Cauchy-Schwarz inequality, it is obvious that the overlapping degree function in (39)  In our algorithm, we preset a threshold to balance the tradeoff between the algorithm complexity and channel acquisition performance. Specifically, we schedule the pilot phase shifts for different UTs to make the overlap function between the ADCPMs for different UTs smaller than the preset threshold γ. Intuitively, the smaller the threshold γ, the better the channel acquisition performance will be, although with a higher algorithm complexity. The description of the proposed algorithm is summarized in Algorithm 1.

IV. CHANNEL ACQUISITION WITH APSPS OVER MULTIPLE SYMBOLS
In the previous section, we investigated channel acquisition for massive MIMO-OFDM with the proposed APSPs over one OFDM symbol. Sometimes pilots over one symbol might be not sufficient to accommodate a large number of UTs. In this section, we extend the use of APSPs to the case of multiple consecutive OFDM symbols.
We assume that the pilots are sent over Q consecutive OFDM symbols starting with the ℓth symbol in each frame. In practice, the pilot segment length Q is usually short, and we adopt the widely accepted assumption that the channels remain constant during the pilot segment [10]- [12]. Then the received signals by the BS during the pilot segment can be written as represents the received pilot signal at the BS during the ℓth symbol, X k,(Q) [X k,0 X k,1 . . . X k,Q−1 ] represents the pilot signals and X k,q = diag {x k,q } ∈ C Nc×Nc represents the signal sent from the kth UT during the qth symbol of the pilot segment, Z ℓ,(Q) is AWGN with i.i.d. elements distributed as CN (0, σ ztr ) and σ ztr is the noise power.
Recalling (19), the maximum adjustable phase shift for different pilots over one OFDM symbol is N c − 1. For the Q pilot symbol case, the maximum adjustable pilot phase shift can be extended to QN c − 1. By exploiting the modulo operation, we construct the APSPs over multiple OFDM symbols as follows where U is an arbitrary Q × Q dimensional unitary matrix, and X ⌊φ k /Q⌋ is the APSP signal over one symbol defined in (19). Then it can be obtained that, for ∀k, k ′ ∈ K, where ( (20). This shows that the available phase shifts for the Q symbol case are divided into Q groups for the proposed APSPs in (41), and the group index depends on the residue of the pilot phase shift φ with respect to the pilot segment length Q. Pilot interference can only affect the UTs using APSPs with phase shifts in the same group. For example, if φ k ′ Q = φ k Q , then phase shifts φ k ′ and φ k are within the same group, and the corresponding channel acquisition of UTs k ′ and k might be mutually affected. Given the APSP correlation property over multiple symbols in (42), the channel estimation and prediction operations can be performed similarly to the single-symbol case investigated in the previous section, and we will briefly discuss such issues below.
After decorrelation and power normalization with Y ℓ,(Q) given in (40), the BS can obtain an observation of the pilot segment ADCRM H k,ℓ as Y k,ℓ,(Q) where (a) follows from (42), ρ tr σ xtr /σ ztr is the pilot segment SNR, Z iid is the normalized AWGN matrix with i.i.d. elements distributed as CN (0, 1), and (b) follows from (24).
With the channel observation Y k,ℓ,(Q) in (43), the MMSE estimate of the ADCRM H k,ℓ can be readily obtained in an element-wise manner as (44) shown at the top of the next page, and the corresponding sum MSE-CE is given by (45) shown at the top of the next page, In addition, prediction of the ADCRM H k,ℓ+∆ ℓ based on Y k,ℓ,(Q) can be performed as (46) shown at the top of the next page, and the corresponding sum MSE-CP with a given delay ∆ ℓ is given by (47) shown at the top of the next page.
Based on the above sum MSE-CE and MSE-CP expressions for the multiple symbol APSP case, we can readily obtain the following proposition.
Proposition 5: The sum MSE-CE ǫ CE (Q) is lower bounded by Both the lower bounds in (48) and (49) can be achieved under the condition that, for ∀k, k ′ ∈ K and k = k ′ , Proof: The proof is similar to that of Proposition 3, and is omitted for brevity.
Proposition 5 extends the single-symbol APSP case in the previous section to the multiple symbol case. Actually, when Q = 1, Proposition 5 reduces to the results in Proposition 3 and Proposition 4. The interpretation of Proposition 5 is straightforward. For multiple symbol APSPs, different pilot phase shifts are divided into several groups, and pilot interference only affects the UTs using the phase shifts within the same group. If pilot interference can be eliminated through proper phase shift scheduling in all the groups, then optimal channel estimation and prediction performance can be achieved. When the optimal pilot phase shift scheduling condition in Proposition 5 cannot be met, a straightforward extension of the pilot phase shift scheduling algorithm in the previous section can be applied. Specifically, the UT set can be divided into Q groups, and pilot phase shift scheduling can be performed within each UT group using Algorithm 1.
The tradeoff between channel acquisition performance and algorithm complexity can still be balanced with the preset threshold to determine the degree of allowable channel overlap.

V. NUMERICAL RESULTS
In this section, we present numerical simulations to evaluate the performance of the proposed APSP-CA in massive MIMO-OFDM. The major OFDM parameters, which are based on 3GPP LTE [46], are summarized in Table I. The massive MIMO-OFDM system considered is assumed to be equipped with a 128-antenna ULA at the BS with half wavelength antenna spacing. The number of UTs is set to K = 42 as in [4].
We consider channels with 20 taps in the delay domain, which exhibit an exponential power delay profile [18], [51] where ς k denotes the channel delay spread of UT k. We assume that transmissions from all the UTs are synchronized [13], [18]. The qth channel tap of UT k is assumed to exhibit a Laplacian power angle spectrum [18], [33], [51] S ang k,q (θ) ∝ exp − √ 2 |θ − θ k,q | /ϕ k,q , where θ k,q and ϕ k,q represent the corresponding mean angle of arrival (AoA) and angle spread for the given channel tap, respectively. We assume that the UTs are uniformly distributed in a 120 • sector, and the mean AoA θ k,q is uniformly  [18], [38], and are summarized in Table II. We assume that all UTs exhibit the same Doppler, delay, and angle spread in the simulations.
With the above settings, we compare the performance of the proposed APSP-CA approach with that of the conventional PSOP-CA approach, which serves as the benchmark for comparison of channel acquisition performance. For the conventional PSOP-CA, the required pilot segment length is Q = ⌈K/ (N c /N g )⌉ = 3 OFDM symbols [4]. For the proposed APSP-CA, the pilot segment length can be set to Q = 1 or 2. We adopt Algorithm 1 to schedule the pilot phase shifts in the simulations, and the overlap threshold in the algorithm is set as γ = 10 −4 . Although this algorithm is suboptimal in general compared with exhaustive search,   Fig. 2, the pilot segment MSE-CE performance 5 obtained by the proposed APSPs (with Q = 1 and 2) are compared with those for conventional PSOPs (Q = 3) under several typical propagation scenarios. It can be observed that, in all the considered scenarios, the MSE-CE performance with APSPs approaches the performance obtained with PSOPs, while the pilot overhead is reduced by 66.7% (Q = 1) and 33.3% (Q = 2), respectively.
In Fig. 3, we compare the channel acquisition performance during the data segment in terms of MSE versus the delay ∆ ℓ between the data symbol and pilot segment. Both the APSP-CA (Q = 1) and PSOP-CA (Q = 3) are evaluated. Also, for APSPs, both the channel estimation and prediction MSE performance are calculated. It can be observed that the MSE-CP performance obtained with APSPs approaches that for PSOPs, with the pilot overhead reduced by 66.7%. In addition, with APSPs, channel prediction outperforms channel estimation in all the evaluated scenarios. Note that the channel acquisition performance in terms of both MSE-CE and MSE-CP grows almost linearly with delay, and thus the channel acquisition performance can be improved when combined 5 All the simulated MSE results are normalized by the number of subcarriers Nc and the number of UTs K.
with the type-B frame structure, as shown in the following simulation results.
At the end of this section, we compare the achievable spectral efficiency of the proposed APSP and the conventional PSOP approaches. 6 We assume that the frame length equals 500 µs as in [4], which is equal to the length of 7 OFDM symbols [46], and that UL and DL data transmission each occupies half of the data segment length. For the conventional PSOP-CA approach, channel estimation and the type-A frame structure in Fig. 1(a) are adopted. For the proposed APSP-CA approach, both APSPs (Q = 1) and channel prediction are adopted, and both type-A and type-B frame structures are considered. A MMSE receiver and precoder are employed for both UL and DL data transmissions, and the SNR is assumed to be equal to the pilot SNR. In Fig. 4, the achieved spectral efficiency 7 of the APSP-CA and PSOP-CA approaches are depicted. It can be observed that the proposed APSP-CA approach shows substantial performance gain in terms of the achievable spectral efficiency over the conventional PSOP-CA approach, especially in the high mobility regime where pilot overhead dominates and the high SNR regime where pilot interference dominates. Specifically, in the high mobility SU scenario (250 km/h) with an SNR of 10 dB, the proposed APSPs can provide about 69% in average spectral efficiency gains over the conventional PSOPs. In addition, the type-B frame structure can provide a gain of about 64% over the type-A frame structure when APSPs are adopted.

VI. CONCLUSION
In this paper, we proposed a channel acquisition approach with adjustable phase shift pilots (APSPs) for massive MIMO-OFDM to reduce the pilot overhead. We first investigated the channel sparsity in massive MIMO-OFDM based on a physically motivated channel model. With this channel model, we investigated channel estimation and prediction for massive MIMO-OFDM with APSPs, and provided an optimal pilot phase shift scheduling condition applicable to both channel estimation and prediction. We further developed a simplified pilot phase shift scheduling algorithm based on this optimal channel acquisition condition with APSPs. The proposed APSP-CA implemented over both one and multiple symbols were investigated. Significant performance gains in terms of achievable spectral efficiency were observed for the proposed APSP-CA approach over the conventional PSOP-CA approach in several typical mobility scenarios.

APPENDIX A DERIVATION OF (6)
The derivation of (6) is detailed in (53), shown at the top of the next page, where (a) follows from (5), and (b) follows from the definition of the delta function.

APPENDIX B PROOF OF PROPOSITION 1
We start by defining some auxiliary variables to simplify the derivations. We define n d ⌊d/M ⌋ and m d d M for an arbitrary non-negative integer d. Note that the element indices start from 0 in this paper. Then we can readily obtain that for We can also obtain that for matrices F ∈ C Nc×Ng and V ∈ C M×M , [F ⊗ V] i,j = [F] ni,nj [V] mi,mj from the definition of the Kronecker product. With the above definitions and related properties, the proof can be obtained as follows: [f Nc,q ⊗ v M,θ ] · exp (2πν (ℓ + ∆ ℓ ) T sym ) · g k (θ, qT s , ν) dθdν Before concluding the proof, we also have to show that both of the limits in the first equation of (54) exist and are finite. For this purpose, as can be seen from (e) of (54), we only need to show that exp −2π (n i − n j ) N c q · exp (−π (m i − m j ) sin (θ)) · S AD k (θ, τ q ) dθ To show (15), it suffices to show that From the definition of H k,ℓ given in (14) where (a) follows from the fact that F Nc×Ng and V M are both deterministic matrices, (b) follows from (6), and (c) follows from Proposition 1. This concludes the proof.

APPENDIX D PROOF OF PROPOSITION 3
Due to the fact that the elements of Ω φ k ′ −φ k k ′ are nonnegative, we can obtain