# Efficient parallel carrier recovery for ultrahigh speed coherent QAM receivers with application to optical channels.

1. IntroductionThe recent emergence of the updated standards IEEE 802.3 for 40 and 100 gigabit per second (Gb/s) Ethernet and G.709 for 40 and 100 Gb/s optical transport network (OTN), as well as the first commercially available devices implementing these data rates, reveals the vertiginous growth on the bandwidth demand in the last decade [1, 2].

The projected increase on the bandwidth demand (e.g., [greater than or equal to] 100 Gb/s) has set the bases for the next generation of Ethernet and OTN, and it has, therefore, renewed interest on coherent detection and spectrally efficient modulation techniques such as M-ary phase-shift keying (M-PSK) and M-ary quadrature amplitude modulation (M-QAM). More precisely, the conjunction among intradyne coherent detection, polarization-division multiplexing (PDM), 16-QAM, and electronic dispersion compensation (EDC) [3, 4] allows to reach good tradeoff among complexity, spectral efficiency, minimization of nonlinear distortions, and the possibility to completely compensate with zero penalty the main fiber channel impairments [3] (i.e., polarization mode dispersion (PMD) and chromatic dispersion (CD) [5]). In particular, intradyne detection is preferred over the alternative heterodyne or homodyne architectures because it replaces complex optical phase-locked loops (PLLs) with more robust and easier to implement digital carrier recovery (CR) techniques. In other words, all of these aspects can be summarized in an improved receiver sensitivity in comparison to intensity modulation direct detection (IM/DD) schemes [6, 7].

In this context, CPR fulfils a fundamental role in coherent optical receivers [3, 8]. Feedforward phase estimation schemes such as Viterbi-Viterbi (VV) [9] or blind phase search (BPS) [10] algorithms have been proposed for optical coherent receivers, because of their good laser linewidth tolerance and feasibility for parallel implementation. More specifically, significant amounts of CD lead to an enhancement of the phase noise introduced by the local oscillator and a lower tolerance with respect to carrier frequency offsets. In these feedforward CPR schemes, a perfect compensation of carrier frequency offset is assumed. However, this condition may not be always satisfied in practice. In fact, it has been shown that the phase error variance increases with the frequency offset, degrading the performance of the feedforward phase estimation stage [11]. Feedforward techniques to estimate and compensate frequency offset have been investigated in previous works [12-15]. Moreover, parallel architectures of these techniques are feasible for implementation in high-speed receivers. In particular, [15] has been conceived as data-aided (DA) algorithm that uses training sequences to enhance the capture range up to near 1/(2T), being T the symbol duration, whereas [12-14] are nondata-aided algorithm (NDA) with capture range close to 1/(8T) for 16-QAM scheme.

Although accurate frequency offset estimation and compensation can be carried out by well-known techniques, a static frequency offset has been assumed in all these proposals. As it has been recently demonstrated, transmitter or local oscillator laser frequency instability caused by mechanical vibrations significantly degrades the performance of feedforward CPR algorithms [16]. Other effects such as power supply noise may also introduce laser frequency fluctuations which can be modeled as a frequency modulation with a sinusoid of large amplitude (e.g., ~250 MHz) and low frequency (e.g., [less than or equal to] 35 KHz) [16]. The effectiveness of frequency offset estimation techniques, such as those mentioned earlier, is limited due to the large amplitude of the modulation signal (i.e., large laser frequency change rate). Recent publications have proposed architectures for compensation of laser frequency fluctuations when quadrature phase-shift keying modulation (QPSK) is used [2, 17, 18]. For example, a two-stage carrier recovery parallel architecture based on a low-latency parallel DPLL and the feedforward VV CPR algorithm has been proposed in [17]. This technique offers an excellent tradeoff between complexity and performance for coherent QPSK receivers in the presence of laser phase noise, sinusoidal frequency jitter, and frequency offset. In this work, we generalize the technique introduced in [17] for application to M-QAM optical receivers.

As mentioned before, feedforward CPR blocks based on the VV or BPS algorithms achieve good laser linewidth tolerance and overcome some of the latency-related limitations [8]. We show here that traditional decision directed DPLLs [19] offer advantages in some aspects of the operation of CPR, for example, the tracking of large amplitude sinusoidal carrier frequency jitter experienced by typical lasers. A traditional PLL is often modeled as a linear filter, assumption which is useful to compute the small signal transfer function [19]. However, the PLL is actually a nonlinear filter precluding, in this way, the use of the unfolding techniques discussed by Parhi in [20], are applicable only to strictly linear filters. Therefore, a different approach to reduce the latency of the PLL parallel implementation must be found.

In the present work we introduce a new parallel carrier recovery algorithm which combines a novel low-latency parallel DPLL with a traditional feedforward CPR algorithm. The new low-latency parallel DPLL is used to compensate not only frequency offset but also frequency fluctuations. The proposed DPLL approach takes out of the feedback loop as much processing as possible in order to simplify the loop and reduce its latency. Then, the bottleneck of the critical PLL feedback path is broken by using a novel approximation to the DPLL computation, which provides a capture range and bandwidth close to those achieved by serial DPLLs [17, 21]. Computer simulations demonstrate that the degradations caused by frequency offset and laser frequency fluctuations can be eliminated with the proposed parallel carrier recovery technique. Unlike the superscalar parallelization (SP) methods [22-25], the technique proposed here does not require training symbols to avoid the acquisition problem. Moreover, the buffers required by the SP scheme are completely avoided in our approach.

The remainder of the paper is organized as follows. Section 2 presents the system model and analyzes the effects of the carrier frequency fluctuations on the receiver performance. Section 3 describes the two-stage carrier recovery technique. Section 4 introduces the new low-latency parallel DPLL, while numerical results are shown and discussed in Section 5. Finally, conclusions are drawn in Section 6.

2. System Model

Figure 1 shows a simplified block diagram of the coherent receiver with electronic dispersion compensation. Then, the sample at the equalizer output can be expressed as

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], (1)

where [a.sub.n] is the nth transmitted symbol and [[alpha].sub.n] is the total phase noise. Component [z.sub.n] represents the amplified spontaneous emission (ASE) noise sample, which is modeled as a white complex Gaussian random variable with power [[sigma].sup.2] [3]. The equalized output signal (1) can be rewritten as

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], (2)

where [absolute value of [r.sub.n]] and [[theta].sub.n] are the magnitude and the phase of the complex sample [r.sub.n], respectively. In M-PSK and M-QAM systems, the symbol information is contained totally or partially in the phase of [r.sub.n], respectively. The received phase [[theta].sub.n] can be expressed as

[[theta].sub.n] = [[zeta].sub.n] + [[OMEGA].sub.c]n + [DELTA][[OMEGA].sub.n] + [[phi].sub.n], (3)

where [[zeta].sub.n] is the phase of the transmitted symbol [a.sub.n] and [[OMEGA].sub.c] is the angular carrier frequency offset given by [[OMEGA].sub.c] = 2[pi]T[f.sub.c], with [f.sub.c] and T being the carrier frequency offset and the symbol duration, respectively. Term [DELTA][[OMEGA].sub.n] represents the phase change generated by frequency fluctuations. In this work we assume that the carrier is modulated by a sinusoidal interfering signal; therefore

[DELTA][[OMEGA].sub.n] = [A.sub.p][DELTA][f.sub.c] sin (2[pi]T[DELTA][f.sub.c]n), (4)

where [A.sub.p] and [DELTA][f.sub.c] are the amplitude and frequency of the modulation tone.

Component [[phi].sub.n] is the total phase noise given by

[[phi].sub.n] = [[phi].sup.(laser).sub.n] + [[phi].sup.(ASE).sub.n], (5)

where [[phi].sup.(laser).sub.n] and [[phi].sup.(ASE).sub.n] are the laser phase noise and the ASE generated phase noise, respectively. Laser phase noise is modeled as a Wiener process as follows:

[[phi].sup.(laser).sub.n] = [n.summation over (k=-[infinity]) [[eta].sub.k], (6)

where [[eta].sub.k] are independent, identically distributed, Gaussian random variables with zero mean and variance [[sigma].sup.2.sub.[eta]] = 2[pi]T[DELTA]v, being [DELTA]v the laser linewidth [8].

2.1. Feedforward CPR. Typical carrier recovery techniques for coherent optical receivers combine a frequency offset compensation stage followed by a feedforward phase estimation block based on the well-known VV or BPS algorithms (see Figure 2) 13]. Once the frequency offset is removed, the VV or BPS block estimate and compensate the phase noise.

Figure 3 shows a simplified block diagram of the VV algorithm implementation. The VV block estimates the phase noise based on the Mth power of the received signal as follows:

[[phi].sub.n] = 1/M U (arg {[u.sub.n]}), (7)

where U is the unwrap function and un is the output of the VV estimator given by

[u.sub.n] = [N-1.summation over (i=0)] [([[??].sub.n-1]).sup.M], (8)

with N being an integer odd number which represents the VV estimator length (see [8] for more details).

An alternative to the VV estimator is the so-called BPS algorithm shown in Figure 4. The BPS blocks estimates the phase noise as follows:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], (9)

where [[??].sub.b] is the test phase defined as

[[??].sub.b] = b/B x [phi]/2, b [member of] {0, 1, ..., B-1}, (10)

where B is the number of phases to be tested; term [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] is given by

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], (11)

where Q(*) is the slicer function and N is, again, the estimator length (see [10] for more details).

Both VV and BPS techniques efficiently compensate the effects of the laser phase noise. Particularly, the VV architecture is preferred for M-PSK modulation schemes because of its uniform angular spacing and constant modulus between symbols. Although there exist alternatives that enable the VV to operate with M-QAM schemes [26], the BPS algorithm is preferred because it performs better in the presence of laser phase noise in spite of its greater computational complexity.

2.2. Effects of Frequency Fluctuations. Mechanical vibrations cause small deformations of electronic components, such as the laser cavity, leading to frequency fluctuations (see [16] and references therein). As expressed in the introduction, these fluctuations can be described as a frequency modulation with a sinusoidal signal of large amplitude (e.g., [A.sub.p] ~ 250 MHz) and low frequency (e.g., [DELTA][f.sub.c] [less than or equal to] 35 KHz). Without loss of generality, we consider in this work differential QPSK and 16QAM differentially encoded in quadrant. Figures 5 and 6 show the optical signal-to-noise ratio (OSNR) penalty at a bit-error-rate (BER) of 10-3 versus the tone amplitude [A.sub.p] for [DELTA][f.sub.c] = 35 KHz. We use the feedforward VV and BPS CPR schemes depicted in Figures 3 and 4, respectively, with 1/T = 32 giga-samples per second (Gs/s), laser linewidth [DELTA]v = 250 KHz, and several values of the estimator length, N. Perfect estimation of the frequency offset is assumed. At the selected symbol rate, and within the jitter tone amplitude range of concern, QPSK does not show a significant penalty when the averaging block length is properly chosen. On the other hand, note that the performance in the 16-QAM case is significantly deteriorated with the amplitude of the frequency modulation tone, which agrees with that reported in [16]. Notice also that the value of the estimator length that minimizes the penalty depends on the tone amplitude. This fact suggests the need for an automatic adjustment algorithm for N.

3. Carrier Recovery with Compensation of Frequency Fluctuations

Based on the results shown in Section 2.2, we conclude that the tracking of frequency fluctuations becomes an essential task in ultrahigh speed intradyne coherent optical receivers. Towards this end, a two-stage carrier recovery algorithm is proposed in this work (see Figure 7). A first CPR stage is based on a low-latency parallel DPLL, which is used to compensate not only frequency offset but also carrier frequency fluctuations. The second CPR stage is based on the renowned VV [9] or BPS [10] algorithm, which operates on the signal demodulated by the DPLL. The second CPR stage is mainly used to compensate the laser phase noise.

Parallel architectures for both stages must be provided for multigigabit applications. Feedforward phase estimation schemes such as VV or BPS are attractive for high-speed coherent receivers owing to their good laser linewidth tolerance and feasibility for parallel implementation. Nevertheless, the low-latency parallel DPLL proposed in [17] has been designed for QPSK format. In the following section, we generalize the scheme introduced in [17] for application to M-QAM.

3.1. Phase Domain Digital PLL. We consider a phase domain DPLL in order to reduce computational complexity. The domain change results in the substitution of complex multipliers by real adders, allowing in this way to increase the processing rate of the system, a fundamental aspect in multi-gigabit communications where high processing rates are required.

In a decision directed carrier recovery loop (see Figure 8), the symbol information is first removed [19]. In QPSK receivers, this operation can be easily carried out in the phase domain as follows:

[[??].sub.n] = [([[theta[.sub.n]).sub.[phi]/2]. (12)

where [(*).sub.H] denotes modulus H. In the absence of phase noise and frequency deviations (i.e., [[phi].sub.n] = 0 for all n and [f.sub.c] = [DELTA][f.sub.c] = 0), notice that [[??].sub.n] = [([[zeta].sub.n]).sub.[pi]/2] = [pi]/4 for all n. A similar approach can be adopted for M-QAM. For example, for 16-QAM the symbol phase [[zeta].sub.n] reduced to the first quadrant results in [([[zeta].sub.n]).sub.[pi]/2] [member of] {arctan(1/3), [pi]/4, arctan(3)}. Figure 9 depicts the entire QPSK and 16-QAM constellations in the complex plane, where the labels i and q stand for the real and imaginary axes, respectively. Moreover, the shaded areas in Figure 9 highlight the quadrant reduction given by (12).

The phase at the numerically controlled oscillator (NCO) output of a type II second-order DPLL (see Figure 8) can be expressed as

[[psi].sub.n] = [[psi].sub.n-1] + [K.sup.(p)] [[epsilon].sub.n] + [K.sup.(i)-] [[epsilon].sub.n-1], (13)

where all addition operations in the following analysis are modulus 2[pi], and the constants [K.sup.(p)] and [K.sup.(i)]are the loop pro portional and integral gains, respectively; [[epsilon].sub.n] is the phase error given by

[[epsilon].sub.n] = [([[??].sub.n] - [[psi].sub.n-1]).sub.[pi]/2] - [[rho].sub.n], (14)

where [[rho].sub.n] is the symbol phase of the transmit symbol reduced to the first quadrant; that is, [[rho].sub.n] = [([[zeta].sub.n]).sub.[pi]/2]. Finally, term [[bar.[epsilon]].sub.n-1] in (13) is the accumulated phase error given by

[[bar.[epsilon]].sub.n] = [n-1.summation over (k=-[infinity])] [[epsilon].sub.k]. (15)

Since the phase symbol is not known apriori at the receiver, we use a tentative decision of the transmit symbol to estimate the phase [[rho].sub.n] as follows:

[[rho].sub.n] [approximately equal to] f([absolute value of [r.sub.n]], [[??].sb.n]), (16)

where [[??].sub.n] is the phase of the demodulated received sample, reduced to the first quadrant; that is,

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. (17)

Note that [(a + b).sub.H] = [([(a).sub.H] + [(b).sub.H]).sub.H]; therefore, since [[??].sub.n] = [([[theta].sub.n]).sub.[pi]/2], we can get 17). For example, for QPSK

[[rho].sub.n] = f([absolute value of [r.sub.n]], [[??].sub.n]) = [pi]/4 [for all]n, (18)

while for 16-QAM,

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. (19)

Figure 10 shows the 16-QAM constellation reduced to the first quadrant of the complex plane and the decision boundaries according to (19).

3.2. Evaluation of DPLL for Tracking Frequency Fluctuations. The effectiveness of the decision directed DPLL to track frequency fluctuations is analyzed in the following section. In our carrier recovery scheme, the serial DPLL is used for compensation of frequency offset and fluctuations, while a feedforward CPR block based on the BPS algorithm is used for phase noise estimation. This carrier recovery architecture will be denoted as S-DPLL + BPS.

Figure 11 shows the OSNR penalty versus the modulation tone amplitude, [A.sub.p], for [DELTA][f.sub.c] = 35 KHz and Av = 250 KHz. The BPS filter length is N = 21, while the test phase number is B = 32. Note that the performance degradation caused by the carrier frequency fluctuation is eliminated with the new combined S-DPLL + BPS carrier recovery technique.

Figure 12 presents the tolerance of BPS and S-DPLL + BPS architectures to the laser phase noise in the presence of a frequency modulation tone with [A.sub.p] = 140 MHz, [DELTA][f.sub.c] = 35 KHz. These models were compared with the BPS algorithm without influence of frequency fluctuations (i.e., [A.sub.p] = 0). The last mentioned scheme is used as a benchmark. It is interesting to highlight the important degradation caused by the frequency fluctuations in the solution solely based on the BPS algorithm. Again notice that the effects of the carrier frequency fluctuations are mitigated by using the proposed S-DPLL + BPS carrier recovery algorithm.

4. New Low Latency Parallel DPLL for M-QAM

Maximum clock frequency of complex digital signal processors for the state of the art 28 nm CMOS technology is limited to less than 1 GHz. Thus, the use of parallel processing techniques for the implementation of multigabits per second receivers is mandatory. Unfortunately, the nonlinear filter nature of the DPLL impedes the use of the unfolding techniques [20]. Since low latency is a key factor to track frequency fluctuations, then we develop a new approach to reduce the latency in the parallel implementation of DPLL.

4.1. Parallel Type II DPLL for M-QAM. From (13) it is possible to show that

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], (20)

where

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (21)

with [[rho].sub.n+1] given by (16) and

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. (22)

For the type II second-order DPLL, the steady-state error is zero (i.e., [lim.sub.n[right arrow][infinity]] [[epsilon].sub.n] [right arrow] 0) [19]. Thus, assuming that the bandwidth of the loop is low-to-moderate such [K.sup.(p)] [much less than] 1, the contribution of the term [K.sup.(p)][[epsilon].sub.n] can be neglected; therefore the phase error (21) results in

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], (23)

where [[??].sub.n+1] is given by

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (24)

with

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (25)

Furthermore, since the accumulated phase error varies slowly with time (i.e., [[bar.[epsilon]].sub.n] [approximately equal to] [[bar.[epsilon]].sub.n-1]), from (20) and (23), we can obtain

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], (26)

where [[??].sub.n+k] is given by

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (27)

with

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. (28)

Let P be the parallelization factor. Following a similar analysis, it is possible to derive that

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. (29)

A type II DPLL can be considered as two separate feedback loops: the proportional and integral loops (see Figure 13). Thus, the NCO output (29) can be rewritten as

[[psi].sub.n+m] = [[psi].sup.(p).sub.n+m] + [[psi].sup.(i).sub.n+m], m = 0, 1, ..., P - 1, (30)

where [[psi].sup.(p).sub.n+m] and [[psi].sup.(i).sub.n+m] are the NCO components due to the proportional and integral paths, respectively.

4.2. Proportional Loop. From (29), it is simple to show that

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. (31)

From (12) and 17), note that

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. (32)

Thus, expression (31) can be rewritten as

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], (33)

where

[[??].sub.n+k] = [([[theta].sub.n+k] - [[psi].sup.(i).sub.n-1] - k[K.sup.(1)-] [[bar.[epsilon]].sub.n-1]).sub.[pi]/2]. (34)

From (32) and 34), note that (28) can be rewritten as

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. (35)

For example, from (18) and 33) the NCO output 33) for QPSK reduces to [17]

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. (36)

Unfortunately, it is still highly complex for M-QAM (33) to be implemented with digital signal processors for the state of the art 28 nm CMOS technology as a result of the complexity required to carry out in one clock cycle the computation of the function [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] and then the last summation in (33). This problem can be mitigated if terms f([absolute value of [r.sub.n+k]], [[??].sub.n+k]) are precomputed by using the NCO output of the previous clock cycle; that is,

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. (37)

As we shall show later, the performance degradation caused by (37) is negligible in practical situations (e.g., 16-QAM with P [less than or equal to] 80). For 16-QAM, this behavior can be understood from the facts that (i) only the nondiagonal symbols use [[??].sub.n+k] (see (19)) and (ii) laser frequency fluctuations are slow compared to the baud rate.

A low-latency parallel implementation of the proportional loop can be easily derived from (33)-(37). Figure 14 shows the architecture of the low-latency parallel type I DPLL. Block "[F.sub.k]" (k = 0, 1, ..., P - 1) computes terms [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] given by (37), while block "[W.sub.k]" evaluates the summations of (33). Block "[W.sub.k]" (k = 0, 1, ..., P - 1) uses a fast adder (e.g., a Wallace tree and carry save adder [20]) to quickly calculate the NCO output (33). Furthermore, the gain [K.sup.(p)] is assumed to be a power of 2 (i.e., [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] with [N.sub.p] being a positive integer). In this way, multiplications by the proportional gain [K.sub.(p)] are reduced to simple bit shift operations. Again note that all additions in (33) are modulus 2[pi].

4.3. Integral Loop. On the other hand, from (29) and Figure 13, we can also derive the NCO component due to the integral path as follows:

[[psi].sup.(i).sub.n+m] [approximately equal to] [[psi].sup.(i).sub.n-1] + (m + 1)[K.sup.(i)-] [[epsilon]].sub.n]. (38)

The accumulated phase error can be expressed as

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. (39)

Based on (12), (14), (30), (34), and (38), the accumulated phase error can be evaluated as

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. (40)

4.4. Parallel Architecture of the New DPLL. A parallel implementation of the type II DPLL can be easily achieved as depicted in Figure 15. Term L = IP with l being a positive integer represents the latency required to compute all the operations of the integral path (e.g., the phase error computation (PEC) defined in (40)). Since the latency in this path is not as critical as in the proportional loop, its effect on the DPLL performance will be negligible, as we will show in the next section. Similarly to [K.sup.(p)], the integral gain [K.sup.(i)] is assumed to be a power of 2 (i.e., [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] with [N.sub.i] being a positive integer).

Figure 16(a) shows a possible implementation of the block "WP-1 ", and Figure 16(b) depicts an example of a tentative implementation of the "[F.sub.k]" block based on look-up tables for 16-QAM.

5. Numerical Results

In this section we evaluate the effectiveness of the proposed two-stage CPR. We use 16-QAM differentially encoded in quadrant on a nondispersive noisy channel with 1/T = 32 Gs/s. The OSNR at a given bit-error-rate (i.e., BER of [10.sup.-3]) is also used as a measure of the efficiency of the proposed CR loop. Two different type II DPLLs were simulated for comparison purposes: the already mentioned serial DPLL (S-DPLL) and the proposed low-latency parallel DPLL (P-DPLL) shown in Figure 15 with different parallelization factors. Moreover, the BPS algorithm with filter length N = 21 and B = 32 test phase values was considered.

The frequency responses of the DPLLs are depicted in Figure 17. The loop filter gains were selected in order to obtain maximum bandwidth with 0.5 dB maximum peaking (see Table 1). For the optical system considered here, these values of bandwidth and peaking provide a good tradeoff between capture range and the residual phase noise power at the input of the slicer (see Figure 1).

Due to the fact that frequency offset values in intradyne receivers exceed the maximum theoretical limit of 1/(8T) [27] that can be reached by decision directed algorithms at the considered symbol rate (i.e., [+ or -] 5 GHz; see [28]), typical intradyne coherent optical receivers are provided with a coarse carrier frequency recovery (CCFR) stage [2] that minimizes or reduces to zero this frequency gap to values in the theoretical range. However, residual frequency offset after CCFR can surpass the tolerance of CPR algorithms like the VV and the one considered in this work, that is, BPS. The capture range for the proposed P-DPLL is ~[+ or -] 4 GHz, which is close to the maximum theoretical frequency offset value for the given symbol rate (i.e., 1/(8T) = 4 GHz). Gear shifting is applied into the proportional and integral gains during the capture period.

Figure 18(a) shows the BPS CPR tolerance to the joint effect of the laser phase noise and the sinusoidal frequency tone amplitude, [A.sub.p]. At the same time Figure 18(b) depicts the performance of the combined architecture P-DPLL + BPS with P = 64 under the same conditions as the ones already mentioned. It is interesting to note in Figure 18(b) the significant improvement in terms of sinusoidal frequency tolerance of the combined architectures in relation to the single stage CPR solely based on BPS. In other words, this improvement is evidenced in the increase of the contour line slope, getting parallel (i.e., independent) to the [A.sub.p] axis.

Figure 19 complement the current study for several values of the parallelization factor under the same conditions earlier detailed. Particularly, Figure 19 shows the performance of the two stage CPR architecture DPLL in conjunction with BPS algorithm using 16-QAM scheme. From the present study it is possible to derive Figure 20 where the efficiency of the proposed approximation for the parallelization of the DPLL is evidenced. Even though the 16-QAM format seems to be sensible to the effect of the parallelization factor, it is possible to highlight that the performance remains constant in a wide range of the parallelization axis and solely increases the penalization for large values of laser linewidth (i.e., [DELTA]vT).

5.1. Impact of Decision Errors. The impact of the decision errors in terms of the variance of the estimated phase is analyzed for two different PLLs with the same bandwidth against the modified Cramer-Rao bound (MCRB) [29]. The Cramer-Rao lower bound (CRLB) can be considered as a fundamental limit on the performance that a linearized system can reach in the absence of decision errors [30].

In other words, the optimum theoretical bound is achieved under the simplifying assumption that the additive noise does not affect the receiver decisions about the data symbols. Simulation results for (i) the serial DPLL (S-DPLL) and (ii) the parallel DPLL (P-DPLL) with a parallelization factor of P = 80 are shown in Figure 21(a).

At the OSNR regime of interest in the application considered in our work (i.e., PDM-16-QAM, 1/T = 32 Gs/s, BER < [10.sup.-2] [right arrow] OSNR > 18 dB), it can be observed that the phase noise variance in the proposed parallel DPLL is sliglthy higher than that experienced in a serial DPLL. Nevertheless, notice that the impact of this phase variance increase on the performance in terms of bit-error-rate (BER) is practically negligible (see Figure 21(b)). Finally, it is important to highlight that catastrophic errors caused by cycle slips are avoided in the proposed carrier recovery architecture by using differential 16-QAM [11].

6. Conclusion

A new DPLL-based carrier recovery architecture for high speed optical coherent receivers has been introduced in this paper. The proposed parallel scheme builds upon a novel DPLL computation, which breaks the bottleneck of the feedback path. We have shown here a novel approach that leads to a simple parallel implementation. Furthermore, it has also been demonstrated that the new parallel DPLL can provide a bandwidth and capture range similar to those achieved by the serial DPLL.

The proposed two-stage carrier recovery architecture based on a low-latency parallel DPLL and a feedforward phase estimator BPS offers a low complexity, high performance, integral solution to the frequency, and phase compensation in coherent optical systems. This solution outperforms previously proposed architectures when all optical channel impairments present in real applications, including laser phase noise, sinusoidal frequency jitter, and frequency offset, are accounted for in the modeling.

http://dx.doi.org/10.1155/2013/240814

Acknowledgment

This paper has been supported in part by the ANPCyT (PICT2011-2527), MINCyT, Fundacion Tarpuy, and Fundacion Fulgor.

References

[1] P. Winzer, "Beyond 100G ethernet," IEEE Communications Magazine, vol. 48, no. 7, pp. 26-30, 2010.

[2] D. Crivelli, M. Hueda, H. Carrer et al., "A 40nm CMOS single-chip 50Gb/s DP-QPSK/BPSK transceiver with electronic dispersion compensation for coherent optical channels," in Proceedings of the IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC '12), pp. 328-330, February 2012.

[3] D. E. Crivelli, H. S. Carrer, and M. R. Hueda, "Adaptive digital equalization in the presence of chromatic dispersion, PMD, and phase noise in coherent fiber optic systems," in Proceedings of the IEEE Global Telecommunications Conference (GLOBECOM '04), vol. 4, pp. 2545-2551, December 2004.

[4] M. Kuschnerov, F. N. Hauske, K. Piyawanno et al., "DSP for coherent single-carrier receivers," Journal of Lightwave Technology, vol. 27, no. 16, pp. 3614-3622, 2009.

[5] G. P. Agrawal, Fiber-Optic Communication Systems, Wiley-Interscience, 2nd edition, 1997.

[6] O. E. Agazzi, M. R. Hueda, H. S. Carrer, and D. E. Crivelli, "Maximum-likelihood sequence estimation in dispersive optical channels," Journal of Lightwave Technology, vol. 23, no. 2, pp. 749-763, 2005.

[7] O. E. Agazzi, M. R. Hueda, D. E. Crivelli et al., "A 90 nm CMOS DSP MLSD transceiver with integrated AFE for electronic dispersion compensation of multimode optical fibers at 10 Gb/s," IEEE Journal of Solid-State Circuits, vol. 43, no. 12, pp. 2937-2957, 2008.

[8] M. Taylor, "Phase estimation methods for optical coherent detection using digital signal processing," Journal of Lightwave Technology, vol. 27, no. 7, pp. 901-914, 2009.

[9] A. Viterbi, "Nonlinear estimation of PSK-modulated carrier phase with application to burst digital transmission," IEEE Transactions on Information Theory, vol. 29, no. 4, pp. 543-551, 1983.

[10] T Pfau, S. Hoffmann, and R. Noe, "Hardware-efficient coherent digital receiver concept with feedforward carrier recovery for M-QAM constellations," Journal of Lightwave Technology, vol. 27, no. 8, pp. 989-999, 2009.

[11] E. Ip and J. Kahn, "Feedforward carrier recovery for coherent optical communications," Journal of Lightwave Technology, vol. 25, no. 9, pp. 2675-2692, 2007

[12] I. Fatadin and S. Savory, "Compensation of frequency offset for 16-QAM optical coherent systems using QPSK partitioning," IEEE Photonics Technology Letters, vol. 23, no. 17, pp. 1246-1248, 2011.

[13] H. Leng, S. Yu, X. Li et al., "Frequency offset estimation for optical coherent m-QAM detection using chirp z-transform," IEEE Photonics Technology Letters, vol. 24, no. 9, pp. 787-789, 2012.

[14] S. Dris, I. Lazarou, P. Bakopoulos, and H. Avramopoulos, "Frequency offset estimation in m-QAM coherent optical systems using phase entropy," in Proceedings of the Conference on Lasers and Electro-Optics (CLEO '12), pp. 1-2, May 2012.

[15] X. Zhou, X. Chen, and K. Long, "Wide-range frequency offset estimation algorithm for optical coherent systems using training sequence," IEEE Photonics Technology Letters, vol. 24, no. 1, pp. 82-84, 2012.

[16] M. Kuschnerov, K. Piyawanno, M. S. Alfiad, B. Spinnler, A. Napoli, and B. Lankl, "Impact of mechanical vibrations on laser stability and carrier phase estimation in coherent receivers," IEEE Photonics Technology Letters, vol. 22, no. 15, pp. 1114-1116, 2010.

[17] P. Gianni, G. Corral-Briones, C. Rodriguez, H. Carrer, and M. Hueda, "A new parallel carrier recovery architecture for intradyne coherent optical receivers in the presence of laser frequency fluctuations," in Proceedings of the Global Telecommunications Conference (GLOBECOM '11), pp. 1-6, 2011.

[18] N. Stojanovic, Y. Zhao, B. Mao, C. Xie, F. N. Hauske, and M. Chen, "Robust carrier recovery in polarization division multiplexed receivers," in Proceedings of the Optical Fiber Communication Conference, Technical Digest (Optical Society of America), Los Angeles, Calif, USA, March 2012.

[19] E. A. Lee and D. G. Messerschmitt, Digital Communication, KAP, 1st edition, 1992.

[20] K. K. Parhi, VLSI Digital Signal Processing Systems: Design and Implementation, Wiley-Interscience, 1999.

[21] P. Gianni, H. S. Carrer, G. Corral-Briones, and M. R. Hueda, "A novel low-latency parallel architecture for digital PLL with application to ultra-high speed carrier recovery systems," in Proceedings of the 7th Southern Conference on Programmable Logic (SPL '11), pp. 31-36, April 2011.

[22] K. Piyawanno, M. Kuschnerov, B. Spinnler, and B. Lankl, "Low complexity carrier recovery for coherent QAM using superscalar parallelization," in Proceedings of the 36th European Conference and Exhibition on Optical Communication (ECOC '10), pp. 1-3, September 2010.

[23] X. Zhou and Y. Sun, "Low-complexity, blind phase recovery for coherent receivers using QAM modulation," in Proceedings of the Optical Fiber Communication Conference and Exposition and the National Fiber Optic Engineers Conference (OFC/NFOEC '11), pp. 1-3, March 2011.

[24] Q. Zhuge, M. E. Mousa-Pasandi, X. Xu et al., "Linewidth-tolerant low complexity pilot-aided carrier phase recovery for m-QAM using superscalar parallelization," in Proceedings of the Optical Fiber Communication Conference, Technical Digest (Optical Society of America), Los Angeles, Calif, USA, March 2012.

[25] Q. Zhuge, M. Morsy-Osman, X. Xu et al., "Pilot-aided carrier phase recovery for m-QAM using superscalar parallelization based PLL," Optics Express, vol. 20, no. 17, pp. 599-519, 2012.

[26] I. Fatadin, D. Ives, and S. Savory, "Laser linewidth tolerance for 16-QAM coherent optical systems using QPSK partitioning," IEEE Photonics Technology Letters, vol. 22, no. 9, pp. 631-633, 2010.

[27] D. G. Messerschmitt, "Frequency detectors for PLL acquisition in timing and carrier recovery," IEEE Transactions on Communications Systems, vol. 27, no. 9, pp. 1288-1295, 1979.

[28] Z. Tao, L. Li, L. Liu et al., "Improvements to digital carrier phase recovery algorithm for High-Performance optical coherent receivers," IEEE Journal of Selected Topics in Quantum Electronics, vol. 16, no. 5, pp. 1201-1209, 2010.

[29] A. N. DAndrea, U. Mengali, and R. Reggiannini, "The modified Cramer-Rao bound and its application to synchronization problems," IEEE Transactions on Communications, vol. 42, no. 2, pp. 1391-1399, 1994.

[30] H. Meyr, M. Moeneclaey, and S. A. Fechtel, Digital Communication Receivers, Synchronization, Channel Estimation, and Signal Processing, Wiley-Interscience, 2nd edition, 1997

Pablo Gianni, Laura Ferster, Graciela Corral-Briones, and Mario R. Hueda

Laboratorio de Comunicaciones Digitales, Universidad Nacional de Cordoba (CONICET), Avenida Velez Sarsfield 1611, X5016GCA Cordoba, Argentina

Correspondence should be addressed to Pablo Gianni; giannipablo@gmail.com

Received 10 December 2012; Accepted 28 March 2013

Academic Editor: Ashkan Ashrafi

TABLE 1: DPLL Parameters. Parallelism K(p) K(i) Processing rate 1 0.12 0.001 32 GHz 32 2-5 2-10 1GHz 64 2-6 2-10 500 MHz 80 2-6 2-10 400 MHz

Printer friendly Cite/link Email Feedback | |

Title Annotation: | Research Article |
---|---|

Author: | Gianni, Pablo; Ferster, Laura; Corral-Briones, Graciela; Hueda, Mario R. |

Publication: | Journal of Electrical and Computer Engineering |

Article Type: | Report |

Date: | Jan 1, 2013 |

Words: | 5939 |

Previous Article: | Total variation regularization algorithms for images corrupted with different noise models: a review. |

Next Article: | Bayesian compressive sensing as applied to directions-of-arrival estimation in planar arrays. |

Topics: |