Timing redundancy in telecommunication systems: a white paper.
Highly reliable operation
Telecommunications systems must provide highly reliable operation under all network conditions. To do this, the most critical components within the system are made redundant. A typical telecommunications product is a 19inch standard telecom rack populated by up to 18 one-inch vertically inserted cards.
As shown in Figure 1, a typical system is comprised of two control cards and multiple line cards that communicate over a common backplane. The two control cards are identical and run in parallel. Only one control card is active at any given time, and the other takes over if the first fails. Switching from one control card to the other should not cause any interruption or failure in the system.
[FIGURE 1 OMITTED]
The control card includes a system control processor, switching fabric and system timing. It is important to note that in more complex, larger systems, timing is implemented on separate cards to further increase the flexibility of the product. This article covers only the timing aspects of telecommunication systems.
Timing card architecture
Having two timing cards protects against an internal failure where one of the cards fails. To protect from external clock reference failures, the timing cards are designed to be able to synchronize to more than one reference.
A timing card accepts references from multiple sources, selects one, cleans it from phase noise with a digital phase locked loop (DPLL), and distributes it to the line cards via the backplane. The DPLL is the most important part of the timing card. Depending on the targeted application of the product and region of deployment, the DPLL needs to be compliant with the appropriate timing specifications, such as Telcordia GR-1244 CORE, Telcordia GR-253 CORE or ITU G.813. The DPLL needs to provide an array of crucial features, including:
* Hitless reference switching--if the reference the DPLL is locked to fails, the DPLL will lock to another available reference without phase disturbances at its output.
* Holdover mode--the DPLL constantly calculates the average frequency of the locked reference. If the reference fails and none of the other references are available, the DPLL goes into holdover mode where it generates an output clock based on calculated average value. Holdover stability depends on resolution of the DPLL averaging algorithm and on frequency stability of the oscillator used as the DPLL master clock.
* Reference monitoring--the DPLL needs to constantly monitor quality of its input references. If the reference the DPLL is locked to deteriorates--disappears or drifts in frequency--the DPLL raises an alarm (interrupt) and switches to another valid reference.
* Narrow loop bandwidth--the DPLL can be viewed as a phase noise filter. The narrower the loop bandwidth, the better the phase noise attenuation. Some specifications, such as G.813, explicitly provide the loop bandwidth. Others, including GR-253 CORE, provide narrow loop bandwidth specifications implicitly through the wander transfer requirement. Ideally, the DPLL should have programmable loop bandwidth so the timing card can be easily used for different applications.
* High jitter and wander tolerance--the DPLL should tolerate large phase noise at its input and still maintain synchronization.
Timing card DPLL references can come externally from a Building Integrated Timing Supply (BITS) or internally from line cards. The BITS is defined as the most accurate clock in an office, and is used as a master clock for all intraoffice equipment. The BITS can be viewed as a standalone timing car& usually with Stratum 2 (0.1 parts per billion) holdover stability. The BITS is timed by two T1 signals and its outputs are distributed to equipment with T1 or Composite Clock (CC) signals. It should be noted that BITS is a North American term, while the rest of the world uses Synchronization Supply Unit (SSU). Where BITS uses TI for clock reception and distribution, SSU uses E1 links.
All nodes in a public telecommunication network must be synchronized to timing references that are traceable to a Primary Reference Source (PRS). A PRS provides a clock with Stratum 1 accuracy (0.01 parts per billion). PRS can be generated from an on-site cesium clock, or from cesium clock-controlled radio signals such as Global Positioning System (GPS) and Long Range Navigation System, or Version C (LORAN-C). Due to the high cost of cesium clocks, PRS usually use GPS with LORAN-C as a backup if GPS fails. Because it is not economically viable to have PRS at each network node, few (usually two) nodes have their BITS synchronized directly to PRS.
The other nodes in the network use line timing where their BITS/SSU is synchronized to one of the extracted line clocks. The clock path sequence is shown in Figure 3. In this case, an additional low-cost wideband DPLL is needed to convert the frequency of the line card extracted clock to the frequency needed by T1/E1/CC Line Interface Units (LIU). LIUs are used for the transmission of the timing references between the timing card and BITS and vice versa. For example, if the extracted line clock originates from an OC-3 line card, its frequency is usually 19.44 MHz so the wideband DPLL is needed to convert from 19.44 MHz to 1.554 MHz (T1), 2.048 MHz (E1), or 64 KHz (CC).
[FIGURE 3 OMITTED]
Optionally, the timing card can be used to source BITS/SSU clock if an external BITS/SSU source with better holdover accuracy is not available. In this case, the timing card DPLL is synchronized to one of the extracted line clocks. Its output is fed to the backplane and to LIUs via wideband DPLL.
Timing card redundancy
Timing card redundancy is implemented in one of two ways--parallel redundancy or serial redundancy. Parallel redundancy is shown in Figure 4, while serial redundancy (commonly referred to as "master/slave" time redundancy) is illustrated in Figure 5.
[FIGURE 4-5 OMITTED]
As seen in Figures 4 and 5, DPLLs on the active and redundant cards drive the active and redundant clocks to the corresponding traces on the backplane. Each DPLL usually drives common clock frequencies such as 8 kHz (DS0), 1.544 MHz (DS1), 2.048 MHz (E1) and 19.44 MHz (SONET/SDH).
The active and redundant clocks on the backplane should have the same frequency and phase. Ideally, the phase difference should be equal to zero. In practice, a phase difference in the range of few nanoseconds is achievable.
The active and redundant clocks are distributed via the backplane to the line cards. As seen in Figures 4 and 5, the line cards each have a DPLL followed by an analog PLL (APLL). The DPLL is used for hitless switching between the active and redundant clocks and to provide clock continuity tbr a short period, such as when the active clock unexpectedly disappears before the system detects active reference failure and switches the line card DPLL to lock to the redundant reference.
The APLL is used only for jitter reduction and frequency multiplication. It is possible to have hitless reference switching with an APLL. However, good clock continuity is difficult to achieve because oscillators used on APLLs (usually LC-based) have very low holdover stability relative to DPLLs that use crystal oscillators. Typically, a DPLL has short-term holdover accuracy of 0.01 ppm (parts per million) or better, whereas an APLL has holdover accuracy above 100 ppm.
Parallel timing redundancy
In this scheme, as illustrated in Figure 2, DPLLs on both timing cards are locked to either an extracted line clock from one of the line cards or the BITS reference. Both DPLLs should be locked to the same input reference and should have identical loop bandwidth (i.e. 0.1 Hz for Telcordia GR253 CORE). In this case, if the active card does a reference switch from BITS0 to BITSI, the redundant card should simultaneously do the same. Because the DPLLs on the active and redundant timing cards have the same bandwidth and are fed with the same input reference, the outputs should be closely phase aligned regardless of the jitter/wander on the input reference. However; this is only partially true due to intrinsic wander issues. We will look at this later in the article.
Serial (master/slave) redundancy
A serial redundancy timing scheme is implemented by locking the secondary timing card to the output of the primary timing card, as shown in Figure 3. The loop bandwidth of the DPLL on the active timing card should be set in accordance with requirements (for Telcordia GR-253CORE it is 0.1 mz). However, the loop bandwidth of the DPLL on the redundant card should be set as wide as possible--at least l0 times more than the DPLL on the active card. The wider bandwidth allows the DPLL to track clock changes at its input much faster, thus keeping the active and redundant clocks closely aligned at all times.
If it is detected that the clock generated by the active card has failed, the DPLL on the secondary card will go into holdover mode and signal to the board controller. The controller will now promote the secondary card to act as the primary card by selecting the narrowband loop filter on the DPLL and locking the DPLL to the same reference input (if available) that the active card was locked to before it failed. When the failed timing card is replaced, the new card will assume the role of the redundant timing card.
In serial timing redundancy, the phase offset between the active and the redundant clocks can be calculated from:
D = [d.sub.PLL] + [d.sub.RxBuffer] + [d.sub.Mux] + [d.sub.TxBuffer]
[d.sub.RxBuffer] is a typical propagation delay of the receive clock buffer on the slave card,
[d.sub.Mux] is a typical propagation delay of the clock multiplexer,
[d.sub.TxBuffer] is a typical propagation delay of the clock driver on the slave card, and
[d.sub.PLL] is a typical phase offset between input and the output reference after reference alignment is performed.
Some advanced DPLLs intended for timing card design have the ability to advance the output clock relative to the input with a resolution below 1 nanosecond. This feature can be used to minimize delay D.
Comparing redundancy schemes
In practice, designers use serial redundancy more often because it has several important advantages.
If the product is in island mode (not locked to the network reference or to the BITS clock), its timing cards must work in a free-run mode. In this mode, the DPLL output frequency will be based on crystal oscillators used as the DPLL master clock. As a result, the active and redundant clocks in the parallel method will drift relative to each other at a rate proportional to the fractional frequency difference between crystal oscillators on the active and redundant cards. However, in the serial redundancy method the active and redundant clocks will always be aligned because the DPLL on the redundant card locks to the clock generated by the free-running DPLL on the active card.
Since DPLLs on the active and redundant timing cards have the same bandwidth in the parallel redundancy method, and because they are led with the same input reference, one would expect that the outputs would be closely phase-aligned regardless of the jitter/wander on the input reference. However, the active and redundant clock may drift back and forth relative to one another due to intrinsic wander generated by the DPLL. This intrinsic wander is dependent on the short time-frequency fluctuations of the crystal oscillator and on the bandwidth of the DPLL. When fed with a clean input reference clock, a DPLL can compensate for those short-term fluctuations and provide clean clocks at its output.
However, the DPLL's ability to do so is dependent on its bandwidth. The wider the bandwidth, the better the compensation. Because the DPLLs on the active and redundant cards in the parallel redundancy method have the same narrow bandwidth they will both have intrinsic wander. Since each card has its own crystal oscillator, the wander generated by the DPLL will be uncorrelated. Thus, the active and redundant clocks may drift back and forth relative to each other. The maximum phase difference between them can be more than 10 nanoseconds when the DPLL is set to 0.1 Hz loop bandwidth, even when very stable oscillators such as Ovenized Crystal Oscillators (OCXO) are used. This problem is not present in the serial redundancy mode because the DPLL on the redundant card compensates for all frequency fluctuations caused by the crystal oscillator due to its wide loop bandwidth.
Yet, the parallel redundancy scheme is easier to implement because it does not require reconfiguration of the DPLL on the redundant card when the active clock/card fails.
Timing card redundancy is implemented in telecommunications products to prevent data loss and increase network reliability. This article presented the typical timing card architecture and two common ways of implementing timing card redundancy. Although slightly more complicated to implement, serial redundancy has several advantages over parallel redundancy.
Alain Blachard, Phase-Locked Loops, Wiely 1976 Synchronous Optical Network (SONET) Transport Systems."
Common Generic Criteria GR-253-CORE, Issue 3, 2000 Clocks for the Synchronized Network: Common Generic Criteria GR-1244-CORE, Issue 2, 2000
Digital Network Synchronization Plan GR-436 CORE, Issue 1, Revision 1, 6 1996
Timing characteristics of SDH equipment slave clocks (SEC) ITU-T Recommendation G.813, 1998
Transport Systems Generic Requirements (TSGR): Common Requirements GR-499-CORE, Issue 2, 1998
Circle 241--Zarlink Semiconductor, or connect directly at www.rsleads.com/501df-241
About the author: Slobodan Milijevic is a Senior Applications Engineer with Zarlink Semiconductor: He can be reached at email@example.com.