Leveraging useful skew.

Useful skew is one of the most important steps to meet timing. But the scope of useful skew is limited in timing critical paths. Our idea tries to increase the scope of introducing useful skew in timing critical paths by exploiting non-timing critical paths.

What is Clock Skew?

The difference between the clock arrival times for the launch and capture flops is termed as clock skew. As shown in Figure 1, the difference in clock latencies for Flop2 and Flop1 is (7.2 ns - 7.0 ns = 200 ps).

[FIGURE 1 OMITTED]

Question: How Clock Skew Helps in Meeting Timing? Answer: Useful Skew

Lets assume that the path from Flopl to Flop2 is violating by 250 ps. The following is the sequence of steps that can be done to meet the timing using useful skew.

1. Identify the slack value "to" the violating register (Flop2 in Figure 2).

[FIGURE 2 OMITTED]

2. Identify the slack value "from" the violating register (i.e. slack value from Flop2 to Flop3 as shown in Figure 2).

3. If the slack value in step (2) is greater that slack value in step (1), the clock of violating register can be pushed such that the timing is met.

4. Check for further violating registers and gain on timing. Figure 3 shows the situation after pushing the clock. We can notice that the timing paths are meeting.

[FIGURE 3 OMITTED]

a. Timing slack "to" Flop2 has changed to 0.050 ms from 0.250 ns.

b. Timing slack "from" Flop2 has changed to 0.450 ns from 0.750 ns.

Useful Skew Can be More Useful

In Figure 4a (viewable online), the slack "to" the violating register is shown as -0.950 ns. After pushing the clock arrival for Flop2 by 750 ps, we would still not meet timing as shown in Figure 4 (viewable online). The path "to" the violating Flop2 would still violate by 200 ps (after inserting a RED clock buffer with 0.750 ns delay). We can make useful skew more useful by following our three-pronged strategy:

[FIGURE 4 OMITTED]

1. Increase the uncertainty "from" violating register so that the slack of 0.750 ns is more pessimistic. Lets assume the new slack is -1.5 ns. Also decrease the uncertainty "to" the violating register such that it is meeting timing.

2. Do optimization of the design after finding all such occurrences.

3. Remove the extra uncertainty after optimization. Now, the slack would be 1.1 ns. Notice that before the strategy the slack was 0.75 ns. Now it has changed to 1.1 ns. Also, slack to the violating register has changed back to -0.950 ns.

Corollary:

Because of extra uncertainty, timing path "from" violating register would have more slack. Hence we would get sufficient scope to further push the clock at Flop2. Now, Figure 4 would change.

The idea could be further extended to more register levels. Essentially we can generate more useful skew margin through cascading registers.