The challenges of testing SATA and SAS; part 1: the physical level.
Similar to the relationship between ATA and ATAPI, Serial Attached SCSI (SAS) is a way of transferring SCSI commands across a SATA bus. Unlike ATAPI, SAS adds the additional protocol to allow the bus to be used for high-end enterprise storage functions with routing capabilities, replacing Fibre Channel with a lower-cost solution in some applications.
Like networking protocols, SAS and SATA can be viewed at many different layers. These include the signaling, byte, packet, command, transaction, and application layers. Problems can occur at and layer, so being able to traverse them quickly during development and debug can significantly reduce the lab time required to get a product finished. Those familiar with PATA who don't have time to learn SATA's new technologies can work with SATA using the familiar Task File registers that PATA uses. Although this is possible for testing devices at the Task File and Application level, it won't provide a way to exercise SATA features, and won't allow the debug of problems at the protocol or physical levels. Similarly, SAS can be used at the SCSI Command Descriptor Block (CDB) level, but the CDBs are now a very small portion of the overall protocol. The right toolset for traversing all layers of SAS or SATA includes tools for analog debug such as an oscilloscope, and tools for digital debug such as a protocol analyzer. It can also be useful to have SAS/SATA traffic generators that allow the user to take full control over the protocol and verify that the host or device can handle various exception conditions. The oscilloscope should preferably be a 4+ GHz real-time scope, but a repetitive sampling scope with 4+ GHz bandwidth can also be used. A good protocol analyzer, such as the Data Transit Bus Doctor used in this article, will provide multiple displays for each of the different layers of the bus, including the Command (Task File) layer, the Packet layer, the Double-Word layer (DWord for short), and the bit layer. A good Traffic Generator, such as the Data Transit SAS/SATA PacketMaker will allow the user to manipulate traffic at all the levels from bits through commands.
[FIGURE 1 OMITTED]
The Physical Level: Due to the Ghz frequencies used by SAS/SATA, changes in the transmission path that would have been considered negligible in ATA or SCSI are now bus-breaking catastrophes that can put a product on ship-hold. Understanding and measuring the quality of the signals is critical to resolving these problems. This section will discuss Eye Diagrams, Jitter, 8b/10b Encoding, Spread Spectrum Clocking, Common Mode signaling and Out-of-Band handshaking. It will include display snapshots from scopes and protocol analyzers showing good and bad signaling.
[FIGURE 2 OMITTED]
Figure 1 shows a sampling scope display of an eye diagram. An eye template is used by the scope to show the keep-out area of the eye. As long as the signal stays outside the red zone, a SAS/SATA receiver will be able to clock valid data in at the right time.
[FIGURE 3 OMITTED]
Since SAS/SATA receivers detect differential switching, an eye diagram is used to simultaneously view the difference between the + and - signals in a differential pair. There are a variety of ways of taking eye diagrams (see Figure 2). The simplest method is to use a single differential probe. Two drawbacks to this method are that it doesn't show the DC voltage offset that the diff-pair may have, and differential probes are generally behind single-ended probes when it comes to bandwidth. The preferred method is to use two single-ended probes feeding two channels of the scope. This allows the use of the highest bandwidth probes, and will also show the instantaneous relationship between the + and - signal. A quicker method is to use one single-ended probe and configure the scope to overlay repetitive traces. The main drawback to this method is that it only shows one side of the pair at a time (+ or -). In practice, the latter method is the one used most often due to its simplicity.
Figure 3 is a sampling scope display of typical SATA signals as measured at the destination SATA connector. These signals display good rise and fall times, controlled overshoot and undershoot, and an acceptable amount of jitter, as evidenced by the width of each eye opening.
[FIGURE 4 OMITTED]
This scope trace was taken using the repetitive overlay method with one single-ended probe. Notice that the bit pattern is usually alternating between 1 (high) and 0 (low), but there are also instances where there are three 1's in a row or three 0's in a row, which results in a higher amplitude than normal. The amplitude shift is bad because when the signal finally does transition it is starting at different amplitude than normal which will cause a timing difference due to the signal slew-rate. Often a technique called pre-emphasis is used by a transmitter to limit the overshoot during consecutive 1's or 0's.
Jitter is often measured in picoseconds (ps), or as a percentage of a Unit Interval (UI). A UI is the time allocated for each bit, which for 1.5Gb SATA would be 1.5 billionth of a second, or 666ps. In the scope display shown, about 85% of each UI is open, giving this system sufficient margin. Jitter fills the other 15% of each UI, and since each UI is 666ps, we can specify the jitter as 100ps.
Figure 4 is a sampling scope display of a SATA bus with excessive capacitance. Notice the slower rise and fall times that cause the reduced eye amplitude, the higher overshoot and undershoot, and the resulting additional jitter that causes reduced eye width.
The eye opening is only open for about 75% of each UI in the display above, which means there is 166ps of jitter. This amount of jitter will likely cause transmission errors with some receivers.
[FIGURE 5 OMITTED]
SAS and SATA are much less forgiving of inconsistencies in the transmission path than SCSI or PATA were. Inconsistencies can come in the form of impedance mismatches, stubs, crosstalk, and capacitive loads. For example, using a scope probe with a capacitance as low as 1 pf will visibly change the measured waveform. Protocol Analyzers can cause serious problems trying to tap into the bus, so most analyzers for 3Gb SAS/SATA don't tap, but instead will receive and retransmit by terminating the incoming signal and then retransmit a new clean version of it again.
Typically a protocol analyzer will make it obvious to the user if signal integrity is a problem by highlighting errors in red. Signal integrity problems usually show up at the protocol level as disparity errors, coding errors, or CRC errors. Before jumping into the cause of bit-level errors it is important to understand that 10-bit characters are transmitted to represent each 8-bit byte, and disparity describes the ratio of 1's to 0's in each 10-bit character. The 10-bit translation accomplishes two primary purposes:
1) It allows the clock to be embedded within the data by ensuring that normally there are never more than four 1's or 0's in a row.
2) It allows for higher transfer rates by DC balancing the data stream. DC balance is achieved by ensuring that the number of 1's and 0's transmitted is always equal (neutral disparity). If a 10-bit character is transmitted that has positive disparity (more 1's than 0's), then the transmit logic chooses a negative disparity character to represent the next byte. A disparity error is when a 10-bit character of mostly 1's is followed by another character with a majority of 1's, or when a character of mostly 0's is followed by another of mostly 0's.
Coding Errors occur when the 10-bit code isn't valid, for example codes that have 5 or more consecutive 1's or 0's are invalid (except for the K-character which is a special 10-bit code used for bit-aligning the receiver). CRC Errors occur when the 32-bit CRC value at the end of each frame is incorrect.
The Protocol Analyzer display in Figure 5 shows a state listing of the bi-directional data, with a coding error (too many 1's in a row) in one of the characters on the host side. Normally, the analyzer displays the data in an 8-bit format, but when an error occurs the 10-bit data is shown allowing the user to see the bit-level cause of the error. Since SAS/SATA data is organized into 4-byte DWords, the analyzer shows a DWord on each line, except when an error occurs and then it uses four lines--one for each character in the DWord.
[FIGURE 6 OMITTED]
To minimize emitted radiation, SATA uses Spread Spectrum Clocking (SSC) to spread the radiation over a broad frequency range. Typically devices employing SSC will linearly vary the transmission frequency between 1.5Gb/s and about 1.49Gb/s (or 3.0Gb/s and 2.98Gb/s). The change occurs slowly enough (about 30Khz) that the data recovery PLL can track the frequency change and stay locked on the varying frequency. Although SATA devices aren't required to transmit SSC, they are required to be able to receive it. SAS devices don't use SSC at all (unless in SATA mode), and thus SAS integrators may have to work harder to get agency approvals on emissions.
Both SAS and SATA use Common-Mode signaling during initialization and as a wake-up signal when asleep. Common-Mode signals can be detected by hosts or devices whose receivers are in low-power mode, unlike the high-speed differential signals which require power hungry termination and biasing. Common-Mode signaling requires the transmitter to drive both the + and - signal to the same 250mV level, thus squelching the differential signal. Receivers use squelch detect circuits to recognize this condition, or the absence of this condition. Both hosts and devices send precisely timed pulses of Common-Mode signals known as Out-Of-Band (OOB) handshaking in order to communicate the Reset (COMRESET), Initialize (COMINIT), and Wakeup (COMWAKE) functions. In addition to these OOB functions, SAS equipment uses a special OOB (COMSAS) to indicate that it is SAS and not SATA.
The analyzer display in Figure 6 shows the timing of the Common-Mode pulses interleaved with ALIGN primitives during a COMRESET OOB. Primitives are special combinations of four 10-bit characters usually beginning with the K character. The timing relationship between the ALIGNs and the Common Mode are visible on the "OOB or Error" signal of the Bus Doctor Timing Waveform display. Notice that the COMRESET OOB pulses are 425ns apart (640 UI), followed by the COMSAS OOB pulses, which are 1065ns apart (1600 UI).
Data Scrambling is used by SAS/SATA to reduce radiated emissions. The scrambling algorithm ensures that even when a long string of the same 8-bit data value is sent, such as a long string of hex 00's, the transmitted data will be constantly changing, thus spreading the emissions over a wider frequency range.
Part 2 of this three-part article will cover SAS/SATA troubleshooting at the higher protocol layers in the February issue of CTR.
Dale Smith is CTO and founder of Data Transit Corporation (San Jose, CA)
|Printer friendly Cite/link Email Feedback|
|Title Annotation:||Connectivity; Serial Advanced Technology Attachment and Serial Attached SCSI|
|Publication:||Computer Technology Review|
|Date:||Jan 1, 2004|
|Previous Article:||Critical power: backup protection for your critical systems: your critical IT systems could be bombarded daily by nine different power problems....|
|Next Article:||New Holy Grail: information lifecycle management; Has it been found? Not yet.|