As developers create ever-faster processors,
multiple-processor systems, and more demanding applications,
progress places significant burdens on traditional bus
technologies, driving the urgent need to improve the
communications link between interconnected devices—and a
faster transport mechanism between the processor and main
memory. Keeping AMD's high-performance 64-bit Opteron
processor, for example, fed with data and instructions is a
task that can quickly saturate conventional I/O busses based
on older PCI technology. Add in the bandwidth demands of a new
generation of multi-processor-capable applications, and you
have a real bottleneck.
To overcome the problem of bus saturation, AMD designed the
Opteron chip not only to be a fast processor, but also to
support the efficient transport of data between interconnected
processors, supporting chips, and I/O devices. To reach this
goal AMD, working with a consortium of industry vendors,
created the HyperTransport technology I/O bus—which transports
data at speeds up to 6.4 GB/s. Figure 1 shows some of the
basic statistics about HyperTransport—see http://www.hypertransport.org/ for details
on the supporting organization.
The HyperTransport bus has benefits that go beyond the
obvious. It uses a "packetized" design, which means that
addresses, data, and commands are sent along the same wires,
allowing for a much narrower link. PCI and its derivatives, by
contrast, are wider, slower busses that require dedicated pins
and traces for data, addresses, and sideband information.
Although the HyperTransport bus may require more cycles to
move a given amount of data, the pure speed of the
HyperTransport connection ensures a far higher effective data
transfer rate—6.4GB/s versus 1GB/s—leaving poor old PCI-X bus
in the dust.
The simplicity of HyperTransport technology enables
hardware designers to build less complex systems, as routing a
narrow bus is far easier than routing a wider bus. Narrower
busses also reduce the need to add layers and additional costs
to system board designs, so lower-cost four-layer circuit
boards can be used. Although all this may be interesting from
a hardware architect point of view, what this really means to
you, the software developer, is that systems using
HyperTransport technology offer the outstanding performance at
low cost that youve been looking for.
Another problem that the Opteron processor addresses is the
relatively slow connection between the processor and
supporting circuitry. This connection, commonly called the
"front-side bus," is the transport mechanism for all data
traveling between the processor and main memory, graphics
card, and all types of I/O devices. The front-side bus
transfer rate on prior generation AMD processors is on the
order of 2.1GB/s—fast, but still capable of being saturated by
the demands of a server configured with multi-processors,
high-speed network cards, and fast storage devices. So the
Opteron processor replaces the front-side bus with a
HyperTransport connection that dramatically extends
communication bandwidth up to 6.4GB/s.
Even with Opteron processors forward-looking design, the
past has not been forgotten. Just as 64-bit Opteron chips can
handle prior-generation 32-bit x86 applications with
confidence, AMDs implementation of HyperTransport cleanly
supports existing I/O technologies such as PCI-X, AGP-8x, USB
2.0, 10/100 Ethernet, and EIDE/ATA.
With HyperTransport, system architects are also freed from
the design constraints imposed by traditional bus
architectures—specifically with the inherent limitations of
the popular Northbridge/Southbridge design. Using
HyperTransport technology as a building block, one can easily
construct a daisy-chained interconnect between system
components. With this approach, illustrated by Figure 2, a
server could support as many high-speed interfaces (such as
Fibre Channel, IEEE-1394 FireWire, Gigabit Ethernet or
InfiniBand) as desired.
Faster Applications
Applications that are
optimized for multi-processor environments are often
constructed using a message-passing architecture. Keeping
large numbers of application threads in sync can result in
high levels of bus traffic as messages are sent back and forth
between processors. Prior to the advent of the Opteron
processors HyperTransport architecture, these messages needed
to compete for attention with other bus traffic.
And when you add in SMP, everything just gets better: With
a multi-processor Opteron system, message-passing applications
can achieve their true potential as HyperTransport technology
provides a high-speed, chip-to-chip interconnect that
significantly reduces the I/O performance bottleneck, with
ample performance headroom for future growth. Figure 3 shows
one possible way of architecting an SMP system using
HyperTransport.
Application performance is further enhanced by fact that
the Opteron processor has a direct connection to main
memory—no bus needed. The integration of a memory controller
into the processor core significantly reduces memory latency
because it alleviates the need for memory transactions to
traverse the traditional memory access path through the
"Northbridge" chip. The effect of the reduction in memory
latency, coupled with the additional increase in memory
bandwidth available directly to the processor, cannot be
overstated, as it tremendously benefits system performance
across all application segments.
With HyperTransport technology it is now possible to build
servers and technical workstations that are faster, cheaper,
and simpler than ever before. Not only will your typical
application benefit from the Opteron processors high
performance I/O bus and direct to memory interface, youll also
find the HyperTransport technology-based processor
interconnect can yield significant improvements in the
performance of multi-processor capable applications. Want to
learn more about how it works? See HyperTransport Technology I/O Link: A
High-Bandwidth I/O Architecture (PDF).