TMS320C3x Features
The TMS320C3x generation of 32-bit digital signal processors
(DSPs) integrates both system control and math-intensive functions on
a single controller. This system integration allows fast and easy
data movement and high-speed numeric processing performance.
Extensive internal busing and a powerful DSP instruction set provide
the devices with the speed and flexibility to execute at up to 60
MFLOPS (million floating-point operations per second). The devices
also feature a high degree of on-chip parallelism that allows users
to perform up to 11 operations in a single instruction cycle -
including, for example, two data accesses, a multiply, an ALU
operation and a DMA transfer.
'C3x Key Specifications
- Performance up to 60 MFLOPS
- Highly efficient C language engine
- Large address space: 16 Mwords x 32 bits
- Fast memory management with on-chip DMA
- Industry-exclusive 3V versions available on some
devices
'C3x Key Applications
- Digital audio
- 3-D graphics
- Laser printers, copiers, scanners
- Bar-code scanners
- Video conferencing
- Industrial automation and robotics
- Voice/facsimile mail
- Servo and motor control
- Networking
Features By Device
TMS320C3x CPU
The TMS320C3x CPU has an independent multiplier and accumulator
and achieves up to 60 MFLOPS. Results are stored in any one of
eight extended-precision registers. These are 40-bit registers
that store values with a 32-bit mantissa and an 8-bit exponent.
These registers can serve as both the source and destination for
any arithmetic operation. The extended-precision registers are
an extremely valuable resource for programming in assembly or
C. These registers allow you to maintain intermediate results
without storing data in memory. This results in higher-performance
assembly code and a more efficient C compiler.
To sustain 60 MFLOPS, the CPU has two independent auxiliary register
arithmetic units (ARAUs). The two ARAUs generate 24-bit addresses
that are accessed via the eight auxiliary registers. The ARAUs
can perform any of these functions:
- Pre- or post-increment or decrement
- Index offset for increment and decrement values other than
1
- Circular addressing to support circular buffers
- Bit-reversed addressing for FFTs
Features of the TMS320C3x CPU include:
- 60-MFLOP CPU
- Register-based CPU
- 32 or 40 floating-point/integer multiplier
- 32 or 40 floating-point/integer ALU
- 32-bit barrel shifter
- Eight 40-bit extended-precision registers
- Two address generators
- Two index registers
- Eight indirect-address registers
TMS320C3x Memory
To realize the full performance of the 'C3x CPU, it is important
to have a bus and memory architecture that can keep pace. The
'C3x fetches up to four words each cycle. These consist of a program
opcode, two CPU data operands, and a DMA data transfer. The internal
buses can transfer all four words in parallel, relying on seven
memory sources for data.
The 'C3x uses seven internal buses to access on-chip resources:
- Program Address/Data: The CPU uses these buses to maintain
instruction fetches every cycle.
- Data Address/Data: In any cycle, the CPU can fetch two data
operands, because it has two data address buses and one data bus
that can be accessed twice in a single cycle.
- DMA Address/Data: The DMA uses these buses to perform DMA
transfers in parallel with CPU operation.
With the internal buses in place to feed the DMA and CPU, the
'C3x devices can use both internal and external data and program
memory. The 'C30 and 'C31 have two 1K 32-bit word blocks of dual-access
RAM, while the 'C32 has two 256 32-bit word on-chip RAMs. This
memory provides up to four words of program or data in a single
cycle. All 'C3x devices feature an on-chip cache to boost system
performance. The primary bus for each device has 16M words of
address reach. The 'C30 features an expansion bus that has an
8K-word address reach, which is often used to interface to peripherals.
The 'C32 offers the ability to access 8-/16-/32-bit data stored
in 8-/16-/32-bit wide external memory giving the flexibility of
nine memory interface options. This feature can significantly
affect total system cost savings. Additionally, the 'C32 memory
interface allows for storage of the 32-bit instruction word in
either 16- or 32-bit-wide external memory.
TMS320C3x DMA Controller
The DMA controller transfers data between memory resources. The
serial ports and timers on the 'C3x are memory-mapped, allowing
DMA transfers to and from these peripherals. To perform a transfer,
the DMA reads a memory location pointed to by the source address
register and then writes to the memory location pointed to by
the destination address register. The source and destination addresses
are incremented or decremented after each transfer, depending
on the value of the global control register. The DMA controller
performs continuous transfers over the DMA bus until the value
in the transfer counter register reaches 0, and a programmable
interrupt is sent to the CPU.
For example, an application might use the DMA to transfer 512
words from slow external memory to the on-chip RAM. At the completion
of the transfer, an interrupt is sent to the CPU to process and
output results while the DMA transfers a new set of 512 words
to on-chip RAM. By off-loading data input, the DMA controller
allows sustained CPU performance for arithmetic calculations.
In this case, the CPU always has zero-wait-state access to data,
even though the external memory requires one or more wait states.
The 'C32 offers the programmer the flexibility of designating
priority on the bus. There are three options:
1) The CPU has priority over the DMA at all times ('C30 and 'C31)
2) The DMA controller has priority over the CPU or
3) The CPU and DMA share a rotating priority with the CPU having
first access
Features of the DMA controller include:
- Increased CPU-sustained performance by virtually eliminating
CPU I/O
- Memory-to-memory transfers
- Two-channel configurable priority ('C32 only)
- Programmable increment or decrement of addresses
|