




|
1600 MIPS DSP
'C6201 unleashes highest level of DSP price/performance
'C6x development tools emphasize software techniques
New VelociTI architecture key for parallel performance
|
New VelociTI architecture key for parallel performance
TI’s powerful new TMS320C6x DSP marks the first time an off-the-shelf DSP has used an advanced Very Long Instruction Word (VLIW) architecture to achieve high performance through increased instruction-level parallelism. The new architecture, called VelociTI, consists of multiple execution units running in parallel to perform multiple instructions during a single clock cycle. This level of parallelism is the key to extremely high performance at extremely low cost, taking the ’C6x DSPs well beyond the performance capabilities of traditional superscalar designs. VelociTI’s advanced features include instruction packing, 100 percent conditional instructions, and pre-fetched branching, all of which eliminate the problems traditionally associated with historical implementations of VLIW machines. Instruction packing, for example, fetches eight instructions per cycle and executes from one to eight instructions per cycle for reduced code size, program fetches, and power consumption. Also, the architectural streamlining and compiler intelligence that implement instruction scheduling at compile time allow the ’C6201 to be fabricated using only 550,000 logic transistors. In contrast, Intel’s Pentium(TM) requires about five million logic transistors. As a highly deterministic architecture, VelociTI has very few restrictions on how or when instructions are fetched, executed, or stored. It is this architectural flexibility that enables unprecedented levels of efficiency and ease of use. "The beauty of this advanced VLIW architecture is really its elegant simplicity," said Ray Simar, TI’s ’C6x chief architect and program manager. "It moves complexity from the hardware to the compiler, allowing for a simpler, easier-to-program, faster processor at a lower cost." The ’C6x compiler strings together large groups of independent operations into very long instruction words in a way that uses all the on-chip function units efficiently during each instruction cycle. The compiler resolves data-dependent instructions by moving them around (scheduling) until only data-independent instructions are executed simultaneously. The net result of VelociTI and its compiler-centric focus is that programmers using ’C6x DSPs can take a big step up from the hardware and its complexities, to focus on application code while still extracting maximum performance. VelociTI-based ’C6x product development becomes a far more software-oriented set of tasks than ever before, resulting in significant development-time reductions. |




