Sunday, Sept. 25, 2005
Chair: Thanh Tran, Texas Instruments
|
MORNING SESSIONS:
(select one)
Track A
8:00 a.m. – 12:00 noon
Models and Tools for Dynamic Reconfiguration of FPGAs
A. Donlin, Xilinx Research Labs, Jürgen Becker, Michael Hübner, University of Karlsruhe
FPGA Dynamic reconfiguration is the tantalizing prospect of modifying one part of an FPGA logic design while the remainder of the design continues to operate seamlessly. However, designing and, importantly, verifying dynamically reconfigurable systems is non-trivial. This tutorial presentation addresses the design and implementation of FPGA Reconfigurable systems in two phases. In the first phase, I describe constructs and strategies that can be used to model dynamically reconfigurable FPGA systems with SystemC, the de facto standard simulation environment for system level design. We begin with a short overview of previous approaches in the RTL simulation world, covering the challenges faced in RTL simulation. We then extend the discussion to modeling strategies that use key principles of object orientation to represent dynamic systems. With examples written in SystemC, we discuss not only how these approaches overcome representation and simulation-performance problems of RTL modeling, but how they too are limited in the scope of dynamic systems they can represent. From here we introduce the use of dynamic process constructs, available in SystemC 2.1, to explicitly represent dynamically reconfigurable embedded system architectures. In particular, we describe how certain object-oriented design patterns were used to simplify the creation of new dynamic classes. We also highlight certain limitations with the current support for dynamic processes that require careful construction of models. To conclude, we provide some quantifications of the simulation performance of the plain object-oriented representation of dynamic reconfiguration versus the cost of using dynamic processes to represent the technique.
The second component of the presentation addresses the latest tools and methodologies being used by Xilinx Research Labs to implement dynamically reconfigurable systems. After addressing the implementation flow, we present a new, novel interface to FPGA reconfiguration. The FPGA Virtual file system applies the familiar and intuitive file and directory metaphor to the configuration memory of an FPGA. We will discuss the variety of use models for the FPGA virtual file system and how the technology can be applied to some important problems in debugging modern system designs.
Adam Donlin is a member of the Embedded and Reconfigurable Systems Group of Xilinx Research Labs, San Jose, CA. His research interests include Transaction Level Modeling and System Level Design with a particular focus on the flows and use-models of future EDA environments for FPGAs. Dr Donlin received the degree of Batchelor of Science in Computer Science from the University of Glasgow in 1996. His PhD was awarded in Computer Science by EdinburghUniversity in 2001. He currently serves on the technical program committees of the Design Automation Conference (DAC), CODES+ISSS and the Reconfigurable Architecture Workshop (RAW).
Track B
8:00 a.m. – 12:00 noon
Standards-Compliant IP-Based ASIC and SOC Design
A. Hekmatpour, K. Goodnow, H. Shah, IBM Corporation
Standards-compliant silicon IP designs have been gaining momentum as a viable alternative to full custom in-house IP design. The notion is that it will be more efficient (schedule, cost, quality) to port a third party silicon IP designed based on industry standards than a non-standards-based IP.
This tutorial will cover the following major issues and disciplines involved in standards-compliant IP-based ASIC/SoC design. It also includes an overview of the IBM Silicon Solutions’ effort to develop an open methodology to satisfy its various IP-Based SoC and ASIC design requirements for internal and external customers.
The speakers will present a comprehensive overview and analysis of standards and protocols that impact IP-based designs and tools and vendors supporting these standards. They will then present an overview of IBM’s IP-based FastTrack ASIC methodology flow. The tutorial will be concluded with a demo of BlueIP - a web-based standards-driven portal for third party silicon IP qualification, design analysis and design data book exchange with IBM ASIC flow.
Amir Hekmatpour received his Ph.D. degree in Computer Engineering from the University of California in San Diego in 1993 and his MS in Electrical Engineering from the University of Illinois in Chicago in 1981. He is a senior engineer at IBM microelectronics in ASIC Development group. He has published more than 35 technical papers and holds 12 U.S. Patents. In the past 24 years with IBM, he has been involved with various aspects of VLSI from design to design automation, functional verification, functional coverage analysis, eDesign platforms, and assertion-based verification. His current projects include research and development of tools and techniques for standards-compliant IP-based ASIC Design and verification. He is an active member of the technical review board of Design Automation Conference (DAC), a member of DesignCon East Technical Program Committee, Associate Editor of The International Journal of Computers and Their Applications (IJCA), and has recently served as a member of the technical review committee of the International Test Conference (ITC) and Parallel and Distributed Computing Systems (PDCS).
Ken Goodnow received his Ph.D. from Penn State University in 1990. He received his MSEE from Penn State University, and his BSEE from the University of Arizona. He is a senior engineer at IBM in the Server and Technology Group. In the past 13 years at IBM, he has been involved with various design projects and design methodologies. For the last seven years he has been involved with Silicon IP development for ASIC design. Ken is a representative to VSIA for IBM and in that capacity helped to lead the Soft IP Tagging Specification for VSIA. Ken currently holds 27 patents and has published in various areas.
Hemen Shah received his MS degree in Electrical Engineering from Worcester Polytechnic Institute in Worcester Polytechnic Institute in 1995. He is a Advisory engineer at IBM microelectronics in ASIC Development group. He holds 2 U.S. Patents. In the past 12 years with IBM, he has been involved with various aspects of circuit design in DRAM memory and Analog/Mixed Signal design group. His most recent projects involve design of high performance mixed-signal IPs, as well as port and optimization of third party digital and mixed-signal IPs.
12:00 noon – 1:00 p.m. – Lunch on your own
AFTERNOON SESSIONS
(select two)
Track A
1:00 p.m. – 3:00 p.m.
Serial Rapid-IO: Benefiting System Interconnects
T. Scheckel, Texas Instruments Inc.
SOC designs have stringent requirements for device interconnect. High bandwidth and low latency are key to supporting real time applications. Additionally, the protocol behind the interconnect technology must provide a wide range of functionality to support these complex embedded applications. This tutorial will examine how the RapidIO protocol meets these needs by providing a detailed look at this technology and it's capabilities. Example applications will be discussed with respect to TI RapidIO enabled DSPs. We will also look at possible future applications and direction of the technology for SOC.
With a career spanning more than a dozen years at Texas Instruments (TI), Travis Scheckel has held a variety of engineering roles within FPGA, ASIC and DSP. His current role is team member of the Communications Infrastructure Systems. In this role, Travis has taken the lead in defining TI’s RapidIO peripheral solution for DSP and is currently focused on the next generation wireless base station device solutions. He also represents Texas Instruments for the Technical Working Group within the RapidIO Trade Association. Additionally, Scheckel has extensive experience supporting high-speed interconnect applications. Travis received his B.S.E.E. degree from the Milwaukee School of Engineering in 1992.
3:30 p.m.– 5:30 p.m.
DSPs for Communications, Video Infrastructure and Audio
N. Seshan, G. Martinez, T. Hiers, A. Seely, Z. Nikolic, Texas Instruments Inc.
Texas Instruments (TI) has introduced two new subfamilies of digital-signal processors (DSPs) that are based on their existing TMS320C64x and TMS320C67x DSPs. This presentation details the technical attributes of the new TI TMS320C645x and TMS320C672x subfamily of DSPs as well as the benefits that these SOCs bring to communications, video infrastructure, and audio systems.
TMS320C645x DSPs benefit communications and video infrastructure applications through four main areas. First, these DSPs include multiple high performance input-output (IO) options for board level connectivity. One of these IO options is an inter-device communication port based on Serial Rapid IO which employs a high throughput message passing scheme that achieves nearly 95% utilization of the available data bandwidth (up to 12.5 Gbits/sec for a 4x serial bidirectional link). Other IO ports include a 1 Gbit/sec Ethernet Media Access Controller (EMAC), a 32-bit Double Data Rate (DDR2) 500MHz memory controller, a 66MHz Peripheral Component Interconnect (PCI) bus, and a Universal Test and Operations PHY Interface for ATM (UTOPIA 2) port. Second, TMS320C645x DSPs include a switch fabric interconnect that allows high-throughput, concurrent data transfers between on-chip masters and slaves while maintaining low latency. Third, the core system and memory architecture (collectively referred to as the megamodule) provide a significant improvement in terms of streamlined dataflow and enhanced development and operating system support over its predecessors, TMS320C62x and TMS320C64x DSPs. Reasons for the improvement in data flow include 256-bit wide memory buses and an internal DMA (IDMA) dedicated to the movement of data within the two levels of internal memory and between the core and the peripheral bus. Last, the CPU uses an architecture that is 100% object code compatible with its predecessors, but provides a dramatic (50% to 10x) increase in performance for critical signal-processing operations. The new CPU architecture also reduces code size through the use of an SLOOP buffer used for software pipelining and 16-bit versions of the native 32-bit instructions.
The new TMS320C672x DSPs advance audio signal processing through four methods. First, TMS320C672x DSPs have a peripheral set oriented towards audio applications. Three of these interfaces, referred to as Multichannel Audio Serial Ports (McASPs), are optimized for multi-channel audio streams. Each McASP audio stream can consist of multiple channels (typically stereo) of audio data. This provides for multiple zones of multi-channel audio to support multi-room environments. Other peripherals include Inter-Intergrated circuit (I2C) ports and dedicated serial peripheral interfaces (SPIs) for systems communications. Second, a flexible DMA architecture provides concurrent data movement for the type of dataflow required in multi-channel audio applications. This includes non-sequential data access for reverb, circular buffers, as well as data sorting. The peripherals are connected to the DMA through a low latency crossbar. Performance on the crossbar is provided through a burst-oriented pipelined data protocol. Third, these DSPs integrate a significant amount of on-chip memory in an innovating unified one-level memory architecture providing for flexible memory allocation and high performance. This memory architecture increases instruction cache size and decreases cache miss penalty versus its predecessors. Finally, they contain an improved version of the TMS320C67x CPU with additional instructions. These additional instructions and other CPU changes improve the numerical performance as well as the speed with which high quality algorithms can execute. Code size reduction improves both cache performance and allows integration of more algorithms on-chip.
The presentation will compare and contrast how they optimize for their end system goals in the areas of (1) compute performance (2) IO peripheral architecture (3) on-chip interconnect.
Nat Seshan recently joined the Advanced Architecture and Chip Technology team as a Distinguished Member Technical Staff. From 2002-2004, Nat was Device Architecture Manager of the DSP Catalog & Emerging End Equipment Group. In this role, Nat was principal architect of the C64x+ megamodule. From 1997-2002 he was TMS320C6000 applications manager. He was co-architect of the C62x CPU and lead the chip architectures of 10 C6000 DSPs. Since joining Texas Instruments in 1987, Nat was the first application engineer on the C3x, C54x, and C6000 product lines. Nat holds a MSEE & BSEE from the Massachusetts Institute of Technology and an Executive MBA from the University of Houston. Nat has 23 US Patents in the areas of CPU architecture, chip architecture, emulation, and parallel algorithms.
Gustavo Martinez is an applications engineer with the Texas Instruments Catalog DSP Application team focusing on applications verification and validation of new devices. He has provided technical support and documentation for several DSP platforms including C55x, OMAP, and C64x. Currently, he is developing technical documentation for the latest C645x products. He holds a B.S.E.E. from theUniversity of Texas at El Paso and is currently working towards an M.E.E. at RiceUniversity.
Todd Hiers is a senior applications engineer with the Texas Instruments Catalog DSP Application team focusing on High Speed hardware Productization. He joined the TMS320C6000 Hardware Applications team as an intern in 1999 and a full-time employee in 2001. He has created technical content for and supported the first and second generation devices in the C64x family of DSPs. He has recently worked on verification and validation of the new C645x devices. Todd has a BS and an M.Eng. in EECS from MIT.
Anthony Seely is a member group technical staff in the Texas Instruments Performance Audio Business. Anthony was chip architect of the TMS30C672x DSP and system architect and lead applications engineer of the TMS320C6713 and ‘DA6xx DSPs.
Zoran Nikolic, PhD, has been an applications engineer with the Texas Instruments Catalog DSP Application team focusing on embedded systems engineering since 1997. Specified functional and performance requirements for a DMA used on the latest floating point DSP. Previously, Zoran was central to deployment of the first imaging platform using the C6000 architecture: the imaging developers kit (IDK). He also worked on specifying functional and performance requirements for the Video Port Module of the TMS320DM642 Media Processor intended for streaming media applications. Responsible for requirements definition/specification and test specification for: Peripheral Component Interconnect (PCI) module, the Host Port Interface (HPI) port, and the Expansion Bus module. Zoran is currently responsible for introducing the new C672x DSPs to professional audio customers.
Track B
1:00 p.m. – 3:00 p.m.
High-performance on-chip interconnect circuit technologies for sub-65nm CMOS
H. Kaul, Intel Corporation
The continued increase in performance and integration levels of VLSI designs for the last three decades has been fueled by shrinking transistor sizes. Unlike devices, on-chip wires get slower with technology scaling and pose performance and power challenges as VLSI designs scale into the nanometer regime. At the same time signal integrity issues have also become important due to increased cross-talk and inductive effects and pose reliability challenges for on-chip signaling. In this tutorial we will discuss various techniques for improving performance, energy-efficiency and signal integrity of on-chip signaling. The scope of these techniques will include solutions at the architectural, circuit and physical design level.
Himanshu Kaul received the Bachelor of Engineering degree in electrical and electronics engineering from the Birla Institute of Technology and Science, India, in 2000. He completed his Ph.D. in Electrical Engineering at the University of Michigan, Ann Arbor, in December 2004. His PhD research focused on on-chip signaling and interconnects. He is currently with the Circuit Research Lab at Intel Corporation in Hillsboro, Oregon. His research interests include on-chip signaling techniques and low-voltage and high-performance circuit design.
3:30 p.m. – 5:30 p.m.
Challenges in Nanometer SOC SRAM Designs
S. Chung, TSMC
This short course will summarize unique circuit techniques to achieve a high speed, low cost, and highly manufacturable SRAM design for nanometer SoC.
SRAM designs in nanometer SoC are quite different in cell stability, read/write margins, array structures, and cell leakage current reduction. Firstly, the device variations in nanometer technologies are very widely spread. This results in high Vt and driving current mismatch among the cell transistors. The cell layout prefers straight gates pointing to the same direction, which leads to a totally different cell layout.
Secondly, the methods to investigate cell stability in the nanometer technologies are introduced. The Static Noise Margins (SNM) analyses show the margins are too narrow to make both read and write operations reliably. This problem is aggregated by low supply voltage, high device mismatch, and high leakage current in nanometer technologies. Thirdly, the sources of high leakage currents in nanometer technologies are explained. These can lead to static leakage consuming high percentage of operating currents.
Various schemes of multiple supply voltages are introduced to increase cell margins while reducing the leakage current. Back-gate biasing, source biasing, header, or footer is applied to cell array during read or write for optimization. Array structures need to be modified to employ these techniques accordingly. Finally, power down and wake up operations are also briefly introduced.
Shine Chung received his B.S. in Physics from National Taiwan University in 1974 and MS and Ph. D candidate from Harvard University in Applied Physics, in 1979 and 1981, respectively.
Since 1981, he worked in the semiconductor industry. From 1981 to 1984, he worked for AMD in SRAM technology development and SRAM design. From 1984-1988 he worked for VLSI Design Associate in ASIC logic and memory design. Then, he worked for HP Labs in PA-WW Architecture, precedent of Merced Architecture, from 1989 to 1994. After leaving HP, he worked for Digital and AMD in StrongArm and K5 microprocessor design, respectively. In 1997, he was a Director of Engineering at Programming Silicon Solutions working on embedded flash design. In 2001, he joined Turbo Inc. as a VP of Engineering in EEPROM and flash design. Currently, he worked for TSMC as a Director in memory related IP development.
Mr. Chung has more than 20 patents granted and in filing. He published a book “System-On-Chip design Quizzes.” He was an adjourn professor at San JoseStateUniversity and North-Western Polytech University.