Job Recruitment Website - Job seeking and recruitment - Introduction to the knowledge of digital signal processors
Introduction to the knowledge of digital signal processors
There are many DSP algorithms. Most DSP processors use fixed-point arithmetic, and numbers are represented as integers or decimals between -1.0 and 1.0. Some processors use floating-point arithmetic, and the data is expressed in the form of mantissa plus exponent: mantissa × 2 exponent.
Floating point algorithm is a more complex conventional algorithm. Floating point data can be used to achieve a large dynamic range of data (this dynamic range can be expressed by the ratio of the largest and smallest numbers). In the application of floating-point DSP, design engineers do not need to worry about issues such as dynamic range and accuracy. Floating-point DSPs are easier to program than fixed-point DSPs, but cost and power consumption are high.
Due to cost and power consumption reasons, fixed-point DSP is generally used for batch products. Programmers and algorithm designers determine the required dynamic range and accuracy through analysis or simulation. If the requirements are easy to develop, and the dynamic range is wide and the accuracy is high, floating point DSP can be considered.
Floating-point calculations can also be implemented by software using fixed-point DSP, but such software programs take up a lot of processor time and are rarely used. An effective method is "block floating point", which uses this method to process a set of data with the same exponent but different mantissas as a data block. "Block floating point" processing is usually implemented in software. The word width of all floating-point DSPs is 32 bits, while the word width of fixed-point DSPs is generally 16 bits. There are also 24-bit and 20-bit DSPs, such as Motorola's DSP563XX series and Zoran's ZR3800X series. Since the word width has a great relationship with the external size of the DSP, the number of pins, and the size of the required memory, the length of the word width directly affects the cost of the device. The wider the word width, the larger the size, the more pins, the greater the memory requirements, and the cost increases accordingly. Under the premise that the design requirements are met, a DSP with small font width should be used as much as possible to reduce costs.
When choosing between fixed-point and floating-point, you can weigh the relationship between word width and development complexity. For example, by combining instructions together, a 16-bit word-wide DSP device can also implement a 32-bit word-wide double-precision arithmetic (of course, double-precision arithmetic is much slower than single-precision arithmetic). If single precision can meet the vast majority of calculation requirements, and only a small amount of code requires double precision, this method is also feasible. However, if most calculations require high precision, you need to choose a processor with a larger word width.
Please note that the width of the instruction word and data word of most DSP devices are the same, and there are some differences. For example, the data word of the ADSP-21XX series of ADI (Analog Devices) is 16 bits and the instruction word is 16 bits. Words are 24 bits. Whether the processor meets the design requirements depends on whether it meets the speed requirements. There are many ways to test the speed of a processor. The most basic is to measure the processor's instruction cycle, which is the time it takes for the processor to execute the fastest instruction. Divide the reciprocal of the instruction cycle by one million and multiply it by the number of instructions executed per cycle. The result is the maximum speed of the processor in millions of instructions per second (MIPS).
However, the instruction execution time does not indicate the real performance of the processor. Different processors complete different tasks in a single instruction. Simply comparing the instruction execution time cannot fairly distinguish the performance differences. Some new DSPs now adopt the Very Long Instruction Word (VLIW) architecture. In this architecture, multiple instructions can be implemented in a single cycle, and each instruction implements fewer tasks than traditional DSPs. Therefore, compared with VLIW and general-purpose DSPs, Device-wise, comparing MIPS sizes can be misleading.
Even comparing MIPS sizes between traditional DSPs is somewhat one-sided. For example, some processors allow several bits to be shifted together in a single instruction, while some DSPs have one instruction that can only shift a single data bit; some DSPs can parallelize data independent of the ALU instruction being executed. processing (loading operands while executing instructions), while some other DSPs can only support data parallel processing related to the ALU instruction being executed; some new DSPs allow two MACs to be defined within a single instruction.
Therefore, simply making a MIPS comparison cannot accurately determine the performance of the processor.
One way to solve the above problem is to use a basic operation (rather than an instruction) as a standard to compare processor performance. What is commonly used is MAC operation, but MAC operation time does not provide enough information to compare DSP performance differences. In most DSPs, MAC operation is only implemented in a single instruction cycle, and its MAC time is equal to the instruction cycle time. As mentioned above, Some DSPs handle more tasks in a single MAC cycle than others. MAC times do not reflect the performance of operations such as looping, which are used in all applications.
The most common method is to define a set of standard routines and compare the execution speed on different DSPs. Such routines may be the "core" functionality of an algorithm, such as FIR or IIR filters, etc., or they may be the entire or part of an application (such as a speech coder). Figure 1 shows the performance of several DSP devices tested using BDTI's tools.
When comparing the speed of DSP processors, pay attention to the MOPS (million operations per second) and MFLOPS (million floating point operations per second) parameters advertised by them, because different manufacturers have different opinions on " The understanding of "operation" is different, and the meaning of indicators is also different. For example, some processors can perform floating-point multiplication operations and floating-point addition operations at the same time, so they advertise that the MFLOPS of their products is twice that of MIPS.
Secondly, when comparing processor clock rates, the DSP's input clock may be the same as its instruction rate, or it may be two to four times the instruction rate. Different processors may be different. In addition, many DSPs have clock multipliers or phase-locked loops that can use an external low-frequency clock to generate the high-frequency clock signal required on-chip. Speech processing: speech coding, speech synthesis, speech recognition, speech enhancement, voice mail, speech storage, etc.
Image/graphics: two-dimensional and three-dimensional graphics processing, image compression and transmission, image recognition, animation, robot vision, multimedia, electronic maps, image enhancement, etc.
Military; secure communications, radar processing, sonar processing, navigation, global positioning, frequency hopping radio, search and counter-search, etc.
Instrumentation: spectrum analysis, function generation, data acquisition, seismic processing, etc.
Automatic control: control, deep space operations, automatic driving, robot control, disk control, etc.
Medical: hearing aids, ultrasound equipment, diagnostic tools, patient monitoring, electrocardiogram, etc.
Household appliances: digital audio, digital TV, video phone, music synthesis, tone control, toys and games, etc.
Biomedical signal processing examples:
CT: Computed X-ray tomography device. (Among them, Housfield of the British EMI company who invented head CT won the Nobel Prize.)
CAT: Computer X-ray spatial reconstruction device. Whole-body scans, three-dimensional graphics of heart activity, foreign bodies in brain tumors, and human torso image reconstruction appear.
Electrocardiogram analysis. The performance of a DSP is affected by its ability to manage the memory subsystem. As mentioned earlier, MAC and other signal processing functions are the basic capabilities of DSP device signal processing. Fast MAC execution capability requires reading one instruction word and two data words from the memory in each instruction cycle. There are several ways to implement such reads, including multi-interface memories (allowing multiple accesses to memory per instruction cycle), separated instruction and data memories ("Harvard" structures and their derivatives), and instruction caches (allowing multiple accesses to memory from Cache fetch instructions instead of memory, freeing up memory for data reads). Figures 2 and 3 show the differences between the Harvard memory architecture and the "von Norman" architecture used in many microcontrollers.
Also pay attention to the size of the supported memory space. The main target market for many fixed-point DSPs is embedded application systems, where memory is generally small, so this DSP device has small to medium on-chip memory (around 4K to 64K words) and a narrow external data bus.
In addition, the address bus of most fixed-point DSPs is less than or equal to 16 bits, so the external memory space is limited.
Some floating-point DSPs have small or even no on-chip memory, but the external data bus is wide. For example, TI's TMS320C30 only has 6K on-chip memory, a 24-bit external bus, and a 13-bit external address bus. ADI's ADSP2-21060 has 4Mb of on-chip memory, which can be divided into program memory and data memory in various ways.
When selecting a DSP, you need to select it based on the storage space size and external bus requirements of the specific application. DSP processors are very different from general-purpose processors (GPPs) such as Intel, Pentium or PowerPC. These differences arise from the fact that the structure and instructions of DSPs are designed and developed specifically for signal processing. , which has the following characteristics.
·Hardware multiply-accumulate operations (MACs)
In order to effectively complete multiply-accumulate operations such as signal filtering, the processor must perform effective multiplication operations. GPPs were not originally designed for heavy multiplication operations. The first major technical improvement that distinguished DSPs from earlier GPPs was the addition of specialized hardware and explicit MAC instructions capable of single-cycle multiplication operations.
·Harvard Structure
Traditional GPPs use Feng. Norman memory structure. In this structure, there is a storage space connected to the processor core through two buses (an address bus and a data bus). This structure cannot meet the requirement that the MAC must perform four operations on the memory in one instruction cycle. Requirements for the first visit. DSPs generally use Harvard architecture. In Harvard architecture, there are two storage spaces: program storage space and data storage space. The processor core is connected to these memory spaces through two sets of buses, allowing two simultaneous accesses to the memory. This arrangement doubles the processor's bandwidth. In Harvard architecture, greater storage bandwidth is sometimes achieved by adding a second data storage space and bus. Modern high-performance GPPs typically have two on-chip caches, one for data and one for instructions. From a theoretical perspective, this dual on-chip cache and bus connection is equivalent to the Harvard architecture. However, GPPs use control logic to determine which data and instruction words reside in the on-chip cache, a process that is generally not available to the programmer. As you can see, in DSPs, programmers can explicitly control which data and instructions are stored in on-chip memory cells or caches.
·Zero-consumption loop control
The same characteristics of DSP algorithms: most of the processing time is spent executing a small number of instructions contained in a relatively small loop. Therefore, most DSP processors have specialized hardware for zero-cost loop control. A zero-cost loop is a loop in which the processor can execute a set of instructions without spending time testing the value of the loop counter. The hardware completes the loop jump and decay of the loop counter. Some DSPs also implement high-speed single-instruction loops through an instruction cache.
·Special addressing mode
DSPs often contain special address generators, which can generate special addressing required by signal processing algorithms, such as loop addressing and bit flip addressing. . Loop addressing corresponds to the pipeline FIR filtering algorithm, and bit flip addressing corresponds to the FFT algorithm.
·Execution time predictability
Most DSP applications have hard real-time requirements, and in each case all processing work must be completed within a specified time. This real-time limitation requires the programmer to determine exactly how much time each sample requires or at least how much time will elapse in the worst case. The process by which DSPs execute programs is transparent to the programmer, so it is easy to predict the execution time of each job being processed. However, for high-performance GPPs, the prediction of execution time becomes complex and difficult due to the use of large amounts of ultra-high-speed data and program cache to dynamically allocate programs.
·Have rich peripherals
DSPs have DMA, serial port, Link port, timer and other peripherals.
- Previous article:Zeng Guofan’s life
- Next article:Huangmai gentlefolk recruitment
- Related articles
- I got a job in game promotion, because I felt like a liar, and I left the next day, a little regretful.
- Where is the delivery place for oppo official website to buy mobile phones? How long will it take to arrive?
- Is the recruitment requirement of Sanjiang Aerospace high?
- How much is the salary for the first three and three months of working in Chengdu Lianjia?
- How can I get to Jingdong Building on Shiyi Jing Road in Tianjin? Where should I go for an interview? Thank you! !
- List of outstanding enterprises in Suzhou.
- Where is the recruitment job on the recruitment website?
- There was a wedding in my hometown, so I'm going back to the city to invite my friends and colleagues to dinner. How to write a newsletter?
- smt year-end summary
- Take the best major.