Realistic Car Engine Sound Simulation — WT2003Hx Voice Chip Solution

audio ic for car

Product Market

WT2003Hx is a high-performance MP3 audio decoding chip with features such as cost-effectiveness, low power consumption, and high reliability. It is applicable to various scenarios, including but not limited to car entertainment systems, toys, educational equipment, and professional audio devices. In the application of simulating car engine sounds, the characteristics of this chip are specially utilized to meet the market demand for an immersive driving experience.

Such applications mainly target the automotive aftermarket, especially car owners who pursue a high-quality driving experience and car modification enthusiasts. In addition, with the popularization of electric vehicles, solutions for simulating engine sounds have also attracted electric vehicle manufacturers. Since electric vehicles run relatively quietly, simulating engine sounds can enhance the driving experience and improve pedestrian safety warnings. The application is not limited to private vehicles; it can also be extended to racing simulators, car showrooms, driving experience projects in theme parks, and even some high-end children’s toy cars to enhance the entertainment and interactivity of products.

With technological progress and the growing consumer demand for personalized and differentiated experiences, the application market for simulating car engine sounds is expected to continue expanding. Especially in the electric vehicle market, as the global promotion of green travel advances, simulating engine sounds has become an innovative solution to address the problem of electric vehicles’ silent operation.

Although there are other solutions in the market, such as software-based mobile applications or specialized engine sound simulator hardware, the solution based on the WT2003Hx chip has strong competitiveness in specific niche markets due to its integration, high cost performance, and sound quality advantages.

To sum up, the application market for simulating car engine sounds based on the WT2003Hx chip has broad prospects, especially in niche markets that pursue driving pleasure and sound experience, with huge potential. It is expected to become an important part of future car entertainment systems and personal entertainment devices.

Product Solution Comparison

Traditional Solution: Speed Change and Pitch Shift

Basic Principle: Traditional solutions usually rely on simple speed change and pitch shift technology, that is, simulating changes in engine speed by adjusting the playback rate of audio samples. When the car accelerates, the audio playback speed increases, and vice versa, to simulate the sound changes during the actual acceleration and deceleration of the engine.

Sound Effect: This method can provide a basic sense of acceleration but may lead to an unnatural relationship between pitch and speed, resulting in an unrealistic sound. Especially when the speed changes significantly, the pitch change of the sound will appear abrupt.

Implementation Difficulty: The technical implementation is relatively simple, requiring no complex algorithm support, and is easy to deploy quickly.

Optimized Solution: Frequency Shifting

  • Basic Principle: Frequency shifting technology independently adjusts the pitch of the audio while keeping the duration of the original audio unchanged. This technology can more accurately simulate changes in engine sounds while maintaining natural pitch and timbre, ensuring the authenticity of the sound even with large-range speed changes.
  • Sound Effect: The optimized solution can provide a more delicate and realistic auditory experience. Through careful adjustment of the sound spectrum, it can simulate subtle sound changes during gear shifting, allowing drivers to feel a driving experience closer to that of a real vehicle.
  • Implementation Difficulty: Compared with traditional solutions, the implementation of frequency shifting technology is more complex, requiring advanced audio processing algorithms and computing resources. It may involve complex steps such as signal processing, frequency domain analysis, and transformation to ensure that the sound quality remains undistorted and natural.
  • Application Examples: For example, the electric vehicle engine sound simulation technology developed by companies like Bose not only simulates various sound combinations of traditional fuel vehicles but also creates an immersive “engine” roar through sophisticated algorithms, enhancing the driver’s sense of immersion and emotional connection.

In summary, the frequency shifting solution provides a higher-quality experience in simulating car engine sounds. Through precise audio processing technology, it ensures the authenticity and natural transition of sounds, although its implementation cost and technical complexity are relatively high. The traditional solution (speed change and pitch shift) is a cost-effective and easy-to-implement option but may be deficient in sound authenticity and quality. With technological progress and increasing consumer demands for experience, the optimized solution is gradually becoming a trend in high-end car audio design.

Chip Introduction

3.1 Chip Resources

  • 32-bit MCU with built-in Flash;
  • Two UART controllers (UART0/1);
  • Two SPI (SPI0/1) supporting master and slave modes;
  • Four-channel PWM output;
  • Built-in 0.5W/8Ω PWM power amplifier;
  • 10-bit ADC;
  • Power-down mode (deep sleep mode) as low as 2μA;
  • Strong IO driving capability, providing a maximum driving current of 64mA;
  • Supporting user remote update or mass production update functions/voice content.

3.2 Package Introduction

The WT2003H series chips are available in SOP16, TSSOP24, and QFN32 packages, suitable for various applications. Their pin diagrams and pin definitions are as follows:

SOP16 package:

Low-power MP3 decoding IC for automotive audio systems

TSSOP24 package:

WT2003Hx chip applied in racing simulators and EV warning sounds

QFN32 package:

Car engine sound simulation solution using WT2003Hx

Function Introduction

Although the WT2003HX chip is mainly designed as a voice broadcasting chip for playing preset voice information, it can also be applied to scenarios of simulating car engine sounds by virtue of its powerful audio processing and playback capabilities.

4.1 Voice Broadcasting

WT2003HX supports high-quality audio formats such as MP3, enabling it to play clear and realistic car engine audio, providing users with an immersive simulation experience. Its high-performance 32-bit processor ensures smooth audio playback, maintaining the naturalness and details of the sound even in complex audio segments. It has built-in Flash storage with different capacities, allowing storage of audio content ranging from 100 seconds to 1000 seconds, which is sufficient to cover various car engine sound samples from idle speed to full speed, meeting the needs of engine sound simulation in different scenarios.

4.2 Frequency Shifting Function

Audio frequency shifting refers to changing the overall offset of all frequency components in the audio content through digital signal processing technology without altering the timing and duration of the audio. This usually involves shifting the entire spectrum of the audio signal up or down to change its perceived pitch (tone). It should be noted that maintaining the sound quality and naturalness of the original signal during frequency shifting is a major challenge; excessive frequency shifting may introduce distortion or unnatural timbre. Therefore, high-quality frequency shifting algorithms usually require careful design to maintain sound quality.

Instruction Introduction

5.1 Protocol Description

The one-wire serial port mode allows the MCU to send data to the WT2003HX-16S/24SS/32N series voice chips through the DATA1 line for control, enabling functions such as controlling voice playback, stopping, and looping.

5.2 Communication Pins

Package formPin
DATA1BUSY
SOP16615
TSSOP24920
QFN32212

5.3 One-Wire Voice Address Correspondence

Data (Hexadecimal)Function
00HPlay the 0th segment of voice
01HPlay the 1st segment of voice
02HPlay the 2nd segment of voice
…………
DDHPlay the 221st segment of voice
DEHPlay the 222nd segment of voice
DFHPlay the 223rd segment of voice

Note: To play the voice at a certain address, simply send that address, and the voice at that address will be played automatically. The time interval between two address commands must be greater than 4ms.

5.4 One-Wire Voice and Command Code Correspondence Table

Command CodeFunctionDescription
E0H – E7HE0 has the lowest volume, E7 has the highest volume, with a total of 16 levels of volume adjustmentIn voice playback, send this command when playback ends or in standby mode to adjust the volume.
F2H + XXHLoop play XXH voiceExecuting this command can loop the current voice. It can be sent when voice is playing or stopped. F2 can be interrupted by F1 command, ordinary address command, and F3 command during execution and becomes invalid. The loop play command should be sent first, then the play command.
F3HContinuous code playbackF3H + voice address A, F3H + voice address B, F3H + voice address C… A, B, C addresses are consecutive. After playing the voice at address A, the address automatically increases by 1 to play the voice at address B, then automatically increases by 1 again to play the voice at address C, and so on until a non-consecutive address or no subsequent voice is encountered.
F4HStop playing the current voice/Stop current outputExecuting this command can stop the currently playing voice or stop the current audio output.
F5H 00/01Audio output mode switchingF5H 00 switches to DAC, F5H 01 switches to PWM, and it is effective when set in wake-up state.
F4H 02Deep sleep (within 30μA)Deep sleep, effective when set in wake-up state.
F4H 03Normal sleep (within 300μA)Normal sleep, effective when set in wake-up state.
F6H + XXH + XXHLinear frequency shift of XXH voiceXXH is the frequency shift direction (range: 00H – 01H), 00H is shifting to low frequency, 01H is shifting to high frequency; XXH is the frequency shift parameter (range: 0000H – FFFF H), 0000H is 7.777kHz (maximum frequency), FFFF H is 1kHz. It is recommended to send the loop play command (F2 command) first, then the frequency shift command. It is not recommended to switch the play index during frequency shifting. If changes are needed, send the F4 command to stop playback and then reprocess. After the end of playback, the normal state should be restored (i.e., shifted back to 0 point).

Note: Without stopping playback, if there is no command code F3H and only a voice address, the previously playing voice will be interrupted. The continuous code command must be used with an address (e.g., F3H + 00H + F3H + 01H). F3H can easily combine different voices: F3H + address A + F3H + address B, with a maximum of 10 groups of content. The first group of commands must be F3 + address; combined playback can also be completed by judging the change of the BUSY level during voice playback and at the end of playback.

5.5 One-Wire Serial Port Timing Diagram

The chip is awakened on the falling edge of the DATA pin; after awakening, it can receive commands effectively only after an interval of 100ms. This command has power-down memory.

First, pull the data line low for 4~20ms (5ms is recommended), then send 8-bit data, starting with the low bit and then the high bit. The ratio of high level to low level is used to represent the value of each data bit.

Note: The high level must come first, followed by the low level.

It is recommended to use 200us:600us or 400us:1200us (widening the level is beneficial to communication stability under certain circumstances). The reference for the upper and lower limits of the value is: 40us:120us ~ 400us:1200us. Note that a 3:1 and 1:3 level ratio should be used to ensure communication stability.

If we want to send 96H, starting with the low bit and then the high bit, the corresponding timing diagram is as follows:

If we want the chip to play the voice content at addresses 01/02/03/04 in sequence, that is, continuous code command playback: F3 + 01 + F3 + 02 + F3 + 03 + F3 + 04, the corresponding timing can be as shown in the following figure:

Note: Since the WT2003HX requires a certain initialization time after power-on and cannot respond to commands during initialization, it is recommended that users delay 2ms after sending a group of continuous code addresses before sending the next group when using the continuous code function; however, the interval between F3 and the address should still be 2ms.

After dormancy, the chip is pulled up by default; the DATA will be pulled high when voice playback ends.

Solution Display

It meets the requirements of the national standard GB/T37153—2018 for electric vehicle low-speed warning sounds, including the speed range, sound level limits, frequency requirements, sound types, and pause switches for electric vehicle low-speed warning sounds.

4.2 Sound Level Limits

4.2.1 The external noise of an electric vehicle measured according to the measurement method specified in 5.6.2 shall have at least two 1/3 – octave bands not less than the sound levels specified in Table 1 among all the contained 1/3 – octave bands, and at the same time meet the requirements for its total sound level in Table 1.

Table 1 Minimum Sound Level Limits

Frequency (Hz)External Noise of Electric Vehicle (dBA)
Uniform Forward Driving SpeedUniform Reverse Driving Speed
Weighted sound level (overall sound level)10 km/h20 km/h6 km/h
160525849
2004752
2504551
3154651
4004752
5004752
6304853
8004853
1 0004853
1 2504853
1 6004651
2 0004449
2 5004146
3 1503843
4 0003641
5 0003338
If the sound emitted by an electric vehicle without an installed warning sound system meets all the total sound level requirements specified in Table 1 when tested according to 5.6.2 and all exceed by at least 3 dB(A), then it does not need to meet the 1/3 – octave band sound level limits in Table 1 and the requirements in 4.3 Frequency Requirements.

4.2.2 The maximum noise emitted by a vehicle equipped with a warning sound system during driving shall not exceed 75 dB(A).

In the scenario of simulating car engine sounds, the application of frequency shifting technology mainly creates a dynamic sound transformation effect to imitate the pitch changes of car engines under different speeds and load conditions. Although frequency shifting technology in the traditional sense is mainly applied in fields such as audio signal processing and hearing aids, extending its concept to car engine sound simulation, the following solution can be conceived:

  • Start
  • Preparation Stage: Prepare audio materials (collect engine sound samples at different speeds); Set up the test environment (build a simulation platform, connect WT2003HX with external control units)
  • Parameter Initialization: Set basic frequency – shifting parameters (initial frequency offset, frequency – shifting curve type, etc.), example: F 8 01 04 1C shifts to high frequency by 1052Hz; Configure the audio processing module (determine whether to use external DSP or MCU to control frequency shifting)
  • Preliminary Testing and Evaluation: Playback test (play the engine sound after frequency – shifting processing, and initially verify the effect); Collect feedback (record indicators such as sound quality, naturalness, and response time)
  • Parameter Fine – tuning: Analyze feedback (identify existing problems, such as excessive frequency offset, unnatural transition, etc.); Adjust parameters (modify frequency offset, adjust filter settings, optimize algorithms, etc.); Iterative testing (repeat playback and evaluation until satisfied)
  • Performance Optimization: Power consumption optimization (check and reduce unnecessary operations to save electrical energy); Response time optimization (speed up algorithm processing to improve real – time performance); Storage optimization (optimize audio sample storage to reduce memory usage)
  • Final Verification: User experience testing (invite target user groups to audition and collect feedback); Performance stability testing (long – term operation test to ensure no faults)
  • End: Record the final parameter configuration and prepare for deployment or production

4.3.2 Frequency Shifting

When the vehicle is moving forward at a certain speed within the range of 5 km/h to 20 km/h, among the sounds emitted by the warning sound system, at least one 1/3 – octave frequency specified in Table 1 shall increase as the vehicle speed increases or decrease as the vehicle speed decreases. The minimum average shift rate of this frequency shall be not less than 0.8 %/(km/h).
If there are multiple 1/3 – octave frequencies specified in Table 1 that shift simultaneously, only one frequency shift needs to meet the requirement.

Reference Frequency (Hz)Uniform Forward Driving Speed (km/h)Minimum Frequency Shift (Hz)Example of Frequency – Shifting Code
100018F8 01 00 08
1000216F8 01 00 10
1000540F8 01 00 28
10001080F8 01 00 50
100020160F8 01 00 A0

Audio Sampling and Processing:
First, collect a series of real car engine sound samples, covering the entire range from idle speed to high speed. Use audio editing software to preprocess these samples to ensure the quality and consistency of each sample.

Frequency Shifting Algorithm Design:
The algorithm should dynamically adjust the audio frequency according to the simulated vehicle speed or throttle position to simulate the natural frequency changes of the engine as the speed changes.

Control System Integration:
Integrate the WT2003HX chip with an external control system (such as an MCU). The MCU is responsible for calculating the frequency shifting parameters of the audio to be played according to the real-time status of the simulated car (such as simulated vehicle speed and throttle pedal position). The control system sends commands to the WT2003HX to select or dynamically adjust the playback of engine sound samples after corresponding frequency shifting processing.

Real-Time Response and Interaction:
Achieve real-time response to ensure that the engine sound changes quickly and naturally when the user operates (such as accelerating or decelerating); provide adjustable parameters to allow users to adjust the characteristics of the engine sound (such as volume and pitch) according to personal preferences or the needs of the simulation scenario.

Audio Playback Optimization:
Utilize the high-quality audio playback capability of WT2003HX to ensure that the engine sound after frequency shifting processing is undistorted, natural, and smooth. Optimize storage management, and reasonably arrange storage space to store engine sound samples after frequency shifting processing under different states.

User Experience:
Focus on the end-user experience, and ensure that the simulated engine sound is real and coherent through multiple debugging to enhance the immersion of simulated driving, games, or educational tools.

The WT2003HX chip can be cleverly integrated with external processing units and control logic to design a complete solution, effectively applying frequency shifting technology in the scenario of simulating car engine sounds and improving the authenticity and interactivity of the simulation. This requires close collaboration between software and hardware, as well as in-depth understanding of audio processing and control logic.

If you’re looking for a suitable voice chip for your company’s products, feel free to contact us anytime: send an email to [email protected], or click here to fill out a short form in just 30 seconds. Our team will get in touch with you as soon as possible to provide free samples and a detailed quote.

We are WayTronic, founded in 1999, with over 25 years of experience specializing in custom voice chip solutions. Backed by a team of 100+ senior engineers, we can precisely meet the diverse needs of different products. We also offer one-on-one selection guidance and technical support to help you find a practical and cost-effective voice chip with minimal effort.

We look forward to working with you for mutual success!

Related Posts

Leave a Reply

Your email address will not be published. Required fields are marked *