Recently, the 6th Detection and Classification of Acoustic Scenes and Events, DCASE 2020 results were published. The Acoustic Perception team of School of Marine Science and Technology, guided by Prof. Jianfeng Chen, PhD candidate Jisheng Bai and master student Chen Chen made great achievement. In Task5, Urban Sound Tagging with Spatiotemporal Context, they won the second place in the world and the first place in the country. In Task2, Unsupervised Detection of Anomalous Sounds for Machine Condition Monitoring, they won the second place in the country and the eighteenth place in the world.
The DCASE organized by the Institute of Electrical and Electronics Engineers (IEEE) Audio and Acoustic Signal Processing Technical Committee (AASP) has been launched to the sixth session and has attracted much attention of many top acoustic research institutes at home and abroad, including Google, Cornell University and Carnegie Mellon University. More and more researchers have also participated in this area.
In 2018, Mou Wang, a PhD candidate from Center of Intelligent Acoustics and Immersive Communications (CIAIC), organized a DCASE competition team, which consists of students of audio and signal processing direction from: CIAIC, the Acoustic Perception team and Audio Speech and Language Processing Research Group (ASLP). The team was guided by three professors: Jianfeng Chen, Xiaolei Zhang and Zhonghua Fu. CIAIC provided GPU and other computing resources. In DCASE2018 and DCASE2019 challenges, the team actively prepared and overcome difficulties. They achieved great results for two consecutive years and also gained rich experience.
In DCASE2020, there are six tasks in total, including Acoustic Scene Classification with Multiple Devices, Unsupervised Detection of Anomalous Sounds for Machine Condition Monitoring, Sound Event Localization and Detection, Sound event detection and separation in domestic environments, Urban Sound Tagging with Spatiotemporal Context and Automated Audio Captioning.
The abnormal sound detection (ASD) is to identify whether the sound of the target machine is normal or abnormal. Task2 is to detect the abnormal sound of the machine when only normal sound samples are provided as training data. Automatic detection of mechanical failures is an important technology of the fourth industrial revolution, including factory automation based on artificial intelligence.
Figure 1: Overview of development and evaluation datasets.
Some top international institutes also participated in Task 2, including Amazon, Samsung, IBM and Intel, and well-known universities at home and abroad such as Tsinghua University, University of Illinois at Urbana-Champaign and University of Electronic Science and Technology. In the end, The Acoustic Perception team stood out from more than 40 teams and won the second place in the country and eighteenth place in the world. The top1 was Amazon.
Figure 2: Task2 teams ranking
The task of Urban Sound Tagging with Spatiotemporal Context is to detect the presence of noise pollution sources in a 10-second sound under the condition of audio recording time and location. The noise pollution sources are divided into 8 coarse categories and 23 fine categories, such as engine, machinery/non-mechinery impact, chainsaw, alarm, music, human voice, etc. The motivation of this task is to establish real-world problems with machine monitoring tools to assist in monitoring, analyzing and reducing urban noise pollution.
Figure 3: Overview of a system for audio tagging with spatiotemporal context.
This is the second session held by New York University. The Acoustic Perception went further and finally won the second place in the world and the first place in the country. The development of urban environmental sound classification and detection technology will help solve the problem of urban noise pollution and promote the progress and development of smart city.
Figure 4: Task5 teams ranking
Attachment: Our team's competition results over the years
Figure 5: Best ranking of our team over the years (International)
Writer: Bai Jisheng