Recently, the results of TextVQA Challenge of IEEE 2020 International Joint Conference on Computer Vision and Pattern Recognition (CVPR) were announced. The team composed of Gao Chenyu, a first-year graduate student, and Zhu Qi, a senior student, both from the School of Computer Science, won the championship under the guidance of Professor Wang Peng, and was invited to participate in CVPR 2020 VQA Workshop.
CVPR was first held in 1983 by IEEE, and is a top-level conference in the field of computer vision and pattern recognition. The TextVQA Challenge held at the conference requires the algorithm to understand the information of text and objects in the image at the same time, and to generate answers to open questions. This task involves a variety of computer vision and natural language processing technologies, such as scene text recognition (STR), target detection and recognition, machine reasoning, question answering, etc., which is one of the most complex image-text interaction tasks for the time being.
Gao Chenyu, a first-year graduate student and Zhu Qi, a senior student from the School of Computer Science, are core members of the team, who, instructed by the professor, designed a text visual question answering algorithm based on heterogeneous graph neural network (see Fig. 1 for the algorithm frame). Through explicit modeling on the relationship between text and object regions in the image, they improved the comprehensive reasoning ability of the algorithm and its performance in competition tasks (see Fig. 2 for algorithm visualization results). Relevant work they have done has been submitted to top-level computer vision conferences.
Fig. 1 Algorithm Frame Diagram
Fig. 2 Visualization Diagram of Algorithm Results
In addition, it is reported that Professor Wang Peng has led undergraduates and junior graduate students to participate in a number of domestic and international AI competitions since 2019, and achieved excellent results. In view of the problems exposed by undergraduates and junior postgraduates in AI courses, such as poor theoretical background, weak comprehensive practical ability and insufficient innovation, Professor Wang created a set of all-round AI innovation ability cultivation mode by combining classroom (classroom teaching), competition (AI competitions) and research (scientific research and paper writing). Through classroom teaching, he delivers multidisciplinary knowledge of artificial intelligence to his students, and cultivates their critical thinking and innovative consciousness; By encouraging students to take part in AI related competitions, he stimulates students' research interest and morale, thereby quickly improving their comprehensive practical ability; He helps his students develop innovative ability and rigorous academic attitude by get them involved in many an in-depth scientific research and paper writing. Classroom teaching, competition and research, which are carried out step by step, complement each other, enriching teaching methods, speeding up the progress of comprehensive and systematic training for students, and improving students' AI innovation ability in all directions. As a result, a new innovative talent training mode for undergraduates and junior graduates is taking shape.
Written by: Gao Chenyu, Reviewed by: Gao Wu