胡英-计算机科学与技术学院

胡英

发布者: 发布时间：2025-09-17 浏览次数：

胡英

副教授、博士/电子系副主任

研究领域：音频信号处理，基于内容的音乐信息检索，深度学习和智能计算

办公室＆实验室：新疆大学（博达校区）信息楼B 段112，406

电子邮件：huying@xju.edu.cn， huying_75@sina.com

联系电话：(+86)

教育背景

2009/03–2016/03，西安交通大学电信学院，信息与通信工程，博士。
1998/09–2002/06，新疆大学，通信与信息系统，硕士。
1993/09–1997/06，新疆大学，电子学与信息系统，学士。

工作简历

2018/09- 至今，新疆大学，计算机科学与技术学院，副教授。
2006/10 - 2018/08，新疆大学，信息科学与工程学院，讲师；
2004/03–2006/10，中兴通讯乌鲁木齐办事处，商务技术经理；
1997/07–2004/02，自治区国家安全厅机要处，科员；

研究内容

基于深度学习和智能计算的声源分离、声源定位、声音事件检测、语音情感分析、基于内容的音乐信息检索。

获奖情况

科研获奖：

1、2022年获得自治区第 16 届自然科学优秀学术论文三等奖。

教学获奖：

1、2020年获得校级优秀指导教师；

2、2023年主持并通过验收校级金课《电路与电子学》线上线下混合式教学建设项目，排名第1；

3、2023年参与“构建’课创赛‘互融新模式，探索电子信息类应用创新人才培养”项目获得自治区教育教学成果奖一等奖，排名第6.

学术兼职

任IEEE会员，国际语音通信协会 (International Speech Communication Association, ISCA) 会员，中国计算机学会CCF会员（语音对话与听觉专委会）。
任Speech Communication、Digital Signal Processing、Computer Speech and Language国际学术期刊、国际会议IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 、Interspeech和 IEEE International Conference on Multimedia and Expo (ICME) 审稿人。

主持项目

中央引导地方科技发展资金项目（ZYYD2025JD10），基于信号智能检测与处理的边缘数据治理创新平台建设，项目经费：324万元，起讫时间：202501-202612。
北京信息科学与技术国家研究中心开放课题 (04410307724)，⾯向真实场景的远场语⾳识别⽅法研究，项目经费：10万元，起讫时间：202411-202510。
横向课题，上海海思技术有限公司，语音增强委托开发项目，项目经费：111万元，起讫时间：202207-202307。
国家语委科研项目-研究中心项目，项目编号：ZDI145-21，单声道混合信号多声源分离研究，项目经费：15万元，起讫时间：202201-202312。
国家自然科学基金（地区基金），项目编号：61761041，唱歌辅助系统相关技术的研究，项目经费：37万元，起讫时间：201801-202112。
赛尔网络下一代互联网技术创新项目，项目编号： NGII20190309，基于IPv6网络的车辆运动姿态感互知通信技术研究，项目经费：10万元，起讫时间：202001-202101。
新疆维吾尔自治区自然科学基金：项目编号：2016D01C061，基于非负矩阵部分联合分解的强噪声条件下语音分离的研究，项目经费：7万元，起讫时间：201701-201912。
新疆大学自然科学基金，项目编号：BS16023，单声道歌曲的唱声分离研究，项目经费：3万元，起讫时间：201704-201904。

参与项目

国家自然科学基金新疆联合基金，项目编号：U1903213，音视频多模态协同的异常事件鲁棒性检测关键技术研究，起讫时间：202001-202312，项目经费：278万元。
国家自然科学基金青年基金项目，项目编号：61603323，****场景文本检测方法研究，起讫时间：2017/01-2019/12，项目经费：20.4万元。
新疆维吾尔自治区自然科学基金青年基金项目，项目编号：2016D01C079，基于层次认知模型的视觉目标检测方法研究，起讫时间：2017/01-2019/12，项目经费：2.5万元。
自治区普通高等学校教学改革研究项目（普通教改项目），项目编号：2017JG029 创新创业团队培养模式探索，起讫时间：2017/09-2019/12，项目经费：2万元

学术成果

发表论文40余篇，其中SCI期刊论文10余篇，国际会议论文20篇。其中有6篇高水平SCI期刊论文（TASLP, SPL），10篇高水平国际会议论文，其中CCF B类会议ICASSP会议论文4篇，CCF B类会议ICME会议论文3篇，CCF C类会议 Interspeech 会议论文7篇。

2025年：

Ying Hu*, Qin Yang, Wenbing Wei , Li Lin , Liang He , Zhijian Ou , Wenzhong Yang. MN-Net: Speech Enhancement Network Via Modeling the Noise. IEEE Trans. Audio Speech and Language Processing. TASLP, vol. 33, pp. 1208–1219, Mar. 2025.
Ying Hu, Jiabo Jing, Fan Li, Lijun He, Li Lin, Wenzhong Yang. A Singing Melody Extraction Network Via Self-Distillation and Multi-Level Supervision. The 50th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Haderabad, India, April 4-11, 2025.

2024年：

Wei W, Hu Y*, Huang H, He L. IIFC-Net: A Monaural Speech Enhancement Network with High-Order Information Interaction and Feature Calibration. IEEE Signal Processing Letters.SPL, Vol. 31, 2024.
Ying Hu*, Haotao Xu, Liang He, and Hao Huang. SMMA-Net: An Audio Clue-Based Target Speaker Extraction Network with Spectrogram Matching and Mutual Attention. In ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 1496-1500). IEEE.
Ma, Mengzhen, Ying Hu*, Liang He, and Hao Huang. "GLFER-Net: a polyphonic sound source localization and detection network based on global-local feature extraction and recalibration." EURASIP Journal on Audio, Speech, and Music Processing 2024, no. 1: 34.
Hu Y, Yang H, Huang H, et al. Cross-modal Features Interaction-and-Aggregation Network with Self-consistency Training for Speech Emotion Recognition[C]//Proc. Interspeech 2024. 2024: 2335-2339.
Fujie Xu, Huamin Yang, and Ying Hu，“DA-KWFormer : A Domain Adaptation Network with K-Weight Transformer for Speech Emotion Recognition” in National Conference on Man-Machine Speech Communication 2024.
Xin Fan, Wenjie Fang and Ying Hu “ASD-Diff: Unsupervised Anomalous Sound Detection With Masked Diffusion Model” in National Conference on Man-Machine Speech Communication 2024.
Liusong Wang Yuan Gao, Kaimin Cao, and Ying Hu “SESNet: A Speech Enhancement and Separation Network in Noisy Reverberant Environments” in National Conference on Man-Machine Speech Communication 2024.
Huamin Yang and Ying Hu “D-AGNet: A Dual-branch Network with Attention Guidance for Speech Emotion Recognition” in National Conference on Man-Machine Speech Communication 2024.

2023年：

Ying Hu*, Shijing H, Huaming Y, Liang He, and Hao Huang. "A Joint Network Based on Interactive Attention For Speech Emotion Recognition." IEEE International Conference on In Multimedia and Expo, ICME (2023).
Yuan Gao, Ying Hu*, Liusong Wang, Hao Huang, and Liang He. "MTANet: Multi-band Time-frequency Attention Network for Singing Melody Extaction from Polyphonic Music." Proc. Interspeech 2023.
Wang, M., Li, Y. and Hu, Y., 2023, September. Improved Self-Consistency Training with Selective Feature Fusion for Sound Event Detection. In 2023 6th International Conference on Information Communication and Signal Processing (ICICSP) (pp. 460-464). IEEE.
Hou, S., Yang, H. and Hu, Y., 2023, September. A Lightweight Speech Emotion Recognition Model with Bias-Focal Loss. In 2023 6th International Conference on Information Communication and Signal Processing (ICICSP) (pp. 644-648). IEEE.
Zhu, Mengying, Liusong Wang, and Ying Hu. "A Lightweight Music Source Separation Model with Graph Convolution Network." In National Conference on Man-Machine Speech Communication, pp. 23-36. Singapore: Springer Nature Singapore, 2023.
Fang, Wenjie, Xin Fan, and Ying Hu. "Multi-branch Network with Cross-Domain Feature Fusion for Anomalous Sound Detection." In National Conference on Man-Machine Speech Communication, pp. 215-226. Singapore: Springer Nature Singapore, 2023.

2022年：

Tang, Yuwu, Ying Hu*, Liang He, and Hao Huang. "A bimodal network based on Audio–Text-Interactional-Attention with ArcFace loss for speech emotion recognition." Speech Communication, 143 (2022): 21-32.
Hu, Ying*, Yadong Chen, Wenzhong Yang, Liang He, and Hao Huang. "Hierarchic Temporal Convolutional Network with Cross-Domain Encoder for Music Source Separation." IEEE Signal Processing Letters, SPL (2022).
Chen, Yadong, Ying Hu*, Liang He, and Hao Huang. "Multi-stage music separation network with dual-branch attention and hybrid convolution." Journal of Intelligent Information Systems, JIIS (2022): 1-22.
Ying Hu*, Sun, Xinghao, Liang He，and Hao Huang. "A Generalized Network with Multi-scale Densely Connection and Residual Attention for Sound Source Localization and Detection", Journal of the Acoustical Society of America, JASA, 151(3), Mar. 2022.
Qiu, Wenbo, and Ying Hu*. "Dual-Path Hybrid Attention Network for Monaural Speech Separation." IEEE Access 10 (2022): 78754-78763.
Xinghao Sun, Mengzhen Ma, Ying Hu∗, Sound source localization and detection based on parameter transfer learning, PROCEEDINGS of the 24rd International Congress on Acoustics, ICA, October 24 to 28, 2022 in Gyeongju, Korea.
Zihao Chen, Wenbo Qiu, Haitao Xu, Ying Hu∗, Hierarchic Temporal Convolutional Network Attention Fusion for Target Speaker Extraction, Asia-Pacific Signal and Information Processing Association (APSIPA2022), Chiang Mai, Thailand, 2022/11/07 -2022/11/10
Yunlong Li, Xiujuan Zhu, Mingyu Wang, and Ying Hu∗, Self-Consistency Training With Hierarchical Temporal Aggregation for Sound Event Detection', Asia-Pacific Signal and Information Processing Association (APSIPA2022), Chiang Mai, Thailand, 2022/11/07 -2022/11/10
Liusong Wang, Wenbing Wei, Yadong Chen, and Ying Hu∗, D²Net: A Denoising and Dereverberation Network Based on Two-branch Encoder and Dual-path Transformer, Asia-Pacific Signal and Information Processing Association (APSIPA2022), Chiang Mai, Thailand, 2022/11/07 -2022/11/10
Hu, Ying*, Yuwu Tang, Hao Huang, and Liang He. "A Graph Isomorphism Network with Weighted Multiple Aggregators for Speech Emotion Recognition." Proc. Interspeech 2022
Hu, Ying*, Xiujuan Zhu, Yunlong Li, Hao Huang, and Liang He. "A Multi-grained based Attention Network for Semi-supervised Sound Event Detection." Proc. Interspeech 2022
Wang, K., Peng, Y., Huang, H., Hu, Y. and Li, S., 2022, May. Mining Hard Samples Locally And Globally For Improved Speech Separation. In ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 6037-6041). IEEE.

2021年：

Ma, Wenfang, Ying Hu*, and Hao Huang. "Dual Attention Network for Pitch Estimation of Monophonic Music." Symmetry 13.7 (2021): 1296.
Kang, X. , Huang, H.* , Hu, Y. , & Huang, Z. . (2021). Connectionist temporal classification loss for vector quantized variational auto-encoder in zero-shot voice conversion. Digital Signal Processing, 116(6), 103110.
Sun, Xinghao, Ying Hu*, Xiujuan Zhu, and Liang He. "Sound Event Localization and Detection Based on Adaptive Hybrid Convolution and Multi-sacle Feature Extractor." In DCASE 2021-6th Workshop on Detection and Classification of Acoustic Scenes and Events.2021.
Zhu, Xiujuan, Ying Hu,* Xinghao Sun, and Liang He. "Multi-Scale Network Based on Split Attention For semi-supervised Sound Event Detection." In DCASE 2021-6th Workshop on Detection and Classification of Acoustic Scenes and Events. 2021
Huang, H.* , Wang, K. , Hu, Y. , & Li, S. . (2021). Encoder-Decoder based pitch tracking and joint model training for Mandarin tone classification. In Proc. IEEE-ICASSP, 2021. IEEE.

2020年：

[1] 董兴磊，胡英*，黄浩，吾守尔·斯拉木，基于卷积非负矩阵部分联合分解的强噪声单声道语音分离，《自动化学报》2020 Vol. 46, No. 6。影响因子（CJCR）: 2.793
[2] Geng H , Hu Y *, Huang H . Monaural Singing Voice and Accompaniment Separation Based on Gated Nested U-Net Architecture[J]. Symmetry, 2020, 12(6):1051. Impact actor: 2.645
[3] Zhong, Y., Hu, Y.*, Huang, H. and Silamu, W., A Lightweight Model Based on Separable Convolution for Speech Emotion Recognition. Proc. Interspeech 2020, pp.3331-3335.

2019年以前：

Huang, Hao*, Xu, Haihua, Hu, Ying, & Zhou, Gang (2017). A transfer learning approach to goodness of pronunciation based automatic mispronunciation detection. Journal of the Acoustical Society of America, JASA, 142(5), 3165. Impact actor: 1.902
Hu, Ying and Guizhong Liu.* Separation of Singing Voice Using Nonnegative Matrix Partial Co-Factorization for Singer Identification. IEEE Trans. Audio Speech and Language Processing. TASLP, VOL. 23, NO. 4, APRIL 2015, pp. 643-653. Impact actor: 3.918
Hu, Ying, and Guizhong Liu*. Singer identification based on computational auditory scene analysis and missing feature methods. Journal of Intelligent Information Systems, JIIS 2014, vol. 42, no. 3, pp.333-352. Impact actor: 1.667
Hu, Ying, and Guizhong Liu*. Instrument identification and pitch estimation in multi-timbre polyphonic musical signals based on probabilistic mixture model decomposition. Journal of Intelligent Information Systems. JIIS 2013, vol. 40, no. 1, pp.141-158. Impact actor: 1.667
Hu, Ying*, Wang, Liejun, Huang, Hao, & Zhou, Gang. Monaural Singing Voice Separation by Non-negative Matrix Partial Co-Factorization with Temporal Continuity and Sparsity Criteria. International Conference on Intelligent Computing. ICIC (2016).，Springer International Publishing..
Zhou, Gang*, Liu, Yajun, Shi, Fei., & Hu, Ying. (2016). Scene Text Detection Based on Text Probability and Pruning Algorithm. Intelligent Computing Methodologies. Springer International Publishing.
Hu, Ying, and Guizhong Liu*. Automatic singer identification using missing feature methods. In Multimedia and Expo (ICME), 2013 IEEE International Conference on, 2013, pp.1-6.
Hu, Ying, and Guizhong Liu*. Dynamic characteristics of musical note for musical instrument classification. In Signal Processing, Communications and Computing (ICSPCC), IEEE International Conference on, 2011, pp. 1-6. IEEE.

已受权发明专利：

董兴磊,胡英,黄浩. 基于卷积神经网络和深度聚类的多说话人语音分离方法[P]. 新疆维吾尔自治区：CN110459240B, 2021-01-12.

已受理发明专利：

陈亚东, 邱文博, 胡英,等. 基于双注意力机制和多阶段混合卷积网络声源分离方法. 2021-06-15.
邱文博, 陈亚东, 胡英,等. 基于浅层特征重激活和多阶段混合注意力的声源分离方法. 2021-11-09

评测

参加Challenge on Detection and Classification of Acoustic Scenes and Events（DCASE2021）挑战赛Task 3（声音事件定位与检测）和 Task 4（现实场景中的声学事件检测，排名19名，19/50）。参加DCASE2022 Task 3（排名11名）和Task 4（排名13名)