胡英
 
  
   
    |  | 
      副教授、博士/电子系副主任 研究领域:音频信号处理,基于内容的音乐信息检索,深度学习和智能计算 办公室&实验室:新疆大学(博达校区)信息楼B 段112,406 电子邮件:huying@xju.edu.cn, huying_75@sina.com 联系电话:(+86) | 
  
 
 
教育背景
 - 2009/03–2016/03,西安交通大学电信学院,信息与通信工程,博士。 
- 1998/09–2002/06,新疆大学,通信与信息系统,硕士。 
- 1993/09–1997/06,新疆大学,电子学与信息系统,学士。 
工作简历
 - 2018/09- 至今,新疆大学,计算机科学与技术学院,副教授。 
- 2006/10 - 2018/08,新疆大学,信息科学与工程学院,讲师; 
- 2004/03–2006/10,中兴通讯乌鲁木齐办事处,商务技术经理; 
- 1997/07–2004/02,自治区国家安全厅机要处,科员; 
研究内容
基于深度学习和智能计算的声源分离、声源定位、声音事件检测、语音情感分析、基于内容的音乐信息检索。
获奖情况
科研获奖:
1、2022年获得自治区第 16 届自然科学优秀学术论文三等奖。
教学获奖:
1、2020年获得校级优秀指导教师;
2、2023年主持并通过验收校级金课 《电路与电子学》线上线下混合式教学建设项目,排名第1;
3、2023年参与“构建’课创赛‘互融新模式,探索电子信息类应用创新人才培养”项目获得自治区教育教学成果奖一等奖,排名第6.
学术兼职
 - 任IEEE会员,国际语音通信协会 (International Speech Communication Association, ISCA) 会员,中国计算机学会CCF会员(语音对话与听觉专委会)。 
- 任Speech Communication、Digital Signal Processing、Computer Speech and Language国际学术期刊、国际会议IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 、Interspeech和 IEEE International Conference on Multimedia and Expo (ICME) 审稿人。 
主持项目
 - 中央引导地方科技发展资金项目(ZYYD2025JD10),基于信号智能检测与处理的边缘数据治理创新平台建设,项目经费:324万元,起讫时间:202501-202612。 
- 北京信息科学与技术国家研究中心开放课题 (04410307724),⾯向真实场景的远场语⾳识别⽅法研究,项目经费:10万元,起讫时间:202411-202510。 
- 横向课题,上海海思技术有限公司,语音增强委托开发项目,项目经费:111万元,起讫时间:202207-202307。 
- 国家语委科研项目-研究中心项目,项目编号:ZDI145-21,单声道混合信号多声源分离研究,项目经费:15万元,起讫时间:202201-202312。 
- 国家自然科学基金(地区基金),项目编号:61761041,唱歌辅助系统相关技术的研究,项目经费:37万元,起讫时间:201801-202112。 
- 赛尔网络下一代互联网技术创新项目,项目编号: NGII20190309,基于IPv6网络的车辆运动姿态感互知通信技术研究,项目经费:10万元,起讫时间:202001-202101。 
- 新疆维吾尔自治区自然科学基金:项目编号:2016D01C061,基于非负矩阵部分联合分解的强噪声条件下语音分离的研究,项目经费:7万元,起讫时间:201701-201912。 
- 新疆大学自然科学基金,项目编号:BS16023,单声道歌曲的唱声分离研究,项目经费:3万元,起讫时间:201704-201904。 
参与项目
 - 国家自然科学基金新疆联合基金,项目编号:U1903213,音视频多模态协同的异常事件鲁棒性检测关键技术研究,起讫时间:202001-202312,项目经费:278万元。 
- 国家自然科学基金青年基金项目,项目编号:61603323,****场景文本检测方法研究,起讫时间:2017/01-2019/12,项目经费:20.4万元。 
- 新疆维吾尔自治区自然科学基金青年基金项目,项目编号:2016D01C079,基于层次认知模型的视觉目标检测方法研究,起讫时间:2017/01-2019/12,项目经费:2.5万元。 
- 自治区普通高等学校教学改革研究项目(普通教改项目),项目编号:2017JG029 创新创业团队培养模式探索,起讫时间:2017/09-2019/12,项目经费:2万元 
学术成果
发表论文40余篇,其中SCI期刊论文10余篇,国际会议论文20篇。其中有6篇高水平SCI期刊论文(TASLP, SPL),10篇高水平国际会议论文,其中CCF B类会议ICASSP会议论文4篇,CCF B类会议ICME会议论文3篇,CCF C类会议 Interspeech 会议论文7篇。
2025年:
 - Ying Hu*, Qin Yang, Wenbing Wei , Li Lin , Liang He , Zhijian Ou , Wenzhong Yang. MN-Net: Speech Enhancement Network Via Modeling the Noise. IEEE Trans. Audio Speech and Language Processing. TASLP, vol. 33, pp. 1208–1219, Mar. 2025. 
- Ying Hu, Jiabo Jing, Fan Li, Lijun He, Li Lin, Wenzhong Yang. A Singing Melody Extraction Network Via Self-Distillation and Multi-Level Supervision. The 50th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Haderabad, India, April 4-11, 2025. 
2024年:
 - Wei W, Hu Y*, Huang H, He L. IIFC-Net: A Monaural Speech Enhancement Network with High-Order Information Interaction and Feature Calibration. IEEE Signal Processing Letters.SPL, Vol. 31, 2024. 
- Ying Hu*, Haotao Xu, Liang He, and Hao Huang. SMMA-Net: An Audio Clue-Based Target Speaker Extraction Network with Spectrogram Matching and Mutual Attention. In ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 1496-1500). IEEE. 
- Ma, Mengzhen, Ying Hu*, Liang He, and Hao Huang. "GLFER-Net: a polyphonic sound source localization and detection network based on global-local feature extraction and recalibration." EURASIP Journal on Audio, Speech, and Music Processing 2024, no. 1: 34. 
- Hu Y, Yang H, Huang H, et al. Cross-modal Features Interaction-and-Aggregation Network with Self-consistency Training for Speech Emotion Recognition[C]//Proc. Interspeech 2024. 2024: 2335-2339. 
- Fujie Xu, Huamin Yang, and Ying Hu,“DA-KWFormer : A Domain Adaptation Network with K-Weight Transformer for Speech Emotion Recognition” in National Conference on Man-Machine Speech Communication 2024. 
- Xin Fan, Wenjie Fang and Ying Hu “ASD-Diff: Unsupervised Anomalous Sound Detection With Masked Diffusion Model” in National Conference on Man-Machine Speech Communication 2024. 
- Liusong Wang Yuan Gao, Kaimin Cao, and Ying Hu “SESNet: A Speech Enhancement and Separation Network in Noisy Reverberant Environments” in National Conference on Man-Machine Speech Communication 2024. 
- Huamin Yang and Ying Hu “D-AGNet: A Dual-branch Network with Attention Guidance for Speech Emotion Recognition” in National Conference on Man-Machine Speech Communication 2024. 
2023年:
 - Ying Hu*, Shijing H, Huaming Y, Liang He, and Hao Huang. "A Joint Network Based on Interactive Attention For Speech Emotion Recognition." IEEE International Conference on In Multimedia and Expo, ICME (2023). 
- Yuan Gao, Ying Hu*, Liusong Wang, Hao Huang, and Liang He. "MTANet: Multi-band Time-frequency Attention Network for Singing Melody Extaction from Polyphonic Music." Proc. Interspeech 2023. 
- Wang, M., Li, Y. and Hu, Y., 2023, September. Improved Self-Consistency Training with Selective Feature Fusion for Sound Event Detection. In 2023 6th International Conference on Information Communication and Signal Processing (ICICSP) (pp. 460-464). IEEE. 
- Hou, S., Yang, H. and Hu, Y., 2023, September. A Lightweight Speech Emotion Recognition Model with Bias-Focal Loss. In 2023 6th International Conference on Information Communication and Signal Processing (ICICSP) (pp. 644-648). IEEE. 
- Zhu, Mengying, Liusong Wang, and Ying Hu. "A Lightweight Music Source Separation Model with Graph Convolution Network." In National Conference on Man-Machine Speech Communication, pp. 23-36. Singapore: Springer Nature Singapore, 2023. 
- Fang, Wenjie, Xin Fan, and Ying Hu. "Multi-branch Network with Cross-Domain Feature Fusion for Anomalous Sound Detection." In National Conference on Man-Machine Speech Communication, pp. 215-226. Singapore: Springer Nature Singapore, 2023. 
2022年:
 - Tang, Yuwu, Ying Hu*, Liang He, and Hao Huang. "A bimodal network based on Audio–Text-Interactional-Attention with ArcFace loss for speech emotion recognition." Speech Communication, 143 (2022): 21-32. 
- Hu, Ying*, Yadong Chen, Wenzhong Yang, Liang He, and Hao Huang. "Hierarchic Temporal Convolutional Network with Cross-Domain Encoder for Music Source Separation." IEEE Signal Processing Letters, SPL (2022). 
- Chen, Yadong, Ying Hu*, Liang He, and Hao Huang. "Multi-stage music separation network with dual-branch attention and hybrid convolution." Journal of Intelligent Information Systems, JIIS (2022): 1-22. 
- Ying Hu*, Sun, Xinghao, Liang He,and Hao Huang. "A Generalized Network with Multi-scale Densely Connection and Residual Attention for Sound Source Localization and Detection", Journal of the Acoustical Society of America, JASA, 151(3), Mar. 2022. 
- Qiu, Wenbo, and Ying Hu*. "Dual-Path Hybrid Attention Network for Monaural Speech Separation." IEEE Access 10 (2022): 78754-78763. 
- Xinghao Sun, Mengzhen Ma, Ying Hu∗, Sound source localization and detection based on parameter transfer learning, PROCEEDINGS of the 24rd International Congress on Acoustics, ICA, October 24 to 28, 2022 in Gyeongju, Korea. 
- Zihao Chen, Wenbo Qiu, Haitao Xu, Ying Hu∗,  Hierarchic Temporal Convolutional Network Attention Fusion for Target Speaker Extraction, Asia-Pacific Signal and Information Processing Association (APSIPA2022), Chiang Mai, Thailand, 2022/11/07 -2022/11/10 
- Yunlong Li, Xiujuan Zhu, Mingyu Wang, and Ying Hu∗, Self-Consistency Training With Hierarchical Temporal Aggregation for Sound Event Detection', Asia-Pacific Signal and Information Processing Association (APSIPA2022), Chiang Mai, Thailand, 2022/11/07 -2022/11/10 
- Liusong Wang, Wenbing Wei, Yadong Chen, and Ying Hu∗,  D²Net: A Denoising and Dereverberation Network Based on Two-branch Encoder and Dual-path Transformer, Asia-Pacific Signal and Information Processing Association (APSIPA2022), Chiang Mai, Thailand, 2022/11/07 -2022/11/10 
- Hu, Ying*, Yuwu Tang, Hao Huang, and Liang He. "A Graph Isomorphism Network with Weighted Multiple Aggregators for Speech Emotion Recognition." Proc. Interspeech 2022 
- Hu, Ying*, Xiujuan Zhu, Yunlong Li, Hao Huang, and Liang He. "A Multi-grained based Attention Network for Semi-supervised Sound Event Detection." Proc. Interspeech 2022 
- Wang, K., Peng, Y., Huang, H., Hu, Y. and Li, S., 2022, May. Mining Hard Samples Locally And Globally For Improved Speech Separation. In ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 6037-6041). IEEE. 
2021年:
 - Ma, Wenfang, Ying Hu*, and Hao Huang. "Dual Attention Network for Pitch Estimation of Monophonic Music." Symmetry 13.7 (2021): 1296. 
- Kang, X. , Huang, H.* , Hu, Y. , & Huang, Z. . (2021). Connectionist temporal classification loss for vector quantized variational auto-encoder in zero-shot voice conversion. Digital Signal Processing, 116(6), 103110. 
- Sun, Xinghao, Ying Hu*, Xiujuan Zhu, and Liang He. "Sound Event Localization and Detection Based on Adaptive Hybrid Convolution and Multi-sacle Feature Extractor." In DCASE 2021-6th Workshop on Detection and Classification of Acoustic Scenes and Events.2021. 
- Zhu, Xiujuan, Ying Hu,* Xinghao Sun, and Liang He. "Multi-Scale Network Based on Split Attention For semi-supervised Sound Event Detection." In DCASE 2021-6th Workshop on Detection and Classification of Acoustic Scenes and Events. 2021 
- Huang, H.* , Wang, K. , Hu, Y. , & Li, S. . (2021). Encoder-Decoder based pitch tracking and joint model training for Mandarin tone classification. In Proc. IEEE-ICASSP, 2021. IEEE. 
2020年:
 - [1] 董兴磊,胡英*,黄浩,吾守尔·斯拉木,基于卷积非负矩阵部分联合分解的强噪声单声道语音分离,《自动化学报》2020 Vol. 46, No. 6。影响因子(CJCR): 2.793 
- [2] Geng H , Hu Y *, Huang H . Monaural Singing Voice and Accompaniment Separation Based on Gated Nested U-Net Architecture[J]. Symmetry, 2020, 12(6):1051. Impact actor: 2.645 
- [3] Zhong, Y., Hu, Y.*, Huang, H. and Silamu, W., A Lightweight Model Based on Separable Convolution for Speech Emotion Recognition. Proc. Interspeech 2020, pp.3331-3335. 
2019年以前:
 - Huang, Hao*, Xu, Haihua, Hu, Ying, & Zhou, Gang (2017). A transfer learning approach to goodness of pronunciation based automatic mispronunciation detection. Journal of the Acoustical Society of America, JASA, 142(5), 3165.  Impact actor: 1.902 
- Hu, Ying and Guizhong Liu.* Separation of Singing Voice Using Nonnegative Matrix Partial Co-Factorization for Singer Identification. IEEE Trans. Audio Speech and Language Processing. TASLP, VOL. 23, NO. 4, APRIL 2015, pp. 643-653.  Impact actor: 3.918 
- Hu, Ying, and Guizhong Liu*. Singer identification based on computational auditory scene analysis and missing feature methods. Journal of Intelligent Information Systems, JIIS 2014, vol. 42, no. 3, pp.333-352.   Impact actor: 1.667 
- Hu, Ying, and Guizhong Liu*. Instrument identification and pitch estimation in multi-timbre polyphonic musical signals based on probabilistic mixture model decomposition. Journal of Intelligent Information Systems. JIIS 2013, vol. 40, no. 1, pp.141-158.  Impact actor: 1.667 
- Hu, Ying*, Wang, Liejun, Huang, Hao, & Zhou, Gang. Monaural Singing Voice Separation by Non-negative Matrix Partial Co-Factorization with Temporal Continuity and Sparsity Criteria. International Conference on Intelligent Computing. ICIC (2016).,Springer International Publishing.. 
- Zhou, Gang*, Liu, Yajun, Shi, Fei., & Hu, Ying. (2016). Scene Text Detection Based on Text Probability and Pruning Algorithm. Intelligent Computing Methodologies. Springer International Publishing. 
- Hu, Ying, and Guizhong Liu*. Automatic singer identification using missing feature methods. In Multimedia and Expo (ICME), 2013 IEEE International Conference on, 2013, pp.1-6. 
- Hu, Ying, and Guizhong Liu*. Dynamic characteristics of musical note for musical instrument classification. In Signal Processing, Communications and Computing (ICSPCC), IEEE International Conference on, 2011, pp. 1-6. IEEE. 
已受权发明专利:
 - 董兴磊,胡英,黄浩. 基于卷积神经网络和深度聚类的多说话人语音分离方法[P]. 新疆维吾尔自治区:CN110459240B, 2021-01-12. 
  已受理发明专利:
 - 陈亚东, 邱文博, 胡英,等. 基于双注意力机制和多阶段混合卷积网络声源分离方法. 2021-06-15. 
- 邱文博, 陈亚东, 胡英,等. 基于浅层特征重激活和多阶段混合注意力的声源分离方法. 2021-11-09 
评测
 - 参加Challenge on Detection and Classification of Acoustic Scenes and Events(DCASE2021)挑战赛Task 3(声音事件定位与检测) 和 Task 4(现实场景中的声学事件检测,排名19名,19/50)。参加DCASE2022 Task 3(排名11名)和Task 4(排名13名)