16.2 C
London
Friday, September 20, 2024

Apple’s Interspeech 2024 Breakthrough: Revolutionizing Speech Recognition with Groundbreaking Machine Learning Innovations

Introduction

Every year, the international community of speech processing comes together to share the latest advancements and innovations in the field. The 25th annual Interspeech conference is no exception. This premier event brings together experts from around the world to discuss the science and technology of spoken language processing. This year, the conference is taking place from September 1 to 5 in Kos, Greece, and Apple is set to make a big splash with its sponsorship of various workshops and events.

Schedule

Apple’s sponsorship involves the setup of an expo booth at the Kipriotis Hotels & Conference Center, located on Floor 1, Booth #4. Attendees can visit the booth during the following timings:

Apple-sponsored workshops and events

Apple Welcome Reception

Wednesday, September 4

* 11:00 – 11:20 GMT+3: ORAL, ESPnet-SPK: Full Pipeline Speaker Verification Toolkit with Multiple Reproducible Recipes, Self-Supervised Front-Ends, and Off-the-Shelf Models Jee-weon Jung, Wangyou Zhang, Jiatong Shi, Zak Aldeneh, Takuya Higuchi, Barry Theobald, Ahmed Hussen Abdelaziz, Shinji Watanabe
* 13:30 – 15:30 GMT+3: POSTER, Multimodal Large Language Models with Fusion Low Rank Adaptation for Device Directed Speech Detection Shruti Palaskar, Oggi Rudovic, Sameer Dharur, Florian Pesce, Gautam Krishna, Aswin Sivaraman, Jack Berkowitz, Ahmed Hussen Abdelaziz, Saurabh Adya, Ahmed Tewfik

Thursday, September 5

Accepted Papers

The following papers will be presented during the Interspeech conference:

1. Can You Remove the Downstream Model for Speaker Recognition with Self-Supervised Speech Features? – Zak Aldeneh, Takuya Higuchi, Jee-weon Jung, Skyler Seto, Tatiana Likhomanenko, Stephen Shum, Ahmed Hussen Abdelaziz, Shinji Watanabe
2. Comparative Analysis of Personalized Voice Activity Detection Systems: Assessing Real-World Effectiveness – Satyam Kumar, Sai Srujana Buddi, Oggy Sarawgi, Vineet Garg, Shivesh Ranjan, Oggi Rudovic, Ahmed Hussen Abdelaziz, Saurabh Adya
3. Enhancing CTC-based Speech Recognition with Diverse Modeling Units – Michael Han, Zhihong Lei, Mingbin Xu, Xingyu Na, Zhen Huang
4. ESPnet-SPK: Full Pipeline Speaker Verification Toolkit with Multiple Reproducible Recipes, Self-Supervised Front-Ends, and Off-the-Shelf Models – Jee-weon Jung, Wangyou Zhang, Jiatong Shi, Zak Aldeneh, Takuya Higuchi, Barry Theobald, Ahmed Hussen Abdelaziz, Shinji Watanabe
5. Multimodal Large Language Models with Fusion Low Rank Adaptation for Device Directed Speech Detection – Shruti Palaskar, Oggi Rudovic, Sameer Dharur, Florian Pesce, Gautam Krishna, Aswin Sivaraman, Jack Berkowitz, Ahmed Hussen Abdelaziz, Saurabh Adya, Ahmed Tewfik
6. Novel-view Acoustic Synthesis from 3D Reconstructed Rooms – Byeongjoo Ahn, Karren Yang, Brian Hamilton, Jonathan Sheaffer, Anurag Ranjan, Oncel Tuzel, Miguel Sarabia del Castillo, Rick Chang
7. Positional Description for Numerical Normalization – Deepanshu Gupta, Javier Latorre Martinez
8. RepCNN: Micro-sized, Mighty Models for Wakeword Detection – Arnav Kundu, Prateeth Nayak, Priyanka Padmanabhan, Devang Naik
9. Transformer-based Model for ASR N-Best Rescoring and Rewriting – Edwin Kang, Christophe Van Gysel, Man-Hung Siu

Acknowledgements

Arnav Kundu, Ilya Oparin, Javier Latorre Martinez, Lyan Verwimp, Markus Nussbaum-Thom, Mirko Hannemann, Thiago Fraga da Silva, Sameer Badaskar, Tuomo Raitio, and Tatiana Likhomanenko served as reviewers for the Interspeech conference.

Frequently Asked Questions

Q1. Who is sponsoring the workshops and events at the Interspeech conference?

Apple is sponsoring various workshops and events at the Interspeech conference, including the expo booth at the Kipriotis Hotels & Conference Center.

Q2. Where can attendees visit the Apple-sponsored booth?

The Apple-sponsored booth can be found at the Kipriotis Hotels & Conference Center, located on Floor 1, Booth #4.

Q3. What are the timings of the Apple-sponsored events?

The timings of the Apple-sponsored events are Monday, September 2: 10:30 – 19:00 GMT+3, Tuesday, September 3: 09:30 – 18:00 GMT+3, Wednesday, September 4: 09:30 – 18:00 GMT+3, and Thursday, September 5: 10:30 – 16:00 GMT+3.

Q4. What papers will be presented during the Interspeech conference?

The papers that will be presented during the Interspeech conference include Can You Remove the Downstream Model for Speaker Recognition with Self-Supervised Speech Features?, Comparative Analysis of Personalized Voice Activity Detection Systems: Assessing Real-World Effectiveness, Enhancing CTC-based Speech Recognition with Diverse Modeling Units, ESPnet-SPK: Full Pipeline Speaker Verification Toolkit with Multiple Reproducible Recipes, Self-Supervised Front-Ends, and Off-the-Shelf Models, and many more.

Q5. Who served as reviewers for the Interspeech conference?

Arnav Kundu, Ilya Oparin, Javier Latorre Martinez, Lyan Verwimp, Markus Nussbaum-Thom, Mirko Hannemann, Thiago Fraga da Silva, Sameer Badaskar, Tuomo Raitio, and Tatiana Likhomanenko served as reviewers for the Interspeech conference.

Conclusion

The Apple-sponsored events at the Interspeech conference provide an ideal platform for attendees to connect, learn, and network with experts in the field of spoken language processing. With a diverse range of papers and workshops, attendees can expect an exciting and engaging experience.

Latest news
Related news
0 0 votes
Article Rating
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x