Tech Trends Watcher
  • Home
  • Artificial Intelligence
  • Chatbots
  • Digital Marketing
  • Energy & Resources
  • Software & High-Tech
  • Financial Services
  • Machine Learning
No Result
View All Result
  • Home
  • Artificial Intelligence
  • Chatbots
  • Digital Marketing
  • Energy & Resources
  • Software & High-Tech
  • Financial Services
  • Machine Learning
No Result
View All Result
Tech Trends Watcher
No Result
View All Result
Home Machine Learning

Pre-Trained Foundation Model Representations to Uncover Breathing Patterns in Speech

Tech Trends Watcher by Tech Trends Watcher
2 August 2024
in Machine Learning
0
SHARES
1
VIEWS
Share on FacebookShare on Twitter

The process of human speech production involves coordinated respiratory action to elicit acoustic speech signals. Typically, speech is produced when air is forced from the lungs and is modulated by the vocal tract, where such actions are interspersed by moments of breathing in air (inhalation) to refill the lungs again. Respiratory rate (𝑅𝑅) is a vital metric that is used to assess the overall health, fitness, and general well-being of an individual. Existing approaches to measure 𝑅𝑅 (number of breaths one takes in a minute) are performed using specialized equipment or training. Studies have demonstrated that machine learning algorithms can be used to estimate 𝑅𝑅 using bio-sensor signals as input. Speech-based estimation of 𝑅𝑅 can offer an effective approach to measure the vital metric without requiring any specialized equipment or sensors. This work investigates a machine learning based approach to estimate 𝑅𝑅 from speech segments obtained from subjects speaking to a close-talking microphone device. Data were collected from N=26 individuals, where the groundtruth 𝑅𝑅 was obtained through commercial grade chest-belts and then manually corrected for any errors. A convolutional long-short term memory network (Conv-LSTM) is proposed to estimate respiration time-series data from the speech signal. We demonstrate that the use of pre-trained representations obtained from a foundation model, such as WAV2VEC2, can be used to estimate respiration-time-series with low root-mean-squared error and high correlation coefficient, when compared with the baseline. The model-driven time series can be used to estimate 𝑅𝑅 with a low mean absolute error (𝑀𝐴𝐸) β‰ˆ 1.6π‘π‘Ÿπ‘’π‘Žπ‘‘h𝑠/π‘šπ‘–π‘›.

Previous Post

LazyLLM: Dynamic Token Pruning for Efficient Long Context LLM Inference

Next Post

Tuning LLMs with Contrastive Alignment Instructions for Machine Translation in Unseen, Low-resource Languages

Tech Trends Watcher

Tech Trends Watcher

Next Post

Tuning LLMs with Contrastive Alignment Instructions for Machine Translation in Unseen, Low-resource Languages

Recent Posts

Apple’s wearable ideas include smart glasses and cameras in your ears

β€˜You are a helpful mail assistant,’ and other Apple Intelligence instructions

12 August 2024

BISCUIT: Scaffolding LLM-Generated Code with Ephemeral UIs in Computational Notebooks

5 August 2024
Photo collage of an image of Donald Trump behind a graphic, glitchy design.

Donald Trump says Google β€˜has to be careful’ or it will be β€˜shut down’

5 August 2024
Vector illustration of the Chat GPT logo.

Elon Musk is suing OpenAI and Sam Altman again

5 August 2024
OpenAI is making ChatGPT cheaper for schools and nonprofits

OpenAI won’t watermark ChatGPT text because its users could get caught

5 August 2024
footer_logo

Welcome to Tech Trends Watcher! Your go-to source for the latest in tech updates. Stay informed and ahead of the curve!Β 

Browse by Category

COMPANY

  • About Us
  • Contact us

Subscribe to Our Newsletter

    SUPPORT

    • Disclaimer
    • Privacy Policy
    • Terms & Conditions

    Β© 2024 Tech Trends Watcher

    No Result
    View All Result
    • Home
    • Artificial Intelligence
    • Chatbots
    • Digital Marketing
    • Energy & Resources
    • Software & High-Tech
    • Financial Services
    • Machine Learning

    Β© 2024 Tech Trends Watcher