Diego Toribio

Writing Portfolio About
  • Multimodal Alzheimer's Detection

    Used spontaneous speech from clinician-patient conversations as a diagnostic signal for automated Alzheimer's detection.

    • Designed a scalable ETL pipeline for recordings, handling audio processing, transcription, and feature extraction across multiple models to produce audio and text embeddings
    • Developed a training pipeline to evaluate multiple classifiers on audio, text, and combined embeddings.
    • Found that text emeddings performed best (82% / 0.83 F1), while combining audio and text improved results over the audio-only baseline (70% / 0.74 F1)
    Machine Learning Audio Text
  • Agnus The Troll

    An interactive audio-video agent designed to mock and provoke attendees.

    • Built a real-time audio-video loop connecting a dynamic microphone and webcam for live speech and visual input.
    • Integrated the Gemini 2.5 Live API with WebSockets to stream and play spoken responses in continuous conversation.
    • Configured session memory and a simple Gradio interface to let users start, stop and resume interaction seamlessly.
    Generative AI Audio Video

github [#142]Created with Sketch. Email

© 2025 Diego Toribio