Simplifying the training and deployment of large foundation models

Being able to quickly scale up or down OpenShift cluster resources becomes more critical as the fine-tuning of foundation models gains popularity. Red Hat and IBM Research have worked together to open source a generative AI infrastructure stack that has been provided in Open Data Hub and is being productized as part of OpenShift AI. In this demo, Mustafa Eyceoz shows how you can use Ray, CodeFlare, and Multi-Cluster App Dispatcher (MCAD) technology to prioritize real-time access to cluster resources or schedule workloads for batch processing.

Intervenants

Mustafa Eyceoz | Software Engineer, Red Hat

Mustafa Eyceoz is a Machine Learning Engineer at Red Hat. Mustafa has worked on distributed model training for both OpenShift AI and RHEL AI, as well as inference performance and RAG solutions. He is currently working on the InstructLab project, as well as conducting research with IBM Research on Text2SQL, question decomposition, and LLM reasoning.