2-Day Instructor-Led Programme
Learn how to build multimodal RAG systems that retrieve and generate insights from both text and images. Explore multimodal retrieval pipelines and AI applications using Vertex AI infrastructure.
Duration
2 Days
Price
$1,799
Multimodal AI systems are increasingly used to analyse and retrieve knowledge from multiple data types including text, images, and structured content. Multimodal Retrieval-Augmented Generation (RAG) extends traditional RAG pipelines to support richer knowledge sources.
This programme teaches engineers how to design and implement multimodal RAG systems capable of retrieving both textual and visual information. Participants learn how to ingest multimodal datasets, generate embeddings for images and text, and perform retrieval operations that support multimodal reasoning.
Using modern AI development tools and Vertex AI infrastructure, learners build applications that combine document search with visual understanding to support advanced AI assistants and knowledge systems.
Participants design a multimodal retrieval pipeline capable of processing both image and text knowledge sources.
Industry mentors demonstrate real-world multimodal AI architectures used in enterprise AI platforms and advanced generative AI systems.
Develop the ability to build next-generation AI knowledge systems that understand and retrieve information across multiple data modalities.
Understand multimodal AI architecture concepts
Build retrieval pipelines that support both text and image data
Generate embeddings for multimodal datasets
Implement multimodal vector search and retrieval systems
Design multimodal RAG pipelines for enterprise AI applications
Deploy reliable multimodal AI knowledge systems
Basic programming knowledge (Python recommended)
Familiarity with AI or machine learning concepts
Understanding of RAG pipelines helpful but not mandatory
Organized by professional domains with comprehensive coverage
Master these in-demand skills through hands-on practice
A clear view of the roles this programme supports, what typically comes next, and where learners progress over time
Choose the learning format that works best for you and your team
Instructor-Led Training
Join live instructor-led sessions from anywhere. Interactive, engaging, and flexible.
Price per person
Group enrolments and early planning options available.
All prices are exclusive of VAT where applicable. Group enrolments and custom packages available on request.
Not everyone learns best in a group. If you want focused guidance, faster clarity, and confidence you can use on the job, our 1-to-1 Fast-Track Training gives you private, mentor-led support tailored to your experience and goals.
"Many learners choose 1-to-1 when they want understanding, not memorisation."
Everything you need to know about the certification exams
You will receive an Xcademia certificate of completion based on participation and successful completion of labs and scenario simulations.
Everything you need to know about this course
Multimodal RAG retrieves and processes information from multiple data types such as text and images before generating responses.
Take the next step in your professional development