13ème édition – du 16 au 18 avril 2025
3 jours de conférences, 70 exposants, 4500 visiteurs par jour
Cedric Clyburn
Red Hat

Cedric Clyburn (@cedricclyburn), Senior Developer Advocate at Red Hat, is an enthusiastic software technologist with a background in Kubernetes, DevOps, and container tools. He has experience speaking and organizing conferences including DevNexus, WeAreDevelopers, The Linux Foundation, KCD NYC, and more. Cedric loves all things open-source, and works to make developer's lives easier! Based out of New York.

View
Structuring the Unstructured: Advanced Document Parsing for AI Workflows
Tools-in-Action (BEGINNER level)
Neuilly 251

Modern organizations generate vast amounts of data stored in diverse and often unstructured formats, such as PDFs, scanned documents, and proprietary file types. For engineers working with AI, the challenge isn’t simply extracting text—it’s preserving the structure, context, and relationships within the data. Whether fine-tuning models or building retrieval-augmented generation (RAG) pipelines, effective document processing is essential to powering actionable insights.

This session dives into the techniques and open source tools needed to transform unstructured documents into structured formats like JSON or Markdown, ready for AI workflows. You’ll learn how to handle challenges like multi-page tables, image-heavy layouts, and scanned documents using context-aware methods. Join this session as we explore how to efficiently bridge the gap between unstructured data and AI-powered applications, and help you achieve better results in your AI projects.

More

Searching for speaker images...

fr_FRFrançais