-
Notifications
You must be signed in to change notification settings - Fork 4.2k
Closed
Labels
Description
Discussed in #11012
Originally posted by glorious-beard March 17, 2025
Now that OpenAI can handle file inputs (for PDFs) in addition to text and images, are there plans to add the ability to parse additional content tags in ChatPromptParser
to handle additional content types, like BinaryContent
, AudioContent
, etc.? (Claude can handle PDFs too - see here)
Additional tags could include:
- '<audio> (base64 audio stream) </audio>' - Parsed into an
AudioContent
instance - '<binary mimeType="(mime type)"> (base64 content) </binary>' - Parsed into a
BinaryContent
instance, withmimeType
defaulting to "application/octet-stream" if not present - '<pdf> (base64 content) </pdf>' - Parsed into a new
PdfContent
class derived fromBinaryContent
My application makes heavy use of the YAML prompt templates so this would be very helpful in not having to manually build chat histories for any operation involving inputs beyond text and images.
I volunteer to add the above if it's not already planned for a near term release.
(Maybe this is an extension of this discussion?)
Metadata
Metadata
Assignees
Labels
Type
Projects
Status
Sprint: Done