Handling REST API Tools Response

Learn how to use special extension tags in your OpenAPI schemas to handle media content and binary data. Discover techniques for converting URLs and base64 content into attachments while optimizing LLM context usage

Last updated 11 months ago

Handling Media Content and Attachments in REST API Tools

When creating REST API tools on allmates.ai, you often need to handle responses that contain media content or large binary data. The platform provides special extension tags that allow you to control how these responses are processed, ensuring optimal display of content and efficient use of the LLM's context window.

Understanding Response Processing Extensions

Response processing extensions are special tags added to properties in your OpenAPI schema that instruct the system how to handle specific types of response data. These extensions are particularly valuable when working with:

Images returned as URLs
Base64-encoded images or documents
Audio or video files
Large datasets that shouldn't consume context space

By properly implementing these extensions, you can create tools that seamlessly integrate rich media content into conversations without cluttering the LLM's context with raw binary data or lengthy URLs.

Available Extension Tags

The allmates.ai platform supports several extension tags designed specifically for response processing:

1. Converting URLs to Attachments

The x-process-as-attachment-from-url extension transforms a URL in the response into an attachment that appears directly in the conversation:

image_url:
  type: string
  description: URL to the generated image
  x-process-as-attachment-from-url: true
  example: "https://example.com/images/generated.jpg"

When this extension is applied, instead of returning the raw URL in the response, the system will:

Fetch the content from the specified URL
Convert it into an appropriate attachment based on the content type
Display it directly in the conversation

This is particularly useful for image generation APIs, document storage services, or any API that returns URLs to media content.

2. Converting Base64 Content to Attachments

The x-process-as-attachment-from-base-64 extension converts base64-encoded content in the response into an attachment:

image_data:
  type: string
  description: Base64-encoded image data
  x-process-as-attachment-from-base-64: true
  x-process-as-attachment-mime-type: 'image/png'
  example: "iVBORw0KGgoAAAANSUhEUgAAA..."

This extension must be used together with x-process-as-attachment-mime-type, which specifies the content type of the encoded data. Common MIME types include:

image/png - For PNG images
image/jpeg - For JPEG images
application/pdf - For PDF documents
audio/mpeg - For MP3 audio files
video/mp4 - For MP4 video files

3. Hiding Properties from the LLM

The x-hide-property-from-llm extension prevents specific properties from being included in the response sent back to the LLM's context:

large_data:
  type: string
  description: Large dataset in JSON format
  x-hide-property-from-llm: true
  example: "{"data": [...]}"

This extension is crucial for:

Preventing large binary data from consuming valuable tokens
Hiding technical metadata that doesn't add value to the conversation
Ensuring base64-encoded content doesn't pollute the LLM's context

Combining Extensions for Optimal Results

These extensions can be combined to achieve the desired behavior. For example, when handling base64-encoded images, you typically want to both convert them to attachments and hide them from the LLM's context:

image_data:
  type: string
  description: Base64-encoded image data
  x-process-as-attachment-from-base-64: true
  x-process-as-attachment-mime-type: 'image/png'
  x-hide-property-from-llm: true
  example: "iVBORw0KGgoAAAANSUhEUgAAA..."

Practical Example: Image Generation API

Here's a complete example showing how to implement response processing extensions for an image generation API that can return either a URL or base64-encoded image:

responses:
  '200':
    description: Successfully generated image
    content:
      application/json:
        schema:
          type: object
          properties:
            cost:
              type: number
              description: The cost of the image generation
              example: 0.0025165824
            seed:
              type: integer
              description: The seed used for image generation
              example: 312585864
            url:
              type: string
              description: |
                URL to the generated image (when response_format is "url")
                
                ## For LLMs
                - This URL will be automatically converted to an image attachment
                - No need to describe the URL or suggest opening it
              x-process-as-attachment-from-url: true
              x-hide-property-from-llm: true
              example: "https://api-images.example.com/generated-image.jpeg"
            image:
              type: string
              description: |
                Base64-encoded image data (when response_format is "b64")
                
                ## For LLMs
                - This data will be automatically converted to an image attachment
                - Do not attempt to include this data in your response
              x-process-as-attachment-from-base-64: true
              x-process-as-attachment-mime-type: 'image/png'
              x-hide-property-from-llm: true
              example: "iVBORw0KGgoAAAANSUhEUgAAA..."

In this example:

The url property uses x-process-as-attachment-from-url to convert the URL to an attachment
The image property uses x-process-as-attachment-from-base-64 to convert base64 data to an attachment
Both properties use x-hide-property-from-llm to prevent them from consuming tokens in the LLM's context
The cost and seed properties are still included in the response to the LLM

Best Practices for Response Processing

To ensure optimal performance and user experience, follow these best practices when implementing response processing extensions:

1. Always Hide Binary Data

Use x-hide-property-from-llm: true for any property containing binary data, base64-encoded content, or large text blocks. This prevents unnecessary token consumption and keeps the LLM's context focused on relevant information.

2. Specify Correct MIME Types

When using x-process-as-attachment-from-base-64, always specify the correct MIME type with x-process-as-attachment-mime-type. An incorrect MIME type can result in attachments that don't render properly.

3. Prefer URL Attachments When Possible

When given the choice between URL-based and base64-encoded content, prefer URLs for better performance:

URL attachments typically load faster and consume fewer resources
Base64 encoding increases data size by approximately 33%
URL attachments can be loaded on demand, while base64 data must be processed immediately

4. Include Clear LLM Instructions

Add explicit instructions for LLMs in your property descriptions:

url:
  type: string
  description: |
    URL to the generated image
    
    ## For LLMs
    - This URL will be automatically converted to an image attachment
    - No need to describe the URL or suggest opening it
  x-process-as-attachment-from-url: true
  x-hide-property-from-llm: true

5. Test Thoroughly

Always test your response processing extensions with various inputs to ensure they work as expected:

Verify that attachments display correctly
Check that hidden properties don't appear in the LLM's responses
Test with different content types and sizes

Common Use Cases

Response processing extensions are particularly valuable in these scenarios:

Image Generation Tools

APIs that generate images can return either URLs or base64-encoded data. Using the appropriate extensions ensures that users see the generated images directly in the conversation.

Document Processing

When working with document processing APIs, you can convert PDFs, spreadsheets, or other documents into attachments that users can download directly from the conversation.

Data Visualization

APIs that generate charts or graphs can return them as images, which can be displayed directly in the conversation using these extensions.

Media Content

Audio or video content returned by APIs can be made available as attachments, enhancing the multimedia capabilities of your Mates.

Troubleshooting

If you encounter issues with response processing extensions, check for these common problems:

Attachments Not Displaying

Verify that the URL is accessible and returns the expected content
Ensure the correct MIME type is specified for base64-encoded content
Check that the base64 encoding is valid and complete

Large Responses

Consider using URL attachments instead of base64 for large files
Break up very large responses into smaller chunks
Use pagination for APIs that return large datasets

Incorrect Content Types

Double-check the MIME type specified in x-process-as-attachment-mime-type
Ensure the content matches the specified MIME type
Test with different browsers or clients if attachments display incorrectly

Saving Complete Responses as Attachments

In addition to the previously described extensions that handle specific properties, allmates.ai also supports the x-save-tool-response-as-attachment extension. This extension works differently as it applies to the entire operation rather than individual response properties.

The `x-save-tool-response-as-attachment` Extension

This extension allows you to save the complete API response as an attachment in the conversation. Unlike other extensions, it should be placed at the operation level (same level as operationId), not within the responses section:

paths:
  /analytics/report:
    get:
      summary: Generate analytics report
      description: |
        Generates a comprehensive analytics report.
      operationId: generateAnalyticsReport
      x-save-tool-response-as-attachment: true  # At operation level, not in responses
      parameters:
        # ... parameters
      responses:
        '200':
          description: Successful response
          content:
            application/json:
              schema:
                # ... schema properties

When this extension is applied:

The complete API response is saved as a JSON attachment
The LLM still receives the response to analyze and process it
Users can download the complete response if needed

Difference from Other Processing Extensions

It's important to understand the differences between x-save-tool-response-as-attachment and the other extensions:

Extension	Application Level	Effect	Use Cases
`x-save-tool-response-as-attachment`	Operation level	Saves entire response as attachment	Complex structured data, analysis results, complete reports
`x-process-as-attachment-from-url`	Property level	Converts a URL to an attachment	Images, documents, media files referenced by URL
`x-process-as-attachment-from-base-64`	Property level	Converts base64 data to an attachment	Images or files encoded directly in the response
`x-hide-property-from-llm`	Property level	Prevents property from being included in LLM context	Large data, technical metadata, binary content

Use Cases for `x-save-tool-response-as-attachment`

This extension is particularly useful for:

Complex Responses: When the API returns complex data structures that users might want to analyze in detail
Large Datasets: For responses containing many records or data points
Reports: When the response constitutes a complete report that might be useful to consult later
Debugging: During development, to examine the exact responses received from the API

Implementation Example

paths:
  /analytics/report:
    get:
      summary: Generate analytics report
      description: |
        Generates a comprehensive analytics report with multiple data points and visualizations.
        
        ## For LLMs
        - The complete response will be saved as a downloadable JSON attachment
        - You should summarize the key findings in your response
        - Reference the attachment for users who want the complete data set
      operationId: generateAnalyticsReport
      x-save-tool-response-as-attachment: true  # Correct placement at operation level
      parameters:
        - name: timeframe
          in: query
          description: Time period for the report (daily, weekly, monthly)
          schema:
            type: string
            enum: [daily, weekly, monthly]
      responses:
        '200':
          description: Analytics report generated successfully
          content:
            application/json:
              schema:
                type: object
                properties:
                  summary:
                    type: object
                    properties:
                      total_visitors:
                        type: integer
                      conversion_rate:
                        type: number
                  detailed_metrics:
                    type: array
                    items:
                      type: object
                  trend_analysis:
                    type: object

Conclusion

Response processing extensions significantly enhance the capabilities of your REST API tools on allmates.ai by providing sophisticated ways to handle media content, binary data, and complex responses. By strategically implementing these extensions, you can create more powerful and user-friendly experiences that seamlessly integrate diverse content types into conversations.

When designing your OpenAPI schemas, consider these key takeaways:

Choose the right extension for each scenario: Use property-level extensions (x-process-as-attachment-from-url, x-process-as-attachment-from-base-64) for individual media elements, and operation-level extensions (x-save-tool-response-as-attachment) for preserving entire responses.
Optimize LLM context usage: Apply x-hide-property-from-llm to prevent large binary data and technical metadata from consuming valuable token space, keeping conversations focused and efficient.
Balance functionality and performance: Consider the trade-offs between different approaches, such as URL versus base64 encoding, and choose the method that best serves your specific use case.
Provide clear guidance: Include explicit instructions for LLMs in your property descriptions to ensure they interact appropriately with the processed content.
Test thoroughly: Verify that your extensions work as expected across different content types, sizes, and scenarios before deploying to production.

By mastering these response processing techniques, you can unlock the full potential of your REST API tools, creating rich, interactive experiences that effectively leverage external services while maintaining optimal performance. Whether you're displaying images, handling documents, or preserving complex data structures, these extensions provide the flexibility and control needed to deliver exceptional results through your Mates.