-
Notifications
You must be signed in to change notification settings - Fork 3.2k
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Affected Notebook/File
03_skills_custom_development.ipynb
Bug Description
When invoking skills/tools through the Claude agent, the input token consumption appears to be excessively high, causing frequent rate limit errors even for relatively simple operations.
RateLimitError: Error code: 429 - {
'type': 'error',
'error': {
'type': 'rate_limit_error',
'message': 'This request would exceed the rate limit for your organization of 50,000 input tokens per minute. Please reduce the prompt length or the maximum tokens requested, or try again later.'
},
'request_id': 'req_011CVwgfm56xSnTkeud2yzZX'
}
I'd like to understand and discuss potential strategies to reduce input token consumption:
- System Prompt Optimization
- Is there a recommended way to minimize system prompt size when using agent skills?
- Can skill definitions be loaded dynamically rather than included in every request?
- Conversation History Management
- What's the recommended approach for truncating or summarizing conversation history?
Is there a sliding window implementation available?
- Tool/Skill Definition Efficiency
- Are there best practices for defining tools with minimal token overhead?
- Can tool schemas be compressed or cached?
- Caching Mechanisms
- Does the API support prompt caching to reduce repeated token charges?
- Are there plans to implement context caching for agent interactions?
- Token Budgeting
- Is there a way to set a maximum input token budget per request?
- Can we get token count estimates before sending requests?
Steps to Reproduce
Error appear on : Test Brand Guidelines with Document Creation
Let's test the brand skill by creating a branded PowerPoint presentation:
Error Message
RateLimitError: Error code: 429 - {
'type': 'error',
'error': {
'type': 'rate_limit_error',
'message': 'This request would exceed the rate limit for your organization of 50,000 input tokens per minute. Please reduce the prompt length or the maximum tokens requested, or try again later.'
},
'request_id': 'req_011CVwgfm56xSnTkeud2yzZX'
}Environment
No response
Would you be willing to submit a PR to fix this?
None
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working