Global AI Network

Telegram Expense Tracker AI Agent

Automate expense tracking via Telegram with AI-powered voice transcription, OCR receipt scanning, and intelligent expense categorization using GPT-4o.

59+
Total Deployments
10 min
Setup Time
v1.0
Version

Need Help Getting Started? Our AI Specialists Will Set It Up For Free

1-Click Deployment 5-Min Setup Free Expert Support
Technology Partners

Required Integrations

This agent works seamlessly with these platforms to deliver powerful automation.

OpenAI

OpenAI

Leverage OpenAI's powerful language models to generate text, answer questions, a...

Telegram

Telegram

Connect your Telegram bot to send messages, photos, documents and receive update...

Step by Step

Setup Tutorial

mission-briefing.md

What This Agent Does

This intelligent Telegram bot transforms how you track expenses by accepting text, voice messages, photos, and documents—then automatically processing them into structured expense data. When you send a message to your bot, it intelligently routes different input types through specialized processing: voice messages get transcribed, images get analyzed with OCR, and text gets processed directly. All inputs then flow to an AI-powered expense manager that extracts relevant information, categorizes expenses, and responds with confirmation.

Key benefits include:

  • Save 15+ minutes daily by eliminating manual expense entry and receipt organization
  • Never lose a receipt with automatic photo and document processing
  • Hands-free tracking through voice message support for on-the-go expense logging
  • Instant categorization powered by GPT-4o for accurate expense classification

This workflow is perfect for tracking business expenses, managing personal budgets, processing receipts in real-time, and maintaining expense records without switching between multiple apps.

Who Is It For

This automation is ideal for:

  • Freelancers and consultants who need to track billable expenses across multiple clients
  • Small business owners managing company spending without dedicated accounting software
  • Sales professionals capturing receipts during travel and client meetings
  • Anyone managing a budget who wants effortless expense tracking through their favorite messaging app
  • Teams that need a simple, shared expense reporting system without complex software

Whether you're tech-savvy or just getting started with automation, this workflow provides enterprise-level expense management through the familiar Telegram interface.

Required Integrations

Telegram

Telegram powers the user interface for this workflow, receiving messages and sending responses back to users. This integration enables your bot to accept text, voice, photos, and documents while providing real-time feedback.

Setup steps:

  1. Open Telegram and search for @BotFather
  2. Send /newbot and follow the prompts to create your bot
  3. Choose a display name (e.g., "My Expense Tracker")
  4. Choose a unique username ending in "bot" (e.g., "myexpense_tracker_bot")
  5. Save the API token provided by BotFather (format: 123456789:ABCdefGHIjklMNOpqrsTUVwxyz)
  6. In TaskAGI, navigate to IntegrationsAdd IntegrationTelegram
  7. Paste your bot token in the Bot Token field
  8. Click Connect and verify the connection shows your bot's name
  9. Set up the webhook by copying the webhook URL from the Telegram trigger node
  10. The webhook will automatically register when you activate the workflow

Important: Keep your bot token secure—anyone with this token can control your bot.

OpenAI

OpenAI provides the AI capabilities for transcribing voice messages, extracting text from images via OCR, and powering the intelligent expense manager that categorizes and processes your expense data.

Setup steps:

  1. Visit https://platform.openai.com and sign in or create an account
  2. Navigate to API Keys in your account settings
  3. Click Create new secret key
  4. Name your key (e.g., "TaskAGI Expense Bot")
  5. Copy and save the key immediately—you won't be able to see it again
  6. In TaskAGI, go to IntegrationsAdd IntegrationOpenAI
  7. Paste your API key in the API Key field
  8. Click Connect to verify the integration
  9. Ensure you have billing set up in your OpenAI account with available credits

Cost considerations: This workflow uses GPT-4o, which costs approximately $0.005 per expense processed (varies based on message length and image complexity).

Configuration Steps

1. Telegram Webhook Trigger (Telegram Message Received)

This node listens for incoming messages to your bot. No configuration is needed beyond connecting your Telegram integration—it automatically captures all message types (text, voice, photo, document).

What it captures: Message content, sender information, chat ID, and media file IDs.

2. Route by Message Type (Switch Node)

This intelligent router directs messages to the appropriate processing path based on content type.

Configure three cases:

  • Case 1 (Text): Condition: {{message.text}} exists
  • Case 2 (Voice): Condition: {{message.voice}} exists
  • Case 3 (Photo/Document): Condition: {{message.photo}} OR {{message.document}} exists

3. Welcome Message Path (Send Welcome Message)

For the /start command, configure this node to send:

Chat ID: {{trigger.message.chat.id}}
Message: Welcome! Send me your expenses as text, voice, photos, or documents. I'll help you track them automatically!

4. Voice Processing Path

Check for Voice (If Condition): Set condition to {{trigger.message.voice}} !== undefined

Get Voice File: Configure with:

  • File ID: {{trigger.message.voice.file_id}}

Transcribe Voice (OpenAI):

  • Model: whisper-1
  • Audio File: {{previous.file_path}}
  • Language: en (or leave blank for auto-detection)

Prepare Voice Input:

  • Create field processed_text with value: {{transcription.text}}

5. Photo/Document Processing Path

Check for Photo/Document: Condition: {{trigger.message.photo}} !== undefined OR {{trigger.message.document}} !== undefined

Get Photo File ID (Function Node):

// Photos come as array, get highest resolution
if (input.message.photo) {
  return input.message.photo[input.message.photo.length - 1].file_id;
}
return input.message.document.file_id;

Get Media File: File ID: {{previous.file_id}}

OCR with Vision (OpenAI Completion):

  • Model: gpt-4o
  • Messages:
    • System: "You extract text and expense information from images."
    • User: "Extract all text and expense information from this image: [image_url: {{previous.file_path}}]"
  • Max Tokens: 1000

Prepare Media Input:

  • Create field processed_text with value: {{completion.choices[0].message.content}}

6. Text Processing Path

Prepare Text Input:

  • Create field processed_text with value: {{trigger.message.text}}

7. Merge and Process

Merge Inputs: This node automatically combines outputs from all three processing paths (text, voice, media).

Expense Manager AI Agent:

  • Model: gpt-4o
  • Temperature: 0.3 (for consistent categorization)
  • System Prompt:
You are an expense tracking assistant. Extract and structure expense information from user input.

Always respond with:
- Amount (with currency)
- Category (food, transport, entertainment, business, utilities, other)
- Date (default to today if not specified)
- Description
- Merchant/vendor (if mentioned)

Format as a clear confirmation message.
  • User Message: {{merged.processed_text}}

8. Send Response

Send AI Response:

  • Chat ID: {{trigger.message.chat.id}}
  • Message: {{ai_agent.response}}
  • Parse Mode: Markdown (for formatted responses)

Testing Your Agent

Initial Test Sequence

  1. Activate the workflow in TaskAGI and ensure the webhook is registered
  2. Open Telegram and search for your bot by username
  3. Send /start and verify you receive the welcome message
  4. Test text input: Send "Lunch at Chipotle $15.50"
  5. Check the execution log in TaskAGI—you should see the message flow through: Trigger → Switch → Text Prepare → AI Agent → Response
  6. Verify the response contains structured expense data with amount, category, date, and description

Comprehensive Testing

Test voice messages:

  1. Record a voice message: "Uber ride to airport, forty-two dollars"
  2. Verify transcription appears in the execution log
  3. Confirm AI extracts: Amount ($42), Category (transport), Description

Test photo processing:

  1. Take a photo of a receipt
  2. Send to your bot
  3. Check execution log for OCR extraction
  4. Verify AI identifies merchant, amount, and items

Test document processing:

  1. Send a PDF receipt or invoice
  2. Confirm OCR processes the document
  3. Verify expense extraction accuracy

Success Indicators

✅ All message types receive responses within 5-10 seconds
✅ Voice transcription accuracy exceeds 95%
✅ OCR correctly extracts amounts and merchant names
✅ AI categorizes expenses consistently
✅ Responses include all required fields (amount, category, date, description)

Troubleshooting

"Bot doesn't respond to messages"

Cause: Webhook not properly registered or workflow not activated

Solution:

  • Verify workflow status shows "Active" in TaskAGI
  • Check the Telegram trigger node shows a valid webhook URL
  • Test the webhook URL directly—it should return a 200 status
  • Deactivate and reactivate the workflow to re-register the webhook

"Voice messages aren't transcribed"

Cause: OpenAI integration issue or unsupported audio format

Solution:

  • Verify OpenAI integration is connected and has available credits
  • Check execution logs for specific error messages
  • Ensure the Whisper model (whisper-1) is specified correctly
  • Telegram voice messages are in .ogg format, which Whisper supports

"Photos return 'cannot process image'"

Cause: File size too large or GPT-4o vision not properly configured

Solution:

  • Verify you're using gpt-4o (not gpt-4 or gpt-3.5-turbo)
  • Check image file size—compress if over 20MB
  • Ensure the image URL is properly passed to the vision API
  • Verify OpenAI account has vision API access enabled

"AI responses are inconsistent"

Cause: Temperature setting too high or prompt needs refinement

Solution:

  • Lower temperature to 0.2-0.3 for more consistent categorization
  • Add specific examples to the system prompt
  • Include explicit formatting instructions
  • Add validation rules for required fields

"Execution fails at Merge node"

Cause: Multiple paths executing simultaneously or data structure mismatch

Solution:

  • Verify Switch node properly routes to only one path per message
  • Check that all preparation nodes output processed_text field
  • Ensure Merge node is configured to wait for any single input (not all inputs)

Next Steps

Immediate Actions

After successful setup, customize the AI prompt to match your specific expense categories and business rules. Add your preferred currency, tax handling requirements, or project codes.

Set up expense storage by adding a Google Sheets or Airtable node after the AI Agent to automatically log all processed expenses for later analysis and reporting.

Create expense reports by adding a scheduled workflow that queries your stored expenses and sends weekly or monthly summaries.

Optimization Suggestions

Improve accuracy by training the AI with examples:

  • Add 5-10 sample expenses to the system prompt
  • Include edge cases (split bills, tips, foreign currency)
  • Specify how to handle ambiguous categories

Add validation with an additional If Condition node that checks for required fields and asks clarifying questions when information is missing.

Enable multi-user support by storing user preferences (default currency, favorite categories) in a database keyed by Telegram user ID.

Advanced Usage Tips

Implement approval workflows for business expense tracking by adding a confirmation step with inline keyboard buttons before logging expenses.

Add receipt storage by uploading processed images to cloud storage (Dropbox, Google Drive) with organized folder structures.

Create budget alerts by checking expense totals against predefined limits and sending warnings when approaching budget thresholds.

Generate analytics by connecting to visualization tools that show spending patterns, category breakdowns, and trend analysis.

Support multiple languages by modifying the Whisper transcription to auto-detect language and adjusting the AI prompt to respond in the user's language.

Your expense tracking bot is now ready to save you hours of manual data entry while ensuring no expense goes unrecorded!