Groq

AI

Ultra-fast LLM inference for Llama, Mixtral, and Gemma

Groq delivers the fastest LLM inference available — run Llama 3, Mixtral, Gemma and more at hundreds of tokens per second.

Details

Auth Type
API Key (Bearer token)
Rate Limit
30 req/min (free)
Pricing
$0.05–$0.27 per 1M tokens
Full Docs
Step 1: Save your provider key

This is NOT your Callio key. Enter the API key from the provider's dashboard (e.g. OpenAI/SendGrid).

API Key (Bearer token)

1. Go to https://console.groq.com/keys 2. Create a key 3. Paste it in the API Key field

Get API Credentials

Getting Started

1

Try It Instantly

Click "Try It" above to test the API in the playground

2

Add to Your Agent

Click "Add to Agent" to get your API key and integrate

Common Use Cases

Real-time chat applications
Low-latency agents
Streaming text
Fast summarisation
Code completion

💻 Code Examples

Get started quickly with these code examples in your favorite language

curl -X GET \
'https://www.callio.dev/api/proxy/groq-llm/endpoint' \
-H 'Authorization: Bearer YOUR_CALLIO_KEY' \
-H 'Content-Type: application/json'

💡 Tip: Replace YOUR_CALLIO_KEY with your actual Callio API key from the dashboard.

Ready to integrate Groq?

Test endpoints live or generate your API key and start building in minutes

Browse More APIs