AI Rest-API – A Fully Free Service
Introduction
Hello everyone. Today I’m opening access to something I’ve been building, testing, and tuning for months — my own AI Rest-API, completely free to use. This is an attempt to provide a fast, reliable, privacy-friendly and developer-oriented AI service that anyone can integrate into apps, tools, automations, or experiments without a paywall. I designed it to be practical and dependable rather than a demo that falls apart under real load.
Core Architecture
Local, Customized GPT4ALL
The primary engine is a fully local GPT4ALL instance, deeply customized and fine-tuned to behave like a production-ready AI endpoint.
Running locally gives complete independence from cloud providers, removes third-party exposure of data, and provides predictable performance for many common tasks.
Advanced Caching with Prompt-Relay Logic
The API uses an advanced caching layer that does more than simple key-value caching. It understands prompt similarity and context logic:
- Similar prompts that lead to the same logical answer return an instant cached response.
- Exact prompts return immediately with near-zero latency from cache.
- Cached responses do not consume fallback provider calls.
- The cache reduces unnecessary computation and keeps the system responsive under load.
Fallback Chain: Local → OpenAI → Grok → Deeplink
Local models can be resource-intensive under high concurrency. To keep the API stable, I implemented an automated fallback chain:
- Primary: local GPT4ALL
- Fallback A: OpenAI
- Fallback B: Grok
- Fallback C: Deeplink
- If all providers are busy, the API returns
502. In that case, retry after a few seconds.
How to Use the API
Authentication
Every request must include your personal token as a query parameter:
https://api.mihajlo.mk/v1/ai-text?token=YOUR_TOKEN
You can find your token in your profile at https://mihajlo.mk/. Keep it private. Requests with an invalid token are rejected.
Endpoint
POST https://api.mihajlo.mk/v1/ai-text?token=YOUR_TOKEN
The endpoint accepts multipart/form-data.
Request Parameters
| Field | Type | Required | Description |
|---|---|---|---|
| prompt | text | Yes | Up to 5000 characters. The main input. |
| temperature | number | No | Creativity control (0–2). Defaults to provider settings. |
| system | text | No | Optional system instruction (e.g. "You are a creative fitness coach"). |
| tokens | int | No | Approximate max tokens for the response. |
Example cURL
curl -X POST "https://api.mihajlo.mk/v1/ai-text?token=YOUR_TOKEN" \
-F "prompt=Write a 3 day beginner push/pull/legs program" \
-F "temperature=0.8" \
-F "system=You are a friendly certified trainer" \
-F "tokens=800"
Response Format
{
"error": false,
"status_text": "OK",
"status_code": 200,
"status_msg": "OK",
"ai-text": {
"prompt": "Write a 3 day beginner push/pull/legs program",
"response": "Day 1 - Push ... (full AI response)"
}
}
Validation errors return 422. If all resources are busy, the API returns 502. In the latter case, retry after a few seconds.
Error Codes
| HTTP Status | Meaning | Action |
|---|---|---|
| 401 | Missing or invalid token | Verify token on your profile. |
| 405 | Wrong HTTP method | Only POST is supported. |
| 422 | Validation error (prompt missing, etc.) | Fix the payload and retry. |
| 502 | Upstream failure (all resources failed) | Wait a few seconds and submit again. |
Integration Example: PHP
'Generate a short motivational quote',
'temperature' => 1.0,
'system' => 'You speak like a calm mentor',
'tokens' => 300
];
curl_setopt_array($curl, [
CURLOPT_URL => "https://api.mihajlo.mk/v1/ai-text?token=YOUR_TOKEN",
CURLOPT_POST => true,
CURLOPT_POSTFIELDS => $data,
CURLOPT_RETURNTRANSFER => true
]);
$response = curl_exec($curl);
curl_close($curl);
echo $response;
?>
Notes and Best Practices
- Do not exceed the 5,000 character prompt limit.
- Cached responses return instantly and do not consume additional provider requests.
- If you receive
502, retry after a short pause; heavy load may temporarily route requests to fallback providers. - Keep your token secret and never publish it in client-side code.
Final Thoughts
I built this service because I enjoy creating practical tools that help other developers and builders. The goal is reliability and usefulness rather than hype. The combination of a local model, a context-aware caching layer, and a robust fallback chain provides a very practical API for many real-world use cases.
If you have ideas, feedback, or feature requests, I’d appreciate hearing them.
Sincerely,
Mihajlo