AI Rest-API – A Fully Free Service

Mihajlo
Dec 1, 2025
Development

Introduction

Hello everyone. Today I’m opening access to something I’ve been building, testing, and tuning for months — my own AI Rest-API, completely free to use. This is an attempt to provide a fast, reliable, privacy-friendly and developer-oriented AI service that anyone can integrate into apps, tools, automations, or experiments without a paywall. I designed it to be practical and dependable rather than a demo that falls apart under real load.

Core Architecture

Local, Customized GPT4ALL

The primary engine is a fully local GPT4ALL instance, deeply customized and fine-tuned to behave like a production-ready AI endpoint. Running locally gives complete independence from cloud providers, removes third-party exposure of data, and provides predictable performance for many common tasks.

Advanced Caching with Prompt-Relay Logic

The API uses an advanced caching layer that does more than simple key-value caching. It understands prompt similarity and context logic:

Similar prompts that lead to the same logical answer return an instant cached response.
Exact prompts return immediately with near-zero latency from cache.
Cached responses do not consume fallback provider calls.
The cache reduces unnecessary computation and keeps the system responsive under load.

Fallback Chain: Local → OpenAI → Grok → Deeplink

Local models can be resource-intensive under high concurrency. To keep the API stable, I implemented an automated fallback chain:

Primary: local GPT4ALL
Fallback A: OpenAI
Fallback B: Grok
Fallback C: Deeplink
If all providers are busy, the API returns 502. In that case, retry after a few seconds.

How to Use the API

Authentication

Every request must include your personal token as a query parameter:

https://api.mihajlo.mk/v1/ai-text?token=YOUR_TOKEN

You can find your token in your profile at https://mihajlo.mk/. Keep it private. Requests with an invalid token are rejected.

Endpoint

POST https://api.mihajlo.mk/v1/ai-text?token=YOUR_TOKEN

The endpoint accepts multipart/form-data.

Request Parameters

Field	Type	Required	Description
prompt	text	Yes	Up to 5000 characters. The main input.
temperature	number	No	Creativity control (0–2). Defaults to provider settings.
system	text	No	Optional system instruction (e.g. "You are a creative fitness coach").
tokens	int	No	Approximate max tokens for the response.

Example cURL

curl -X POST "https://api.mihajlo.mk/v1/ai-text?token=YOUR_TOKEN" \
-F "prompt=Write a 3 day beginner push/pull/legs program" \
-F "temperature=0.8" \
-F "system=You are a friendly certified trainer" \
-F "tokens=800"

Response Format

{
  "error": false,
  "status_text": "OK",
  "status_code": 200,
  "status_msg": "OK",
  "ai-text": {
    "prompt": "Write a 3 day beginner push/pull/legs program",
    "response": "Day 1 - Push ... (full AI response)"
  }
}

Validation errors return 422. If all resources are busy, the API returns 502. In the latter case, retry after a few seconds.

Error Codes

HTTP Status	Meaning	Action
401	Missing or invalid token	Verify token on your profile.
405	Wrong HTTP method	Only `POST` is supported.
422	Validation error (prompt missing, etc.)	Fix the payload and retry.
502	Upstream failure (all resources failed)	Wait a few seconds and submit again.

Integration Example: PHP

 'Generate a short motivational quote',
    'temperature' => 1.0,
    'system' => 'You speak like a calm mentor',
    'tokens' => 300
];

curl_setopt_array($curl, [
    CURLOPT_URL => "https://api.mihajlo.mk/v1/ai-text?token=YOUR_TOKEN",
    CURLOPT_POST => true,
    CURLOPT_POSTFIELDS => $data,
    CURLOPT_RETURNTRANSFER => true
]);

$response = curl_exec($curl);
curl_close($curl);

echo $response;
?>

Notes and Best Practices

Do not exceed the 5,000 character prompt limit.
Cached responses return instantly and do not consume additional provider requests.
If you receive 502, retry after a short pause; heavy load may temporarily route requests to fallback providers.
Keep your token secret and never publish it in client-side code.

Final Thoughts

I built this service because I enjoy creating practical tools that help other developers and builders. The goal is reliability and usefulness rather than hype. The combination of a local model, a context-aware caching layer, and a robust fallback chain provides a very practical API for many real-world use cases.

If you have ideas, feedback, or feature requests, I’d appreciate hearing them.

Sincerely,
Mihajlo

Development

I'Mihajlo

Development