Development

AI Rest-API – A Fully Free Service

AI Rest-API – A Fully Free Service

Introduction

Hello everyone. Today I’m opening access to something I’ve been building, testing, and tuning for months — my own AI Rest-API, completely free to use. This is an attempt to provide a fast, reliable, privacy-friendly and developer-oriented AI service that anyone can integrate into apps, tools, automations, or experiments without a paywall. I designed it to be practical and dependable rather than a demo that falls apart under real load.

Core Architecture

Local, Customized GPT4ALL

The primary engine is a fully local GPT4ALL instance, deeply customized and fine-tuned to behave like a production-ready AI endpoint. Running locally gives complete independence from cloud providers, removes third-party exposure of data, and provides predictable performance for many common tasks.

Advanced Caching with Prompt-Relay Logic

The API uses an advanced caching layer that does more than simple key-value caching. It understands prompt similarity and context logic:

  • Similar prompts that lead to the same logical answer return an instant cached response.
  • Exact prompts return immediately with near-zero latency from cache.
  • Cached responses do not consume fallback provider calls.
  • The cache reduces unnecessary computation and keeps the system responsive under load.

Fallback Chain: Local → OpenAI → Grok → Deeplink

Local models can be resource-intensive under high concurrency. To keep the API stable, I implemented an automated fallback chain:

  1. Primary: local GPT4ALL
  2. Fallback A: OpenAI
  3. Fallback B: Grok
  4. Fallback C: Deeplink
  5. If all providers are busy, the API returns 502. In that case, retry after a few seconds.

How to Use the API

Authentication

Every request must include your personal token as a query parameter:

https://api.mihajlo.mk/v1/ai-text?token=YOUR_TOKEN

You can find your token in your profile at https://mihajlo.mk/. Keep it private. Requests with an invalid token are rejected.

Endpoint

POST https://api.mihajlo.mk/v1/ai-text?token=YOUR_TOKEN

The endpoint accepts multipart/form-data.

Request Parameters

Field Type Required Description
prompt text Yes Up to 5000 characters. The main input.
temperature number No Creativity control (0–2). Defaults to provider settings.
system text No Optional system instruction (e.g. "You are a creative fitness coach").
tokens int No Approximate max tokens for the response.

Example cURL

curl -X POST "https://api.mihajlo.mk/v1/ai-text?token=YOUR_TOKEN" \
-F "prompt=Write a 3 day beginner push/pull/legs program" \
-F "temperature=0.8" \
-F "system=You are a friendly certified trainer" \
-F "tokens=800"

Response Format

{
  "error": false,
  "status_text": "OK",
  "status_code": 200,
  "status_msg": "OK",
  "ai-text": {
    "prompt": "Write a 3 day beginner push/pull/legs program",
    "response": "Day 1 - Push ... (full AI response)"
  }
}

Validation errors return 422. If all resources are busy, the API returns 502. In the latter case, retry after a few seconds.

Error Codes

HTTP Status Meaning Action
401 Missing or invalid token Verify token on your profile.
405 Wrong HTTP method Only POST is supported.
422 Validation error (prompt missing, etc.) Fix the payload and retry.
502 Upstream failure (all resources failed) Wait a few seconds and submit again.

Integration Example: PHP

 'Generate a short motivational quote',
    'temperature' => 1.0,
    'system' => 'You speak like a calm mentor',
    'tokens' => 300
];

curl_setopt_array($curl, [
    CURLOPT_URL => "https://api.mihajlo.mk/v1/ai-text?token=YOUR_TOKEN",
    CURLOPT_POST => true,
    CURLOPT_POSTFIELDS => $data,
    CURLOPT_RETURNTRANSFER => true
]);

$response = curl_exec($curl);
curl_close($curl);

echo $response;
?>

Notes and Best Practices

  • Do not exceed the 5,000 character prompt limit.
  • Cached responses return instantly and do not consume additional provider requests.
  • If you receive 502, retry after a short pause; heavy load may temporarily route requests to fallback providers.
  • Keep your token secret and never publish it in client-side code.

Final Thoughts

I built this service because I enjoy creating practical tools that help other developers and builders. The goal is reliability and usefulness rather than hype. The combination of a local model, a context-aware caching layer, and a robust fallback chain provides a very practical API for many real-world use cases.

If you have ideas, feedback, or feature requests, I’d appreciate hearing them.

Sincerely,
Mihajlo

Blog author portrait

Mihajlo

I’m Mihajlo — a developer driven by curiosity, discipline, and the constant urge to create something meaningful. I share insights, tutorials, and free services to help others simplify their work and grow in the ever-evolving world of software and AI.