Skip to content

Cloudflare Worker Endpoints

This document provides a detailed contract for the endpoints exposed by the Cloudflare worker.


1. RAG Endpoints

Vector Query

Performs a vector search query against the Vectorize index, with an optional reranking step.

  • Endpoint: POST /rag/query
  • Request Body:

    Field Type Required Description
    query string Yes The text to search for.
    k integer No The number of nearest neighbors to retrieve from Vectorize. Defaults to 10.
    rerank_top_n integer No The number of results to rerank. Defaults to 20.
    filters object No Key-value pairs for metadata filtering.
  • Example Request:

    {
      "query": "Latest news on ceasefire talks",
      "k": 10,
      "rerank_top_n": 5,
      "filters": { "lang": "ar", "published_at_bucket": 202508 }
    }
    

  • Success Response (200 OK):

    {
      "ok": true,
      "timing_ms": { "embed": 210, "search": 350, "rerank": 180, "total": 740 },
      "results": [
        {
          "id": "article:<uuid>:chunk:0",
          "article_id": "...",
          "chunk_no": 0,
          "url": "https://example.com/article/123",
          "title": "Ceasefire Talks Progress",
          "snippet": "A short snippet of the article text...",
          "lang": "ar",
          "source": "SANA",
          "published_at_bucket": 202508,
          "vector_score": 0.87,
          "rerank_pos": 0
        }
      ]
    }
    


2. LLM, Translation & ASR Endpoints

Protected Endpoints

These endpoints are protected and require a valid x-api-key header, which should match the API_KEY secret set in the worker.

Endpoint Method Model Description
GET /llm/health GET - Health check for the LLM service.
POST /llm/chat POST Llama 3.1 8B General-purpose chat and text generation.
POST /llm/translate POST M2M100 1.2B Text translation between languages.
POST /llm/asr POST Whisper v3 Turbo Audio transcription (ASR).

Example Payloads

  • /llm/chat:
    { "messages": [{"role": "user", "content": "Summarize this for me..."}], "max_tokens": 256 }
    
  • /llm/translate:
    { "text": "قمة دمشق تبحث آخر التطورات", "target_lang": "en", "source_lang": "ar" }
    
  • /llm/asr: The request body should be the raw audio file, or sent as a multipart form with a file field.

3. Development Endpoints

Development Only

These endpoints are for development and testing purposes only and are guarded by an x-dev-key: local header.

  • POST /rag/dev/echo-embed: Returns the vector embedding for a given text.
  • POST /rag/dev/vectorize-smoke: Runs a full smoke test (embed, upsert, query) for a single item.
  • POST /rag/dev/upsert-batch: Inserts a batch of items into the Vectorize index.