Skip to content

OpenAI compatible API

GenAI4Science portal has built-in compatibility with the OpenAI Chat Completions API, making it possible to use more tooling.

API urls

GenAI4Science provides two OpenAI API compatible endpoints.

Default endpoint

This endpoint provides access to all installed models, making it ideal for testing and exploring the capabilities of each model or for applications with limited API call volumes.

Note that this endpoint is rate-limited to ensure stability and protect the user interface.

The default OpenAI API base url is:

https://genai.science-cloud.hu/api/
You can authenticate your API requests using the Bearer Token mechanism. Obtain your API key from Settings > Account in the Portal.

Performance endpoint

GenAI4Science portal is currently optimized to allow users to try out multiple models. Since not every model can fit in the memory of the GPU cards at the same time, there may be a slight delay between model switching. In the case of intensive API calls, model switching noticeably slows down processing.

This endpoint is separated from the user interface and provide models with dedicated GPU memory. Embeddings are only supported through this endpoint.

Please contact us to request access if you would like to make bulk API calls for an extended period of time or use the API for a production service. Briefly describe your use case and specify the required models.

Context Window

The OpenAI API does not provide an option for setting context length. All OpenAI API endpoints offer a minimum context window of 25k tokens for all models, with the following exceptions:

  • llama4:maverick - 4k tokens
  • codellama:13B - 12k tokens
  • mistral-small3.2:24b, qwen3:32b - 16k tokens
  • gpt-oss:120b - 50k tokens
  • gemma3:27b, qwen3-vl:30b-a3b - 64k tokens

We are continuously working on expanding the context window. Contact us if you require a larger context window.

Limitations

Each backend instance can handle limited number of requests per model at the same time. If this limit has been reached, then the request is waiting in the backend queue. When this queue is full, the following error message is displayed:

Ollama: 503, message='Service Unavailable'

After a short wait it is advisable to try again or to choose another model.

To protect the UI from overload, we have implemented a rate limit.

"Rate limit exceeded. Please try again later."

After a short wait, try again or request access to the performance endpoint.

Usage

To invoke OpenAI compatible API endpoint, use the same OpenAI format

Developer tools, extensions

Set the OpenAI API base url and the API key. Unless specifically requested, do not set the "chat/completions" endpoint, only the base URL.

Continue

It is the leading open-source AI code assistant inside of VS Code and JetBrains. To install it, see the documentation.

Configuration

See the following model configuration example:

models:
 - name: qwen3-coder:30b
   provider: openai
   model: qwen3-coder:30b
   apiBase: https://genai.science-cloud.hu/api/
   apiKey: sk-... 
   roles:
      - chat
      - apply
      - autocomplete

cURL

Set your API key in the header parameter "Authorization" after "Bearer"

curl https://genai.science-cloud.hu/api/chat/completions \
    -H "Content-Type: application/json" \
    -H "Authorization: Bearer sk-9a..." \
    -d '{
        "model": "llama3.1:8b",
        "messages": [
            {
                "role": "system",
                "content": "You are a helpful assistant."
            },
            {
                "role": "user",
                "content": "Hello!"
            }
        ]
    }'

Python

pip install openai

Set your API key

from openai import OpenAI
client = OpenAI(base_url="https://genai.science-cloud.hu/api/", api_key="sk-9a...")

print(client.models.list())

response = client.chat.completions.create(
  model="llama3.1:8b",
  messages=[
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "What is a LLM?"}
  ]
)

print(response)

More details: https://ollama.com/blog/openai-compatibility