Models

GenAI4Science portal is currently optimized for enabling multiple models, so that users can try as much as possible. Various larger model sizes of the DeepSeek, Gemma, Llama, Mistral and Qwen model families are available, including versions optimized for coding and for visual recognition.

Since not every model can fit in the memory of the GPU cards at the same time, it happens that you have to wait a bit when changing models.

Each backend instance can handle limited number of requests per model at the same time (usually 4). If this limit has been reached, then the request is waiting in the backend queue. When this queue is full, the following error message is displayed:

Ollama: 503, message='Service Unavailable'

After a short wait, you should press the "Regenerate" button below the message to try again. If this message still occurs, it is advisable to choose another model.

View the active users indicator and click on it to see the currently used models in the users menu.

To protect the UI from overload, we have implemented a rate limit.

"Rate limit exceeded. Please try again later."

After a short wait, you should press the "Regenerate" button.

Available models

Our models are regularly updated, and the list below may not reflect the currently available models. For the most up-to-date information on available models and their brief descriptions, please log in to access the UI. If you require a specific model that is not currently installed, please contact us.

DeepSeek

deepseek-r1:32b, 70b

DeepSeek’s first-generation reasoning models, achieving performance comparable to OpenAI-o1 across math, code, and reasoning tasks.

Distilled models

DeepSeek team has demonstrated that the reasoning patterns of larger models can be distilled into smaller models, resulting in better performance compared to the reasoning patterns discovered through RL on small models.

GenAI4Science supports DeepSeek-R1-Distill-Llama-70B and DeepSeek-R1-Distill-Qwen-32B.

License

The model weights are licensed under the MIT License. DeepSeek-R1 series support commercial use, allow for any modifications and derivative works, including, but not limited to, distillation for training other LLMs. Please note that:

The Qwen distilled models are derived from Qwen-2.5 series, which are originally licensed under Apache 2.0 License, and now finetuned with 800k samples curated with DeepSeek-R1.

The Llama 8B distilled model is derived from Llama3.1-8B-Base and is originally licensed under llama3.1 license.

The Llama 70B distilled model is derived from Llama3.3-70B-Instruct and is originally licensed under llama3.3 license.

Google Gemma

gemma3:27B

Gemma is a lightweight, family of models from Google built on Gemini technology. The Gemma 3 models are multimodal - processing text and images - and feature a 128K context window with support for over 140 languages. They excel in tasks like question answering, summarization, and reasoning, while their compact design allows deployment on resource-limited devices.

The model has good knowledge of the Hungarian language.

Meta Llama

The most capable openly available LLM to date

llama3.1:8B

The upgraded version of the 8B model is multilingual and has a significantly longer context length of 128K, state-of-the-art tool use, and overall stronger reasoning capabilities. This enables Meta’s models to support advanced use cases, such as long-form text summarization, multilingual conversational agents, and coding assistants.

Meta also has made changes to their license, allowing developers to use the outputs from Llama models to improve other models.

Model evaluations

For this release, Meta has evaluation the performance on over 150 benchmark datasets that span a wide range of languages. In addition, Meta performed extensive human evaluations that compare Llama 3.1 with competing models in real-world scenarios. Meta’s experimental evaluation suggests that our flagship model 405B is competitive with leading foundation models across a range of tasks, including GPT-4, GPT-4o, and Claude 3.5 Sonnet. Additionally, Meta’s smaller models are competitive with closed and open models that have a similar number of parameters.

llama3.2:3B

The Meta Llama 3.2 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction-tuned generative models in 1B and 3B sizes (text in/text out). The Llama 3.2 instruction-tuned text only models are optimized for multilingual dialogue use cases, including agentic retrieval and summarization tasks. They outperform many of the available open source and closed chat models on common industry benchmarks.

The 3B model outperforms the Gemma 2 2.6B and Phi 3.5-mini models on tasks such as:

Following instructions
Summarization
Prompt rewriting
Tool use

Supported Languages: English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai are officially supported. Llama 3.2 has been trained on a broader collection of languages than these 8 supported languages.

llama3.3:70B

The model offers similar performance compared to Llama 3.1 405B model.

The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and instruction tuned generative model in 70B (text in/text out). The Llama 3.3 instruction tuned text only model is optimized for multilingual dialogue use cases.

Supported languages: English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.

llama4:scout

The Llama 4 collection of models are natively multimodal AI models that enable text and multimodal experiences.

Input: multilingual text, image
Output: multilingual text, code

Llama 4 Scout (109B parameter MoE model), a 17 billion active parameter model with 16 experts, is the best multimodal model in the world in its class and is more powerful than all previous generation Llama models. Additionally, Llama 4 Scout offers an industry-leading context window of 10M and delivers better results than Gemma 3, Gemini 2.0 Flash-Lite, and Mistral 3.1 across a broad range of widely reported benchmarks.

codellama:13B

Code Llama is a model for generating and discussing code, built on top of Llama 2. It’s designed to make workflows faster and efficient for developers and make it easier for people to learn how to code. It can generate both code and natural language about code. Code Llama supports many of the most popular programming languages used today, including Python, C++, Java, PHP, Typescript (Javascript), C#, Bash and more.

GenAI4Science supports the 13B model size.

More information: How to prompt Code Llama

Mistral

Mistral AI is a French company specializing in artificial intelligence products.

mistral-large:123B

Mistral Large 2 is Mistral's new flagship model that is significantly more capable in code generation, mathematics, and reasoning with 128k context window and support for dozens of languages.

Mistral-Large-Instruct-2407 is an advanced dense Large Language Model (LLM) of 123B parameters with state-of-the-art reasoning, knowledge and coding capabilities. Key features

Multi-lingual by design: Dozens of languages supported, including English, French, German, Spanish, Italian, Chinese Japanese, Korean, Portuguese, Dutch and Polish.
Proficient in coding: Trained on 80+ coding languages such as Python, Java, C, C++, JavacScript, and Bash. Also trained on more specific languages such as Swift and Fortran.
Agentic-centric: Best-in-class agentic capabilities with native function calling and JSON outputting.
Advanced Reasoning: State-of-the-art mathematical and reasoning capabilities.
Mistral Research License: Allows usage and modification for research and non-commercial usages.
Large Context: A large 128k context window.

mistral-small3.1:24B

Building upon Mistral Small 3, Mistral Small 3.1 (2503) adds state-of-the-art vision understanding and enhances long context capabilities up to 128k tokens without compromising text performance.

devstral:24B

Devstral (May 21, 2025) is an agentic LLM for software engineering tasks built under a collaboration between Mistral AI and All Hands AI 🙌. Excels at using tools to explore codebases, editing multiple files and power software engineering agents. The model achieves remarkable performance on SWE-bench which positionates it as the #1 open source model.

It is finetuned from Mistral Small 3.1, the vision encoder was removed.

codestral:22B

Codestral (May 29, 2024) is Mistral AI’s first-ever code model designed for code generation tasks. It is a 22B model. Fluent in 80+ programming languages

Codestral is trained on a dataset of over 80 programming languages, including Python, Java, C, C++, JavaScript, Swift, Fortran and Bash.

The model can complete coding functions, write tests, and complete any partial code using a fill-in-the-middle mechanism.

More information: https://mistral.ai/news/codestral/

Qwen

Qwen is a family of large language models developed by Alibaba Cloud.

qwen3

Qwen 3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. The flagship model, Qwen3-235B-A22B, achieves competitive results in benchmark evaluations of coding, math, general capabilities, etc., when compared to other top-tier models such as DeepSeek-R1, o1, o3-mini, Grok-3, and Gemini-2.5-Pro. Additionally, the small MoE model, Qwen3-30B-A3B, outcompetes QwQ-32B with 10 times of activated parameters, and even a tiny model like Qwen3-4B can rival the performance of Qwen2.5-72B-Instruct.

Uniquely support of seamless switching between thinking mode (for complex logical reasoning, math, and coding) and non-thinking mode (for efficient, general-purpose dialogue) within single model, ensuring optimal performance across various scenarios. Use "/no-think" to disable thinking.

Tell me a random fun fact about the Roman Empire /no-think
Significantly enhancement in its reasoning capabilities, surpassing previous QwQ (in thinking mode) and Qwen2.5 instruct models (in non-thinking mode) on mathematics, code generation, and commonsense logical reasoning.
Superior human preference alignment, excelling in creative writing, role-playing, multi-turn dialogues, and instruction following, to deliver a more natural, engaging, and immersive conversational experience.
Expertise in agent capabilities, enabling precise integration with external tools in both thinking and unthinking modes and achieving leading performance among open-source models in complex agent-based tasks.
Support of 100+ languages and dialects with strong capabilities for multilingual instruction following and translation.