Selene: Frontier AI evaluation models

Get precise judgments on your AI app’s performance. Run evals with the Selene models, the most accurate LLM Judges on the market.

Try the API for free

Try the API for free

Run evals with 
our LLM-as-a-Judge

Need to build trust with customers that your generative AI app is reliable? Judge your AI responses
with our evaluation models and receive scores and actionable critiques.

Selene models

Explore the right size and implementation methods
for your evaluation needs.

Optimized for speed

Selene 1 Mini

The best evaluation model of its size (8B). Suitable for running evals at inference time.

Industry-leading accuracy

Selene 1

The best model for evaluation on the market. Capable of accurately judging a wide variety of eval tasks, as well as adapting to custom eval criteria. Suitable for pre-production evals.

Cost

Intelligence

A new standard for AI evaluations

01

State-of-the-art models

Decorative video with abstract shapes

Selene outperforms frontier models on commonly-used evaluation benchmarks, making it the most accurate and reliable model for evaluation.

02

Customize to your use case

Decorative video with abstract shapes

Make your evals more fine-grained, format your score as you wish, and fit eval criteria to your use case with few-shots in our Eval Copilot (beta).

03

Accurate scores, actionable critiques

Designed for straightforward integration into existing workflows. Use our API to generate accurate eval scores with actionable critiques.

Decorative video with abstract shapes

Read the blog post

Read the blog post

February 26, 2025

Introducing Selene 1: the world’s best LLM-as-a-Judge

Pricing plans

Free

Designed for hobbyists who want to start their project solo

Free credits per month:
1,000 free API calls (Selene)
3,333 free API calls (Selene Mini)

Receive an evaluation score and a critique for each API call
‍

Upgrade any time

Graduate to the next tier by adding your billing details

‍

Sign up for free

Sign up for free

Key features

API access

Build your own metrics on Eval Copilot

Community Discord

SOCII report available upon request

Shared Slack channel

Support SLA

Rate limits

100 requests / minute

Pro

Designed for startups with AI applications in production

After monthly free credits:
$10 / 1K API calls (Selene)
$3 / 1K API calls (Selene Mini)

Receive an evaluation score and a critique for each API call
‍

5x higher rate limits

Monitor model outputs at production scale

Add payment method

Add payment method

Key features

API access

Build your own metrics on Eval Copilot

Community Discord

SOCII report available upon request

Shared Slack channel

Support SLA

Rate limits

500 requests / minute

Enterprise

Designed for teams with more security, deployment, and support needs

Enterprise grade security and support

Secure VPC peering, private deployments, dedicated endpoints, and 24/7 priority support
‍

Scalable pricing

Pricing options that scale with your evaluation volume.
‍

Custom rate limits

Key features

API access

Build your own metrics on Eval Copilot

Community Discord

SOCII report available upon request

Shared Slack channel

Support SLA

Rate limits

Custom

Boost your GenAI accuracy

Run evals with Selene 1 and Selene Mini

Custom eval metric deployment using Eval Copilot (beta)

Free credits & usage-based pricing

Docs & guides