How it works

An open-source LLM comparison tool by SUPA allows users to input prompts and compare the performance of language models in a blind test format. Simply select two models, then test them across various prompts and scenarios tailored to your domain. Each round provides anonymized responses for evaluation, helping you gain a deeper understanding of each model’s capabilities. All collected data will be published to contribute to open-source research.

Learn more

Select models to compare

Model 1

Model 2