need more info from comparison tests (Alban will do them when he has time)
Criteria:
task
length of processed documents (context window - nb of tokens)
price
response time
Best performing models
GPT 4o → the best performing
GPT 4o mini → very cheap and very fast with good performance
Mistral Large → slightly worse than GPT 4o but slightly cheaper
Groq models → ideal when the text to generate is long