Developer | Model ID | Context length | View on Hugging Face |
---|---|---|---|
DeepSeek | |||
DeepSeek-V3-0324 | 8k tokens | Model card | |
OpenAI | |||
Whisper-Large-V3 | N/A | Model card | |
Meta | |||
Llama-4-Scout-17B-16E-Instruct | 8k tokens | Model card | |
Llama-4-Maverick-17B-128E-Instruct | 8k tokens | Model card | |
Qwen | |||
Qwen2-Audio-7B-Instruct | N/A | Model card |
Developer | Model ID | Context length | View on Hugging Face |
---|---|---|---|
DeepSeek | |||
DeepSeek-R1 | 16k tokens | Model card | |
DeepSeek-R1-Distill-Llama-70B | 128k tokens | Model card | |
Meta | |||
Meta-Llama-3.3-70B-Instruct | 128k tokens | Model card | |
Meta-Llama-3.2-3B-Instruct | 8k tokens | Model card | |
Meta-Llama-3.2-1B-Instruct | 16k tokens | Model card | |
Meta-Llama-3.1-405B-Instruct | 16k tokens | Model card | |
Meta-Llama-3.1-8B-Instruct | 16k tokens | Model card | |
Meta-Llama-Guard-3-8B | 8k tokens | Model card | |
Qwen | |||
QwQ-32B | 16k tokens | Model card | |
Tokyotech-llm | |||
Llama-3.1-Swallow-8B-Instruct-v0.3 | 16k tokens | Model card | |
Other | |||
E5-Mistral-7B-Instruct | 4k tokens | Model card |