Introducing ‘Typhoon 2’: Advancing Thai LLMs, Driving the Future of AI for Thai
Typhoon 2’s Continuous Development
Typhoon is a large language model (LLM) specifically designed for the Thai language. It has been continuously developed. Currently, Typhoon demonstrates high-performance capabilities in Thai language processing, with a strong focus on fine-tuning to suit the specific context and nuances of the Thai language. Multiple versions have been released:
- Typhoon 1.0 or Typhoon-7B: This 7B parameter model comes in two versions: Pretrained and Instruction-tuned.
- Typhoon 1.5 and 1.5X: Featuring 8B and 70B parameter models, these versions offer increased performance and accuracy, rivaling leading models. They are designed for practical applications.
- Typhoon 2: This latest iteration boasts the highest performance, with 5 model sizes for diverse applications. It also includes the development of Multimodal Models capable of processing both image and audio data.
Typhoon 2 Model Sizes to Suit Every Need
Typhoon 2 is the latest update with increased parameter sizes to support a wider range of uses. It comes in five sizes:
- 1B (Model name: Typhoon2-1B)
- 3B (Model name: Typhoon2-3B)
- 7B (Model name: Typhoon2-7B)
- 8B (Model name: Typhoon2-8B)
- 70B (Model name: Typhoon2-70B)
The 1B and 3B models are ideal for less complex tasks like summarization and translation, especially in environments with limited processing resources. They are suitable for devices such as smartphones and low-power computers.
The 7B and 8B models are suitable for general tasks or proof-of-concept development before production deployment. They also cater to cost-conscious users and are ideal for workflows that prioritize simplicity and local context adaptation.
The 70B model is ideal for enterprise-level applications that demand exceptional accuracy or support mission-critical workflows. Its use requires higher processing resources, making it best suited for production environments capable of handling complex organizational requirements.
Key Features of Typhoon 2
- Enhanced Performance and Accuracy for Thai: Evaluated against benchmarks like ThaiExam and M3Exam, the model demonstrates superior Thai language processing compared to other open-source models.
- Improved Instruction Following: Compared to Typhoon 1.0 and Typhoon 1.5X, it shows better performance in following instructions, as assessed by IFEval-TH and MT-Bench standards.
- Increased Data Handling and Text Generation: With an expanded context length from 8,192 to 128,000 tokens, it can handle more complex information.
- State-of-the-Art Function Calling: Offers best-in-class function calling capabilities.
- Small Models for Mobile Devices: Includes small models for less complex tasks on mobile devices, such as summarization and translation.
- Safety Classifier Model (Preview): A safety classifier model adapted for the Thai language helps assess the appropriateness of input, providing alerts for potentially unsuitable content.
Understanding the Key Benchmarks for Evaluating Typhoon 2
Typhoon 2 utilizes important benchmarks to evaluate its performance and capabilities, particularly within the context of the Thai language. This allows developers to refine and enhance the model for better performance and capabilities.
- ThaiExam and M3Exam: Focus on assessing comprehension and problem-solving abilities in Thai.
- IFEval-EN and IFEval-TH: Evaluate instruction-following capabilities in both Thai and English.
- MT-Bench-EN and MT-Bench-TH (VISTEC): Assess the overall performance of the LLM across various tasks and usage dimensions in both Thai and English.
Enhanced Performance and Accuracy for the Thai Language in Typhoon 2
Results from the ThaiExam and M3Exam benchmarks (which evaluate comprehension and problem-solving abilities in Thai)
Typhoon 2’s Improved Instruction Following Capabilities:
Typhoon 2 shows enhanced instruction-following capabilities, particularly the 70B model, which achieves excellent results in both IFEval and MT-Bench. It matches or surpasses competitors on several metrics, making it suitable for advanced applications such as intelligent conversation and developing AI tools specifically for the Thai language.
Compared to previous models, Typhoon 2 demonstrates superior performance, especially in Thai language processing and instruction following, as measured by IFEval-TH and MT-Bench standards.
‘Typhoon 2 Audio’ and ‘Typhoon 2 Vision’ (Research Preview): Multimodal Capabilities
Typhoon 2 expands its capabilities beyond text processing with the introduction of Multimodal Models that encompass both audio and visual data. Two key models stand out:
- Typhoon2-Audio:
- Listen, understand, and respond better : It can take both text and voice as input and produce both text and voice as output simultaneously, enabling full two-way interaction. This is similar to the Advanced Voice Mode feature in ChatGPT, which allows for human-like conversation with Chat GPT
- Deeper Audio Understanding: It analyzes audio with greater detail, leading to a better understanding of its meaning, including emotional nuances in voice tones.
- Improved Instruction Following: It can follow more complex instructions, such as multi-turn conversations and answering questions requiring in-depth information.
- Text-to-Speech Support: It effectively converts Thai text to speech, outperforming other open-source models.
- Use Case: Converting Thai text into clear, intelligible audio, making it suitable for prototyping conversational systems or accessibility experiments
- Typhoon2-Vision:
- Enhanced Image Processing and Understanding: This model analyzes images and understands their content in detail.
- Built-in OCR (Optical Character Recognition): This function converts text within images or documents into digital text, enabling applications involving text extraction from documents or photos.
Overall, Typhoon2-Audio and Typhoon2-Vision offer significant potential for various applications, including creating virtual assistants, developing diverse applications, and analyzing image and audio data.
**Important Disclaimer: Both Typhoon2-Audio and Typhoon2-Vision are still under development and research and may not be suitable for commercial use. However, they can be experimented with or integrated into various applications.
Target User Groups for Typhoon 2
Typhoon 2 is designed to support a wide range of users, including:
- Researchers: Provides access to powerful and up-to-date natural language processing (NLP) tools for Thai, facilitating research and development in NLP technology.
- Data Scientists: Offers benchmarks and relevant datasets to develop and evaluate AI models.
- Software Engineers: Enables the creation of LLM-powered applications or the integration of LLMs into existing systems. For example, developing legal chatbots like the Sommai platform by VISAI.
- AI Engineers: Provides access to high-performance LLMs as tools for AI development and deployment of various AI solutions.
Target Industries for Typhoon 2
Typhoon 2 has the potential to be applied across various industries, addressing diverse needs:
- Consumer Finance: Developing chatbots for customer service and financial risk analysis.
- Healthcare: Analyzing medical data.
- Legal: Conducting legal research.
- Insurance: Assessing risks and developing new insurance products.
- Public Sector: Providing public information services and developing intelligent government service systems.
- Education: Developing teaching aids, creating learning content, and improving teaching efficiency.
Feedback from Typhoon Users
“Typhoon has a deep understanding of the Thai language and can be applied widely, such as in Text2SQL and RAG.”
- Data Analytics Team, SCB“The Typhoon API is a powerful and user-friendly tool that allows students to easily experiment with LLMs.”
- Dr. Titipat Achakulvisut, Department of Biomedical Engineering, Faculty of Engineering, Mahidol University“We are very impressed with the development of the Typhoon model by the development team.”
- SI Data+ Team“Typhoon is the best-performing LLM for the Thai language, especially for tasks that require knowledge and understanding of the Thai context, such as legal work, which is a domain that requires local knowledge for accurate and precise answers. Moreover, Typhoon can be used as a chatbot for answering questions, as well as acting as an agent in complex frameworks such as RAG or multi-agentic workflow effectively. This makes it the first choice for tasks that require LLMs on Thai language data.”
- Pawitsapak Akarajaradwong, Senior Data Scientist, VISAI AI"The Typhoon development team from SCB 10X has proven itself to be a leader in the research and development of Large Language Models (LLMs) by innovating and setting new standards for AI capabilities. In addition to developing cutting-edge LLM models and systems, they have also played a crucial role in advancing research in natural language processing. As the head of the Natural Language Processing and Representation Learning Lab (NRL) at VISTEC, I have had the opportunity to collaborate with researchers from the Typhoon team. In 2024, our collaboration led to 3 important research papers published at EMNLP on Multilingual Reasoning, Bias Mitigation, and Cross-lingual Retrieval Question Answering. We continue to collaborate by focusing on addressing critical AI research problems such as AI safety, emergent behavior evaluation, and AI system transparency. Our shared commitment to advancing both the practice and theory of AI ensures that we will help create research that has a positive impact on the global research community and helps elevate Thailand to be recognized on the world stage in AI research."
- Assoc. Prof. Sarana Nutanong, VISTEC
Partners Involved in Developing and Applying Typhoon 2
- VISTEC
- Mahidol University
- Artificial Intelligence Association of Thailand (AIAT)
- AI Singapore (AISG)
- Stanford Human-Centered Artificial Intelligence (Stanford HAI)
- Together AI
- SEA AI Lab
- InnovestX
- Sambanova Systems
Typhoon 2 is a significant step forward in the development of large language models for Thai. It focuses on increased performance and addresses the needs of various industries and businesses. With enhanced safety features, it can be effectively applied across all platforms. Typhoon continues to collaborate with partners across various industries to develop and deploy large language models for Thai, enhancing business potential and driving future applications sustainably.
Read more about Typhoon2 at
1.) Typhoon 2 Text Models: https://medium.com/opentyphoon/typhoon-2-release-9dd36e3882c0
2.) Typhoon 2 Multimodal Models: https://medium.com/opentyphoon/typhoon-2-multimodal-release-research-preview-200fe9015ad9
For more information and to try it out, visit the website: https://opentyphoon.ai/
Try the Typhoon2 Audio and Typhoon2 Vision models:
- Typhoon2 Audio: https://audio.opentyphoon.ai/ (update coming soon)
- Typhoon2 Vision: https://vision.opentyphoon.ai/ (coming soon)