A research team at the Tokyo Institute of Technology and the National Institute of Advanced Industrial Science and Technology have released Swallow, a large-scale language model that is the foundation of a generative AI with excellent Japanese language proficiency. It is the largest large-scale language model that supports Japanese, and is open and available for commercial use.

 In recent years, research and development of large-scale language models, such as OpenAI's ChatGPT and GPT-4, and Google's PaLM 2 and Gemini, have progressed rapidly. Although progress is being made in the development of large-scale language models that are strong in Japanese, there have been few open and high-performance large-scale language models.

 The Llama 2 series developed by Meta AI shows high performance in English, but is weak in reading and writing Japanese. Therefore, the research team built a large-scale language model ``Swallow'' based on several models of Llama 2. A method of additional pre-training (continuous pre-training) to a trained large-scale language model showed high performance for Japanese.

 In addition, since Llama 2 is an English-focused model, major Japanese words and characters are not included in the vocabulary, the text is divided into unnatural units (tokens), and the text is expressed with more tokens. Learning and generation efficiency decreases and computational cost increases. By adding vocabulary such as Japanese characters and words (a set of tokens that can be handled by a language model), the token length of Japanese text was reduced to 56.2%.

 Furthermore, the research team independently extracted and refined Japanese texts from archives distributed by the non-profit organization Common Crawl, and built a Japanese web corpus consisting of approximately 3,121 billion characters (approximately 1.73 million pages). This is the largest commercially available training corpus of Japanese language models.

 The introduction of a large-scale language model that is strong and open to Japanese will further promote the research, development, and utilization of large-scale language models in Japan, leading to further product development and technological innovation.

reference:[National Institute of Advanced Industrial Science and Technology] Releases “Swallow”, a large-scale language model that is good at Japanese - Teaching Japanese to a large-scale language model that is good at English -

Tokyo Institute of Technology

The pinnacle of science and engineering university that continues to produce science and engineering people with high aspirations and a Japanese spirit, mastering the knowledge that creates the times, refining their skills

Tokyo Institute of Technology was established as the Tokyo Institute of Technology in 1881 (Meiji 14), when modernization of industry was an urgent need.Since its establishment, it has continued to produce excellent research results with excellent science and engineering human resources, and is still at the top of Japan's science and engineering universities.Tokyo Institute of Technology requires not only a high degree of specialization but also liberal arts […]

University Journal Online Editorial Department

This is the online editorial department of the university journal.
Articles are written by editorial staff who have a high level of knowledge and interest in universities and education.