The RIKEN Center for Computational Science (R-CCS) envisions the future of the supercomputer "Fugaku" and the utilization of generative AI.

February 4, 2025

From the left in the photo, Satoshi Matsuoka
Director of the RIKEN Center for Computational Science, RIKEN Fumiyo Shoji
Head of the Operations Technology Division, RIKEN Center for Computational Science

Introduction The RIKEN Center for Computational Science (hereafter, R-CCS) continues to challenge societal issues by utilizing the world’s leading supercomputer "Fugaku" to solve social problems. Developed collaboratively by RIKEN and Fujitsu, "Fugaku" gained significant attention through droplet dispersion simulations during the COVID-19 pandemic. The visualization study conducted during the period of infection spread strongly impressed the public with the potential for science and technology to contribute directly to societal challenges. In recent years, research and development are progressing not only in traditional simulation areas such as weather forecasting, material development, and astrophysical calculations, but also in new research areas that fuse with generative AI (large language models: LLM). In this article, through interviews with Satoshi Matsuoka, the director of R-CCS, and Fumiyo Shoji, who leads the operations technology division of "Fugaku," we discuss the evolution of "Fugaku" into a supercomputer that directly assists society, its response to the global boom of generative AI, and expectations for domestic LLMs. Additionally, R-CCS plans to transition its first inquiry response on the Fugaku support site to the generative AI chat "AskDona" by February 2024, and why they are implementing generative AI using Retrieval-Augmented Generation (RAG) technology. We will explore the outcomes, challenges, and a vision linking towards a future 'AI co-pilot'.

(Interview supervision: GFLOPS Inc. Mori moto, Suzuki | Interview text: Google "Gemini 2.0" OpenAI "GPT 01 pro mode" | Photo: Karina Yada @instagram)

Topic: The world-leading performance of "Fugaku" driving social change
─ How supercomputers impact real-world issues such as healthcare and disaster prevention
‍The power of supercomputers supporting the backstage of LLM development
─ Why are supercomputers indispensable for large-scale language models handling large amounts of parameters?
‍The significance and strategy of domestic LLM development
─ What benefits does AI optimized for the Japanese language and cultural context bring?
Challenging the hallucination problem, RIKEN R-CCS's practical approach
─ How can we use AI that is not 100% accurate?
The leading edge of the Fugaku support site: The role of generative AI chat "AskDona"
─ Why can even beginners in simulation master advanced computational resources?
The future opened up by supercomputers and generative AI: A new trend of social return and industrial revitalization
─ The path toward creating new innovations and the possibilities that lie ahead

< Interview >

The world-leading performance of "Fugaku" driving social change

─ What technical breakthroughs and judgments led to the wide recognition of the supercomputer "Fugaku" in society?

‍Director Matsuoka (hereafter, Matsuoka): "Fugaku" became widely recognized during the COVID-19 pandemic for its simulations of droplet and aerosol dispersions. The visualization studies related to infection spread, conducted by "Fugaku," were widely reported in television, newspapers, and online media, significantly increasing its recognition across Japan. Originally, supercomputers were well-known among researchers and specialized fields, but during the pandemic, "Fugaku" gained attention from general society. Around 2020, "Fugaku" was at the assembly stage towards completion. At that timing, COVID-19 became a global crisis, and measures to combat the infection became urgent in Japan. Therefore, we repurposed advanced fluid calculation techniques cultivated in internal combustion engine and aerodynamic simulations to analyze droplet dispersion. As a result, we were able to visualize the insight that "mask-wearing and ventilation are effective" with scientific evidence and present it to the public, which was immensely significant.

"Fugaku" achieved the first place in the international benchmark "HPCG" (High Performance Conjugate Gradient), competing in processing speed of the conjugate gradient method used in industrial applications, for ten consecutive periods as of November 19, 2024, maintaining its world-leading performance. It has significant meaning that this computing power can be used to solve social issues such as healthcare, industry, and disaster prevention. Originally, supercomputers have been used in diverse simulation fields such as weather forecasting, material development, and astrophysics, but "Fugaku" clearly demonstrated its potential through the new societal need — infectious disease control. Contributing directly to alleviating social unrest through high-precision simulations and presenting the importance of mask-wearing and ventilation based on scientific evidence is a typical example of how supercomputers can contribute to society.

Data source: Google Trends (https://www.google.com/trends)

The power of supercomputers supporting the backstage of LLM development─ In recent years, generative AI (LLM) has caused a global boom, but how is this rapid development supported by supercomputers?

‍Matsuoka: AI has experienced several booms throughout its history and is currently positioned in the so-called "fourth AI boom." In the past, neural networks gained attention, and thereafter, there were various technological milestones such as CNN (Convolutional Neural Networks), GAN (Generative Adversarial Networks), and RNN (Recurrent Neural Networks). With the advent of transformers, LLMs flourished, leading to the current generative AI boom. The relationship between supercomputers and AI has been deepening since the early 2010s. Around 2012, Google Brain used computing resources at the scale of 1000 CPUs to train neural networks, which is one example. Afterward, the spread of GPUs dramatically improved the processing speed of deep learning, making supercomputer-level computing environments essential for AI research. LLMs especially require massive computational resources for training due to their enormous number of parameters and data volumes. Creating models with trillions of parameters may require occupying supercomputers for months and involve investments of hundreds of millions of yen, making it an area that cannot be sustained without supercomputers. Such LLMs are expected to have extremely significant value in business and social application. Traditionally, supercomputers have contributed to solving scientific and industrial problems through advanced simulations, generating enormous economic effects. However, LLMs can be applied to an even broader range of uses, making it easier to service and productize, thus attracting investment and development funding and exploding economic value. We have been early adopters of AI technology in research and practice. "Fugaku" aims to expand beyond successful simulation fields into various AI applications, including generative AI. In future supercomputer development, it will no longer simply be a simulation machine but will actively integrate diverse AI tasks, particularly focused on LLMs, by incorporating GPU-type accelerators and specialized hardware. We believe we are entering an era where fusion with AI is indispensable.

The significance and strategy of domestic LLM development
─ Interest in nurturing domestic LLMs is rising, but amidst fierce performance competition with overseas LLMs, what is the significance of Japan independently pursuing development?

‍Matsuoka: There are several clear significances to domestic LLM development. First, LLMs are not merely language processing technologies; they are significantly influenced by cultural and social contexts and values reflected in the training data. Unique grammatical structures and honorific expressions in Japanese, as well as subtle nuances, can be difficult to reproduce with models developed overseas. Domestic LLMs can be optimized for the Japanese language and cultural context, enabling more natural and accurate communication. Additionally, ensuring technological autonomy and competitiveness is also a crucial point. LLMs are at the core of modern AI technology and will play a foundational role in broader fields in the future. If we rely completely on these core technologies from abroad, there are risks of facing constraints or losing competitiveness down the line. Maintaining the capability to develop LLMs in one’s own country equates to preserving the ability to continually create and improve cutting-edge models. This could be likened to military preparedness—having the technological prowess to create what is needed when required is strategically important. Furthermore, applications in scientific and technological fields, such as our focus on AI for Science, are a significant motivation for developing domestic LLMs. LLMs hold the potential to create new value across various sectors, including science, healthcare, education, and manufacturing. Especially in highly specialized fields, it is necessary to train models with specialized knowledge that general LLMs cannot cover. Domestic LLMs can effectively incorporate specialized data and insights held by Japanese research institutions and companies, strongly supporting advancements in science and technology. Of course, developing high-quality domestic LLMs requires large-scale training datasets, massive computational resources using supercomputers, and collaboration with experts from various fields such as linguistics, information science, and computational science. We aim to maximize the resource utilization of "Fugaku" and optimize it for the Japanese language and specialized fields. In the future, we expect domestic LLMs to be utilized in various areas such as healthcare and education, enriching people's lives. Achieving this requires not just research and development, but also the establishment of a conducive environment for dissemination. We aim to act as a reliable entity, ensuring reliability and quality, smoothly delivering technology to society, and laying the groundwork for domestic LLMs to be widely accepted.

Challenging the hallucination problem, RIKEN R-CCS's practical approach
─ Many companies are concerned about hallucinations (when AI generates information that doesn’t exist) and risks of information leakage. What do you believe R-CCS can do to alleviate these concerns and promote the practical use of generative AI?

Director Shoji (hereafter, Shoji): Indeed, many companies expect "100% accurate answers," making hallucination problems a significant barrier. However, we believe that it’s necessary to conceptualize not demanding 100% but rather offloading a large majority of extensive search, summarization, and analysis tasks performed by humans. If it can be 95% correct, it significantly reduces the manual work, which is valuable enough. Furthermore, by employing technologies like RAG that combine reference information, mistakes can be reduced. The introduction of generative AI to the Fugaku support site is an example of this. Moreover, research institutions like RIKEN can serve as model cases that openly share insights gained from successes and failures, creating an environment where domestic companies can adopt such technology with confidence. Generative AI has only recently emerged, and it is natural for companies to lack technical understanding. Thus, we have to take the lead, visualizing effectiveness and challenges. This way, a common understanding can spread that "it may not be such a big problem after all" and "while hallucinations do occur, they are within an acceptable range."

The leading edge of the Fugaku support site: The role of generative AI chat "AskDona"
─ Why did you decide to integrate generative AI into the support site for the world-class computing power of "Fugaku"? Please share the intention and background behind this decision.

Shoji: "Fugaku" is a supercomputer with top-class performance in the world, but the ways to use it and optimization methods vary greatly. Access is open even for high school students and individuals, as long as they go through necessary applications and reviews. However, to effectively utilize it, one must understand vast manuals and FAQs while searching for appropriate settings and commands. This high barrier has posed a significant obstacle for new users. Then, the emergence of generative AI similar to ChatGPT allows immediate answers when inquiries are made in natural language. I felt that this intuitive interface was absolutely usable. By creating a system that refers to internal documents while answering through RAG technology, users will only need to inquire, "I want to execute ___, how do I do it?", and receive accurate guidance. RAG (Retrieval Augmented Generation) refers to the method of referencing relevant source documents or data in real-time while generating responses with LLMs, contributing to reducing hallucinations and enhancing the accuracy of answers. This means that the answers generated will no longer be merely questions directed at a large-scale language model; instead, they will be supported by the specialized manuals and technical guides of "Fugaku", leading to more accurate and context-aware responses. Thus, users can search for information that previously took hours in an instant, significantly shortening the time to solve problems. Of course, it is not possible to eliminate hallucinations entirely at this point. However, by backing them up with RAG, we can significantly reduce them. Additionally, in the operation concerning "Fugaku", alongside answers from AskDona, we also list links to primary sources, further suppressing the impact of hallucinations. In actual prototype operations, there are increasingly more cases where users obtain answers more smoothly than expected. Over time, by improving the model and creating a feedback loop, precision and reliability will further enhance. Areas that involve complex regulations, specifications, and documents are particularly beneficial, as the summarization and extraction capabilities of generative AI are highly effective. Within companies, for instance, in manufacturing, there are abundant documents with complex machinery manuals, extensive security regulations, summary patents, and more, which are overly difficult to process. By applying generative AI × RAG here, we can achieve efficiencies unattainable by manpower alone.

The future opened up by supercomputers × generative AI: New trends of social return and industrial revitalization
─ What future are you envisioning beyond the support improvements using RAG technology?

Shoji: One ideal scenario is that users only need to convey their broad idea, such as "I want to perform this simulation" or "I want to try this parameter," to the AI, which will automatically handle necessary computational procedures, resource settings, and even assembling the actual simulation commands. Typically, supercomputer operations and optimization require advanced expertise, but when such a system is in place, high-level computations that previously required specialized skills should be more accessible to a broader group of people. For example, submitting computational jobs, retrieving results, loading appropriate modules and libraries, reserving in job schedulers, securing disk space, and other tasks that users previously had to configure manually with an understanding of the system, could all be automated, including optimization, by AI. This would substantially reduce time, effort, and technical learning costs, allowing anyone to swiftly master advanced computing resources. The introduction of generative AI to the Fugaku support site doesn’t just have tangible effects on the convenience of supercomputer operations. It serves as a model case for improving research and development environments utilizing cutting-edge technology. Users will have lowered barriers to using supercomputers and will be able to allocate time to new challenges. Consequently, new discoveries or technologies could arise, contributing back to society. Through this flow, other diverse sectors and industries may determine that if RIKEN is involved, the safety and utility will be guaranteed to some extent. This could act as a catalyst promoting a proactive stance towards AI use throughout society. We, too, are still midway through this journey. However, by incrementally introducing and improving, we hope to establish an increasingly refined support system and computing environment. RIKEN prioritizes scientific evidence and validation at all times. Through the efficiency and practical examples of RAG, publicizing metrics, and maintaining transparency in the improvement process, we aim to provide a solid foundation for the utilization of AI within the country. As society grows more positive towards AI, new value will be created.

─ Lastly, what future vision do you sketch regarding the growing importance of the relationship between generative AI and supercomputers?

Matsuoka: The current "Fugaku" has been designed to achieve general-purpose high-performance computing and is suitable for a wide range of applications. However, with the rapid advancement of AI technology, future supercomputers will be required to meet more diverse and advanced computational needs. For example, designs that allow for more efficient fine-tuning of LLMs or specialized configurations for generative AI to operate smoothly are among the new directions we are considering. Our mission is to strive for system designs that meet the needs of the times and to realize computing environments desired by users. By leveraging the technological knowledge and operational know-how cultivated in simulation fields so far, we hope to accelerate the utilization of AI for scientific, industrial, and societal issues. RIKEN, as a national research institution, not only leads cutting-edge scientific technology but also has a role in contributing its results back to society. The role that "Fugaku" played during the COVID-19 pandemic demonstrated the power of the connection between science and society. It showcased the potential for information grounded in scientific evidence to move society. Now, we wish to replicate that experience in the field of AI utilization. If we can apply generative AI technology in practice and demonstrate its safety, efficacy, and efficiency through real examples, it could serve as a trigger for many domestic companies to embark on AI implementation. In the future, we plan to expand our collaboration partners further and advance data sharing and infrastructure development. By building a comprehensive research support system, we aim to facilitate smoother utilization of generative AI by domestic companies and researchers. This will accelerate the creation of innovations, ultimately enhancing the overall technological capabilities and industrial competitiveness of Japan.

Satoshi Matsuoka Ph.D. in Science, Graduate School of Science, University of Tokyo, 1993. Professor at the Academic Center for Computing and Media Studies, Tokyo Institute of Technology (now Tokyo Science University) since 2001. Lab head at AIST-Tokyo Institute of Technology RWBC-OIL Lab in 2017. Currently holds a specific professor position at the School of Computing, Tokyo Science University. Specializes in high-performance computing systems. Involved in the research and development of the TSUBAME series of supercomputers, achieving top ranks in energy efficiency and various other benchmarks, as well as foundational research on parallel algorithms for massively parallel computers, programming, fault tolerance, energy conservation, and integration with big data and AI. Awarded ACM Fellow by the Association for Computing Machinery in 2009, ACM Gordon Bell Prize in 2011, Minister of Education, Culture, Sports, Science and Technology Award in 2013 for contributions in scientific and technological fields, and received the first IEEE Sidney Fernbach Award for supercomputing accomplishments in 2014 as a Japanese individual. In 2018, received the Career Award at the HPDC international conference hosted by ACM, and in 2019, the Asia HPC Leadership Award at SCAsia 2019. In 2021, won the prestigious ACM Gordon Bell Prize for the second time. In 2022, awarded the Information Processing Society of Japan Achievement Award, NEC C&C Foundation C&C Award, and the highest award for supercomputer achievement, the Cray Award (IEEE-CS Seymour Cray Computer Engineering Award). First-ever recipient of both the Fernbach and Cray Awards. Additionally, honored with the Order of the Sacred Treasure for lifetime contributions to computer science research. In 2024, selected as one of the "HPCwire 35 Legends" chosen by HPCwire magazine (USA). Fellow of the Information Processing Society of Japan.

Fumiyo Shoji Withdrew from the Graduate School of Natural Science, Kanazawa University, in 1998, obtaining a doctorate in Science in 2000. Served as an assistant at Hiroshima University’s Center for Information Education and Research (currently the Center for Education and Research in Information Media) in 1998. Head of the Next-Generation Supercomputer Development Headquarters at RIKEN in 2006, followed by the Operations Technology Division of the RIKEN Center for Computational Science, serving as its head since 2014. Engaged in improving the operational efficiency and utilization of large-scale HPC systems. Recipient of the ACM Gordon Bell Prize in 2010, the Achievement Award of the Institute of Electronics, Information and Communication Engineers in 2012, and the RIKEN Umebayashi Award in 2024.

About GFLOPS Inc. GFLOPS Inc. leverages cutting-edge AI technology and data analysis capabilities to provide AI solutions that support businesses in improving operational efficiency and creating innovations. Particularly, our unique solutions combining large Language models (LLM) and RAG (Retrieval Augmented Generation) technology achieve high response accuracy and flexibility, with implementations progressing across many companies. Company Name: GFLOPS Inc. (English: GFLOPS Co., Ltd.)
Headquarters: Shibuya, Tokyo
Business Content: Development and provision of AI services utilizing large language models (LLM) generative AI technology, etc.

Note 1: Supercomputer "Fugaku"
The successor to the supercomputer "K". Aimed at contributing to Japan's growth in solving social and scientific challenges and producing world-leading results in the 2020s, its introduction began in March 2021 as the world’s highest level supercomputer in terms of power efficiency, computational performance, user convenience, usability, and capability for generating groundbreaking results, as well as for accelerating big data and AI.

Note 2: LLM (Large Language Model)
Refers to neural networks with vast parameters numbering from billions to hundreds of billions. It serves as a core technology in natural language processing and generative AI, enabling advanced language understanding and generation by learning from immense volumes of text data. It has become a cornerstone supporting the rapid development of the generative AI field.

Note 3: RAG (Retrieval Augmented Generation)
A technology that allows large language models to refer to external documents and data sources in real-time while generating responses. This contributes to suppressing hallucinations (generation of incorrect information) and improving the accuracy of responses.

Note 4: HPCG (High Performance Conjugate Gradient)
A benchmark that competes in processing speeds of conjugate gradient methods used in industrial applications. Unlike theoretical performance assessments by LINPACK, it is seen as a crucial index for evaluating more realistic computational processes. As of November 2024, "Fugaku" has claimed the world’s first place in this ranking for ten consecutive terms.

Note 5: CNN (Convolutional Neural Network)
Abbreviation of Convolutional Neural Network. It is a type of neural network that exhibits high performance in image recognition and speech recognition, utilizing convolution operations to effectively capture local features of input data (mainly images), thus allowing for precise classification or detection.

Note 6: GAN (Generative Adversarial Network)
Abbreviation of Generative Adversarial Network. This learning method features two opposing networks, a Generator and a Discriminator, to generate data that is detailed enough to be indistinguishable from actual data. Its applications are spreading to areas such as image synthesis and the creation of new designs.

Note 7: RNN (Recurrent Neural Network)
Abbreviation of Recurrent Neural Network. This type of neural network is suitable for handling data in which "sequence" and "context" are essential, such as time-series data or sentences. It processes calculations while retaining past information like memory, making it widely utilized in natural language processing and speech recognition.

Note 8: Google Brain
The general term for an AI research project and team initiated by Google. Around 2012, it gained fame for training neural networks on a large scale and achieving significant advancements in image recognition, becoming one of the leading forces driving the deep learning boom.

Note 9: GPU (Graphics Processing Unit)
Originally designed to rapidly process computer graphics, it has recently become indispensable hardware in supercomputing and AI research due to its high performance in large-scale parallel computation, particularly in deep learning.

Note 10: AI for Science
Aiming to apply AI not only in engineering and industry but also directly to fundamental sciences and academic research, promoting scientific discoveries and new insights. This is a focus area for RIKEN R-CCS, with the development of domestic LLMs and the advancement of AI research utilizing supercomputers being part of this effort.

Note 11: AskDona
The generative AI chat integrated into the Fugaku support site by RIKEN R-CCS. Utilizing RAG technology, it is designed to answer users' questions while referencing the manuals and technical documents of "Fugaku."

NEWS

&

ARTICLES

February 18, 2025

RIKEN Announces Full Transition to "AskDona" Generative AI Chat for Initial Inquiries on the Supercomputer "Fugaku" Support Site

February 18, 2025

RIKEN Announces Full Transition to "AskDona" Generative AI Chat for Initial Inquiries on the Supercomputer "Fugaku" Support Site

February 4, 2025

The RIKEN Center for Computational Science (R-CCS) envisions the future of the supercomputer "Fugaku" and the utilization of generative AI.

February 4, 2025

The RIKEN Center for Computational Science (R-CCS) envisions the future of the supercomputer "Fugaku" and the utilization of generative AI.

July 9, 2024

The RIKEN Center for Computational Science has started using the GFLOPS AI assistant "AskDona" to provide a real-time response service utilizing generative AI technology to users of the Fugaku support site.

July 9, 2024

List