마이크로소프트연구소와 공동연구협력을 위한 프로젝트 모집 공고

  1. 목적

마이크로소프트연구소와 공동연구협력 프로젝트* 선정

* 2025년도 「디지털분야글로벌연구지원사업(과학기술정보통신부)」중 ‘기업(연구소)연계형’ 과제로 추진되며, 본 프로젝트는 ‘국가연구개발혁신법 제9조(예고 및 공모 등) 제4항, 동법 시행령 제9조(연구개발과제 및 연구개발기관의 공모 절차)’에 해당함.

 

  1. 운영방향

마이크로소프트연구소에서 선정한 연구주제에 부합하는 창의적 아이디어 공모를 통해 선정

프로젝트 수행시 공동연구협력을 위한 마이크로소프트연구소의 전문가 매칭

공동 연구원* 마이크로소프트연구소에 180일 파견 (파견시점 파견 국가 규정에 따라 변경   가능)

* 공동연구기관(국내 대학)의 연구원(석박사생)

 

  1. 과정개요

A. 프로젝트 지원

– 지원규모: 총 30억 원*, 25-30개 과제

* 디지털분야글로벌연구지원사업(과학기술정보통신부) 예산 : 30억 원

– 지원분야:

Media Foundation and Immersive AI

We aspire for AI to acquire knowledge and intelligence from various media sources in the real world. To achieve this goal, we must transform the complex and noisy real world into abstract representations capable of capturing essential information and dynamics. The exploration of Media Foundation and Immersive AI serves as a new breakthrough in the synergy of multimedia and AI, offering novel perspectives for the research on multimodal large models. (more information can be found here)

Topics:

  • neural codec
  • immersive telepresence
  • multimedia understanding
  • 3D and computer vision
  • audio and speech

AI-Transformed Medical Care and Education

As we enter an era characterized by rapid advancements in artificial intelligence, the medical field is at the forefront of these technological transformations. The integration of AI in healthcare brings forth unique challenges and opportunities that require interdisciplinary collaboration. Our mission is to explore solutions that cultivate medical talent prepared for future healthcare demands, foster research breakthroughs, and accelerate the translation of research into practical applications. We invite professors and researchers from diverse fields—including computer science, medicine, education, and ethics—to join us in this pivotal initiative. Together, we can shape the future of AI-transformed medical care and education, ensuring that technological advancements serve the greater good in healthcare.

Topics:

  • Innovative models of medical education and training
  • AI enhanced medical diagnosis, treatment, and patient care
  • Transformational effects of AI on medical research and clinical trials
  • Ensuring safety, reliability, and control in AI applications within healthcare
  • Aligning artificial intelligence with medical ethics and human values

Towards a synergy between AI and brain

The future of interdisciplinary research between artificial intelligence (AI) and brain science is exceptionally promising, offering potential breakthroughs that could revolutionize multiple fields. The integration of these disciplines may ultimately unlock new horizons in human knowledge and capability, driving progress in healthcare, technology, and beyond. Our goal is to explore the synergy between AI and brain science, focusing on how they can empower each other. Within this research theme, we are particularly interested in the following.

Topics:

  • Brain-inspired AI
  • Brain-computer interfaces
  • Spiking neural networks
  • Embodied AI towards open-ended tasks and environments

Societal AI

In an era marked by rapid advancements in AI, the impact of these technologies on society encompasses a broad spectrum of challenges and opportunities. To navigate this complex landscape, it is vital to foster an environment of interdisciplinary research. By integrating insights from computer science, social sciences, ethics, law, and other fields, we can develop a more holistic understanding of AI’s societal implications. Our goal is to conduct comprehensive, cross-disciplinary research that ensures the future trajectory of AI development is not only technologically advanced but also ethically sound, legally compliant, and beneficial to society at large.  (more information can be found here)

Topics:

  • AI’s impact on human cognition, learning, and creativity
  • Ensuring AI safety, reliability, and control
  • Aligning AI with human values and ethics
  • AI’s transformation of social science research
  • Evaluating AI in unforeseen tasks and environments

Advancing Healthcare through Innovative Sensing Technologies and AI

We aim to transform the healthcare landscape through the development of innovative sensing technologies and machine learning approaches. Our vision is to enable continuous, non-invasive monitoring of vital signs and to harness advanced health data analysis, creating a more proactive, personalized, and efficient healthcare system.

Topics of Interest include, but are not limited to:

  • Development of Cutting-Edge Wireless/Wearable Sensing Hardware and/or Software
  • ML models for Human Healthcare Data
  • LLM based healthcare agent
  • Multi-Modal Analytics

Embodied AI

With the rapid advancements in AI and robotics, the development of highly intelligent robots capable of seamlessly interacting with the physical environment is becoming increasingly achievable. As the next AI wave, embodied AI innovations promise to revolutionize various industries and significantly impact human life. Our research aims to build a new generation of foundational embodied AI models with enhanced 3D spatial and physical proficiencies in perception, reasoning, and action, enabling generalist robots to act precisely and efficiently for physical interaction with the 3D spatial world. We are dedicated to making technical breakthroughs and bringing the future closer to us.

(More information can be found here)

Topics:

  • Vision-Language-Action model architecture design
  • Spatial Multimodal LLMs
  • 3D human-object-environment reconstruction and understanding
  • Multimodal-sensory intelligence and tactile sensors
  • Robotic world models and neural simulators
  • Dexterous hand manipulation and reinforcement learning

3D AIGC and world models

We live in a 3D physical world. The reconstruction and generation of 3D objects and scenes have broad applications in domains such as AR/VR, filming, game, and robotics. While 3D generation has seen remarkable progress in recent years with advanced generative models and large 3D asset data, the diversity and quality are still not on par with latest video generation models. The capability to generate large-scale and interactable scenes that can closely resemble the physical world is also lacking. With years of experience in 3D vision and generative model research, our team is committed to pushing the frontier of 3D reconstruction and generation techniques. Our recent interests include handling complex, dynamic objects and large-scale, interactive scenes with a 3D world model.

Topics:

  • 4D articulated or deformable object generation
  • 3D/4D scene reconstruction and generation
  • Large 3D scene dataset construction
  • Interactive 3D world generation
  • Unified 3D and video generation models

Related projects: TRELLIS, MoGe, PF3plat, GaussianCube, IVID, AdaMPI, etc. (see more here)

Efficient Embodied AI

The rapid growth of foundation models has opened new frontiers in artificial intelligence, showcasing the potential for embodied AI to become a reality. However, while foundation models thrive in other domains with minimal constraints, their application in embodied AI faces unique challenges rooted in physical dependencies. Addressing these limitations requires a concerted effort to optimize the efficiency of embodied AI systems, reducing costs that impede their development. By improving computational, data, and operational efficiencies, we aim to pave the way for everyday physical AI agents to transition from concept to reality.

Topics:

  • Reducing computational costs in controlling physical agents with embodied AI
  • Enhancing data efficiency during training processes
  • Streamlining effective data collection for embodied AI systems
  • Enabling real-time control of embodied AI powered by large models
  • Accelerating the pipeline of data collection, training, and real-world testing

Embodied AI for robotic manipulation

Enabling robots to manipulate the range of physical objects that people can – and with human-level dexterity – is widely viewed as a hallmark of embodied AI. It is also a key that unlocks the economic potential of robotics, estimated to be in the trillions of US dollars.  Truly dexterous robotic manipulation requires advances in many research areas, from spatial reasoning to efficient and safe lifelong learning to large models that can understand and tactile and force feedback along with multi-camera observations and language. In turn, research progress on these topics accompanied by continuing investments in VLM technology is poised to improve AI models’ reasoning about the physical world in use cases beyond robotics.

Topics:

  • Bimanual manipulation
  • Mobile manipulation
  • Manipulation with dexterous (e.g., multi-fingered) end-effectors
  • Inference-time compute scaling for robot foundation models (RFMs)
  • Synthetic data generation for RFM training
  • Long-horizon & spatially aware decision-making for robotic manipulation
  • Integrating touch, force, and audio feedback into RFMs
  • Data-efficient learning of new robot skills

Foundation Model for Vision-based Embodied AI

Recent advancements in embodied AI have sparked interest in developing a learning foundation model that could lead to breakthroughs in the field. Motivated by this, we propose a foundation model for vision-based embodied AI as a research theme and call for collaborations with professors and researchers from the academic community. Our goal is to learn foundation models from scratch based on large-scale ego-centric embodied AI video data. By extracting commonsense knowledge from large-scale real-world data, the learned foundation model should be easily adaptable to various downstream tasks with limited demonstrations, significantly improving performance. From a research perspective, we hope the collaborations can lead to impactful research papers or achieve state-of-the-art results on the latest research-driven leaderboards.

Topics:

  • Learning foundation world models
  • Learning foundation policy/action models
  • Adapting foundation models to various robotics tasks

LLM-powered Decision-Making

The intersection of advanced natural language processing and decision science holds vast potential for enhancing human decision-making processes. We propose pioneering research that leverages large language models (LLMs) to interpret complex data, predict outcomes, and guide choices in sequential decision-making tasks. By integrating recent advancements in LLM reasoning, such as the “chain of thought” approach, we aim to develop models applicable to real industry applications, providing actionable insights and improving decision-making across various sectors. We invite collaboration with esteemed professors and researchers to explore the integration of LLMs with Reinforcement Learning, achieving sampling-efficient, generalizable, and robust policy learning in dynamic environments. Our goal is to produce substantial research publications and foster innovative solutions with transformative impacts on real-world applications.

Topics:

  • Leveraging LLMs for interpreting complex data and predicting outcomes
  • Strategic planning and resource allocation using LLMs
  • Integration of LLMs with Reinforcement Learning for efficient policy learning
  • Exploring LLM reasoning for real industry applications

Grounded Visual Interaction

Visual interaction relies on semantics, but visual data—comprised of pixels—lacks inherent semantic meaning. “Grounded Visual Interaction” focuses on semantic concepts rather than pixels to achieve efficient understanding and generation of visual content. By integrating spatial and temporal grounding, this approach could transform computer vision and vision-language systems, making them more interpretable, usable, and impactful in real-world scenarios.

Topics including but not limited to:

  • Unified visual representation learning for understanding and generation
  • Long video understanding and spatiotemporal grounding
  • Planning and reasoning in vision-language systems
  • Efficient text and image to video generation

Data and AI driven Optimization

Optimization is central to most engineering tasks and is crucial for decision making and planning. In the big data and AI era, we have access to both massive amounts of data and powerful generative AI tools. Therefore, optimization tasks should increasingly rely on the guidance and insights from data and on the assistance of AI tools. In this research theme, we expect collaborations with academic researchers on the theoretical and principled ways of integrating data and AI tools into optimization tasks. Examples of optimization tasks include online advertising and recommendations, robotics planning, wireless routing, etc. In these tasks, the online data are constantly produced and can be collected as feedback to improve the performance of optimization algorithms, and generative AI tools can be potentially applied to conduct planning and task decomposition.

Sample research topics under the theme include:

  • Combinatorial online learning and online optimization
  • Reinforcement learning for online optimization
  • Learning and optimization with offline batch data
  • Integration of generative AI tools into optimization task planning

Rebuilding Edge AI with Foundation Models

Foundation models, exemplified by large language models (LLMs), are increasingly being integrated into customer devices, such as Microsoft Copilot PCs, Apple Intelligence, Google Gemini Nano for Android, and the LLM for Honor Magic 6. These models are now embedded within operating systems as core services, enabling applications to deliver intuitive, seamless, and private AI experiences.

Beyond traditional devices, more emerging edge devices are adopting on-device LLMs to understand and perform tasks interactively with users and the physical world, including autonomous driving, embodied AI, and healthcare, marking a new LLM-based Edge AI ecosystem.

This transition introduces a range of exciting challenges and research opportunities. Over the past two years, our team has been actively pioneering this space, achieving significant breakthroughs in: on-device inference systems, novel AI accelerators, streaming video understanding, and multi-modality fusion. We believe there is immense potential to collaborate and push the boundaries of this ecosystem further.

Key directions for exploration include, but are not limited to:

  • Real-Time Understanding of the Physical World
  • Redesigning OS Services for LLMs, e.g, virtual memory and file system.
  • LLM-based Agents and Task Automation
  • Edge AI Inference Systems and Hardware

Efficient and Scalable LLM Inference 

In the realm of AGI, the inference efficiency is pivotal, particularly with the advent of models such as o1 and o3 that leverage inference-time scaling laws. These laws present a game-changing approach to improving model performance by allocating more computational resources during the inference phase. As we move forward, a comprehensive approach integrating model, system, and hardware co-design becomes essential. This holistic method is key to reducing inference time and resource consumption, thereby making large language models (LLMs) more accessible and cost-effective.

Topics:

  • Advancing low-bit quantization and sparsity algorithms.
  • System/hardware design for low-bit and sparse models.
  • Efficient long context processing (both long prefill and long decode)
  • Memory/Hardware design for dynamic sparse inference (e.g., MoE and sparse attention)

Pretraining Large Language Models for Reasoning

Large language models (LLMs) can acquire general knowledge from huge text and code datasets by self-supervised learning and handle various tasks by the “emergent abilities”. However, these models still struggle with tasks which require strong reasoning skills, such as math word problems, competition-level programming problems or physical-world tasks. Motivated by this, we propose “pretraining large language models for reasoning” as a research theme and call for collaborations with professors/researchers from academic community. The goal is to develop cutting-edge language-centered models with strong reasoning abilities that can solve complex reasoning tasks. We will explore data selection, data attribution, data synthesis, learning algorithms, self-evolving strategy, reward models, test-time computing and so on. From a research perspective, we hope the collaborations can lead to impactful research outcomes like open-source models and papers with state-of-the-art results on latest research-driven leaderboards.

Topics:

  • Data Selection for Pretraining/Post-training
  • Data Attribution
  • Reasoning-related Data Synthesis
  • Learning Algorithms
  • Self-Evolving Strategy for LLMs
  • Reward Model for Fine-Grained Reasoning Steps
  • Test-Time Computing for LLMs

Education and Lifelong Learning with AI

The integration of foundation models into educational and learning systems is transformative. In order to enhance true learning via critical thinking, the community of researchers, educators, and professionals interested in lifelong learning, need to address the potential benefits and challenges of integrating AI in these systems, building solutions which take into trust, transparency and account for personalized and collaborative learning, multilingual and multicultural needs. Ongoing research and development will need to prioritize redefining collaboration frameworks between humans and AIs bringing together scientific and sociotechnical research to uncover future directions necessary to create an effective, ethical, and engaging environment.

Topics:

  • AI-Powered Educational Tools and Resources including scalability, adaptability, robustness, and equity.
  • Assessment and Evaluation addressing biases while ensuring high accuracy and diverse educational needs.
  • Data-Driven, ethical and trustworthy Insights, providing personalized feedback, ensuring transparency, reliability, inclusivity, and accountability to balance technologically benefits with ethical integrity.
  • Content Generation, personalizing learning materials and providing adaptive experiences while maintaining trust and effectiveness.
  • Interactive and Collaborative Learning to improve engagement and playfulness.
  • Multilingual and Multimodal Education in support of inclusivity, adaptability, linguistic and cultural contexts.
  • Agentic AI Education and Learning to help democratize AI, worldwide, using AIs in the classroom (e.g., children, college students) or outside the classroom (e.g., professionals, creators) to expand one’s knowledge and nurture intellectual curiosity.

AI-Driven Materials Research 

Materials research is a multifaceted and demanding field that requires significant time, effort, and resources. Inspired by recent advances in fields such as computer vision, natural language processing, and game playing, Artificial Intelligence (AI) has garnered substantial interest in materials research. AI-driven models have the potential to emulate and generate novel materials, revolutionizing the discovery and development of advanced materials, thereby accelerating the process and reducing costs. In this theme, we encourage researchers to collaborate and address significant challenges in materials research. We welcome research proposals that aim to enhance the materials discovery and development process by leveraging recent AI techniques.

Topics:

  • AI for novel material design
  • AI in material synthesis planning
  • Gen AI for defects and doping for enhancing optical materials
  • AI for sustainable material design
  • AI for plasma facing materials

AI-Driven Energy Research 

Energy research is a multifaceted and demanding field that demands significant time, effort, and resources. Inspired by recent advances in fields such as computational modeling, artificial intelligence (AI), and quantum computing, the fields of energy storage and nuclear fusion have garnered substantial interest as potential solutions to the global energy crisis. AI-driven models have the potential to emulate and generate novel energy solutions, revolutionizing the discovery and development of advanced energy technologies, thereby accelerating the process and reducing costs. In this theme, we encourage researchers to collaborate and address significant challenges in energy research. We welcome research proposals that aim to enhance the energy discovery and development process by leveraging recent AI techniques.

Energy Storage: An Essential Component of Modern Energy Systems

The role of energy storage in modern energy systems cannot be overstated. As the world transitions from fossil fuels to renewable energy sources like solar and wind, the need for efficient energy storage solutions has become increasingly paramount.

AI for Novel Battery Design

AI techniques can significantly enhance the design and development of next-generation battery technologies. By employing machine learning algorithms and predictive models, researchers can identify new materials and chemistries that can improve battery performance, lifespan, and safety. AI can also optimize the manufacturing processes of batteries, reducing production costs and improving quality control.

AI in Battery Management Systems

Battery management systems (BMS) are critical for monitoring and controlling the performance of battery storage systems. AI-driven BMS can predict battery failures, optimize charging and discharging cycles, and enhance overall system efficiency. These intelligent systems can adapt to changing conditions and self-correct in real-time, ensuring the longevity and reliability of battery storage solutions.

Nuclear Fusion: The Holy Grail of Clean Energy

Nuclear fusion, the process that powers the sun, has long been considered the holy grail of clean energy. Unlike nuclear fission, which involves splitting atomic nuclei, fusion involves combining lighter nuclei to form a heavier nucleus, releasing vast amounts of energy in the process. Fusion offers the promise of virtually limitless, clean energy with minimal environmental impact.

AI for Fusion Reactor Design

Designing and optimizing fusion reactors is a complex and challenging task. AI-driven models can simulate various reactor configurations and predict their performance, enabling researchers to identify the most efficient and cost-effective designs. These models can also optimize the placement and configuration of magnetic fields, which are crucial for containing the hot plasma in fusion reactors.

AI in Plasma Control Systems

Controlling the plasma within a fusion reactor is one of the most significant challenges in nuclear fusion research. AI techniques can enhance plasma control systems by predicting plasma behavior and adjusting control parameters in real-time. This can improve the stability and confinement of the plasma, increasing the overall efficiency of the fusion process.

AI for Fusion Fuel Cycle Optimization

The fusion fuel cycle, which involves producing, processing, and recycling fusion fuels like deuterium and tritium, is another critical aspect of nuclear fusion research. AI-driven models can optimize each step of the fuel cycle, reducing costs and minimizing waste. These models can also predict and mitigate potential issues, ensuring a continuous and efficient supply of fusion fuels.

We encourage researchers to explore the following candidate topics within the realms of battery storage and nuclear fusion:

Topics:

  • AI for novel battery materials design
  • AI in battery synthesis planning
  • Generative AI for defect detection and doping enhancement in battery materials
  • AI for sustainable battery design
  • AI for plasma control in fusion reactors
  • AI-driven optimization of fusion reactor configurations
  • AI for fusion fuel cycle efficiency
  • AI for plasma control
  • AI for High temp superconducting magnets

Enhancing Performance in Large-Scale AI Systems 

As large-scale AI models continue to advance, effectively identifying and addressing system bottlenecks has become increasingly critical-particularly in heterogeneous AI systems. These systems, which integrate diverse hardware and network architectures, present complex challenges that demand systematic and scalable optimization strategies.

This collaboration aims to develop innovative methodologies for measuring, analyzing, and resolving bottlenecks in large-scale and heterogeneous AI infrastructures. By employing precision measurement techniques to diagnose performance issues and applying model-driven and system-level solutions, we seek to enhance resource utilization and overall system efficiency. Our ultimate goal is to minimize inefficiencies, accelerate AI training and inference, and enable seamless deployment in real-world applications.

Topics:

  • Bottleneck Identification and Analysis in Large-Scale AI Systems
  • Optimization for Heterogeneous AI Infrastructures

Smart Hardware for AI Systems 

The integration of smart, programmable hardware is transforming the landscape of AI-driven intelligent data centers. We are conducting research on smart hardware, including NPU (Network Processing Unit) and DPU (Data Processing Unit) technologies, which are currently under active development. This cutting-edge technology is poised to deliver substantial performance improvements, particularly in minimizing latency and accelerating data processing for AI workloads.

Beyond fundamental cloud services, our vision—collaborating closely with academic partners—focuses on harnessing Smart Devices to optimize AI training and inference. Together, we aim to pioneer the scalable design and implementation of programmable hardware for AI systems in modern data centers. By addressing practical challenges and unlocking new opportunities, we strive to redefine the performance and efficiency of AI-driven infrastructures.

Topics:

  • Next-Generation Programmable Hardware for AI Systems
  • Enhancing AI Workloads with Advanced Hardware Acceleration
  • Optimizing Network and Data Processing Efficiency for AI Applications

Leveraging LLMs for AI and Data Center Networks 

The common perception is that data centers are inherently scalable by design. However, as data centers continue to expand, a significant paradox emerges: human resources are not scalable, necessitating the use of LLMs to provide assistance in areas such as ACL, BGP, and even the dataplane. A key example is the overwhelming number of counters and states in the dataplane, making it difficult to determine whether a fault has occurred and to identify the root cause of the issue.

To address these challenges, LLMs can be utilized to create intelligent agents capable of analyzing states, resolving faults, and providing explanations for the root causes. This research aims to bridge the gap between academic exploration and industrial implementation, addressing pivotal resource management issues in data centers.

Topics:

  • AI/LLM Applications in Networking
  • Fault Diagnosis in Data Centers
  • Automation in Data Centers

LLM-based log analysis and offline learning 

In the rapidly evolving landscape of communication technologies, understanding the intricate details of network performance and user behavior is paramount. This proposal outlines a project aimed at utilizing Large Language Models (LLMs) to analyze communication application logs. By leveraging the advanced capabilities of LLMs, we aim to uncover patterns, detect anomalies, and gain insights into state-of-the-art network features. This analysis will not only enhance our understanding of current network performance but also pave the way for future innovations in communication technologies.

Topics:

  • LLM-based network replay
  • LLM-based offline learning for congestion control

Advancing the Understanding and Measurement of Reasoning in Large Language Models (LLMs)

Large Language Models (LLMs) have rapidly evolved, achieving remarkable milestones over the past few years. These advances include improvements in language understanding, translation, summarization, and even creative writing. Despite their impressive performance, there remains a significant gap in our understanding of LLMs’ reasoning capabilities. Specifically, reasoning—a core cognitive function—is still poorly understood, both in terms of how it is exhibited by LLMs and how it can be accurately assessed. Fundamentally, questions such as What constitutes reasoning for LLMs?, How can we measure reasoning accurately?, and What approaches can enhance reasoning in these models? remain unanswered. These unresolved questions have critical implications for the development of more reliable and effective LLMs.

Research Objectives:

  1. Investigate the Nature of Reasoning in LLMs
  1. Establish Robust Measurement Frameworks
  1. Develop Techniques to Enhance Reasoning

Our work will aim to provide a deeper understanding of reasoning in LLMs, establish reliable benchmarks for assessment, and propose methods to advance reasoning abilities. Ultimately, this research will target to contribute to the development of more intelligent and trustworthy language models.

Evaluating and Enhancing Large Language Models for System and Networking Design

Large Language Models (LLMs) have achieved significant milestones in natural language processing, demonstrating capabilities in understanding, generating, and summarizing text with impressive fluency. These advances have been fueled by breakthroughs in model architectures, such as transformers, and innovations in pre-training and fine-tuning techniques. Yet, a crucial question remains: Are LLMs ready to contribute meaningfully to complex tasks such as system and networking design?

System design involves not just comprehending existing concepts but also applying structured reasoning and domain-specific knowledge to create new architectures. Current LLMs, while adept at generating plausible-sounding responses, often fall short in domains requiring precise technical understanding and the ability to synthesize new solutions. This proposal aims to explore whether LLMs are equipped to assist in system and networking design and, if not, how to optimize their use for this purpose.

Research Objectives:

  1. Assess the Current State of LLMs in System Design
  1. Investigate Use Cases Where LLMs Succeed or Fail
  1. Develop Strategies to Enhance LLMs’ Contribution to System Design

Our research will aim to clarify the extent to which LLMs can be relied upon for system and networking design tasks. We aim to produce a set of best practices for leveraging LLMs in these contexts, as well as recommendations for future model improvements to enhance their effectiveness in this critical domain.

Efficient Optimization for World Models

Recent world models such as UniSim, Genie, and DreamerV3 have demonstrated impressive results in robotics, video gaming, and autonomous driving by generating and predicting visual content through their understanding of the physical and simulated environments. However, their practical deployment remains challenging due to their intensive computational requirements. These challenges are particularly pronounced in world modeling tasks that need to process large visual data streams and requires responsive, high-resolution simulation and prediction under strict computational constraints.

Efficient optimization is crucial for the practical application of world models. This research aims to develop an efficient optimization framework enables world models to operate effectively within limited computational resources. A primary challenge in processing dynamic scenes is balancing state representation fidelity with computational efficiency, especially in resource-constrained environments. This research focuses on creating optimization techniques that preserve state information while maintaining computational efficiency through compressed representations. The optimization strategies developed will enable world models to be deployed more effectively in real-time, interactive scenarios. These techniques can be extended to various applications, providing solutions for real-time simulation and decision-making in robotics, gaming, autonomous systems, and beyond.

Artificial Intelligence for Specialized Domain

Artificial Intelligence (AI) for specialized domain aims to design AI technologies to meet the specific needs and challenges of particular industries or fields. Unlike general-purpose AI, which is designed to perform a wide range of tasks, specialized AI focuses on optimizing performance and delivering precise solutions within a defined context.

One of the key advantages of specialized AI is its ability to leverage domain-specific data and knowledge to achieve higher accuracy and efficiency. It involves several critical steps. First, it requires a deep understanding of the domain’s unique requirements and challenges. This includes identifying the specific tasks that AI will perform and the data it will use. Next, AI models are trained using domain-specific datasets, which may involve techniques such as continual pretraining, supervised learning, transfer learning, and reinforcement learning. These models are then fine-tuned and validated to ensure they meet the desired performance standards.

It requires close collaboration between AI researchers and domain. Domain experts provide the necessary insights and context, while AI researchers develop and refine the models. This partnership ensures that the AI solutions are not only technically sound but also practically relevant and effective.

In summary, AI for specialized domains represents a powerful tool for enhancing efficiency, accuracy, and innovation across various industries. By harnessing the strengths of AI and combining them with domain-specific expertise, we can create solutions that drive significant advancements and address complex challenges in a targeted and effective manner.

Topics:

  • Domain data enrichment and refinement
  • Specific domain knowledge continual learning
  • Model architecture exploration for specific domain
  • Model reasoning for specific domain

Data-driven AI-assisted Hardware Infrastructure Design

Recent rapid advancements in AI workloads, such as ChatGPT, DALL-E 3 and more, have dramatically accelerated transformation of cloud infrastructure because they lead to remarkably new computational requirements than before. Therefore, there is a strong need and urgency of building novel system tools to scale and speed up AI infrastructure evolution, to improve its performance and efficiency together with AI application advancements. For example, cloud architects need to determine the viability of cluster architecture, chip design and hardware specification of the next-generation AI infrastructure adeptly for future AI models, based on a mixture of diverse objectives such as performance, power, efficiency, cost and more.

This project aims to enable AI to co-evolve with AI infrastructure hardware design. We propose that by harnessing the interplay between AI infrastructure and AI, we can consistently generate high-quality data and effective methods for hardware performance prediction. This will aid cloud architects in accurately forecasting the performance trends of AI workloads on new hardware specifications, even without physical hardware access. The scope of this project encompasses performance predictions for cloud architecture and chip micro-architecture, as well as identifying hardware component requirements for targeted AI workloads. Based on our hardware copilot’s assessment, architects can make informed and unbiased decisions regarding new hardware specification requests to hardware vendors.

Topics:

  • Data-driven High-fidelity Fast Microarchitecture Simulator
  • Data-driven AI-assisted Microarchitecture Hyperparameter Searcher

Multi-Agent Systems for AI Infrastructure 

LLM-based agents have demonstrated remarkable capabilities in many areas, achieving state-of-the-art performance. However, there remains a significant gap in their ability to handle complex tasks, such as production-level system design and development. Just as great projects in human history are the result of collective talent, super artificial intelligence will also require the collaboration of multiple AI agents.

Therefore, in this project, we will use AI Infrastructure development as a case, to explore general agent system design.  It aims to develop and implement cutting-edge agent systems for AI infrastructure, to automate the entire process of AI infrastructure development and optimization.

Topics:

  • (Sys) Rethinking AI infrastructure systems with LLM-based agents
  • (Sys) Abstraction and interface design for LLM-based multi-agent systems
  • (PL & Sys) Planning language and formal verification for multi-agent systems
  • (AI & Sys) System planning capability for agent orchestration and collaboration
  • (AI) RL with system feedback for system planning improvement

Code Adjustment Agent 

Code adjustment is a universal task that software developers spend most of their time on. The following three are specific examples of code adjustment tasks. (1) Resolving broken software dependencies caused by external updates. For instance, upgrading to the latest CUDA or a new GPU often requires updating PyTorch implementations. (2) Environment-/workload-specific optimization. When the hardware and software environment changes, the code optimization strategy should be adjusted accordingly. For instance, matrix multiplication algorithm should be changed when the GPU upgrades from A100 to H100. (3) Adding new features that reuse existing implementations. For instance, adding FP4 precision support for a computing operator will reuse a big part of existing code that has added FP8 precision support.

The goal of this project is to design AI agents for code adjustment that dramatically improves scalability of code adjustment. Code adjustment agents have a huge space of applicable scenarios, highlighting the importance of regarding topics. It is challenging to implement such agents due to three reasons. First, it requires the agent to be knowledgeable on the specific context that the codebase is running over, including the targeting hardware/software environments, third-party libraries that the codebase relies on, and the commit history of the codebase. Existing in-context learning or RAG methods often fail to effectively encode such sophisticated knowledge, while fine-tuning methods make it difficult to update the knowledge in a continuous and strict manner, which is critical as the knowledge keeps changing as the codebase evolves. Lastly, the agent should ensure strict verification by human experts with low efforts. This requires designing a well-defined human-agent interface that can evaluate and verify the entire code development process.

Topics:

  • Coding agents for writing quality unit tests
  • Memory management for coding agents
  • Automated definition of intermediate representations

AI Hardware Design Optimization Using LLMs for Tabular Data Analysis

AI-assisted hardware design is a promising frontier in the development of high-performance and efficient systems. The design of AI hardware is heavily influenced by workload performance metrics, which provide critical insights into how hardware can be optimized for various applications. These metrics are essential for understanding the demands placed on hardware by different AI workloads and for identifying opportunities to enhance performance and efficiency.

However, the sheer volume of workload performance data presents a significant challenge for human analysts. The data generated from various empirical studies and real-world applications is vast and complex, making it difficult to extract meaningful insights efficiently. Traditional analysis methods often fall short in handling this complexity, leading to potential delays and missed opportunities in hardware optimization. To address this problem, we propose leveraging LLMs to analyze tabular data containing workload performance metrics.

LLMs, with their advanced natural language processing capabilities, offer a powerful solution for processing and understanding large datasets. These models can uncover patterns, correlations, and trends that might be missed by traditional analysis methods. By utilizing LLMs, we aim to facilitate the design and optimization of AI hardware, ultimately enhancing performance and efficiency. The integration of LLMs into the analysis process will enable us to quickly and accurately extract insights from the data, driving more informed and effective hardware design decisions.

Topics:

  • LLM-based Tabular Data Analysis
  • LLM-assisted Workload Performance Analysis

Reshaping the Future of “Asynchronous Tensor Processing Engine” assisted by LLM Reasoning (Project-focus)

GPU and high-speed interconnect technology have redefined what constitutes a new generation of supercomputers, and the ever-increasing capabilities of supercomputers further aid in the support and expansion of large-scale tasks (such as LLM, Diffusion, Database, etc.). To optimally combine strong computation and communication, an intelligent upper-level scheduling framework is essential. Although numerous papers in recent years have discussed from various angles how to design faster scheduling frameworks, most of the work has limitations such as specific structure applicability, over-complicated designing, infeasibility to different downstream architectures, and high implementation costs, preventing these ideas from being widely adopted. Even the current mainstream engines such as Pytorch/Tensorflow and frameworks such as Megatron/Fairseq/Deepspeed still rely on “expert design” and “specific model-specific optimization”. In our collaboration, we hope to combine “traditional excellent asynchronous algorithms”, “new GPU architectures”, “diverse workloads (e.g. DNN/DB)”, and “LLM reasoning as assistant” to ultimately create a simple but amazing distributed asynchronous tensor processing engine, which is especially preferred by non-expert users to use it. Hopefully, it will reshape the next-generation “asynchronous tensor processing engine” (ATPE). This work will be positioned primarily for industrial purposes, with papers as optional supporting proof.

Infrastructures it may target: Distributed GPUs and Unified-memory/Grace-style GPUs.

Workloads it may benefit: Large-scale DNN Inference / Database Processing / Storage and Remote IO / …

Topics it may consider: Prefetch / swapping / asynchronous and parallelism suggestion driven by LLM reasoning / …

Superintelligence Reasoning

The ability to perform complex, context-rich, and multi-step reasoning is emerging as a critical differentiator for next-generation AI systems. By pushing beyond narrow task boundaries, AI technologies that exhibit more sophisticated reasoning capabilities promise transformative impacts across diverse areas, from competition-level problem-solving, scientific discovery to complex decision-making and personalized education. However, achieving and systematically understanding these enhanced reasoning processes introduces new questions about how best to develop models that are both powerful and sufficiently transparent to foster trust and reliability.

Our goal is to conduct comprehensive research that deepens our understanding of how AI systems can acquire, refine, and apply complex reasoning. We aim to create robust, adaptive AI models that excel under increasing task demands and in unpredictable environments. This research clarifies how to scale reasoning to unprecedented levels of complexity and fosters breakthroughs that enable stronger generalization, more efficient learning, and greater overall resilience.

Topics:

  • Extended Chain-of-Thought Reasoning
  • Test-Time Scaling
  • Weak-to-Strong Generalization
  • Self-Improving Architectures
  • Scalable Oversight
  • Interpretability

Memory and RAG (Retrieval-Augmented Generation)

Retrieval-augmented generation (RAG) and long-term memory modeling have emerged as important directions to enhance the capabilities of AI systems by integrating external knowledge. This research proposal seeks to address key challenges and advance the state of the art in several critical areas. One focus is to develop evaluation methodologies that can accurately assess RAG systems across diverse scenarios and investigate the factors on training efficient and effective RAG models. Another related area is long-term memory modeling in agentic AI systems, which aims to improve these systems’ ability to store and utilize information over time, thereby enhancing their decision-making and adaptability. By addressing these interconnected topics, our research goal is to build trustworthy and grounded RAG systems capable of supporting a wide array of applications.

Topics:

  • Robust evaluation of RAG systems
  • Building trustworthy and grounded RAG models in Enterprise applications
  • Long-term memory modeling in agentic AI systems
  • Long-context modeling of foundation models

Automated GUI Agents for OS-Level Intelligence

The burgeoning field of Artificial Intelligence presents significant opportunities to revolutionize human-computer interaction beyond the confines of web browsers. This call for proposals seeks innovative research focused on automatically collecting rich, Operating System (OS)-level Graphical User Interface (GUI) interaction data to train advanced Large Language Models (LLMs) and Multimodal Language Models (MLLMs). Moving beyond current web-centric GUI agents, this initiative aims to develop AI capable of understanding and automating general computer usage, encompassing application interaction, file management, system settings, and OS navigation. By integrating expertise in AI agent design, virtualization technologies, and machine learning, we can pave the way for a new generation of intelligent computer assistants. Our goal is to foster comprehensive research that will enable the development of AI systems deeply integrated with the operating system, leading to enhanced user experiences and powerful automation capabilities.

Topics:

  • Development of Autonomous OS-Level GUI Agents: Designing intelligent agents capable of navigating and interacting with diverse desktop applications and operating system interfaces within virtualized environments.
  • Multimodal Data Collection Methodologies for OS Interactions: Creating robust techniques for capturing synchronized streams of GUI actions (mouse clicks, keyboard inputs), visual context (screenshots/video frames), application states, and OS events.
  • Virtualization Strategies for Scalable and Safe Data Generation: Exploring and optimizing virtualization techniques to enable the safe, reproducible, and parallelized collection of OS-level interaction data.
  • LLM/MLLM Training for General Computer Usage Understanding: Investigating novel approaches to train LLMs/MLLMs on the collected OS-level interaction data, enabling them to understand and generate sequences of actions for complex computer tasks.
  • Evaluation of AI Agents in Diverse OS Tasks and Applications: Developing robust evaluation metrics and methodologies to assess the performance and generalizability of trained AI agents across a range of OS-level tasks and software applications.

Interpretable LLM for Reasoning

Large Language Models (LLMs) have demonstrated remarkable capabilities in performing complex reasoning tasks. Despite their impressive performance, the mechanisms underlying their reasoning processes remain opaque. There is a growing need for explainable and interpretable LLMs to ensure their safe, reliable, and effective deployment in critical applications. Understanding how LLMs perform reasoning, the mechanisms driving these processes, and the interplay between memory and exploration during reasoning can provide valuable insights into enhancing model transparency and functionality.

Topics:

  • Mechanism of Reasoning: Investigate how LLMs execute reasoning tasks, including the role of attention mechanisms, transformer architectures, and token representations.
  • Memory vs. Exploration: Analyze the relationship between memory and exploration during reasoning, identifying how prior knowledge is retrieved and integrated with new information.
  • Explainability and Interpretability: Develop methods to visualize and interpret the reasoning pathways of LLMs, ensuring their outputs can be understood and validated by humans.
  • Benchmarks and Evaluation: Create benchmarks to systematically evaluate reasoning capabilities and interpretability of LLMs, and identify the remaining weakness in current LLMs.

– 지원기간: 2025.3.1 – 2026.2.28

– 지원내용: 프로젝트경비 (각 선정 과제별 추후 별도 확인) – 총 9천 – 1억 8천만원

  • 정부부문: 각 선정 과제별 추후 별도 확인 (Korea Won 8천만원 – 1억 7천만원)
  • 기업부문: 각 선정 과제별 추후 별도 확인 (Korea Won 9백 -1천만원)

* 정부부문의 프로젝트비 산정. 사용 등은 정보통신방송 연구개발 관리규정에 따름 (IITP의 추후 안내)

* 기업부문은 기업과제 별도 계약에 따름.

– 선정심사: 마이크로소프트연구소의 전문성심사(서면)

* 심사결과는 선정된 과제에 한하여 개별 통보하며 공개되지 않음

* 과제 공동연구기관으로 선정된 국내 대학은 IITP와 협약 등 정부 과제 수행을 위한 절차를 추진해야 함(대학별로 공동연구기관으로 협약 체결)

 

B. 공동연구원 파견

  • 공동 연구원: 별도 심사를 통해 선발
  • 파견기간: 180일 (2025. 7월 – 2026. 2월 중 예정, 파견시점 파견 국가 규정에 따라 파견 기간 변경 가능)
  • 파견기관: 마이크로소프트연구소 (연구 분야에 따라 중국 북경, 상해, 기타 지역 등에 파견 예정)

 

  1. 신청자격

프로젝트당 학생(2-5명) 및 지도교수로 팀을 구성

학생: 국내 IT관련학과 대학원에 재학중인 전일제 석박사과정 대학원생

* 한국 국적의 내국인 (휴학생 또는 박사후 과정은 제외)

교수: 국내 IT관련학과 소속 전임교원으로서 지원기간 동안 프로젝트 총괄 및 학생 연구 지도가 가능한 자

 

  1. 지원절차

프로젝트 선정 공고 -> 제안서 제출(온라인 접수, 지원양식, 100% 영문제안) -> 선정심사 -> 지원대상 선정통보 -> 협약체결 및 프로젝트 경비 지급

* 지원양식에서 예산작성은 기업부문 Korea Won: 1천만원 기준으로 작성. 정부부문의 프로젝트비는 선정통보후 별도 안내 예정

 

  1. 신청 유의사항

프로젝트팀은 총1개 분야에 한해 신청할 수 있음. 그러나 특수한 경우, 주관연구개발기관 연구책임자(마이크로소프트연구소, 이미란 전무)의 승인 하에 최대 2개 분야에 신청할 수 있도록 함.

신청자격에 부합하지 않을 경우 선정심사 대상에서 제외될 수 있음

* 국가연구개발사업에 참여제한 중인 자와 기관은 신청 불가

 

  1. 신청요령

신청방법: 이메일 신청([email protected])

신청접수마감: 2025년 2월 21일(금) 17:00

* 제출된 서류는 일제 반환되지 않음

 

  1. 문의처

사업담당: 마이크로소프트연구소 이미란 전무 (010-3600-4226, [email protected])

관련 게시물