I will focus on deep learning agent multi agent rag memory

China

I speak English

AI developer and researcher

I am an AI engineer and researcher specializing in deep learning, large language models, multimodal AI, diffusion models, Mamba-based architectures, Agentic AI, reinforcement learning, RAG, and embodi...
About this Gig

## Innovative Design and Improvement Guidance for Agentic RL and LLM Reinforcement Learning


 LLMs are gradually evolving from single-turn Q&A machines into agentic systems capable of repeatedly interacting

 between reasoning and external tool use in multi-turn settings. From Search-R1 to ToolRL and SkyRL, models now need to

 not only think, but also search, calculate, call APIs, and continuously self-improve through RL across multi-

 step trajectories.


 ## 1. Innovative Design Improvements for Agentic RL Algorithms


 ### 1.1 Hierarchical Reinforcement Learning Architecture


 A hierarchical decision-making mechanism divides an Agents decisions into three levels: the strategic layer for task

 decomposition, the tactical layer for tool selection, and the execution layer for concrete operations. Each layer

 adopts a different RL policy.


 Automatic sub-goal discovery allows Agents to identify reusable intermediate sub-goals during training and construct a

 skill library.


 Automated curriculum learning emphasizes enabling Agents to progress autonomously from simple tasks to complex tasks

 without manually designed curricula.


 ### 1.2 Multimodal Environment Interaction

Programming Language:

Python

Javascript

LISP

Pytorch

TypeScript

AI Model Frameworks & Tools:

Hugging Face Transformers

Data Type:

Text

Images

Tabular Data

AI Engine:

GPT

Related tags