Speakers Introduction
Key Discussion Points
Core Issue 1: When Will Humanoid Robots Surpass Humans?
1.Specialized movements already exceed humans, but gaps remain in perception and dexterous hand manipulation, requiring coordinated progress in both hardware and software. (Ding Gang)
2.Full-body manufacturing technology meets standards, yet balancing affordability and cost remains critical. Breakthroughs in dexterous hands must address flexibility, battery life, and durability. (Zhuang Ziwen, Ding Gang)
3.Clear application scenarios enable faster adoption, whereas general control algorithms require longer development cycles. Generalization capability depends heavily on data acquisition and scaling. (Zhuang Ziwen, Su Zhi)
4.Initial implementation possible within 2–3 years after hardware breakthroughs, with conservative estimates pointing to tangible progress within 5 years. Widespread generalization may take 5–10 years. (Ding Gang, Su Zhi, Hou Taixian)
Core Issue 2: Will Future Robot Motion Perception Methods Converge with or Diverge from Human Approaches?
1.Human perception carries evolutionary "legacy traits", while robots employ diverse sensors; humans achieve few-shot learning through genetic endowment, whereas robots require massive simulation data. (Zhuang Ziwen)
2.Reinforcement learning shares underlying logic with human learning; robot sensors may evolve toward "human-like dual RGB cameras"; morphological alignment with humans can enhance environmental and data compatibility. (Hou Taixian, Ding Gang)
3.Robots may follow a "pre-training + efficient RL" path, breaking away from "fixed pre-trained models" to achieve true "acquired learning." (Su Zhi, Zhang Xiaobai)
Core Issue 3: Research and Practice in Online Learning Frameworks for Robot Motion Algorithms
1.High iteration costs limit validation to simple scenarios; balance-critical tasks are "unaffordable to fail" due to hardware fragility and stringent sample efficiency requirements. (Zhuang Ziwen, Ding Gang)
2.Early approaches of "real-world first, simulation later" remain relevant with the rise of parallelized engines and accessible tools; core methods still apply, but require "foundation model + real-world fine-tuning." (Hou Taixian, Ding Gang)
3.Foundational control models are needed to reduce costs; online reinforcement learning requires foundation models to improve efficiency, though technical routes have not yet converged. (Zhuang Ziwen, Su Zhi)
4.Mass production may revive "real-device online learning" to accommodate personalized needs. (Hou Taixian)
Core Issue 4: Research and Exploration in Humanoid Robot Perception Capabilities
1.Perception-decision-control requires layered design: complex scenarios suffer from insufficient decision-making, while simple scenarios allow perception to directly supply control parameters. (Zhuang Ziwen)
2.Simple scenarios rely on system collaboration, whereas complex scenarios demand optimized perception performance. (Zhuang Ziwen)
3.SLAM models are large and computationally intensive, making edge deployment challenging; traditional SLAM performs poorly in dynamic environments. (Hou Taixian, Zhang Xiaobai)
4."VLM + control" transfers underlying motion skills; borrows biological logic of "capturing local key information"; uses hierarchical networks for navigation decisions. (Ding Gang, Zhuang Ziwen, Zhang Xiaobai)
Core Issue 5: Can We Develop Universal Motion Algorithms for Similar Robot Morphologies That Are Directly Deployable Across Platforms?
1."The 'Cerebrum' Can Be Shared, the 'Cerebellum' Cannot": A universal high-level planner ("cerebrum") combined with platform-specific low-level controllers ("cerebellum") can achieve indirect cross-platform deployment. (Ding Gang)
2.Training Frameworks Can Cross Humanoid Platforms; Quadrupeds and Bipeds are Difficult: A shared training framework is feasible for humanoid robots, but cross-platform application between quadrupeds and bipeds is challenging, with the prerequisite of "similar body proportions." (Zhuang Ziwen, Zhang Xiaobai)
3."Pre-planned Trajectories + Inverse Kinematics" for Legged Robots Limits Freedom; "Teacher-Student Distillation" Enables Cross-Morphology with Fine-tuning, Possibly at a Performance Cost: Pre-defined trajectories adapted via IK work for robots like robotic dogs but limit motion freedom. Knowledge distillation from a "teacher" model enables adaptation across morphologies, though it requires fine-tuning and might sacrifice some performance. (Hou Taixian, Su Zhi)
4.Morphology Extension is a Form of Cross-Platform; "Shared Lower Layers + Branched Upper Layers" Architecture Adapts to Different Forms: Expanding a robot's own capabilities (e.g., adding an arm) is a cross-platform problem. An architecture with shared lower-level layers and branched upper-level controllers can suit different morphologies. (Su Zhi, Hou Taixian)

► Origin of the Roundtable: Why Focus on "Humanoid Robots' Ultimate Capabilities and Implementation Pathways"?
► Core Issue 1: When Will Humanoid Robots Surpass Humans?
► Core Issue 2: Will Future Robot Motion Perception Methods Converge with or Diverge from Human Approaches?
► Core Issue 3: Research and Practice in Online Learning Frameworks for Robot Motion Algorithms
► Core Issue 4: Research and Exploration in Humanoid Robot Perception Capabilities
Following the previous discussion, I would like to further explore a question: In scenarios like parkour, does the robot's policy inherently contain certain decision-making capabilities?
Taking the example of a robotic dog crossing boxes:
-
When the distance between two boxes is 60 cm, the robotic dog directly uses visual perception to assess the distance, processes environmental information through a GRU, and adopts a "direct jump" action strategy.
-
However, when the distance is adjusted to 70 cm, I observed a key phenomenon: the underlying policy chooses to descend from the first box onto a step, land with its hind legs, and then place its front legs on the second box.
This resembles an end-to-end problem, suggesting that the robot’s policy may have embedded certain decision-making capabilities directly into the control layer.
► Core Issue 5: Can We Develop Universal Motion Algorithms for Similar Robot Morphologies That Are Directly Deployable Across Platforms?
This topic greatly interests me, as cross-embodiment research is primarily conducted by research institutes and universities. During my time at BAAI, all embodied intelligence departments were focused on cross-embodiment. My view on this is—the "cerebrum" can be transferred across embodiments, but the "cerebellum" cannot.
Here's a simple example: if your consciousness were transferred to a body with a significantly different physique, I believe your knowledge could be carried over. If you think you could immediately control that body to walk, then you believe cross-embodiment is feasible. But if you feel it would take time to adapt, then the conclusion is that it cannot be fully transferred.
Thus, our technical approach centered on developing a general "cerebrum" while creating separate control algorithms for different embodiments. As long as knowledge can be transmitted through the "cerebrum," we can indirectly achieve cross-embodiment capability. That is my perspective.
► Roundtable Summary
This roundtable discussion centered on "Future Directions and Technical Pathways for Humanoid Robot Motion and Perception," featuring five participants who delved deeply into aspects ranging from technology and scenarios to development roadmaps. The exchanges included detailed analysis of core technologies like hardware iteration and algorithmic breakthroughs, predictions on the practical application timeline for deploying solutions in simple scenarios and achieving generalization capabilities, and clarification of industry consensus along with unresolved challenges in key areas such as perception-decision coordination and cross-embodiment adaptation. These insights are provided for reference by professionals both inside and outside the industry.
Moving forward, HighTorque Robotics will host more exchange events for academics and developers. Beyond supplying reliable hardware, we are committed to building a vibrant ecosystem platform for academic and development collaboration. Scholars and developers interested in cooperation and exchange are welcome to contact us via WeChat: dionysuslearning.