This article was written by Alberto Rodriguez, Director of Robot Behavior for Atlas, Shane Rozen-Levy, Research Engineer, and Vinay Kamidi, Research Engineer.

This humanoid robot is unlike anything you’ve seen before. There are things that are obvious in our latest video: our Atlas® robot rotates its torso 180 degrees, squats down to lift a mini-fridge, and carries it to a lounging engineer. There are nuances that are less obvious—the robot’s full use of its arms, legs, and torso to manage a lift that a person would struggle with—and ones that don’t show up on camera at all—the speed of development and fidelity of the behavior.

It’s a novel sight for sure, but why did we do it?


Our other robots are built to automate the most taxing work. Our Stretch® robot autonomously unloads trucks full of 23 kilogram (50 pound) boxes in triple-digit temperatures. Our Spot® robot undertakes the same inspection route every day to take the same measurements at the exact same time, catching the earliest signs of trouble on the factory floor. These jobs are tedious, but require a close attention to detail that Stretch and Spot provide every day.

Atlas targets a very broad set of capabilities across factories, warehouses, or construction sites that require high levels of strength, endurance, and dexterity. We are building Atlas as a general purpose tool for physical work. Attaining the performance and reliability to satisfy real environments requires leaps of capability on both hardware and behavior.

This sequence is a deliberate experiment that shows important advances on both those fronts. Over the span of just a few weeks after Atlas’ public debut in January, we demonstrated unprecedented performance for a humanoid robot’s strength, mobility, and whole-body control. Read on to learn how we train Atlas, how we bring up a new platform, and why this research is groundbreaking.

Physical Intelligence for the Real World

In the last few years, we’ve seen a fundamental transition to behavior architectures fueled by demonstration data, with an emerging capacity for generalization. This is an essential component to deliver on the promise of humanoids—being adaptable, quick to learn, and easily retasked. We’ve shown these architectures can drive the behavior of not just table-mounted arms, but also complete humanoids on real world tasks.

While they produce capable behavior, the dominant approaches to the state of the art today also carry some limitations: They are overreliant on continued camera feedback not just to understand the world but also to guide control loops; they interact with the world through a very limited set of surfaces of the robot, mostly the fingers, and often just their fingertips; and are almost exclusively focused on lightweight tasks. 

Real work, especially the back-breaking kind, requires a broadening of what we mean by physical intelligence. When we carry objects, we use any surface of our body to shoulder loads, and we adapt to their shape, mass, and rigidity through haptic sensations.

You cannot lift a fridge just by looking at it and using your hands. You have to prepare for it to anticipate the weight, lean into it, and let your body do the work of conforming to its shape, adapting to its weight, and testing whether you’ll be able to lift it. The actual work happens during interaction. Humanoids should be able to carry boxes between their forearms and biceps, they should be able to use their knees to lift a heavy object from the floor to their quads, and they should be able to throw a long, heavy object on their shoulders, just as well as they should be able to bear hug a fridge.

Atlas uses reinforcement learning (RL) to learn how to lift a fridge by practicing the move with an absurdly large number of variations of the fridge in simulation. The hardest part is not seeing the fridge or knowing how to lift it, but learning to adapt to whatever version of the fridge that Atlas will encounter in the real world. This is a combined control and perception problem, where perception is done implicitly from body proprioception. The policy driving the behavior has learned to adapt to variations like the location of the fridge, its mass, the amount of grip on the ground and with the fridge, or the configuration where the fridge settles in between the torso, arms, and hands. That level of adaptation is one of the most fundamental building blocks of physical intelligence.

A Robot to Bear the Load

The hardware we are presenting today is also unique. This generation of Atlas is in its own league, not just because it has been designed for the mobility and strength required for real work, but because it comes with the simplicity and reliability required for mass scale. There is clear value in the humanoid form factor, but you can also squeeze in a lot of performance and efficiency with some strategic departures from it. 

Here are some highlights that might not be apparent at first look:

  • We use only two types of actuators for the body. This allows us to focus on making more efficient and powerful actuators at a larger scale, which ends up lowering their cost. All are rotary actuators that are much easier to represent well in simulation, key to the high performance RL work with proprioceptive feedback discussed above.
  • We repeat as many sub-assemblies as possible in the body. Both legs and both arms are identical. The shoulder-to-shoulder and pelvis-to-pelvis structures are also identical. 
  • The actuators have infinite rotation. We achieve this by eliminating all cables across joints, and removing the key driver of hardware failures in actuators. In turn, this lowers the cost of Atlas to customers and gives Atlas unique ways to move with efficiency.
  • The feet are symmetrical in the front and back because Atlas is equally capable of moving forward and backward.
  • Arms, legs, hands, and head are all field replaceable units that can be swapped out within a few minutes.

Moving a mini-fridge demonstrates strength, whole-body coordination, and the use of proprioceptive feedback. It serves as a benchmark for industrial work—moving awkward, heavy, two-person lifts in manufacturing settings.

But less pragmatic tasks also have a purpose. For example, handstands and backflips are possible on a 90 kilogram (198 pound) robot because we have excellent thermal management, which means Atlas will be able to work in hot environments. And these behaviors train other transferable skills—how to move with agility and balance, how to use a full range of motion in constrained conditions, how to recover from slips and falls. 

We have been working with the new Atlas for the last few months, with great results, and we are very excited to begin the journey of mass-scaling it.  

The Training Montage

One of our goals for Atlas as a product, as well as research platform, is to be able to train and deploy new behaviors in as little as a day. This demo wasn’t quite that quick, but it was much faster than anticipated for Atlas to get the fridge move down very consistently.

Here’s how we trained the robot:

  • Reference: To start training a new behavior, we use a reference trajectory—data that tells the policy what it should be doing. This can be a teleoperated demonstration, an animated trajectory, or describing a more abstract goal. For the fridge move, we started with a simple animation, allowing us to take full advantage of Atlas’ superhuman range of motion.
  • Reward: Then we set an objective for the robot to stay as close to the animation as possible to complete the tasks. We established rewards to reinforce the desired behavior—keep the weight in Atlas’ grippers, in the same position and orientation—as well as pushing and pulling on the robot and fridge, so the policy learns to stay on the main task while it experiences disturbances. 
  • Simulation: Atlas practiced the moves for millions of hours in simulations in parallel on Graphics Processing Units (GPUs). Through the extensive experience in simulation, Atlas learned to adapt its behavior to the many variations of the fridge.
  • Real Robot: Once the simulation looked good, we tested on the hardware. Boston Dynamics has always had a build it, break it, fix it philosophy, and we’re continuing with that in our modern product-focused, AI-focused research. Simulation will only take you so far. Testing on hardware is how we make things better.
  • Iteration: Once we have real data about the policy’s performance on the real robot, we can go back to our training to make adjustments and harden the behavior.

Closing the Sim-to-Real Gap

One of the most significant improvements on the enterprise version of Atlas is the high fidelity of its simulation environment. We have a very small sim-to-real gap; it’s easy to train, test, and iterate quickly. Generally if a behavior looks good in simulation, it looks good on the robot.

The sim-to-real gap is the discrepancy between a policy’s performance in a simulated environment and its performance on real hardware. Assumptions and mathematical simplifications in simulation fail to capture the true complexity of the real world. Incremental variations and variables like friction, latency, or sensor noise add up to cause failures in the physical world.

While it may not ever be possible to completely close this gap, we have gotten extremely close. Across the entire Atlas team, we have established a rigorous pipeline and system support for testing and development. We can go from training a policy today to testing on the robot with a baked policy tomorrow, collecting data to fuel the next iteration and next behavior.


What makes this minimal sim-to-real gap possible? 

  • High-Fidelity Hardware: Unlike its predecessors, this platform uses only two types of powerful, highly efficient actuators and is fully symmetrical. That simplicity in its design and structure, as well as the actuators’ efficiency, means that we can model the robot with incredible accuracy in simulation. Because the robot model and the real hardware are extremely close, we have fewer fidelity issues deploying trained policies. What you see in sim is what you get in reality.
  • Domain Randomization: To make the policy robust, we also don’t train the robot in a perfect world. We use domain randomization to vary parameters like the weight of the fridge, the friction of the floor, or the strength of the motors slightly throughout the training process. Small, random variations throughout the training robustify the final behaviors to real world variability. For example, the policy for moving the fridge was trained for 50-70 pound loads, but the robot successfully moved a loaded fridge with a total weight of more than 100 pounds. We also don’t test in perfect conditions. We loaded the fridge with an assortment of objects from around the lab; the weight wasn’t consistent, it wasn’t evenly distributed, it was able to shift within the fridge during the movement. With a well trained policy, all of that noise can be accounted for by Atlas—not by an engineer.
  • People and Process: Finally, our processes and operations are set to make training, testing, and experimentation easy. There’s a rigorous pipeline that’s been established, and there’s a lot of people working behind the scenes. We work closely with many teams that make robots actually work, from the hardware design team, repair technicians, and robot captains. The entire organization is united behind making Atlas as reliable and performant as possible while pushing the envelope of new capabilities.

Heavy-Duty Hands

A note on hands: the grippers used in the fridge experiment and in the gymnast demo are not our newest iteration. These are the workhorse grippers we’ve been using for the last year and a half—strong enough to withstand a handstand and support the robot’s body weight, which clocks in far above the mini-fridge at 90 kilograms (198 pounds). We’ve started experimenting with a newer dexterous gripper, so stay tuned for exciting hands updates.