What Robots Actually Are
Ask ten roboticists what a robot is and you will get twelve answers. That confusion is semantic, but it is costing real money.
The Word Nobody Agrees On
On any given weekday, a six-axis arm at Tesla’s Fremont plant bolts a door onto a Model Y, a Roomba bumps into a kitchen island in Des Moines, a surgeon in New York closes a suture on a patient in Zurich through an Intuitive Surgical da Vinci system, and a marketing team in Austin watches “an AI robot” draft their Q2 campaign. Four machines. Four press releases that use the word robot. Only three of those machines are.
Ask ten roboticists what a robot is and you will get twelve answers.
A six-axis arm bolting doors onto Teslas at the Fremont plant? Obviously a robot. A Roomba bumping around your living room? Sure. A surgical system that lets a doctor in New York operate on a patient in Zurich? Most people would say yes. An algorithm that writes marketing copy? Almost nobody in the field would call that a robot, but the press does it anyway.
This confusion is semantic but it also costs real money. Investors lump together companies building physical machines with companies building software chatbots under the same “robotics and AI” umbrella, then wonder why the hardware company burns cash ten times faster and takes five times longer to reach scale. Operators adopt a “robotic process automation” tool expecting the capabilities of an actual robot to then discover they have purchased a macro with a marketing budget.
Quick origin story on the word “robot”: it entered English in 1920, when Czech playwright Karel Čapek premiered R.U.R. (Rossum’s Universal Robots)1, a play about artificial workers who overthrow their creators. (Science fiction writers have been thinking about dystopian situations for a long time.) The Czech word robota means forced labor, so, from the start, the concept carried an assumption that has never quite gone away: a robot is a thing that replaces a human worker. That assumption shapes public debate, regulatory frameworks, and a surprising number of pitch decks, and it is also incomplete to the point of being misleading.
Here is a more useful starting point: a robot is a physical machine that senses its environment, processes that information to make decisions or follow instructions, and takes physical action in the world. To boil it down to three words: sense, think, and act2. If a system does all three in the physical world, it is a robot. If it is missing any one of the three, it is something else (a sensor, a computer, or a power tool, respectively, and all perfectly fine things to be).
This is the framework I will use throughout this Core. It is not the only possible definition, but it is the one that best serves people who need to evaluate robotics companies, deploy robotic systems, or understand where the related industries are heading.
Sense, Think, Act
Every robot, from a $500 hobbyist arm to a $2 million surgical system, runs on the same basic loop. It gathers data about itself and its surroundings. It processes that data to decide what to do next. Then it does something physical: it moves, grips, welds, cuts, flies, drives, or simply holds still with precision.
The loop repeats. Hundreds of times per second in a fast application, a few times per minute in a slow one. Each cycle, the robot updates its understanding of the world and adjusts its actions accordingly.
Let’s consider a warehouse robot picking items off a shelf. The “sense” phase involves cameras identifying the target object, a depth sensor measuring the distance to the shelf, and force sensors in the gripper confirming whether something is actually in hand. The “think” phase involves software deciding which object to pick first, planning a collision-free path for the arm, and calculating how much grip force to apply. The “act” phase involves motors driving the arm along that path and the gripper closing with the right pressure. The accompanying figure traces this cycle through a single pick: sensors scan the bin, software selects a target and plots a path, motors execute the grasp, and then the loop resets for the next item.
What makes the loop interesting, and what makes robotics genuinely hard, is that each phase constrains the others. Bad sensors produce bad data, which means the “think” layer works from a distorted picture of reality. Slow computation means the robot’s decisions lag behind a changing environment. Imprecise actuators mean that even perfect perception and perfect planning produce imperfect results. Simone Giertz’s breakfast-serving machine, the canonical exhibit from the self-styled “Queen of Shitty Robots,” is funny precisely because every link in the loop fails at once: it sees poorly, plans worse, and grips with the dignity of a flailing toddler. The weakest link in the chain determines the system’s overall capability, and identifying that weakest link is one of the most valuable skills in robotics evaluation.
In some of our robotics coursework at MIT and elsewhere, I see the belief that a better algorithm can rescue a bad sensor. Students arrive with that assumption and tend to discover, over the course of the semester, that the limiting factor was calibration, mounting, or lighting all along. The weakest link dominates the whole system, and developing intuition for which link is actually weakest in a given design is among the most transferable skills a roboticist builds.
I will go deep on each phase in the modules that follow. Module 2 covers the “act” layer (mechanical systems and actuators). Module 3 covers the “sense” layer (sensors and perception). Modules 4 and 5 cover the “think” layer (control systems and software). For now, the point is that these three functions are inseparable. A robot that can see perfectly but can’t move precisely is useless for surgery. A robot with extraordinary dexterity but no perception is useless outside a rigidly controlled environment.
Robotics Is Not AI (But They Make A Great Pair)
This distinction matters more than almost anything else in the space, and it is the one most commonly botched in popular coverage and investor presentations alike.
Artificial intelligence is software that performs cognitive tasks: recognizing faces, translating languages, predicting equipment failures, generating text. It runs on servers, laptops, and phones. It does not, by itself, interact with the physical world. A large language model has no arms. A computer vision system has no legs. An AI that beats the world champion at Go has never picked up a physical game piece.
Robotics is the engineering of physical machines that act in the world. For most of the field’s history (roughly 1960 through 2015), robots operated with minimal intelligence. An industrial arm on a car assembly line followed the same programmed path thousands of times per day. It did not “think” in any meaningful sense, it just executed instructions, and those instructions were painstakingly written by a human programmer who specified every millimeter of movement.
However, the fields of robotics and artificial intelligence are converging briskly. Modern robots increasingly rely on AI for perception (such as using machine learning to identify objects), for planning (using reinforcement learning to figure out how to grasp novel items), and for adaptation (using foundation models trained on millions of examples to handle situations they have never explicitly seen before). Boston Dynamics’ Atlas humanoid, Hyundai’s showcase for ‘physical AI’, uses learned behaviors to navigate terrain that would have required years of hand-coded rules in the 2010s3.
A new category of company, sometimes called “physical AI” or “embodied intelligence,” is trying to build general-purpose software that control robots across many tasks. Figure AI, which raised $675 million in its Series B in early 2024 and followed with a $1 billion+ Series C at a $39 billion valuation in September 20254, is betting that a single AI system can make a humanoid robot useful in factories, warehouses, and eventually homes. Whether that bet pays off depends on solving problems in both AI and robotics simultaneously, which is exactly why it is so expensive and so uncertain.
For the purposes of this Core, we treat AI as just one very broad catch-all term among a variety of enabling technologies that are making robots more capable. After all, a robot with brilliant autonomous decision-making abilities and a flimsy rubber gripper still drops things!
The Five Subsystems
Regardless of form factor or application, every robot contains some version of five subsystems. Knowing what they are gives you a framework for evaluating any robotics product or company:
1. Sensors collect data about the robot and its environment. Cameras, LIDAR (Light Detection and Ranging) units, force sensors, encoders that track joint positions, temperature probes, microphones. Some are pointed outward (what is happening around me?) and some are pointed inward (what are my own joints doing?). We will cover these in depth in Module 3.
2. Computation and control processes sensor data, runs algorithms, and sends commands to the actuators. This can range from a simple microcontroller executing pre-programmed routines to a full onboard computer running real-time perception, planning, and learning systems. Modules 4 and 5 cover this layer.
3. Actuators produce movement. Electric motors (the most common in modern robots), hydraulic cylinders (for applications requiring enormous force), or pneumatic systems (for lighter, faster motion). The actuator determines how strong, fast, and precise the robot’s movements can be. Module 2 goes deep here.
4. End effectors is our namesake because they are the tools at the ‘business end’ of the robot that do the work: the gripper that picks up a box, the welding torch that joins metal, the suction cup that lifts a silicon wafer, the scalpel that makes an incision. The end effector determines what tasks the robot can actually perform. A robot arm without an end effector is an expensive paperweight.
5. Power keeps everything running. Batteries for mobile robots, wall power for fixed industrial arms, sometimes fuel cells for heavy machinery. Power constrains where the robot can operate, how long it can work, and how much it can carry (because the battery itself has weight). Module 7 covers power systems.
The figure maps all five onto a single cobot, the Universal Robots UR10e, showing where each subsystem physically resides and why a failure in any one constrains the whole machine.
These five subsystems interact in a continuous data flow that traces a loop through the robot and its environment. Sensors pull data in from the physical world. That data feeds into a perception stack, which builds a model of what is happening around the robot. The perception model feeds a decision layer, where the robot (or a human operator) determines what to do next. That decision drives the actuators, which act on the environment, changing the world in a way that the sensors will detect on the next cycle.
This is the Robot System Architecture: a circular chain from environment to sensors to perception to decision to actuators and back to the environment. The human’s position in this architecture (directly making decisions, supervising from outside, or absent entirely) is what distinguishes one robotic application from another.
The accompanying figure traces this circular flow and shows where the human can sit: inside the decision loop, supervising from outside, or absent entirely.
These five subsystems interact in ways that make the whole system harder to build than any individual component. That interaction is the subject of Module 8 (Building Complete Systems), and it is the reason robotics companies burn through cash in ways that pure software companies do not.
Why Atoms Are Harder Than Bits
There is a reason that software ‘ate the world’ before robots did. Software scales at near-zero marginal cost, as a software developer can write code once, copy it a billion times, and expeditiously distribute it over the Internet. If the code has a bug, it can be as trivial as pushing a patch. And if the market shifts, it can be feasible to adjust the product with a few sprints of engineering work. The feedback loop from idea to deployed product can be measured in weeks and was unlike anything we saw previously in civilization.
However, software has its limits: we can’t solely write code to solve the greatest challenges our society faces.
Robots live in the physical world, and the physical world is unforgiving.
Gravity pulls things down. Friction wears surfaces. Impacts break components. Temperature warps tolerances. Dust clogs sensors. Water corrodes electronics. Every one of these problems must be solved not in simulation but in the actual environment where the robot operates: a factory floor, a farm field, a hospital operating room, the bottom of the ocean.
Manufacturing a robot means sourcing hundreds of components from dozens of suppliers, machining parts to tight tolerances, assembling them in a clean environment, and testing each unit individually. A software update ships instantly to every user; a hardware revision requires retooling a production line, which takes months and costs millions.
The split-panel figure illustrates why: a pristine lab eliminates the variables that a warehouse, farm, or factory floor restores with a vengeance.
A Working Taxonomy
Roboticists categorize robots in dozens of ways: by application, by form factor, by industry, by size, by payload. Most of these taxonomies are useful for specialists and confusing for everyone else.
For the purposes of this Core, we use two axes that matter most for evaluation and investment decisions.
Axis 1: Autonomy level. How much human involvement does the robot require during operation?
At one end: teleoperation, where a human controls every movement (think bomb disposal robots or underwater ROVs). The robot contributes its physical capabilities; the human contributes all the intelligence. At the other end: full autonomy, where the robot perceives, decides, and acts without human input for extended periods (a Mars rover, for instance, because the communication delay makes teleoperation impractical). In between lies a spectrum that includes pre-programmed automation (the industrial arm that repeats a fixed routine), supervised autonomy (the self-driving car with a safety driver), and collaborative operation (cobots working alongside humans on assembly lines).
Roboticists describe the human’s role with two useful phrases. A human in the loop is directly part of the robot’s decision-making process: the surgeon controlling a laparoscopic robot, the demolition operator driving each swing of an excavator arm. The robot is an extension of the human’s body and intent.
A human on the loop provides supervisory oversight: a fleet manager watching a swarm of warehouse robots, intervening only when one encounters a situation it cannot resolve. Most commercial robotics falls somewhere between these two configurations, and the distinction matters because it determines staffing ratios, liability frameworks, and the level of autonomy the software must actually deliver.
Axis 2: Mobility. Is the robot fixed in place or does it move through the environment?
Fixed robots (industrial arms, surgical systems, benchtop assembly stations) operate in a defined workspace. Their world is constrained and largely predictable. Mobile robots (warehouse AGVs, delivery drones, self-driving trucks, humanoids) must navigate unstructured environments where things change constantly.
The combination of these two axes produces the interesting categories. A fixed, pre-programmed industrial arm is the workhorse of automotive manufacturing: low autonomy, no mobility, but highly productive in its narrow domain. A mobile, highly autonomous humanoid is the moonshot bet that Figure AI, Tesla, and Agility Robotics are chasing: maximum autonomy, full mobility, but not yet proven at commercial scale. Pras Velagapudi, CTO of Agility Robotics, argues on the Audrow Nash Podcast that form factor is a constraint-driven engineering choice rather than an aspiration-driven one, which is the right way to read the autonomy-mobility map: every category below is the answer to a specific operating constraint, not a step toward a generalized humanoid finish line.
Between those extremes lies an enormous range of commercially viable products. The positioning map plots these categories on the two axes that matter most, revealing the industry’s trajectory: every generation pushes toward the upper right, more autonomous, more mobile, and exponentially harder to engineer.
Below are several areas of heavy development as of early 2026:
| Category | Autonomy | Mobility | Example | Market Maturity |
|---|---|---|---|---|
| Industrial arms | Pre-programmed | Fixed | FANUC, ABB, KUKA | Mature, $16.7B/yr |
| Cobots | Supervised | Fixed | Universal Robots, FANUC CRX | Growing fast, ~10% of industrial installs |
| Warehouse AMRs | High | Mobile (wheels) | Amazon Robotics, Locus Robotics, 6 River Systems | Scaling, 750K+ units at Amazon alone |
| Surgical systems | Teleoperated | Fixed | Intuitive Surgical (da Vinci), Medtronic Hugo | Established, high-margin |
| Delivery drones | High | Mobile (air) | Zipline, Wing | Niche but expanding |
| Humanoids | Variable | Mobile (legs) | Boston Dynamics Atlas, Figure AI, Agility Robotics Digit | Pre-commercial |
Industrial arms are gaining sensors and software that push them toward higher autonomy. Cobots are gaining mobility. Warehouse robots are gaining manipulation capabilities (arms mounted on mobile bases). The trend is toward more autonomy and more mobility simultaneously, which is why the integration challenge (Module 8) is the central engineering problem of the field.
The scene in the accompanying figure captures the division of labor at scale: hundreds of mobile robots ferry shelving pods across the floor, but at the picking station, a human hand still does the work that perception and manipulation systems cannot yet match.
Of the five subsystems introduced here, the one the pitch decks glide past is the body itself. The most expensive component in a Universal Robots UR10e is not the computer, not the force-torque sensors, not even the six electric motors. It is a harmonic drive gear unit that costs between $500 and $1,500; six of them live inside every arm, drawn from a supplier list short enough to fit on an index card, and together they decide what the robot can lift, how precisely it places things, and how long it lasts before a field-service truck has to drive to it. Module 2 opens the body and shows why mechanical choices explain more company failures than any AI benchmark ever will.
Further Viewing
- 🎥 Farewell to HD Atlas (2 min). A decade of Boston Dynamics’ hydraulic Atlas condensed into parkour, falls, and the physical punishment that advanced every generation. The fastest 120 seconds you can spend calibrating your intuition for what robots can and cannot do.
References
- ↩︎
Čapek, K. (1920). R.U.R. (Rossum’s Universal Robots). Premiered at the National Theatre, Prague, January 25, 1921. English translation by Paul Selver (1923). — The play that gave English the word “robot” ends not with dystopia but with the robots learning to love. A century later, the labor-replacement anxiety Čapek dramatized is still the first question investors ask about every robotics company — and still the wrong framing for understanding where value actually accrues.
- ↩︎
Murphy, R. (2019). Introduction to AI Robotics. 2nd ed. MIT Press. — The standard graduate textbook that formalizes the sense-plan-act (or sense-think-act) paradigm as the foundational architecture for autonomous robots. Murphy’s framework is used throughout this Core because it maps cleanly onto the evaluation question every investor needs to answer: which part of the loop is this company good at, and which part will kill them?
- ↩︎
Boston Dynamics. (2024). “Atlas | Partners in Parkour” and “All New Atlas.” Boston Dynamics YouTube, various 2024 releases. — The electric Atlas (unveiled April 2024, replacing the hydraulic version) demonstrates learned locomotion behaviors that would have required years of hand-coded rules in the previous generation. The videos are the clearest public demonstration of the convergence between robotics and AI: the hardware is impressive, but the learned behaviors are what make it useful. Watch the terrain navigation sequences for the state of the art in physical AI.
- ↩︎
Figure AI. (2025). “Figure Raises Over $1 Billion in Series C.” Figure AI Press Release, September 2025. Prior: Series B of $675M at $2.6B valuation, February 2024. — The 15x valuation jump ($2.6B to $39B) in 18 months reflects investor enthusiasm for the “physical AI” thesis: a single AI system controlling a humanoid across many tasks. Whether the thesis is correct remains unproven — no humanoid robot has demonstrated commercially viable productivity in unstructured environments. The funding trajectory is a bet on a future capability, not a validation of current revenue. Total raised: approximately $1.9 billion.
- ↩︎
Weise, K. (2024). “The Robots Are Coming for the Warehouse Jobs. It’s Taking a While.” The New York Times, June 2024. — A corrective to the “automation is imminent” narrative. The piece documents how Amazon’s own robotics division has repeatedly pushed back its autonomy timelines — a pattern any investor in warehouse robotics should internalize. The gap between demo-ready and deployment-ready is measured not in software sprints but in years of field testing.
- ↩︎
Wingfield, N. (2012). “Amazon to Buy Kiva Systems for $775 Million.” The New York Times, March 19, 2012. — The acquisition that launched the modern warehouse robotics era. Amazon paid $775 million for Kiva, then pulled the product from the market (Kiva had been selling to other retailers) and rebranded it as Amazon Robotics. The decision to vertically integrate robotics — rather than buy from a vendor — set the template that other logistics giants have since followed. The acquisition price looks like a bargain relative to the operational savings generated.
- ↩︎
Amazon. (2023). “Amazon Introduces Sparrow.” Amazon Science Blog, November 2022; updated deployments through 2023. — Sparrow is Amazon’s robotic manipulation system designed to handle individual items in fulfillment centers. It uses computer vision and machine learning to identify and grasp items of varying shapes and sizes — the specific perception-manipulation challenge that the Kiva robots were never designed to solve. As of early 2026, Sparrow handles a subset of Amazon’s inventory; the full range of warehouse SKUs remains beyond current capability, illustrating the “last inch” problem described in this module.
Strategic Takeaways
Key Terms
Miller, J. (2026). What Robots Actually Are. Robotics, The End Effector. https://endeff.com/cores/robotics/what-robots-actually-are (tee-rob-aa-20260424-5063f1)

