1/14

Module 1 of 14·16 min read

What Robots Actually Are

Ask ten roboticists what a robot is and you will get twelve answers. That confusion is semantic, but it is costing real money.

The Word Nobody Agrees On

On any given weekday, a six-axis arm at Tesla’s Fremont plant bolts a door onto a Model Y, a Roomba bumps into a kitchen island in Des Moines, a surgeon in New York closes a suture on a patient in Zurich through an Intuitive Surgical da Vinci system, and a marketing team in Austin watches “an AI robot” draft their Q2 campaign. Four machines. Four press releases that use the word robot. Only three of those machines are.

Ask ten roboticists what a robot is and you will get twelve answers.

A six-axis arm bolting doors onto Teslas at the Fremont plant? Obviously a robot. A Roomba bumping around your living room? Sure. A surgical system that lets a doctor in New York operate on a patient in Zurich? Most people would say yes. An algorithm that writes marketing copy? Almost nobody in the field would call that a robot, but the press does it anyway.

This confusion is semantic but it also costs real money. Investors lump together companies building physical machines with companies building software chatbots under the same “robotics and AI” umbrella, then wonder why the hardware company burns cash ten times faster and takes five times longer to reach scale. Operators adopt a “robotic process automation” tool expecting the capabilities of an actual robot to then discover they have purchased a macro with a marketing budget.

Quick origin story on the word “robot”: it entered English in 1920, when Czech playwright Karel Čapek premiered R.U.R. (Rossum’s Universal Robots)¹, a play about artificial workers who overthrow their creators. (Science fiction writers have been thinking about dystopian situations for a long time.) The Czech word robota means forced labor, so, from the start, the concept carried an assumption that has never quite gone away: a robot is a thing that replaces a human worker. That assumption shapes public debate, regulatory frameworks, and a surprising number of pitch decks, and it is also incomplete to the point of being misleading.

Here is a more useful starting point: a robot is a physical machine that senses its environment, processes that information to make decisions or follow instructions, and takes physical action in the world. To boil it down to three words: sense, think, and act². If a system does all three in the physical world, it is a robot. If it is missing any one of the three, it is something else (a sensor, a computer, or a power tool, respectively, and all perfectly fine things to be).

This is the framework I will use throughout this Core. It is not the only possible definition, but it is the one that best serves people who need to evaluate robotics companies, deploy robotic systems, or understand where the related industries are heading.

Sense, Think, Act

Every robot, from a $500 hobbyist arm to a $2 million surgical system, runs on the same basic loop. It gathers data about itself and its surroundings. It processes that data to decide what to do next. Then it does something physical: it moves, grips, welds, cuts, flies, drives, or simply holds still with precision.

The loop repeats. Hundreds of times per second in a fast application, a few times per minute in a slow one. Each cycle, the robot updates its understanding of the world and adjusts its actions accordingly.

Let’s consider a warehouse robot picking items off a shelf. The “sense” phase involves cameras identifying the target object, a depth sensor measuring the distance to the shelf, and force sensors in the gripper confirming whether something is actually in hand. The “think” phase involves software deciding which object to pick first, planning a collision-free path for the arm, and calculating how much grip force to apply. The “act” phase involves motors driving the arm along that path and the gripper closing with the right pressure. The accompanying figure traces this cycle through a single pick: sensors scan the bin, software selects a target and plots a path, motors execute the grasp, and then the loop resets for the next item.

The sense-think-act cycle in a warehouse picking application. Cameras and depth sensors identify target items ("sense"), onboard software plans a collision-free path and grip strategy ("think"), and motors execute the pick ("act") -- hundreds of times per hour, with each cycle updating the robot's understanding of the bin.

Int. 1.1

What makes the loop interesting, and what makes robotics genuinely hard, is that each phase constrains the others. Bad sensors produce bad data, which means the “think” layer works from a distorted picture of reality. Slow computation means the robot’s decisions lag behind a changing environment. Imprecise actuators mean that even perfect perception and perfect planning produce imperfect results. Simone Giertz’s breakfast-serving machine, the canonical exhibit from the self-styled “Queen of Shitty Robots,” is funny precisely because every link in the loop fails at once: it sees poorly, plans worse, and grips with the dignity of a flailing toddler. The weakest link in the chain determines the system’s overall capability, and identifying that weakest link is one of the most valuable skills in robotics evaluation.

In some of our robotics coursework at MIT and elsewhere, I see the belief that a better algorithm can rescue a bad sensor. Students arrive with that assumption and tend to discover, over the course of the semester, that the limiting factor was calibration, mounting, or lighting all along. The weakest link dominates the whole system, and developing intuition for which link is actually weakest in a given design is among the most transferable skills a roboticist builds.

I will go deep on each phase in the modules that follow. Module 2 covers the “act” layer (mechanical systems and actuators). Module 3 covers the “sense” layer (sensors and perception). Modules 4 and 5 cover the “think” layer (control systems and software). For now, the point is that these three functions are inseparable. A robot that can see perfectly but can’t move precisely is useless for surgery. A robot with extraordinary dexterity but no perception is useless outside a rigidly controlled environment.

Robotics Is Not AI (But They Make A Great Pair)

This distinction matters more than almost anything else in the space, and it is the one most commonly botched in popular coverage and investor presentations alike.

Artificial intelligence is software that performs cognitive tasks: recognizing faces, translating languages, predicting equipment failures, generating text. It runs on servers, laptops, and phones. It does not, by itself, interact with the physical world. A large language model has no arms. A computer vision system has no legs. An AI that beats the world champion at Go has never picked up a physical game piece.

Robotics is the engineering of physical machines that act in the world. For most of the field’s history (roughly 1960 through 2015), robots operated with minimal intelligence. An industrial arm on a car assembly line followed the same programmed path thousands of times per day. It did not “think” in any meaningful sense, it just executed instructions, and those instructions were painstakingly written by a human programmer who specified every millimeter of movement.

However, the fields of robotics and artificial intelligence are converging briskly. Modern robots increasingly rely on AI for perception (such as using machine learning to identify objects), for planning (using reinforcement learning to figure out how to grasp novel items), and for adaptation (using foundation models trained on millions of examples to handle situations they have never explicitly seen before). Boston Dynamics’ Atlas humanoid, Hyundai’s showcase for ‘physical AI’, uses learned behaviors to navigate terrain that would have required years of hand-coded rules in the 2010s³.

'This is an AI company' does not mean 'this is a robotics company'

A company that builds machine learning models for warehouse inventory management is an AI company, while a company that builds physical picking robots for warehouses is a robotics company that uses AI. This is notable particularly because the AI company’s cost structure is dominated by compute and talent whereas a robotics company’s cost structure likely includes all of that plus mechanical engineering, manufacturing, supply chain, field service, safety certification, and physical testing. The robotics company’s sales cycle is longer, its deployment cost is higher, its iteration speed is slower, and its margin structure is fundamentally different. Confusing the two leads to applying software valuation multiples to hardware businesses, which has destroyed real capital and distorted expectations.

A new category of company, sometimes called “physical AI” or “embodied intelligence,” is trying to build general-purpose software that control robots across many tasks. Figure AI, which raised $675 million in its Series B in early 2024 and followed with a $1 billion+ Series C at a $39 billion valuation in September 2025⁴, is betting that a single AI system can make a humanoid robot useful in factories, warehouses, and eventually homes. Whether that bet pays off depends on solving problems in both AI and robotics simultaneously, which is exactly why it is so expensive and so uncertain.

For the purposes of this Core, we treat AI as just one very broad catch-all term among a variety of enabling technologies that are making robots more capable. After all, a robot with brilliant autonomous decision-making abilities and a flimsy rubber gripper still drops things!

The Five Subsystems

Regardless of form factor or application, every robot contains some version of five subsystems. Knowing what they are gives you a framework for evaluating any robotics product or company:

1. Sensors collect data about the robot and its environment. Cameras, LIDAR (Light Detection and Ranging) units, force sensors, encoders that track joint positions, temperature probes, microphones. Some are pointed outward (what is happening around me?) and some are pointed inward (what are my own joints doing?). We will cover these in depth in Module 3.

2. Computation and control processes sensor data, runs algorithms, and sends commands to the actuators. This can range from a simple microcontroller executing pre-programmed routines to a full onboard computer running real-time perception, planning, and learning systems. Modules 4 and 5 cover this layer.

3. Actuators produce movement. Electric motors (the most common in modern robots), hydraulic cylinders (for applications requiring enormous force), or pneumatic systems (for lighter, faster motion). The actuator determines how strong, fast, and precise the robot’s movements can be. Module 2 goes deep here.

4. End effectors is our namesake because they are the tools at the ‘business end’ of the robot that do the work: the gripper that picks up a box, the welding torch that joins metal, the suction cup that lifts a silicon wafer, the scalpel that makes an incision. The end effector determines what tasks the robot can actually perform. A robot arm without an end effector is an expensive paperweight.

5. Power keeps everything running. Batteries for mobile robots, wall power for fixed industrial arms, sometimes fuel cells for heavy machinery. Power constrains where the robot can operate, how long it can work, and how much it can carry (because the battery itself has weight). Module 7 covers power systems.

The figure maps all five onto a single cobot, the Universal Robots UR10e, showing where each subsystem physically resides and why a failure in any one constrains the whole machine.

The five core subsystems of every robot, mapped onto a Universal Robots UR10e collaborative arm. Sensors, computation, actuators, end effector, and power each occupy distinct physical locations -- and the weakest of the five determines the system's overall capability.

These five subsystems interact in a continuous data flow that traces a loop through the robot and its environment. Sensors pull data in from the physical world. That data feeds into a perception stack, which builds a model of what is happening around the robot. The perception model feeds a decision layer, where the robot (or a human operator) determines what to do next. That decision drives the actuators, which act on the environment, changing the world in a way that the sensors will detect on the next cycle.

This is the Robot System Architecture: a circular chain from environment to sensors to perception to decision to actuators and back to the environment. The human’s position in this architecture (directly making decisions, supervising from outside, or absent entirely) is what distinguishes one robotic application from another.

The accompanying figure traces this circular flow and shows where the human can sit: inside the decision loop, supervising from outside, or absent entirely.

Fig. 1.6The Robot System Architecture traces a continuous loop: the environment feeds sensors, which feed perception, which feeds decision-making, which drives actuators that change the environment -- and the cycle restarts. Where the human sits in this loop defines the robot's autonomy level.

These five subsystems interact in ways that make the whole system harder to build than any individual component. That interaction is the subject of Module 8 (Building Complete Systems), and it is the reason robotics companies burn through cash in ways that pure software companies do not.

Why Atoms Are Harder Than Bits

There is a reason that software ‘ate the world’ before robots did. Software scales at near-zero marginal cost, as a software developer can write code once, copy it a billion times, and expeditiously distribute it over the Internet. If the code has a bug, it can be as trivial as pushing a patch. And if the market shifts, it can be feasible to adjust the product with a few sprints of engineering work. The feedback loop from idea to deployed product can be measured in weeks and was unlike anything we saw previously in civilization.

However, software has its limits: we can’t solely write code to solve the greatest challenges our society faces.

Robots live in the physical world, and the physical world is unforgiving.

Gravity pulls things down. Friction wears surfaces. Impacts break components. Temperature warps tolerances. Dust clogs sensors. Water corrodes electronics. Every one of these problems must be solved not in simulation but in the actual environment where the robot operates: a factory floor, a farm field, a hospital operating room, the bottom of the ocean.

Manufacturing a robot means sourcing hundreds of components from dozens of suppliers, machining parts to tight tolerances, assembling them in a clean environment, and testing each unit individually. A software update ships instantly to every user; a hardware revision requires retooling a production line, which takes months and costs millions.

The 'last inch' is where robotics companies die

A robot that works 95% of the time in the lab often works 60% of the time in the field⁵. That gap, the last inch between demo-ready and deployment-ready, is where most robotics startups stall or fail. The controlled environment of the lab eliminates variables that the real world restores with a vengeance: unexpected objects, weird lighting, slippery floors, impatient humans, ambient vibration from nearby equipment. Closing the last inch requires thousands of hours of field testing, dozens of hardware revisions, and a quality engineering discipline that most venture-backed startups underestimate by at least an order of magnitude. When a founder tells you the product “works,” a useful follow-up question is often: “Where, exactly, and for how many hours without human intervention?”

The split-panel figure illustrates why: a pristine lab eliminates the variables that a warehouse, farm, or factory floor restores with a vengeance.

The "last inch" between demo-ready and deployment-ready. A robot arm that succeeds 95% of the time in a controlled lab often drops to 60% in real-world conditions -- variable lighting, unexpected objects, and environmental unpredictability expose every weakness in perception and control.

A Working Taxonomy

Roboticists categorize robots in dozens of ways: by application, by form factor, by industry, by size, by payload. Most of these taxonomies are useful for specialists and confusing for everyone else.

For the purposes of this Core, we use two axes that matter most for evaluation and investment decisions.

Axis 1: Autonomy level. How much human involvement does the robot require during operation?

At one end: teleoperation, where a human controls every movement (think bomb disposal robots or underwater ROVs). The robot contributes its physical capabilities; the human contributes all the intelligence. At the other end: full autonomy, where the robot perceives, decides, and acts without human input for extended periods (a Mars rover, for instance, because the communication delay makes teleoperation impractical). In between lies a spectrum that includes pre-programmed automation (the industrial arm that repeats a fixed routine), supervised autonomy (the self-driving car with a safety driver), and collaborative operation (cobots working alongside humans on assembly lines).

Roboticists describe the human’s role with two useful phrases. A human in the loop is directly part of the robot’s decision-making process: the surgeon controlling a laparoscopic robot, the demolition operator driving each swing of an excavator arm. The robot is an extension of the human’s body and intent.

A human on the loop provides supervisory oversight: a fleet manager watching a swarm of warehouse robots, intervening only when one encounters a situation it cannot resolve. Most commercial robotics falls somewhere between these two configurations, and the distinction matters because it determines staffing ratios, liability frameworks, and the level of autonomy the software must actually deliver.

Autonomy level determines the business model

Teleoperated robots require a skilled human operator for every deployed unit. The labor cost does not disappear. It shifts from the point of action to a control station. The business case must justify that shift. Operators gain safety, reach into dangerous or remote locations, and superhuman precision through the machine. Fully autonomous robots eliminate the per-unit operator cost, but they require far more engineering investment upfront and far more rigorous safety certification. The autonomy level a company targets determines its cost structure, its regulatory burden, its addressable market, and its path to profitability. Most robotics companies do not choose their autonomy level based on a clear strategic analysis; they choose it based on what their technology can currently do, then retrofit a business model around it.

Axis 2: Mobility. Is the robot fixed in place or does it move through the environment?

Fixed robots (industrial arms, surgical systems, benchtop assembly stations) operate in a defined workspace. Their world is constrained and largely predictable. Mobile robots (warehouse AGVs, delivery drones, self-driving trucks, humanoids) must navigate unstructured environments where things change constantly.

The combination of these two axes produces the interesting categories. A fixed, pre-programmed industrial arm is the workhorse of automotive manufacturing: low autonomy, no mobility, but highly productive in its narrow domain. A mobile, highly autonomous humanoid is the moonshot bet that Figure AI, Tesla, and Agility Robotics are chasing: maximum autonomy, full mobility, but not yet proven at commercial scale. Pras Velagapudi, CTO of Agility Robotics, argues on the Audrow Nash Podcast that form factor is a constraint-driven engineering choice rather than an aspiration-driven one, which is the right way to read the autonomy-mobility map: every category below is the answer to a specific operating constraint, not a step toward a generalized humanoid finish line.

Between those extremes lies an enormous range of commercially viable products. The positioning map plots these categories on the two axes that matter most, revealing the industry’s trajectory: every generation pushes toward the upper right, more autonomous, more mobile, and exponentially harder to engineer.

Int. 1.2

Fig. 1.3The robotics landscape mapped on two axes that drive business models: autonomy level and mobility. The industry trend (orange arrow) pushes toward the upper right -- more autonomous, more mobile -- which is why the integration challenge intensifies with every generation.

Below are several areas of heavy development as of early 2026:

Category	Autonomy	Mobility	Example	Market Maturity
Industrial arms	Pre-programmed	Fixed	FANUC, ABB, KUKA	Mature, $16.7B/yr
Cobots	Supervised	Fixed	Universal Robots, FANUC CRX	Growing fast, ~10% of industrial installs
Warehouse AMRs	High	Mobile (wheels)	Amazon Robotics, Locus Robotics, 6 River Systems	Scaling, 750K+ units at Amazon alone
Surgical systems	Teleoperated	Fixed	Intuitive Surgical (da Vinci), Medtronic Hugo	Established, high-margin
Delivery drones	High	Mobile (air)	Zipline, Wing	Niche but expanding
Humanoids	Variable	Mobile (legs)	Boston Dynamics Atlas, Figure AI, Agility Robotics Digit	Pre-commercial

Industrial arms are gaining sensors and software that push them toward higher autonomy. Cobots are gaining mobility. Warehouse robots are gaining manipulation capabilities (arms mounted on mobile bases). The trend is toward more autonomy and more mobility simultaneously, which is why the integration challenge (Module 8) is the central engineering problem of the field.

Amazon's 750,000 robots, and what they can't do

Amazon operates more than 750,000 mobile robots in its fulfillment network, making it the largest robotics deployment in history by unit count. These robots are Kiva-derived systems (Amazon acquired Kiva Systems for $775 million in 2012⁶) that carry shelving pods to human workers. Navigation is solved. They do not pick items off shelves. The picking, the part that requires identifying a specific item among thousands of products of varying shapes, sizes, and packaging, is still done by human hands. Amazon has been investing in picking robots (including its own Sparrow system⁷) for years, but the perception and manipulation challenges remain partially unsolved for the full range of warehouse inventory. This single example illustrates the sense-think-act framework in action: the mobility problem is solved, the manipulation problem is partially solved, and the perception problem (identifying and planning grasps for arbitrary objects) is the binding constraint.

The scene in the accompanying figure captures the division of labor at scale: hundreds of mobile robots ferry shelving pods across the floor, but at the picking station, a human hand still does the work that perception and manipulation systems cannot yet match.

Amazon's Kiva-derived mobile robots deliver shelving pods to human pickers in a fulfillment center. The fleet exceeds 750,000 units -- the largest robotics deployment in history -- yet item picking still requires human hands, illustrating where autonomy ends and manipulation begins.

What must be true: robots become general-purpose platforms, like smartphones on legs

A single hardware platform must handle manipulation, locomotion, and perception well enough to justify its cost across multiple tasks, not just excel at one. Today’s best humanoids (Figure 02, Atlas, Optimus) are each optimized for narrow demonstrations; a general-purpose robot needs to be mediocre at nothing and adequate at everything, an engineering profile that fights against physics. Software must decouple from hardware the way apps decoupled from phones. Physical Intelligence raised $1.1B betting that a single foundation model can control “any robot for any task.” If that bet fails, “general-purpose” stays marketing language. And the unit economics must survive generality: a robot that does ten things at 70% the efficiency of a purpose-built machine costs more per task-hour than ten specialized machines.

Of the five subsystems introduced here, the one the pitch decks glide past is the body itself. The most expensive component in a Universal Robots UR10e is not the computer, not the force-torque sensors, not even the six electric motors. It is a harmonic drive gear unit that costs between $500 and $1,500; six of them live inside every arm, drawn from a supplier list short enough to fit on an index card, and together they decide what the robot can lift, how precisely it places things, and how long it lasts before a field-service truck has to drive to it. Module 2 opens the body and shows why mechanical choices explain more company failures than any AI benchmark ever will.

Further Viewing

🎥 Farewell to HD Atlas (2 min). A decade of Boston Dynamics’ hydraulic Atlas condensed into parkour, falls, and the physical punishment that advanced every generation. The fastest 120 seconds you can spend calibrating your intuition for what robots can and cannot do.

References

Čapek, K. (1920). R.U.R. (Rossum’s Universal Robots). Premiered at the National Theatre, Prague, January 25, 1921. English translation by Paul Selver (1923). — The play that gave English the word “robot” ends not with dystopia but with the robots learning to love. A century later, the labor-replacement anxiety Čapek dramatized is still the first question investors ask about every robotics company — and still the wrong framing for understanding where value actually accrues.
↩︎
Murphy, R. (2019). Introduction to AI Robotics. 2nd ed. MIT Press. — The standard graduate textbook that formalizes the sense-plan-act (or sense-think-act) paradigm as the foundational architecture for autonomous robots. Murphy’s framework is used throughout this Core because it maps cleanly onto the evaluation question every investor needs to answer: which part of the loop is this company good at, and which part will kill them?
↩︎
Boston Dynamics. (2024). “Atlas | Partners in Parkour” and “All New Atlas.” Boston Dynamics YouTube, various 2024 releases. — The electric Atlas (unveiled April 2024, replacing the hydraulic version) demonstrates learned locomotion behaviors that would have required years of hand-coded rules in the previous generation. The videos are the clearest public demonstration of the convergence between robotics and AI: the hardware is impressive, but the learned behaviors are what make it useful. Watch the terrain navigation sequences for the state of the art in physical AI.
↩︎
Figure AI. (2025). “Figure Raises Over $1 Billion in Series C.” Figure AI Press Release, September 2025. Prior: Series B of $675M at $2.6B valuation, February 2024. — The 15x valuation jump ($2.6B to $39B) in 18 months reflects investor enthusiasm for the “physical AI” thesis: a single AI system controlling a humanoid across many tasks. Whether the thesis is correct remains unproven — no humanoid robot has demonstrated commercially viable productivity in unstructured environments. The funding trajectory is a bet on a future capability, not a validation of current revenue. Total raised: approximately $1.9 billion.
↩︎
Weise, K. (2024). “The Robots Are Coming for the Warehouse Jobs. It’s Taking a While.” The New York Times, June 2024. — A corrective to the “automation is imminent” narrative. The piece documents how Amazon’s own robotics division has repeatedly pushed back its autonomy timelines — a pattern any investor in warehouse robotics should internalize. The gap between demo-ready and deployment-ready is measured not in software sprints but in years of field testing.
↩︎
Wingfield, N. (2012). “Amazon to Buy Kiva Systems for $775 Million.” The New York Times, March 19, 2012. — The acquisition that launched the modern warehouse robotics era. Amazon paid $775 million for Kiva, then pulled the product from the market (Kiva had been selling to other retailers) and rebranded it as Amazon Robotics. The decision to vertically integrate robotics — rather than buy from a vendor — set the template that other logistics giants have since followed. The acquisition price looks like a bargain relative to the operational savings generated.
↩︎
Amazon. (2023). “Amazon Introduces Sparrow.” Amazon Science Blog, November 2022; updated deployments through 2023. — Sparrow is Amazon’s robotic manipulation system designed to handle individual items in fulfillment centers. It uses computer vision and machine learning to identify and grasp items of varying shapes and sizes — the specific perception-manipulation challenge that the Kiva robots were never designed to solve. As of early 2026, Sparrow handles a subset of Amazon’s inventory; the full range of warehouse SKUs remains beyond current capability, illustrating the “last inch” problem described in this module.
↩︎

Strategic Takeaways

1The sense-think-act framework is the single most useful lens for evaluating any robotics company or product. If you can identify which part of the loop a company excels at, which part it struggles with, and which part it outsources, you understand its competitive position.

2Robotics and AI are distinct fields with increasing overlap. Confusing them leads to applying software economics to hardware businesses. The cost structures, timelines, and risk profiles are fundamentally different.

3Physical-world constraints (gravity, friction, breakage, power) impose costs and timelines that pure software companies never face. Respect the physics or get burned.

4The autonomy spectrum matters more than binary "autonomous vs. not" classifications. Where a robot sits on this spectrum determines its business model, its regulatory path, and its addressable market.

5Most value in robotics accrues at the integration layer, not at the component level. The company that can make all five subsystems work together reliably in a specific application is the one that captures margin. I explore this in detail in Module 11 (Economics of Deployment).

Key Terms

Robot

A physical machine that senses its environment, processes information, and takes physical action

Sense-think-act loop

The foundational operational cycle of any robot

Actuator

A component that converts energy into physical motion

End effector

The tool attached to the end of a robot that interacts with the environment

Degrees of freedom

The number of independent axes along which a robot can move

Autonomy spectrum

The range from full teleoperation to full autonomy

Human-in-the-loop

Human directly involved in the robot's decision process (e.g., surgeon with surgical robot)

Human-on-the-loop

Human provides supervisory oversight, intervening only when necessary (e.g., fleet manager)

Teleoperation

Direct human control of a robot from a distance

0/14 complete

By JMill/tee-rob-aa-20260424-5063f1/Revised Apr 24, 2026

Miller, J. (2026). What Robots Actually Are. Robotics, The End Effector. https://endeff.com/cores/robotics/what-robots-actually-are (tee-rob-aa-20260424-5063f1)