Energy dissipation, sensor bandwidth, and learned compliance
A research-grade follow-up to my SymbioKinetics interview, on how to make a Franka Research 3 reliably 1‑Newton-compliant for diagnostic ultrasound and physiotherapy contact — and what I wish I had said when the question first came up.
TL;DR. The interview question — “what are the ways to dissipate energy to make a robot compliant?” — has a deep answer that I gave partial credit to (capacitance, heat, ML on raw sensors). This document is the full version: a literature-grounded taxonomy of compliance, a comprehensive inventory of energy dissipation pathways, a defense of why tapping pre-controller sensors with a learned state estimator is the right move on a 1 kHz robot, and a concrete 90-day plan I’d execute on FR3 if hired. Jump straight to §8 the proposal if you’re impatient.
§1 · The interview
I spoke with the SymbioKinetics team about your work on robot-assisted diagnostic ultrasound, US‑guided procedures, and physiotherapy. Toward the end of the conversation, one of the interviewers — a professor — asked a deceptively simple question:
Energy has to leave the robot somewhere for compliance to be real. What are all the ways you can dissipate energy in a compliant arm?
I gave two short answers in the moment — electrical capacitance and mechanical heat — and then pivoted to a hypothesis: that if we tapped sensor signals before they reached the 1 kHz controller and trained an ML model on them, the controller could be made more compliant. I believe the instinct was right, but I could feel that I wasn’t doing the answer justice and couldn’t, on the spot, point to the specific literature or define the architecture precisely.
This document is what I would have said if I had a few hours and a citation manager. It is also, in part, a public note to myself about how I want to argue ideas like this in the future: with the taxonomy, the math, the literature, and the failure modes laid out cleanly.
A few notes on framing:
- This is research notes, not a sales deck. Where the literature disagrees with my pitch, I say so. Where I’d refine my interview answers, I say so (§7).
- I assume the reader is a controls or robotics engineer. Equations, citations, and SI units throughout.
- The four interactive widgets are the only thing in this document that I think is genuinely better as a webpage than as a PDF. They are also the only thing where prior reading is not strictly required — drag the sliders.
- The target platform is Franka Research 3 (FR3)[30], since that matches what SymbioKinetics deploys, but most of the proposal is platform-agnostic.
§2 · The problem space
Three things stack on top of each other to make compliant medical robotics genuinely hard:
1. The force target is tight. Diagnostic ultrasound probes need 5–20 N of normal force on tissue to produce a clean image, and tissue characterization techniques like elastography require force regulation on the order of ±1 N to be meaningful[28]. Physiotherapy contact — sustained pressure on soft tissue, with motion — has the same regulation budget, plus the additional constraint that the patient is also moving (breathing, micro-adjustments, occasional reflexes). For a 7-DOF arm with a 3 kg payload and Cartesian impedance from libfranka[31], 1 N regulation across a clinically-realistic motion envelope is at the edge of what off-the-shelf control gives you.
2. The contact dynamics are not benign. Skin is not a linear spring. Tissue impedance varies by an order of magnitude across body sites and by 2–3× across the breathing cycle on the same site. A controller tuned for one stiffness will overshoot or chatter on another. The literature on robot-assisted ultrasound — particularly from Navab’s CAMP group at TUM[29][28] — treats this as a planning problem (path) and a perception problem (acoustic shadows, segmentation), but the underlying control problem is variable impedance against an uncertain compliant load.
3. The safety budget is small. ISO/TS 15066[32] sets quasi-static and transient force/pressure limits for collaborative robots in contact with the human body, with the strictest limits at the face, neck, and abdomen — all relevant for diagnostic US and PT. The standard treats forces above the pain threshold as a safety failure, not just a clinical one. Add IEC 60601-1[34] (medical electrical safety) and IEC 62366-1[35] (usability) on top, and the design space narrows quickly.
The core technical question is therefore: how do we get reliable ~1 N Cartesian force regulation, robust to a 10× variation in environmental stiffness, on a 1 kHz industrial control stack, within a regulatory envelope that doesn’t forgive failures? Everything below is a piece of that answer.
§3 · The compliance taxonomy
Compliance in a manipulator is the relationship between contact force and motion: how much the end-effector yields to an external push, and how that yielding is shaped over time. The literature carves the space along two axes — where the compliance physically lives (passive vs active) and how the controller drives it (impedance vs admittance vs hybrid).
3.1 Passive, active, and hybrid
A robot is passively compliant if the compliance comes from physical components — springs, elastomers, pneumatic chambers, cable elasticity — that yield even when the controller is off. It is actively compliant if the controller modulates the joint torques (or joint positions) so that the end-effector behaves as if compliant, while the underlying mechanism remains stiff. Most successful medical and human-interaction arms are hybrid: a passively low-impedance mechanism (light links, harmonic drives with backdrivability) plus an active torque controller. FR3 falls in this camp.
The single most useful textbook treatment remains Siciliano & Villani’s Robot Force Control[9]; for compliant joints specifically, Ott’s monograph[4] and Calanca, Muradore & Fiorini’s 2016 survey[8] are the modern references. Calanca et al. introduce a clean distinction between stiff robots (rigid transmission, compliance via control), fixed-compliance robots (compliance in series with the actuator, e.g., series elastic), and variable-compliance robots (variable impedance actuators)[7] — this is the taxonomy I’ll use throughout.
3.2 Impedance control
Impedance control, introduced by Hogan in 1985[1], formalizes the idea that the controller should impose a desired mechanical impedance on the end-effector — not a position, and not a force, but a relationship between the two. In the most common form,
where is the position error against a virtual equilibrium, are the desired Cartesian inertia, damping, and stiffness, and is the measured (or estimated) contact force. The controller realizes this by commanding joint torques that, under the manipulator dynamics, make the end-effector behave like that mass-spring-damper.
Impedance control needs torque-controllable actuators and low intrinsic impedance — which is why DLR’s LWR[5] and the Franka FR3 are so important: they put a joint-torque sensor at every joint and expose torque commands, so impedance control is the native mode[31].
3.3 Admittance control
Admittance is the dual. The controller measures the contact force and computes a commanded velocity (or position) that satisfies the same impedance equation. Under the hood, an inner high-gain position loop tracks that command. Admittance control is the right choice when the underlying actuator is not torque-controllable — e.g., a position-controlled industrial arm — at the cost of having a stiff inner loop that can transiently misbehave on hard contact.
The widget below makes the difference visceral: in impedance mode the controller drives a force and the position is allowed to deflect; in admittance mode it drives a position and the force is allowed to swell. Cranking K high in admittance shows the chattering on contact that real arms exhibit when the inner loop is over-aggressive.
3.4 Hybrid force/position
Raibert & Craig’s 1981 hybrid scheme[2] handles the case where the task naturally decomposes into orthogonal directions — position-controlled in some axes (e.g., move the ultrasound probe along the abdomen) and force-controlled in others (e.g., maintain 8 N normal pressure into the skin). It is implemented by projecting commands through a selection matrix S that picks which DoFs run on position and which on force. For ultrasound and PT, hybrid control is often the cleanest framing: the probe path is position; the probe pressure is force.
3.5 Operational-space and Cartesian impedance
Khatib’s operational-space formulation[3] derives end-effector dynamics directly, expressing both motion and force at the task level and using the null space for redundancy (a 7-DoF arm with a 6-DoF task has 1 DoF of redundancy). Cartesian impedance control on a redundant flexible-joint robot — the FR3’s situation — is laid out rigorously by Ott[4]: stiffness in the task space, plus a null-space stiffness that prefers a “natural” posture without affecting end-effector behavior. This is the control structure I’ll be assuming throughout the rest of the document.
3.6 The disturbance-observer thread
Running parallel to all of this is the disturbance-observer (DOB) tradition out of Ohnishi’s group[41][11]. A DOB infers external disturbances from the discrepancy between commanded and observed motor behavior, allowing the controller to compensate them without dedicated force sensing. DOBs are not a competing paradigm to impedance — they are an enabling one, particularly when the joint torque sensors are noisy or expensive. The 2020 35th-anniversary overview[11] is the survey to read if you have not.
§4 · Where the energy actually goes
The interview question was: how do you dissipate energy to make a robot compliant? The honest answer is that “dissipate” is too narrow a word — what compliant systems do is temporarily store energy and strategically release it (so that contact transients don’t appear as instantaneous force spikes). Some of that stored energy is later recovered; some is genuinely dissipated as heat or fluid work. Knowing which is which matters because the recovery pathway is what determines efficiency, and the dissipation pathway is what determines worst-case force on the human.
The widget below shows the energy flow per actuator type, with rough fractions based on the literature. The exact splits depend on tuning and load — these are illustrative, not measured.
4.1 Series Elastic Actuators
Pratt & Williamson’s series-elastic actuator (SEA)[6] decouples the motor from the load with a spring of stiffness . On contact, the spring stores kinetic energy as , where is the spring deflection. The spring also serves as a high-bandwidth force sensor (deflection × = force). Most of the input energy is stored and later recovered; the genuinely-lost energy is in motor winding , bearing friction, and seal friction. SEAs trade peak force for safety and force fidelity — exactly the trade we want for contact with humans.
4.2 Variable impedance / variable stiffness actuators
A VSA goes one step further: a second motor (or cam, or antagonistic pulley) modulates on the fly[7]. You spend extra electrical energy preloading the stiffness, but you gain the ability to be soft during fast motion and stiff during precise positioning. The 2013 Vanderborght et al. survey[7] classifies the design space into four categories and remains the standard reference. The cost is mechanical complexity and an extra control axis per joint, which is why VSAs are common in research and rare in production.
4.3 Damping: viscous, magnetorheological, viscoelastic
Where SEAs and VSAs store energy, dampers dissipate it.
- Viscous dampers (hydraulic, fluid-filled) produce a force proportional to velocity. Standard tooling, but the damping is fixed at design time.
- Magnetorheological (MR) dampers modulate effective viscosity in roughly 10 ms by varying a magnetic field that orients iron particles in a carrier fluid. This makes damping a controllable variable, not a fixed parameter — useful for variable-impedance designs that don’t want a second motor.
- Viscoelastic polymers (silicone, rubber) combine elastic storage and internal friction. They are cheap, passive, and excellent for absorbing accidental high-frequency impacts, at the cost of hysteresis loss and creep over time.
All three are pure heat-into-the-environment paths; nothing is recovered. They earn their place in compliance design by bounding worst-case force on impact, not by improving efficiency.
4.4 The “capacitance” answer, done correctly
In the interview I said capacitance. What I meant — and what I should have named — is regenerative braking onto a shared DC bus, buffered by a capacitor bank, with surplus dumped through a braking resistor. This is how every modern multi-axis servo system handles the energy that comes back when one axis decelerates while another accelerates.
Mechanically: the motor is run as a generator. Electrically: the back-EMF current is rectified onto the DC bus. If a sibling axis is consuming power, the bus voltage stays nominal and the energy is transferred, not dissipated. If nothing is consuming, the bus capacitor absorbs the transient. If the cap reaches its overvoltage threshold, an IGBT switches a braking resistor in and dumps the surplus as heat. So three pathways:
- Transfer to another axis (lossless to the system).
- Storage in the DC-bus cap (recoverable on the next acceleration).
- Dissipation through the braking resistor (genuine heat loss).
For a compliant robot, this matters because the controller can deliberately run motors in regenerative mode when the load is pushing back — turning the arm’s interaction kinetic energy into bus-cap charge rather than heat — and recover it on the next move. It’s not “capacitance” as a property of the actuator; it’s the bus-cap as a buffer. I should have used that language.
4.5 Pneumatic and tendon-driven
McKibben pneumatic muscles store energy in compressed air (), and the air column itself is inherently compliant. They are common in soft robotics and rehabilitation devices because they are intrinsically safe — air leaks and decompresses gracefully. The dissipative paths are valve flow friction and seal leakage. They are uncommon in industrial medical robotics because of the bulk and noise.
Tendon-driven systems (cable-and-pulley, e.g., MIT’s quadrupeds, surgical wrist mechanisms) decouple the motors from the joints. Compliance comes from cable elasticity; dissipation comes from Bowden routing friction. Useful when you want to put heavy motors at the base and keep the moving structure light.
4.6 The right answer to the interview question
If I had a whiteboard, the answer to how do you dissipate energy? would be a 2×2:
| Storage (recoverable) | Dissipation (heat) | |
|---|---|---|
| Mechanical | Series springs (SEA), pneumatic compression | Viscous, MR, viscoelastic |
| Electrical | DC-bus cap, sibling-axis transfer | Motor , braking resistor |
For SymbioKinetics on FR3, the design has already been made: most of the active compliance budget is in the active impedance controller (high-bandwidth torque control against measured joint torque), some passive compliance lives in the harmonic-drive flexibility and the wrist tooling, and energy regeneration happens transparently through the drive electronics. There is almost no room to add new dissipation hardware. That is the constraint that motivates the rest of this document: if we can’t change the dissipation hardware, the leverage left is in how the controller knows what’s happening at the contact — which is the bandwidth question, which is the ML question.
§5 · Sensor placement and the real bandwidth ceiling
In the interview I said what if we tap the sensors before they reach the controller, and use ML? The instinct was right; the formulation was vague. Here is the rigorous version.
5.1 The Nyquist constraint isn’t the only one
The naive view of control bandwidth is the Nyquist limit: a controller sampling at can resolve signal content up to [12]. In practice, controllers need closed-loop bandwidth roughly an order of magnitude below their sample rate — anti-aliasing filter delay, finite-precision arithmetic, computation latency, and stability margins eat the rest. A 1 kHz FCI loop on FR3[31] realistically yields ~100 Hz of useful closed-loop force-tracking bandwidth, not 500 Hz.
What this means concretely: a contact transient that rings at, say, 600 Hz — a perfectly reasonable frequency for a stiff joint hitting a stiff object — does not show up as a slightly-smeared version of itself in the controller’s input. It shows up as an alias: a low-frequency ghost that masquerades as a slow disturbance. The controller responds to the ghost, not to reality.
Drag the ringing-frequency slider in the widget below past 500 Hz with the rate set to 1 kHz, and watch the orange trace stop resembling the grey one.
5.2 Collocated vs non-collocated sensing
Spong’s 1987 flexible-joint robot paper[10] makes the structural point: if your sensor is on the wrong side of a compliance, you cannot stabilize arbitrarily fast. The Franka FR3 mostly avoids this: joint torque sensors live just outboard of the harmonic drive at every joint, very close to the load. But there are still residual non-collocated paths — the wrist’s tool plate, the probe holder, the probe itself — where compliance lives downstream of all the sensors. These are the high-frequency contact paths the outer loop literally cannot see.
5.3 Motor current as a fast torque proxy
Inside every servo drive there is already a current loop running at 10–40 kHz. The motor current is, modulo gear friction and inductance, a fast proxy for joint torque. Ohnishi’s disturbance-observer[41][11] uses exactly this signal — comparing commanded vs. observed current — to infer disturbances at a rate far higher than the outer position loop. Disturbance-observer-based robust control has been a mainstream technique for 35 years; it works.
This is where the “tap before the controller” instinct lands in the literature: not as a new idea, but as the recognition that the fastest available copy of the contact information is already inside the drive electronics, and the outer loop is throwing most of it away by undersampling.
5.4 Why ML matters
If the only goal were to recover undersampled signals, we’d reach for a Kalman filter or a frequency-domain observer and stop. The reason ML enters is that the contact dynamics — tissue stiffness, probe-skin slip, breathing-induced motion — are nonlinear, non-stationary, and impossible to model analytically. We don’t have a closed-form model that maps “raw joint torques + raw motor currents + probe inertia” to “current contact state on this patient’s abdomen at this point in the breathing cycle.” We have data. So we learn the map. That is the formal name for what I was trying to describe: a learned state estimator. The literature on this is dense and recent; §6 walks through it.
5.5 The bandwidth budget summary
| Stage | Typical rate (FR3) | What it can resolve |
|---|---|---|
| FOC current loop (inside drive) | 10–40 kHz | Audio-frequency contact transients, friction signatures |
| Joint torque sensor stream (FCI) | 1 kHz | Steady-state and slow transients up to ~100 Hz |
| Outer Cartesian impedance loop | 1 kHz | Closed-loop bandwidth ~100 Hz |
| Wrist 6-axis F/T (typical) | 500 Hz–1 kHz | Aggregated end-effector force, slow |
| Vision (probe pose) | 30–60 Hz | Gross probe placement, posture |
The point: there are five separate sensor streams running at five separate rates, and the only one currently feeding the controller is the 1 kHz torque stream. The rest is on the cutting-room floor. Recovering it is what the proposal in §8 is about.
§6 · ML for compliance — the literature you’d cite
The pitch in §5 — learned state estimator fed by raw multi-rate sensors — sits inside a much larger body of recent work on ML for contact-rich manipulation. This section is a short tour of the parts of that literature most directly relevant to SymbioKinetics’ problem space.
6.1 Learned inverse dynamics and friction models
The oldest thread is learning the manipulator’s own dynamics — Coriolis terms, joint friction, transmission backlash — directly from data instead of from a CAD-derived model. Nguyen‑Tuong & Peters surveyed this in 2011[25]; the modern incarnation uses neural networks or Gaussian processes to model the residual between the rigid-body model and reality. For an arm like FR3, where the manufacturer already provides a high-quality dynamic model, the wins are mostly in friction, harmonic-drive flexibility, and temperature-dependent effects.
6.2 Residual policy learning
Silver & Allen’s residual policy framework[16] and Johannink et al.’s residual RL[17] formalize a pragmatic idea: keep your hand-designed controller as the base, and learn a correction on top of it. This is exactly the right abstraction for a safety-critical medical context, because the base controller remains analyzable and the learned residual can be bounded, monitored, and (if necessary) disabled.
For compliance: keep libfranka’s Cartesian impedance controller as the base, and learn a residual torque (or a stiffness adjustment) from the multi-rate sensors that compensates for whatever the base controller misses — sensor lag, contact transients, tissue stiffness variation. This composes cleanly with the architecture I’ll propose.
6.3 Diffusion policies for contact-rich tasks
Diffusion Policy[13], from Chi et al., is probably the most important recent shift in robot policy learning. Instead of training a deterministic mapping from observations to actions (which collapses when multiple actions are equally valid), they train a score function whose Langevin sampling produces a distribution over action sequences. The architectural payoff is that contact-rich tasks — where the right action depends sensitively on contact state — become tractable in ways they weren’t with standard behavioral cloning.
Reactive Diffusion Policy[14] pushes this further: a slow planner generates a coarse trajectory, and a fast reactive head consumes tactile/force feedback to perturb that trajectory in real time. For ultrasound scanning, this maps almost too cleanly: the slow planner is the scan path; the fast reactive head is the probe-pressure controller.
6.4 Implicit behavioral cloning and energy-based policies
Florence et al.’s Implicit BC[15] represents the policy as an energy function over (observation, action) pairs and selects actions by minimizing energy at inference time. For contact tasks where action multimodality is common (multiple grips give the same outcome), this dramatically outperforms explicit BC on standard benchmarks.
6.5 Multimodal sensor fusion
For our specific problem — fusing 1 kHz torque, 10–40 kHz current, 500 Hz wrist F/T, and 30 Hz vision — the right inductive bias is a model that natively handles irregular and asynchronous sensor streams. Neural Controlled Differential Equations[23][24] are exactly that: a continuous-time latent state that gets updated whenever any sensor produces a measurement, queried at whatever rate the consumer (the impedance controller) wants. The 2022 multiscale-sensor-fusion paper[24] demonstrates this on robotic peg-in-hole; the architectural fit to FR3’s situation is direct.
Lee, Zhu & Srinivasan[26] attack the same problem from a different angle — self-supervised representations of vision + touch — and showed that fusing modalities helps even when no labels are available. Useful for the data-engine question of §9.
6.6 Imitation from teleop: the data-engine references
For the data side, three projects are essential reading:
- Mobile ALOHA[18] — leader-follower teleop with a low-cost rig collecting bimanual mobile-manipulation demos; ~50 demos per task with co-training is enough for production-grade behaviors.
- Universal Manipulation Interface (UMI)[19] — in-the-wild data collection with a handheld gripper and a GoPro; policies transfer zero-shot across robot platforms via relative-trajectory action representation.
- DROID[21] — a 76k-episode pre-training dataset across 86 tasks and 564 scenes; the closest thing the field has to ImageNet for manipulation.
For ultrasound specifically, the obvious adaptation is a UMI-style handheld probe with built-in force sensing that a sonographer can use during a normal scan. Every scan becomes a teleoperated demonstration; the data engine costs nothing extra.
6.7 Variable impedance learning
Buchli et al.[27] showed that the impedance gains themselves can be learned from demonstration — the operator’s natural variation in stiffness while performing a task encodes useful information about which directions need to be stiff and which compliant. This is directly applicable to physiotherapy: a PT specialist applying graded pressure to a patient is implicitly demonstrating an impedance trajectory; with the right rig, that trajectory becomes the training signal.
6.8 Foundation models for robotics
For completeness: Octo[20], RT-2[22], and the general trend toward vision-language-action foundation models are relevant for task semantics (mapping language to actions, multi-task generalization). They are not (yet) the right tool for the low-level state estimation problem this document is about. They may become it once on-device inference latencies drop another order of magnitude.
§7 · Corrections to my interview answers
Three places where the in-the-moment answer was directionally right but technically loose, and how I’d rephrase them now.
7.1 “Capacitance” → DC-bus capacitor + regenerative braking
I said capacitance as a way of capturing “the energy goes into stored electrical charge.” The literal mechanism is a DC-bus capacitor on the shared servo drive, working alongside a braking resistor and sibling-axis power transfer. The energy isn’t dissipated in the capacitor — it’s stored there transiently, and either consumed by another axis or dumped through the brake resistor as heat. The reason to know this distinction is that the engineering knob isn’t “add capacitance”; it’s “size the bus cap and braking resistor for the worst-case deceleration profile, and use multi-axis power sharing where possible.” Full treatment in §4.4.
7.2 “ML on raw sensors” → neural state estimator
The phrase I should have used is learned state estimator (or neural observer, depending on which community you talk to). The architecture is well-defined in the literature[24][11]: take irregular, asynchronous, high-rate sensor streams; map them through a learned latent dynamics model; query the latent state at whatever rate the downstream controller demands; feed it into the existing impedance controller, optionally as a residual on top of the model-based estimate. The reason this is more constrained than “just use ML” is that the interface to the safety-rated controller stays the same — we feed it the same state vector it already consumes, only better. This is what the architecture in §8 implements.
7.3 “Maybe we add sensors at the joints” → reuse what’s already there
I floated putting “optical or Hall-effect sensors on every joint.” The cleaner move is don’t add hardware; expose what’s already inside. The FR3’s drives already measure motor current at 10–40 kHz inside the FOC loop[11][31], and every joint already has a calibrated strain-bridge torque sensor at 1 kHz[30]. The bandwidth is there; what’s missing is exposing the fast signal to a learned estimator. Adding new sensors is a 12-month hardware program; tapping the existing ones (via either an extension to libfranka or a parallel FPGA pickup on the drive bus) is a 3-month software program. Always prefer the cheaper alternative when the information content is the same.
7.4 What I’d avoid claiming next time
In the interview I implied that a learned approach would automatically improve compliance. It won’t — it will improve compliance if the bottleneck is state estimation, if enough representative training data is available, and if the policy is wrapped in a residual structure that bounds its authority. Those are real ifs, and §11 spells them out. If the bottleneck is somewhere else — bad mechanical design, poorly characterized harmonic-drive flexibility, a wrist that resonates at 80 Hz — ML will not save the system; it will just disguise the underlying problem behind a slightly better-looking force trace. The honest version of the pitch begins with: we should first determine what the dominant error source actually is, and then choose tools accordingly.
§8 · The proposal
A concrete, FR3-specific architecture that keeps libfranka’s safety-rated impedance controller in the loop, adds a learned state estimator fed by the sensors that already exist in the system, and treats new hardware as a last resort. The widget below is interactive — click any block to see what it does and why.
8.1 The three things this design refuses to do
It does not replace the controller. libfranka’s Cartesian impedance is the safety-critical path. We feed it cleaner inputs and we adapt its gains (K, B) online from the estimator’s tissue-stiffness output, but the controller itself, its 1 kHz cycle, and its safety stops remain bit-for-bit the manufacturer’s.
It does not add new sensors as the first move. The FR3 already has joint torque sensors at every joint and motor-current sensing inside every drive. The first deployment uses only these. An optional wrist 6-axis F/T sensor is added only as a ground-truth channel during training-data collection — it can be removed in production.
It does not put a giant neural network in the inner loop. The estimator is small (≤ 1 M parameters), runs in well under the 1 ms inner-loop budget, and has bounded output range. If it diverges, the controller falls back to the model-based state estimate it always had.
8.2 The data path, in detail
- Joint torque sensors stream at 1 kHz via the standard FCI. No changes here.
- Motor current is exposed at 10–40 kHz. The cleanest way to get this off the FR3 today is via the Franka FCI extension that provides motor-side data; if that proves inadequate, an FPGA bus-tap on the drive’s encoder/current bus is the fallback (estimated 6 weeks of HW work).
- Optional wrist F/T sensor (e.g., ATI Nano17) at 500–1000 Hz. Used during teleop data collection; removable in production.
- A small FPGA (or RT-Preempt Linux with a PTP grandmaster) time-aligns the three streams to microsecond accuracy. Skipping this step is a common reason learned observers fail — they end up learning sensor lag instead of contact physics.
- The neural state estimator is a Neural CDE[23][24] or a small transformer (we’d benchmark both). Input: the asynchronous multi-rate sensor streams. Output: at the controller’s 1 kHz rate, a 6-DoF contact state — Cartesian force vector + an estimate of the local environmental stiffness.
- The Cartesian impedance controller consumes the estimated state and adapts accordingly. A residual-policy layer[16][17] can add a learned torque correction on top of the impedance output if and only if a watchdog confirms the estimator’s covariance is below a threshold.
8.3 What this design buys you
Sub-Newton force regulation in the presence of unmodeled contact dynamics. The headline pitch. The widget in §5 shows why the existing controller can’t get there from its current sensor diet; the architecture above gives it the diet it needs.
Online stiffness adaptation without explicit tissue modeling. The estimator outputs a stiffness scalar (or matrix). The controller uses it directly. No patient-specific calibration loop; the system learns the mapping from sensor signatures to tissue stiffness during training.
Graceful degradation. If the estimator’s confidence drops, the controller defaults to its baseline behavior. This is critical for clinical deployment — the system never “tries something new”; it falls back to the safety-rated default.
A clear path to FDA-relevant validation. Because the estimator is a bounded residual and not the primary controller, the verification story is: characterize the base controller, characterize the bound on the residual, characterize joint behavior in the worst case. This is much easier to defend in a 510(k) submission than “an end-to-end policy controls the arm.”
8.4 Where this design is wrong
Two places where the architecture is opinionated and could be argued against:
- Hard-coupling the estimator to the existing controller means we can’t take advantage of policies that reshape the controller itself (e.g., a diffusion policy that replaces the impedance loop with a learned one). For SymbioKinetics’ regulatory situation this seems right, but for a research lab the choice would be different.
- Treating the wrist F/T as optional assumes the estimator can learn to do without it once trained. This is a real assumption that has to be tested empirically before committing to a hardware design. If it fails, the production system needs the F/T sensor as a permanent component.
§9 · The data engine
A learned state estimator is only as good as its training data. For SymbioKinetics, the data engine has three layers — each cheaper and broader than the last — and a clean composition story between them.
9.1 Layer A: phantom benchtop
The cheapest data, by far, is silicone-tissue phantoms scanned by a teleoperated FR3. Build a simple “leader-follower” rig in the style of Mobile ALOHA[18]: a passive arm (or even a single-DoF haptic puck) that an operator drives by hand, with the FR3 mirroring its motion in real time. Every session records:
- 1 kHz joint torques + 10 kHz motor currents from FR3
- 500 Hz wrist F/T as ground truth
- 30 Hz RGB from a fixed camera
- the leader’s commanded motion (= the demonstrator’s intended trajectory)
Phantoms are CC, can be reformulated to match different tissue stiffness profiles, and can be instrumented with embedded load cells to validate the estimator against absolute force. Target: 200 hours of phantom data, ≈30 distinct phantom variants, in the first month.
9.2 Layer B: in-the-wild via UMI-style probe
Adapting UMI[19] to ultrasound is the high-leverage move. A handheld diagnostic probe instrumented with a 6-axis F/T sensor and a wide-FOV camera turns every sonographer’s normal workday into a data collection session. They scan as they normally would; the rig records pose, force, and image, with their consent.
Critical implementation details:
- The action representation should be relative trajectories in the probe frame, exactly as UMI does — this is what makes the data transfer to a different robot platform without modification.
- The F/T sensor must be calibrated against the same model used on the FR3’s wrist; cross-rig calibration drift is the single biggest source of sim-to-real error in this kind of system.
- IRB/consent paperwork should be sorted out early; for a startup, this is the gating constraint, not the engineering.
Target: 50 sonographers × 4 hours each = 200 hours of in-the-wild data, by month 3.
9.3 Layer C: public pretraining
Pre-training the encoder on DROID[21] gives the network a strong manipulation prior before it ever sees ultrasound-specific data. The DROID release covers 564 distinct scenes across 86 tasks, collected on Franka arms — the platform alignment is rare and valuable. Octo[20] is the canonical recipe for using it as a foundation model; we’d fine-tune from an Octo checkpoint rather than pretrain from scratch.
9.4 Composition: how the layers stack
The training recipe, in order:
- Initialize from Octo[20] weights (vision + proprioception encoder).
- Domain pre-training on DROID[21], freezing nothing.
- Task pre-training on layer-A phantom data — heavy supervision because ground-truth force is available.
- Task fine-tuning on layer-B in-the-wild data — light supervision (no F/T ground truth in the field), self-supervised representations from vision + touch[26] fill the gap.
- Online residual learning during deployment, with strict watchdogs: collect the residual policy’s actions and rewards from operators’ annotations, retrain the residual offline weekly.
9.5 Augmentation: sim-to-real and domain randomization
The empirical lesson from the dexterous-manipulation literature is that the single most leveraged augmentation is randomizing the contact model in simulation: vary contact stiffness, friction coefficient, sensor noise, and sensor delay across episodes during training. We’d build this in MuJoCo against the FR3 URDF, with ultrasound-probe-specific contact tooling.
9.6 What “enough data” looks like
For an FR3-specific contact policy, prior art on similar tasks[18][14] suggests 50–200 hours of task-specific demonstrations is the regime where performance saturates. That’s the order of magnitude we should target, not 10,000 hours. The constraint is diversity (patient body habitus, anatomical sites, breathing states), not raw volume.
§10 · 90-day plan, if I were hired tomorrow
The plan below assumes I’m joining as a robotics/controls engineer with access to one FR3, one technician day per week, and a single phantom workstation. It is deliberately scoped to what one person can finish in 90 days, not what a team could do. The deliverables compound: each week’s output is consumed by the next.
Week 1–2 · Baselining
- Spin up FR3 with libfranka[31], default Cartesian impedance, ROS 2 bridge.
- Build a 1-DoF normal-force-tracking benchmark on a silicone phantom: hit and hold 5 N, 10 N, 20 N targets while the phantom translates beneath the probe at varying speeds and surface curvatures.
- Deliverable: a quantitative force-RMSE chart over the operating envelope, baseline numbers everyone agrees on.
Week 3–4 · Data pipeline
- Build the teleop rig (passive leader arm + foot pedal trigger).
- Wire 1 kHz torque, 10 kHz motor current (via FCI extension or, as fallback, a Beckhoff EtherCAT bus tap), 500 Hz ATI wrist F/T, and 30 Hz camera into a single timestamped recorder.
- Run a smoke test: 1 hour of phantom teleop data, verify time alignment to ≤ 100 µs.
- Deliverable: a working data-collection rig and a reusable parquet schema for sensor data.
Week 5–6 · Estimator v0
- Train a Neural CDE[23][24] on the phantom corpus to predict instantaneous Cartesian force from the raw sensor streams.
- Evaluate against the wrist F/T ground truth.
- Deliverable: estimator predictions match wrist F/T within ±0.3 N RMSE on held-out phantoms, with explicit reporting of failure cases (high stiffness, near-resonance, etc.).
Week 7–9 · Closed-loop deployment
- Wrap the estimator into the FR3 control loop. Feed its output to libfranka’s Cartesian impedance as the external-force estimate (replacing the default).
- Re-run the W1–2 benchmark. The headline metric: does force-RMSE on the original benchmark improve, and by how much?
- Add a watchdog: if estimator covariance > threshold, revert to the default behavior. Confirm the revert path on a deliberately-out-of-distribution test (extreme phantom stiffness).
- Deliverable: a comparison table — default controller vs. learned-estimator controller — across all benchmark configurations, with the SymbioKinetics safety lead signing off on the watchdog behavior.
Week 10–11 · First clinically-shaped task
- Pick one of: autonomous probe-pressure regulation during a thyroid scan, or sustained-pressure massage along a forearm phantom. Choose with the team based on regulatory priority.
- Collect 20 hours of teleop demos from a domain expert (a sonographer or PT).
- Fine-tune a residual policy[16][17] on top of the W7–9 controller.
- Deliverable: first end-to-end task demo on phantom, with a video and a force trace overlay.
Week 12 · Hardening
- Failure-mode rehearsal: deliberately introduce sensor dropouts, noisy currents, F/T calibration drift; confirm graceful degradation.
- Write up the 90-day report: what worked, what didn’t, where the technical debt sits, what hardware (if any) would buy the next 6 weeks of progress.
- Deliverable: a one-pager that the founders can hand to investors or to a regulatory consultant.
What this plan deliberately does not commit to
- No human subjects. The 90-day plan is phantoms only.
- No new mechanical hardware (no custom wrist, no new sensors beyond what’s already on the FR3 plus an ATI Nano17).
- No claim to FDA-clearance work. §12 sketches what that path looks like; it’s a separate program of work that runs in parallel.
The bet is that one engineer can get from “default FR3” to “demonstrably better closed-loop force tracking on a clinically-shaped task” in a quarter. If that’s true, the next quarter is about generalization across body sites; if it’s not, the 90-day report will say so clearly enough for the team to redirect.
§11 · Risks and failure modes
A pitch isn’t honest without a risk register. Below are the six places I’d expect the architecture in §8 to break, in rough order of how likely they are to bite.
11.1 Sim-to-real gap on contact dynamics
The risk. Whatever model we train in MuJoCo will be wrong about contact in ways that matter — the stiffness model is a Hunt-Crossley with linear damping, the friction is Coulomb, neither captures tissue’s nonlinear, viscoelastic, and rate-dependent behavior. A policy that looks excellent in sim degrades on real phantoms; one that looks excellent on phantoms degrades on actual patients.
Mitigation. Heavy domain randomization on the contact parameters during sim training, and treat sim as a pre-training tool, not a deployment target. Real phantom data is the only thing that validates the system. Pretrained encoders from DROID[21] help with the non-contact generalization (kinematics, vision), not the contact part. Confidence-weighted residuals: if the model’s uncertainty is high, the residual policy has bounded authority and the base impedance controller takes over.
11.2 Joint torque sensor calibration drift
The risk. FR3’s joint torque sensors have known thermal drift; they can drift several Newton-meters over a day of operation. A static gravity compensation calibration at the start of a session will not be valid at hour 4. If the estimator is trained on data that included drift but the deployment doesn’t compensate for it, force estimates will be systematically biased.
Mitigation. Periodic in-session recalibration (the operator parks the arm in a known pose every N scans; the controller re-zeros). Plus: train the estimator to be invariant to a constant additive bias on the joint torques, by data augmentation — randomly add a static bias to torque streams during training.
11.3 The watchdog you cannot fully validate
The risk. The architecture relies on a watchdog that disables the residual policy when its uncertainty is high. Uncertainty estimation in deep networks is hard. Well-calibrated uncertainty under in-distribution is not the same as well-calibrated uncertainty out-of-distribution, and the out-of-distribution case is exactly the case the watchdog needs to handle.
Mitigation. Multi-modal uncertainty: ensemble disagreement, plus a separate model-based residual check (compare the estimator’s output against a Kalman filter over the joint torques alone — when they diverge by more than a fixed threshold, disable). Plus: a physical envelope check (estimated force outside [-50, 50] N → immediate disable, no questions). Plus: human-in-the-loop emergency stop.
11.4 Regulatory pathway risk
The risk. FR3 is sold as a research robot and is not FDA-cleared as a component of a medical device. Anything we ship clinically with FR3 has to either include FR3 in the regulatory submission (which is a long road) or replace it with a medical-grade equivalent. The latter changes the system identification, the sensor characteristics, and possibly the controller — invalidating much of the training data.
Mitigation. Start the conversation with a regulatory consultant in month 2, not month 18. Choose any custom hardware (probe holder, wrist F/T mounting) so that it’s transferable across base platforms. Keep the estimator’s input space platform-agnostic — wrist-frame Cartesian, not joint-space — so the same model can be re-trained against a different base. Be honest with investors that the engineering work is decoupled from the regulatory work and that both are long timelines.
11.5 Data scarcity for rare clinical presentations
The risk. Common scans will dominate the training distribution. Edge cases — patients with unusual anatomy, post-surgical scarring, obese body habitus, tremor — will be under-represented. The estimator will work well on the median patient and badly on the tail.
Mitigation. Active data collection on under-represented presentations, with intentional sampling targets in the field-collection rig. Documented coverage maps for the regulatory submission. No silent failure on under-represented cases — the watchdog should fire and the controller should revert.
11.6 Latency budget tightness
The risk. A 1 kHz cycle has a 1 ms budget. A small Neural CDE + the standard impedance computation + the watchdog logic plus FCI roundtrip might not fit, particularly on the embedded compute available on the robot’s onboard PC.
Mitigation. Profile aggressively from day one. ONNX export, quantization, and (if necessary) move inference to an external real-time PC connected over EtherCAT. The architecture is explicit about this — the estimator is allowed to be on a different physical box than the controller, as long as the link latency is bounded.
A general principle for risk management
Each of these risks is known and has documented mitigations. None of them are show-stoppers; all of them require deliberate engineering attention. The single biggest risk-management move is honest reporting of negative results during development — phantoms where the system underperforms, scenarios where the watchdog disables, edge cases where the residual policy adds nothing. A culture of “show the failures, then show the fix” produces better systems and stronger regulatory submissions than one that only shows the wins.
§12 · Safety and regulatory landscape
A note on the regulatory envelope, since SymbioKinetics’ product crosses into medical-device territory. None of the below is a substitute for an actual regulatory consultant; it’s the engineer’s-eye view of what shapes design decisions.
12.1 The standards that apply
| Standard | Scope | Why it matters here |
|---|---|---|
| ISO/TS 15066[32] | Collaborative robot power & force limits, biomechanical injury thresholds at 29 body sites. | Sets quasi-static and transient force ceilings on contact with each body region. The 1 N regulation target sits comfortably inside these ceilings; what matters is keeping worst-case transient force within the transient limit during edge cases. |
| ISO 13482[33] | Safety requirements for personal-care robots. | Relevant for the PT/rehab use case; less so for diagnostic US (which is typically operator-supervised). |
| IEC 60601-1[34] | General safety for medical electrical equipment. | The umbrella standard for any electrically-powered medical device. Touches grounding, leakage current, single-fault behavior. Required for any device used in patient contact. |
| IEC 62366-1[35] | Usability engineering for medical devices. | Required as part of the 60601-1 family. Forces the design team to document use scenarios, foreseeable misuse, and human-factors testing. |
| IEC 62304[36] | Medical device software lifecycle. | Defines software safety classes (A, B, C) and the documentation each requires. A learned estimator that influences contact force will almost certainly land in Class B or C, with corresponding documentation overhead. |
| FDA 510(k)[37] | Premarket notification, substantial-equivalence pathway. | The most likely commercial path for an autonomous US scanner. Substantial equivalence to RIVANNA Accuro XV (FDA-cleared 2024 for robotic musculoskeletal US) is a plausible argument. |
12.2 What this means for the architecture
Three design choices in §8 are driven directly by regulatory considerations:
- The base controller is unchanged. libfranka’s Cartesian impedance is a documented, well-characterized control loop. Putting a learned residual on top — bounded, watchdogged, with a documented disable path — is much easier to defend than replacing the controller wholesale.
- The estimator outputs are bounded and monitored. Out-of-distribution detection is not optional; it’s a 62304 documentation deliverable. Every release of the estimator ships with a coverage map of the training distribution and a documented behavior on inputs outside that map.
- Hardware changes are minimized. Every new sensor (wrist F/T, FPGA tap, etc.) is a regulatory variable. The architecture is deliberately conservative on hardware to keep the regulatory delta small.
12.3 What this document doesn’t address
This is a research / engineering document, not a regulatory strategy. I have not addressed: predicate device analysis for 510(k); risk management per ISO 14971; clinical evaluation plans; QMS (ISO 13485); cybersecurity per FDA’s premarket cybersecurity guidance. All of those are necessary work; none of them are work that an engineer should be deciding in isolation. The right time to engage a regulatory consultant is now, before architecture decisions accumulate that would be expensive to reverse.
References
- Neville Hogan (1985). Impedance Control: An Approach to Manipulation (Parts I, II, III) . ASME Journal of Dynamic Systems, Measurement, and Control 107(1) . ↑ back
- Marc H. Raibert & John J. Craig (1981). Hybrid Position/Force Control of Manipulators . ASME Journal of Dynamic Systems, Measurement, and Control 103(2) . ↑ back
- Oussama Khatib (1987). A Unified Approach for Motion and Force Control of Robot Manipulators: The Operational Space Formulation . IEEE Journal of Robotics and Automation 3(1) . ↑ back
- Christian Ott (2008). Cartesian Impedance Control of Redundant and Flexible-Joint Robots . Springer Tracts in Advanced Robotics 49 . ↑ back
- Alin Albu-Schäffer et al. (2007). The DLR Lightweight Robot: Design and Control Concepts for Robots in Human Environments . Industrial Robot 34(5) . ↑ back
- Gill A. Pratt & Matthew M. Williamson (1995). Series Elastic Actuators . IEEE/RSJ IROS . ↑ back
- Bram Vanderborght, Alin Albu-Schäffer, Antonio Bicchi et al. (2013). Variable Impedance Actuators: A Review . Robotics and Autonomous Systems 61(12) . DOI: 10.1016/j.robot.2013.06.009 . ↑ back
- Andrea Calanca, Riccardo Muradore & Paolo Fiorini (2016). A Review of Algorithms for Compliant Control of Stiff and Fixed-Compliance Robots . IEEE/ASME Transactions on Mechatronics 21(2) . ↑ back
- Bruno Siciliano & Luigi Villani (1999). Robot Force Control . Kluwer Academic . ↑ back
- Mark W. Spong (1987). Modeling and Control of Elastic Joint Robots . ASME Journal of Dynamic Systems, Measurement, and Control 109(4) . ↑ back
- Emre Sariyildiz & Kouhei Ohnishi (2020). Disturbance Observer-Based Robust Control and Its Applications: 35th Anniversary Overview . IEEE Transactions on Industrial Electronics 67(3) . ↑ back
- Claude E. Shannon (1949). Communication in the Presence of Noise . Proceedings of the IRE 37(1) . ↑ back
- Cheng Chi, Siyuan Feng, Yilun Du, Zhenjia Xu, Eric Cousineau, Benjamin Burchfiel & Shuran Song (2023). Diffusion Policy: Visuomotor Policy Learning via Action Diffusion . Robotics: Science and Systems (RSS) . ↑ back
- Han Xue et al. (2025). Reactive Diffusion Policy: Slow-Fast Visual-Tactile Policy Learning for Contact-Rich Manipulation . ↑ back
- Pete Florence, Corey Lynch, Andy Zeng et al. (2021). Implicit Behavioral Cloning . Conference on Robot Learning (CoRL) . ↑ back
- Tom Silver, Kelsey Allen, Josh Tenenbaum & Leslie Kaelbling (2018). Residual Policy Learning . ↑ back
- Tobias Johannink, Shikhar Bahl, Ashvin Nair et al. (2019). Residual Reinforcement Learning for Robot Control . ICRA 2019 . ↑ back
- Zipeng Fu, Tony Z. Zhao & Chelsea Finn (2024). Mobile ALOHA: Learning Bimanual Mobile Manipulation with Low-Cost Whole-Body Teleoperation . Conference on Robot Learning (CoRL) . ↑ back
- Cheng Chi, Zhenjia Xu, Chuer Pan, Eric Cousineau, Benjamin Burchfiel, Siyuan Feng, Russ Tedrake & Shuran Song (2024). Universal Manipulation Interface: In-the-Wild Robot Teaching Without In-the-Wild Robots . Robotics: Science and Systems (RSS) . ↑ back
- Octo Model Team (2024). Octo: An Open-Source Generalist Robot Policy . ↑ back
- Alexander Khazatsky et al. (DROID Consortium) (2024). DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset . ↑ back
- Google DeepMind (2023). RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control . ↑ back
- Patrick Kidger, James Morrill, James Foster & Terry Lyons (2020). Neural Controlled Differential Equations for Irregular Time Series . NeurIPS . ↑ back
- Sumeet Singh et al. (2022). Multiscale Sensor Fusion and Continuous Control with Neural CDEs . ↑ back
- Duy Nguyen-Tuong & Jan Peters (2011). Model Learning for Robot Control: A Survey . Cognitive Processing 12(4) . ↑ back
- Michelle A. Lee, Yuke Zhu, Krishnan Srinivasan et al. (2019). Making Sense of Vision and Touch: Self-Supervised Learning of Multimodal Representations for Contact-Rich Tasks . ICRA . ↑ back
- Jonas Buchli, Freek Stulp, Evangelos Theodorou & Stefan Schaal (2011). Learning Variable Impedance Control . International Journal of Robotics Research 30(7) . ↑ back
- Christoph Hennersperger, Bernhard Fuerst, Salvatore Virga et al. (2017). Towards MRI-Based Autonomous Robotic US Acquisitions: A First Feasibility Study . IEEE Transactions on Medical Imaging 36(2) . ↑ back
- Franka Robotics (2022). Franka Research 3 Datasheet (v1.1) . ↑ back
- Franka Robotics (2024). libfranka — Franka Control Interface (FCI) Documentation . ↑ back
- ISO/TC 299 (2016). ISO/TS 15066:2016 — Robots and Robotic Devices — Collaborative Robots (Power & Force Limiting) . ↑ back
- ISO/TC 299 (2014). ISO 13482:2014 — Robots and Robotic Devices — Safety Requirements for Personal Care Robots . ↑ back
- IEC (2020). IEC 60601-1 — Medical Electrical Equipment — General Requirements for Basic Safety and Essential Performance . ↑ back
- IEC (2015). IEC 62366-1 — Application of Usability Engineering to Medical Devices . ↑ back
- IEC (2006). IEC 62304 — Medical Device Software — Software Life Cycle Processes . ↑ back
- U.S. Food and Drug Administration (2023). Marketing Clearance of Diagnostic Ultrasound Systems and Transducers — Guidance for Industry . ↑ back
- Hermano I. Krebs et al. (2003). Rehabilitation Robotics: Performance-Based Progressive Robot-Assisted Therapy . Autonomous Robots 15(1) . ↑ back
- Lukas Jaeger, Robert Riener et al. (2009). Patient-Cooperative Strategies for Robot-Aided Treadmill Training: First Experimental Results . IEEE TNSRE . ↑ back
- Tobias Nef, Marco Guidali & Robert Riener (2009). ARMin III — Arm Therapy Exoskeleton with an Ergonomic Shoulder Actuation . Applied Bionics and Biomechanics 6(2) . ↑ back
- Kouhei Ohnishi, Masaaki Shibata & Toshiyuki Murakami (1996). Motion Control for Advanced Mechatronics . IEEE/ASME Transactions on Mechatronics 1(1) . ↑ back
About
Robotics & controls engineer. I work on autonomous systems, perception-action loops, and the messy gap between simulated control and real contact.
- emailcurious.antimony@gmail.com
- linkedin/in/shivamcurious
- github@Shivam-Bhardwaj
- sitetoo.foo
This document is the unsolicited research notes form of an interview follow-up. It represents my independent reading of the public literature and is not endorsed by SymbioKinetics, Franka Robotics, or any of the cited authors. Where I have made specific architectural claims (latency budgets, sensor bandwidths, fallback policies), they reflect what I would build given the constraints I understand today — corrections welcome.
Code for this site is at github.com/Shivam-Bhardwaj. Built with Astro 6, Tailwind 4, and a small amount of vanilla TypeScript. The widgets do real physics (mass-spring-damper integration, time-domain sampling) — they are not animations.