Chapter 0.3: Sensor Systems for Humanoid Robots

Humanoid robots need "senses" to perceive the world—just like humans have eyes, ears, and balance systems. This chapter covers the sensor systems that enable humanoid robots to see, hear, feel, and maintain balance. You'll learn about cameras, LiDAR, IMUs, force/torque sensors, and how to simulate them in PyBullet.

Learning Outcomes

By the end of this chapter, you will be able to:

Understand different sensor types used in humanoid robots
Explain how vision sensors (cameras, depth cameras) work
Describe LiDAR for 3D mapping and obstacle detection
Understand IMUs for balance and orientation tracking
Implement basic sensor simulation in PyBullet
Appreciate sensor fusion for robust perception

Prerequisites

Chapters 0.1 to 0.2 completed (Physical AI concepts, humanoid landscape)
Basic Python knowledge
PyBullet installed (from Chapter 0.1)

Part 1: Sensor Overview

Why Sensors Matter

Without sensors, a robot is blind and deaf:

Can't see obstacles → will collide
Can't hear commands → can't respond
Can't feel objects → can't grasp properly
Can't sense balance → will fall over

With sensors, a robot can:

Perceive the environment
Navigate safely
Manipulate objects
Interact with humans
Maintain balance

Sensor Categories

Category	Purpose	Examples
Vision	See the world	RGB cameras, depth cameras, stereo vision
Range	Measure distances	LiDAR, ultrasonic sensors, depth cameras
Motion	Track movement	IMU, encoders, gyroscopes
Force/Torque	Feel forces	Force sensors, tactile sensors
Audio	Hear sounds	Microphones, microphone arrays

Humanoid robots typically use all of these for robust perception.

Part 2: Vision Sensors

RGB Cameras

What they do: Capture color images (like human eyes)

How they work:

Lens: Focuses light onto sensor
Sensor: Converts light to electrical signals
Processor: Converts signals to digital image (RGB pixels)

Specifications:

Resolution: 640x480 to 4K+ (more pixels = more detail)
Frame rate: 30 to 60 FPS (frames per second)
Field of view: 60 to 120° (how wide the view)

Use cases:

Object detection and recognition
Human detection and tracking
Visual navigation (following paths, reading signs)
Gesture recognition

Limitations:

Lighting: Poor performance in dark or bright conditions
Occlusion: Can't see behind objects
2D only: No depth information (unless stereo)

Depth Cameras

What they do: Measure distance to objects (like human depth perception)

How they work:

Structured light: Projects pattern, measures distortion (Intel RealSense)
Time-of-flight: Measures time for light to bounce back
Stereo vision: Two cameras, triangulate distance

Output: Depth map (each pixel = distance in meters)

Use cases:

3D mapping and SLAM
Obstacle avoidance
Object manipulation (knowing how far to reach)
Human pose estimation

Specifications:

Range: 0.5m to 10m+ (depending on technology)
Accuracy: ±1 to 5cm (varies with distance)
Resolution: 640x480 to 1080p

Limitations:

Reflective surfaces: Poor performance (glass, mirrors)
Outdoor lighting: Struggles in bright sunlight
Range limits: Can't see beyond maximum range

Stereo Vision

What they do: Use two cameras to compute depth (like human eyes)

How it works:

Two cameras capture images from slightly different positions
Find matching points in both images
Compute depth using triangulation

Advantages:

Passive: No special lighting needed
Works outdoors: Better than structured light in sunlight
Color + depth: RGB images with depth

Disadvantages:

Texture required: Needs features to match (fails on blank walls)
Computational: More processing than single camera
Calibration: Cameras must be precisely aligned

Part 3: LiDAR (Light Detection and Ranging)

What is LiDAR?

LiDAR uses lasers to measure distances in 360° around the robot.

How it works:

Laser emitter: Sends out laser pulses
Reflection: Light bounces off objects
Receiver: Detects reflected light
Time measurement: Calculate distance from time-of-flight
Rotation: Spins to scan 360°

Output: Point cloud (3D points representing surfaces)

LiDAR Specifications

Range: 10m to 200m+ (depending on model) Angular resolution: 0.1° to 1° (how fine the scan) Scan rate: 10 to 40 Hz (scans per second) Points per scan: 1,000 to 1,000,000+ points

Use Cases

Mapping:

Create 3D maps of environments
SLAM (Simultaneous Localization and Mapping)

Navigation:

Obstacle detection and avoidance
Path planning in unknown environments

Object Detection:

Identify objects by shape
Track moving objects

Limitations

Cost: Expensive ($1,000-$100,000+) Size: Large and heavy (challenging for humanoids) Power: High power consumption Outdoors: Can struggle in rain, fog, bright sunlight

Note: Some humanoids use solid-state LiDAR (smaller, cheaper, but limited field of view).

Part 4: IMU (Inertial Measurement Unit)

What is an IMU?

IMU measures motion and orientation (like human inner ear/balance system).

Components:

Accelerometer: Measures linear acceleration (gravity + motion)
Gyroscope: Measures angular velocity (rotation)
Magnetometer: Measures magnetic field (compass, optional)

How IMUs Work

Accelerometer:

Uses tiny masses that move when accelerated
Measures force on masses → acceleration
Gravity: Always measures 9.81 m/s² downward (when stationary)

Gyroscope:

Uses spinning masses or MEMS (Micro-Electro-Mechanical Systems)
Measures rotation rate (degrees/second)
Integration: Can compute orientation over time

Magnetometer:

Measures Earth's magnetic field
Provides absolute heading (north direction)
Calibration: Needs calibration for accuracy

Use Cases

Balance Control:

Detect tilt and rotation
Maintain upright posture
Prevent falls

Navigation:

Dead reckoning (estimate position from motion)
Orientation tracking
Step counting (for walking robots)

Motion Tracking:

Measure movement during manipulation
Detect impacts or collisions
Monitor robot health

Limitations

Drift: Orientation estimate drifts over time (gyroscope integration error) Noise: Sensor measurements are noisy Calibration: Requires calibration for accuracy Magnetic interference: Magnetometer affected by metal objects

Solution: Sensor fusion (combine IMU with other sensors like cameras).

Part 5: Force/Torque Sensors

What are Force/Torque Sensors?

Force sensors measure forces applied to the robot (like human touch).

Types:

Force sensors: Measure linear forces (push/pull)
Torque sensors: Measure rotational forces (twist)
Tactile sensors: Measure contact pressure distribution

Use Cases

Manipulation:

Grasping: Know how hard to grip (don't crush objects)
Contact detection: Know when touching objects
Force control: Apply specific forces

Balance:

Foot contact: Detect when foot touches ground
Weight distribution: Measure forces on each foot
Stability: Detect if robot is tipping

Safety:

Collision detection: Detect unexpected contact
Force limits: Stop if force exceeds threshold
Human interaction: Gentle touch detection

Specifications

Range: 0.1N to 10,000N+ (depending on application) Accuracy: ±1 to 5% of full scale Update rate: 1 to 10 kHz (very fast)

Placement:

Wrists: Measure manipulation forces
Ankles: Measure walking forces
Fingers: Measure grasping forces

Part 6: Sensor Fusion

Why Combine Sensors?

Single sensor limitations:

Camera: No depth (unless stereo/depth camera)
LiDAR: No color information
IMU: Drifts over time
Force sensor: Only measures contact

Sensor fusion combines multiple sensors for:

Robustness: If one sensor fails, others compensate
Accuracy: Multiple measurements improve accuracy
Completeness: Get both color and depth, position and orientation

Common Fusion Strategies

Vision + LiDAR:

Vision: Color, texture, object recognition
LiDAR: Accurate 3D structure
Combined: Colored 3D point cloud

IMU + Vision:

IMU: Fast motion tracking (high frequency)
Vision: Accurate position (low frequency, no drift)
Combined: Smooth, accurate motion estimate

Force + Vision:

Vision: See object before contact
Force: Feel object during contact
Combined: Robust manipulation

Part 7: Hands-On Tutorial

Project: Simulate Sensors in PyBullet

Goal: Add sensors to a robot simulation and visualize sensor data.

Tools: PyBullet, Python 3.10+, NumPy, OpenCV (optional)

Step 1: Camera Simulation

Create file: camera_sensor.py

#!/usr/bin/env python3
"""
Camera sensor simulation in PyBullet
Demonstrates RGB and depth camera rendering
"""
import pybullet as p
import pybullet_data
import numpy as np
import cv2
import time

# Connect to physics server
physicsClient = p.connect(p.GUI)
p.setAdditionalSearchPath(pybullet_data.getDataPath())

# Set gravity
p.setGravity(0, 0, -9.81)

# Load ground
planeId = p.loadURDF("plane.urdf")

# Create a simple scene with objects
boxId = p.loadURDF("cube.urdf", [2, 0, 0.5], [0, 0, 0, 1])
sphereId = p.loadURDF("sphere.urdf", [-2, 0, 0.5], [0, 0, 0, 1])

# Camera parameters
camera_pos = [0, 0, 2]  # Camera position
camera_target = [0, 0, 0]  # Where camera looks
camera_up = [0, 1, 0]  # Up direction

# Camera view matrix
view_matrix = p.computeViewMatrix(
    cameraEyePosition=camera_pos,
    cameraTargetPosition=camera_target,
    cameraUpVector=camera_up
)

# Camera projection matrix
aspect = 640.0 / 480.0  # Width/height
fov = 60  # Field of view (degrees)
near = 0.01  # Near plane
far = 10  # Far plane

projection_matrix = p.computeProjectionMatrixFOV(
    fov, aspect, near, far
)

print("Camera sensor simulation started!")
print("Capturing RGB and depth images...")

# Capture images
for i in range(100):
    p.stepSimulation()
    
    # Render camera view
    width, height, rgbImg, depthImg, segImg = p.getCameraImage(
        width=640,
        height=480,
        viewMatrix=view_matrix,
        projectionMatrix=projection_matrix
    )
    
    # Convert RGB image (4 channels: RGBA)
    rgb_array = np.array(rgbImg)
    rgb_array = rgb_array[:, :, :3]  # Remove alpha channel
    rgb_array = cv2.cvtColor(rgb_array, cv2.COLOR_RGB2BGR)  # OpenCV uses BGR
    
    # Convert depth image
    depth_array = np.array(depthImg)
    
    # Normalize depth for visualization (0 to 255)
    depth_normalized = (depth_array * 255 / far).astype(np.uint8)
    depth_colormap = cv2.applyColorMap(depth_normalized, cv2.COLOR_MAP_INFERNO)
    
    # Display images (every 10 steps)
    if i % 10 == 0:
        cv2.imshow("RGB Camera", rgb_array)
        cv2.imshow("Depth Camera", depth_colormap)
        cv2.waitKey(1)
        
        print(f"Step {i}: Captured RGB ({rgb_array.shape}) and depth ({depth_array.shape})")
    
    time.sleep(1./240.)

cv2.destroyAllWindows()
p.disconnect()
print("Camera simulation complete!")

Run simulation:

python3 camera_sensor.py

Expected Output:

Two windows showing RGB and depth camera views
RGB shows colored objects
Depth shows distance (closer = brighter)

Step 2: IMU Simulation

Create file: imu_sensor.py

#!/usr/bin/env python3
"""
IMU sensor simulation in PyBullet
Demonstrates accelerometer and gyroscope measurements
"""
import pybullet as p
import pybullet_data
import numpy as np
import time

# Connect to physics server
physicsClient = p.connect(p.GUI)
p.setAdditionalSearchPath(pybullet_data.getDataPath())

# Set gravity
p.setGravity(0, 0, -9.81)

# Load ground
planeId = p.loadURDF("plane.urdf")

# Create a box that will tilt
boxId = p.loadURDF("cube.urdf", [0, 0, 1], [0, 0, 0, 1])

# Simulate IMU on the box
print("IMU sensor simulation started!")
print("Measuring acceleration and angular velocity...")

previous_linear_velocity = np.array([0, 0, 0])
previous_angular_velocity = np.array([0, 0, 0])

for i in range(1000):
    p.stepSimulation()
    
    # Get box state
    pos, orient = p.getBasePositionAndOrientation(boxId)
    linear_velocity, angular_velocity = p.getBaseVelocity(boxId)
    
    # Convert to numpy arrays
    linear_velocity = np.array(linear_velocity)
    angular_velocity = np.array(angular_velocity)
    
    # Simulate accelerometer (measures linear acceleration + gravity)
    # In real IMU: acceleration = linear_acceleration + gravity (in sensor frame)
    # For simplicity, we'll compute linear acceleration from velocity change
    dt = 1./240.  # Simulation timestep
    linear_acceleration = (linear_velocity - previous_linear_velocity) / dt
    
    # Add gravity (in world frame, pointing down in Z)
    gravity_world = np.array([0, 0, -9.81])
    
    # Transform gravity to box frame (simplified - would use rotation matrix)
    # For now, assume box is upright (gravity = [0, 0, -9.81] in box frame)
    # In reality, need to rotate gravity vector by box orientation
    accelerometer_reading = linear_acceleration + gravity_world
    
    # Gyroscope reading (angular velocity in rad/s)
    gyroscope_reading = angular_velocity
    
    # Print readings every 50 steps
    if i % 50 == 0:
        print(f"\nStep {i}:")
        print(f"  Position: ({pos[0]:.2f}, {pos[1]:.2f}, {pos[2]:.2f})")
        print(f"  Accelerometer: ({accelerometer_reading[0]:.2f}, {accelerometer_reading[1]:.2f}, {accelerometer_reading[2]:.2f}) m/s²")
        print(f"  Gyroscope: ({gyroscope_reading[0]:.2f}, {gyroscope_reading[1]:.2f}, {gyroscope_reading[2]:.2f}) rad/s")
    
    previous_linear_velocity = linear_velocity
    previous_angular_velocity = angular_velocity
    
    time.sleep(1./240.)

p.disconnect()
print("IMU simulation complete!")

Run simulation:

python3 imu_sensor.py

Expected Output:

Box falls and tilts
Accelerometer readings show gravity + motion
Gyroscope readings show rotation

Step 3: LiDAR Simulation

Create file: lidar_sensor.py

#!/usr/bin/env python3
"""
LiDAR sensor simulation in PyBullet
Demonstrates ray casting for distance measurement
"""
import pybullet as p
import pybullet_data
import numpy as np
import matplotlib.pyplot as plt
import time

# Connect to physics server
physicsClient = p.connect(p.GUI)
p.setAdditionalSearchPath(pybullet_data.getDataPath())

# Set gravity
p.setGravity(0, 0, -9.81)

# Load ground
planeId = p.loadURDF("plane.urdf")

# Create objects to detect
boxId = p.loadURDF("cube.urdf", [2, 0, 0.5], [0, 0, 0, 1])
sphereId = p.loadURDF("sphere.urdf", [-2, 0, 0.5], [0, 0, 0, 1])

# LiDAR parameters
lidar_pos = [0, 0, 1]  # LiDAR position
num_rays = 360  # Number of rays (360° scan)
max_range = 10  # Maximum range (meters)
ray_length = max_range

print("LiDAR sensor simulation started!")
print(f"Scanning {num_rays} rays in 360°...")

# Perform LiDAR scan
ray_from = []
ray_to = []
ray_results = []

for i in range(num_rays):
    angle = 2 * np.pi * i / num_rays  # Angle in radians
    
    # Compute ray direction
    ray_dir = [np.cos(angle), np.sin(angle), 0]  # Horizontal scan
    ray_end = [
        lidar_pos[0] + ray_length * ray_dir[0],
        lidar_pos[1] + ray_length * ray_dir[1],
        lidar_pos[2] + ray_length * ray_dir[2]
    ]
    
    ray_from.append(lidar_pos)
    ray_to.append(ray_end)

# Cast rays
ray_results = p.rayTestBatch(ray_from, ray_to)

# Extract distances
distances = []
for result in ray_results:
    if result[0] != -1:  # Hit something
        hit_pos = result[3]  # Hit position
        distance = np.linalg.norm(np.array(hit_pos) - np.array(lidar_pos))
        distances.append(distance)
    else:  # No hit (max range)
        distances.append(max_range)

# Visualize LiDAR scan
angles = np.linspace(0, 2*np.pi, num_rays)
plt.figure(figsize=(10, 10))
plt.polar(angles, distances)
plt.title("LiDAR Scan (360°)")
plt.xlabel("Distance (m)")
plt.show()

print(f"LiDAR scan complete!")
print(f"Detected {sum(1 for d in distances if d < max_range)} hits")
print(f"Average distance: {np.mean([d for d in distances if d < max_range]):.2f} m")

p.disconnect()
print("LiDAR simulation complete!")

Run simulation:

python3 lidar_sensor.py

Expected Output:

Polar plot showing 360° LiDAR scan
Detects objects at different distances
Visual representation of environment

Step 4: Sensor Fusion Example

Create file: sensor_fusion.py

#!/usr/bin/env python3
"""
Simple sensor fusion example
Combines camera and IMU for better state estimation
"""
import pybullet as p
import pybullet_data
import numpy as np
import time

# Connect to physics server
physicsClient = p.connect(p.GUI)
p.setAdditionalSearchPath(pybullet_data.getDataPath())

p.setGravity(0, 0, -9.81)
planeId = p.loadURDF("plane.urdf")
boxId = p.loadURDF("cube.urdf", [0, 0, 1], [0, 0, 0, 1])

print("Sensor fusion simulation started!")
print("Combining camera and IMU measurements...")

# Camera (low frequency, accurate position)
camera_update_rate = 10  # Hz
camera_last_update = 0

# IMU (high frequency, drifts)
imu_update_rate = 240  # Hz

# State estimate (fused)
estimated_pos = np.array([0, 0, 1])
estimated_vel = np.array([0, 0, 0])

for i in range(1000):
    p.stepSimulation()
    t = i / 240.0  # Time in seconds
    
    # Get true state (for comparison)
    true_pos, _ = p.getBasePositionAndOrientation(boxId)
    true_vel, _ = p.getBaseVelocity(boxId)
    true_pos = np.array(true_pos)
    true_vel = np.array(true_vel)
    
    # IMU measurement (high frequency)
    if i % (240 // imu_update_rate) == 0:
        # Update velocity estimate from IMU
        # (In reality, would integrate accelerometer)
        estimated_vel = true_vel + np.random.normal(0, 0.01, 3)  # Add noise
    
    # Camera measurement (low frequency, accurate)
    if i % (240 // camera_update_rate) == 0:
        # Reset position estimate from camera
        estimated_pos = true_pos + np.random.normal(0, 0.001, 3)  # Very accurate
        camera_last_update = t
    
    # Fuse estimates (simple: use camera when available, IMU otherwise)
    # Update position from velocity
    dt = 1./240.
    if t - camera_last_update < 1./camera_update_rate:
        # Recent camera update, trust it
        pass  # Keep camera position
    else:
        # No recent camera update, use IMU
        estimated_pos = estimated_pos + estimated_vel * dt
    
    # Print comparison every 100 steps
    if i % 100 == 0:
        error = np.linalg.norm(estimated_pos - true_pos)
        print(f"Step {i}: True pos = {true_pos}, Estimated pos = {estimated_pos}, Error = {error:.3f} m")
    
    time.sleep(1./240.)

p.disconnect()
print("Sensor fusion simulation complete!")

What this demonstrates:

Camera: Accurate but slow (low frequency)
IMU: Fast but drifts (high frequency)
Fusion: Combines both for best estimate

Part 8: Debugging Common Issues

Issue 1: "Camera image is black"

Symptoms: Camera returns all zeros or black image

Solutions:

# Check camera position and target
# Ensure objects are in view
# Verify camera is not inside an object
# Check lighting (some simulators need light sources)

Issue 2: "IMU readings are noisy"

Symptoms: Accelerometer/gyroscope values jump around

Solutions:

# Add filtering (moving average, Kalman filter)
# Reduce simulation timestep (more frequent updates)
# Add noise model (real sensors are noisy)

Issue 3: "LiDAR doesn't detect objects"

Symptoms: All rays return max range

Solutions:

# Check ray origin and direction
# Verify objects are within range
# Check collision detection is enabled
# Ensure objects have collision geometry

Summary

You learned:

✅ Vision sensors: RGB cameras, depth cameras, stereo vision for seeing the world
✅ LiDAR: 3D mapping and obstacle detection using lasers
✅ IMU: Balance, orientation, and motion tracking
✅ Force/torque sensors: Tactile feedback for manipulation
✅ Sensor fusion: Combining multiple sensors for robust perception
✅ Simulation: How to simulate sensors in PyBullet

Next steps: Complete the Module 0 project—build a robot simulation with sensors!

Exercises

Exercise 1: Camera Calibration (Required)

Objective: Understand camera parameters and calibration.

Tasks:

Create simulation with camera
Experiment with different:
- Field of view (FOV)
- Resolution
- Camera positions
Observe how parameters affect image
Document findings

Acceptance Criteria:

Camera simulation working
Different FOVs tested
Different resolutions tested
Observations documented

Estimated Time: 45 minutes

Exercise 2: IMU Drift Analysis (Required)

Objective: Understand IMU drift and its impact.

Tasks:

Create simulation with IMU
Track orientation over time
Measure drift (error accumulation)
Compare with ground truth
Document drift rate

Acceptance Criteria:

IMU simulation working
Orientation tracked over time
Drift measured and documented
Comparison with ground truth

Estimated Time: 60 minutes

Exercise 3: Sensor Fusion Implementation (Challenge)

Objective: Implement simple sensor fusion.

Tasks:

Create simulation with camera and IMU
Implement fusion algorithm (weighted average or Kalman filter)
Compare fused estimate to individual sensors
Measure improvement in accuracy
Document fusion approach

Requirements:

Camera and IMU simulated
Fusion algorithm implemented
Accuracy improvement demonstrated
Approach documented

Estimated Time: 120 minutes

Additional Resources

PyBullet Camera API - Camera documentation
Intel RealSense - Depth camera technology
Velodyne LiDAR - LiDAR specifications
Sensor Fusion Tutorial - Kalman filtering

Next: Module 0 Project: Basic Robot Simulation →

Return to: Module 0 Introduction →

Learning Outcomes​

Prerequisites​

Part 1: Sensor Overview​

Why Sensors Matter​

Sensor Categories​

Part 2: Vision Sensors​

RGB Cameras​

Depth Cameras​

Stereo Vision​

Part 3: LiDAR (Light Detection and Ranging)​

What is LiDAR?​

LiDAR Specifications​

Use Cases​

Limitations​

Part 4: IMU (Inertial Measurement Unit)​

What is an IMU?​

How IMUs Work​

Use Cases​

Limitations​

Part 5: Force/Torque Sensors​

What are Force/Torque Sensors?​

Use Cases​

Specifications​

Part 6: Sensor Fusion​

Why Combine Sensors?​

Common Fusion Strategies​

Part 7: Hands-On Tutorial​

Project: Simulate Sensors in PyBullet​

Step 1: Camera Simulation​

Step 2: IMU Simulation​

Step 3: LiDAR Simulation​

Step 4: Sensor Fusion Example​

Part 8: Debugging Common Issues​

Issue 1: "Camera image is black"​

Issue 2: "IMU readings are noisy"​

Issue 3: "LiDAR doesn't detect objects"​

Summary​

Exercises​

Exercise 1: Camera Calibration (Required)​

Exercise 2: IMU Drift Analysis (Required)​

Exercise 3: Sensor Fusion Implementation (Challenge)​

Additional Resources​

Learning Outcomes

Prerequisites

Part 1: Sensor Overview

Why Sensors Matter

Sensor Categories

Part 2: Vision Sensors

RGB Cameras

Depth Cameras

Stereo Vision

Part 3: LiDAR (Light Detection and Ranging)

What is LiDAR?

LiDAR Specifications

Use Cases

Limitations

Part 4: IMU (Inertial Measurement Unit)

What is an IMU?

How IMUs Work

Use Cases

Limitations

Part 5: Force/Torque Sensors

What are Force/Torque Sensors?

Use Cases

Specifications

Part 6: Sensor Fusion

Why Combine Sensors?

Common Fusion Strategies

Part 7: Hands-On Tutorial

Project: Simulate Sensors in PyBullet

Step 1: Camera Simulation

Step 2: IMU Simulation

Step 3: LiDAR Simulation

Step 4: Sensor Fusion Example

Part 8: Debugging Common Issues

Issue 1: "Camera image is black"

Issue 2: "IMU readings are noisy"

Issue 3: "LiDAR doesn't detect objects"

Summary

Exercises

Exercise 1: Camera Calibration (Required)

Exercise 2: IMU Drift Analysis (Required)

Exercise 3: Sensor Fusion Implementation (Challenge)

Additional Resources