- Github: https://github.com/Lifelong-Robot-Learning/LIBERO?tab=readme-ov-file#Dataset
- 用的是 robosuite 来跑的 sim 环境,robomimic 来跑的训练;
- predicate-learning(用 0 和 1 定义反馈/结果),不是 reward-learning(设计 reward 函数,容易引入 reward bias)
HDF5 文件格式(data)
- 结构比较简单,分为 Dataset 和 Group,分别是数据集和文件夹;
- 整体的文件格式设计比较笨重,all in one 的做法;
- 应该只有比较老的数据集会使用这种文件格式?用 object store 会比较好;
数据解析的 demo:
# 文件结构遍历
with h5py.File(hdf5_file, "r") as f:
# Top-level keys (Groups)
print(f"\nTop-level keys: {list(f.keys())}")
# 类似 posix 方式访问子目录数据
agentview_rgb = f["data/demo_0/obs/agentview_rgb"]
if isinstance(agentview_rgb, h5py.Group):
print("agentview_rgb is a Group, keys:", list(agentview_rgb.keys()))
elif isinstance(agentview_rgb, h5py.Dataset):
print("agentview_rgb is a Dataset")
print(f" Shape: {agentview_rgb.shape}")
print(f" Dtype: {agentview_rgb.dtype}")
print(f" Size: {agentview_rgb.size}")
# read metadata
data_attrs = f["data"].attrs
print("Data group attributes:")
for key in data_attrs.keys():
value = data_attrs[key]
# Handle bytes/string conversion
if isinstance(value, bytes):
value = value.decode('utf-8')
elif isinstance(value, np.ndarray) and value.dtype.kind == 'S':
value = value.tobytes().decode('utf-8')
print(f" {key}: {value}")
一个 LIBERO hdf5 的文件结构,以 LIBERO-GOAL 里的 open_the_middle_drawer_of_the_cabinet_demo.hdf5 为例:
-data/demo_0
-actions
-dones
-obs
---agentview_rgb
---ee_ori
---ee_pos
---ee_states
---eye_in_hand_rgb
---gripper_states
---joint_states
-rewards
-robot_states
-states
- 所有的数据基本都是以 (T, ),这个 T 是时间或者操作的 episode?
- actions 里 action_dim 是 7(应该是 7 个自由度的意思)分别表示 x,y,z(位置) + roll(旋转)/pitch(俯仰)/yaw(偏航)(EE->End Effector),gripper(夹爪)
- dones,表示当前任务是否结束,主要是用于 robomimic 来做视频切割?
- obs: policy 看到的(接收到的)东西
- 视觉
- agentview_rgb: (T, H, W, 3), image/frames,外部摄像头视角
- eye_in_hand_rgb: (T, H, W, 3), 手眼相机视角
- ee 状态
- ee_ori: (T,3),对应上面的 roll/pitch/yaw 三个自由度
- ee_pos: x,y,z,坐标位置
- ee_stats: ee 的综合状态(速度 / 力 / 位姿拼接)
- gripper_states: (T,1) 夹爪开合状态
- joint_state: (T, n-Joints) 所有关键状态
- 视觉
- rewards: 训练过程中的奖励
- robot_states: 机器人视角,关注的东西比较少,(T, 9)
- states: 上帝视角,环境里的所有东西的状态,(T, 79)
备注:这里有个点,我本来以为 ee_pos[T] 和 ee_pos[T+1] 的差值,可以和 actions 里的数据对应起来,但其实这里有几个误区:
- ee_pos 和 actions 两边不是共享一个坐标系
- actions 是指令,和实际执行后的 ee_pos 未必能一致(考虑到 controller 里的各种增益计算,比如阻尼等等)
BDLL Task 描述
- 定义 problem
- robosuite 作为环境模拟器
- 定义 instruction
- regions:空间定义(plate, bowl, bottle..)
- fixtures: 静态场景物品(不会被机器人抓)
- objects: 可操作物体
- obj_of_interest: 目标相关对象
- init:状态初始化,On 是用来表示不同物体的相对关系
- target:目标条件
(define (problem LIBERO_Tabletop_Manipulation)
(:domain robosuite)
(:language Open the middle layer of the drawer)
(:regions
(plate_region
(:target main_table)
(:ranges (
(0.04 -0.03 0.060000000000000005 -0.01)
)
)
)
(akita_black_bowl_region
(:target main_table)
(:ranges (
(-0.09999999999999999 -0.01 -0.08 0.01)
)
)
)
(wine_bottle_region
(:target main_table)
(:ranges (
(-0.21000000000000002 -0.060000000000000005 -0.19 -0.04)
)
)
)
(cream_cheese_region
(:target main_table)
(:ranges (
(-0.060000000000000005 0.12000000000000001 -0.04 0.14)
)
)
)
(stove_front_region
(:target main_table)
(:ranges (
(-0.09 0.16999999999999998 -0.010000000000000002 0.25)
)
)
)
(cabinet_region
(:target main_table)
(:ranges (
(0.02 -0.25 0.04 -0.23)
)
)
(:yaw_rotation (
(3.141592653589793 3.141592653589793)
)
)
)
(stove_region
(:target main_table)
(:ranges (
(-0.42 0.2 -0.4 0.22)
)
)
)
(wine_rack_region
(:target main_table)
(:ranges (
(-0.27 -0.27 -0.25 -0.25)
)
)
(:yaw_rotation (
(3.141592653589793 3.141592653589793)
)
)
)
(top_region
(:target wooden_cabinet_1)
)
(middle_region
(:target wooden_cabinet_1)
)
(bottom_region
(:target wooden_cabinet_1)
)
(top_side
(:target wooden_cabinet_1)
)
(cook_region
(:target flat_stove_1)
)
(right_region
(:target bowl_drainer_1)
)
(left_region
(:target bowl_drainer_1)
)
(top_region
(:target wine_rack_1)
)
)
(:fixtures
main_table - table
wooden_cabinet_1 - wooden_cabinet
flat_stove_1 - flat_stove
wine_rack_1 - wine_rack
)
(:objects
akita_black_bowl_1 - akita_black_bowl
cream_cheese_1 - cream_cheese
wine_bottle_1 - wine_bottle
plate_1 - plate
)
(:obj_of_interest
wooden_cabinet_1_middle_region
)
(:init
(On wine_bottle_1 main_table_wine_bottle_region)
(On akita_black_bowl_1 main_table_akita_black_bowl_region)
(On plate_1 main_table_plate_region)
(On cream_cheese_1 main_table_cream_cheese_region)
(On wooden_cabinet_1 main_table_cabinet_region)
(On flat_stove_1 main_table_stove_region)
(On wine_rack_1 main_table_wine_rack_region)
)
(:goal
(And (Open wooden_cabinet_1_middle_region))
)
)
本地 render 一个 demo
用 LIBERO-GOAL 里的 open_the_middle_drawer_of_the_cabinet.bddl 举例:
from libero.libero.benchmark import get_benchmark
import os
bm = get_benchmark("libero_goal")()
print(bm.get_task_names())
print(len(bm.tasks))
for t in bm.tasks[:5]:
print(t.name)
from libero.libero.envs import OffScreenRenderEnv, DemoRenderEnv
task = bm.tasks[0]
print("----")
print("name:", task.name)
print("language:", task.language)
print("problem:", task.problem)
print("problem_folder:", task.problem_folder)
print("bddl_file:", task.bddl_file)
print("init_states_file:", task.init_states_file)
print("----")
# For GUI rendering on Mac, use ControlEnv with has_renderer=True
# DemoRenderEnv uses offscreen rendering (no GUI window)
#
# Note: On Mac, if you get OpenGL errors, you may need to:
# 1. Set environment variable: export PYOPENGL_PLATFORM=osmesa (for headless)
# OR use: export PYOPENGL_PLATFORM=glfw (for GUI - requires XQuartz or similar)
# 2. Install: brew install glfw (if using glfw backend)
# 3. Alternative: Use OffScreenRenderEnv and visualize frames with matplotlib (see below)
from libero.libero.envs.env_wrapper import ControlEnv
env = ControlEnv(
bddl_file_name='./libero/libero/bddl_files/libero_goal/open_the_middle_drawer_of_the_cabinet.bddl',
camera_names=["agentview"],
has_renderer=True, # Enable GUI window for visualization
has_offscreen_renderer=True, # Required for camera observations
render_camera="frontview", # Camera view for rendering
)
# Alternative: If GUI doesn't work on Mac, use offscreen rendering and visualize:
# from libero.libero.envs import OffScreenRenderEnv
# import matplotlib.pyplot as plt
# env = OffScreenRenderEnv(
# bddl_file_name='./libero/libero/bddl_files/libero_goal/open_the_middle_drawer_of_the_cabinet.bddl',
# camera_names=["agentview"],
# )
# obs = env.reset()
# plt.imshow(obs['agentview_image'][::-1]) # Flip vertically for display
# plt.show()
obs = env.reset()
print(obs.keys())
# Render the environment (this opens the GUI window on Mac)
# Access the underlying robosuite environment's render method
env.env.render()
import numpy as np
import time
import h5py
from libero.libero import get_libero_path
# Load actions from HDF5 demonstration file
print("\n" + "=" * 60)
print("Loading demonstration from HDF5 file...")
print("=" * 60)
# Get the demonstration file path for this task
demo_file_path = os.path.join(
get_libero_path("datasets"),
bm.get_task_demonstration(0) # Get demo for task 0
)
print(f"Loading demo from: {demo_file_path}")
# Load actions from HDF5 file
with h5py.File(demo_file_path, "r") as f:
# Get first demo (you can change demo_0 to demo_1, demo_2, etc.)
demo_key = "demo_0"
actions = f[f"data/{demo_key}/actions"][()] # Load all actions into memory
states = f[f"data/{demo_key}/states"][()] # Load initial states
print(f"Loaded {len(actions)} actions from {demo_key}")
print(f"Action shape: {actions.shape}")
print(f"Action range: [{actions.min():.4f}, {actions.max():.4f}]")
# Optionally set initial state from the demo
if len(states) > 0:
print(f"Setting initial state from demonstration...")
obs = env.set_init_state(states[0]) # set_init_state returns observations
else:
obs = env.reset()
# Action format for OSC_POSE: [dx, dy, dz, droll, dpitch, dyaw, gripper]
# All values are typically in range [-1, 1]
# - First 3: position delta (translation)
# - Next 3: orientation delta (rotation)
# - Last 1: gripper action (-1=close, 1=open)
print(f"\nReplaying demonstration with {len(actions)} steps...")
print("=" * 60)
# Replay the demonstration actions
print(f"\nReplaying demonstration actions...")
for t, action in enumerate(actions):
# Optionally inject random actions during demo
if np.random.random() < 0.1:
# Replace with random action
action = np.random.uniform(
-1,
1,
size=7
)
action_type = "random"
else:
action_type = "demo"
obs, reward, done, info = env.step(action)
# Render after each step to see the animation
env.env.render()
# Small delay to make animation visible (adjust speed here)
# 0.01 = real-time, 0.05 = slower, 0.001 = faster
time.sleep(0.01)
# Print progress every 50 steps
if (t + 1) % 50 == 0:
print(f"Step {t+1}/{len(actions)} ({(t+1)/len(actions)*100:.1f}%)")
if done:
print(f"\nTask completed at step {t+1}!")
break
print(f"\nDemonstration replay complete!")
# Close the environment when done
env.close()