Optimizations
- Normalization statistics: we use compute_norm_stats.py to create a norm_stats.json file for each of our datasets! This is especially important since the default pi0_base trossen norm_stats.json seems to be for the older Aloha robot which has different joints.
- LoRA: if lora fine tuning is used, it seems to be necessary to use the same lora model definition for both train.py and for serve_policy.py, although it is possible we are misunderstanding this. For example, if model=pi0_config.Pi0Config(paligemma_variant="gemma_2b_lora") is used in TrainConfig() (in config.py) for training, we find it necessary to have serve_policy.py use the same TrainConfig() for policy rollouts.
- Joint_flip_mask: in aloha_policy.py, some joint angles are multiplied by -1 to make joint directions consistent with those expected by the pi0 policy. For the original Aloha robot, this required the shoulder and elbow joints (numbers 1 and 2) to be flipped. For the newer Trossen AI Stationary we believe that the shoulder still needs flipping, but the elbow does not. Also, we do not use the gripper transform in aloha_policy.py, just joint_flip_mask. See adapt_trossen_to_pi in our version of aloha_policy.py, and see more below.
- Image resize: the pi0 model wants 224x224 size images. To resize, openpi uses images.tools.resize_with_pad, so in the trossen_ai example main.py file, we changed the cv2.resize, which does not pad, to the images_tools version, see more below.
- Sim to real joint calibration: As discussed in the calibration section, above, the real and simulated robot joints are just slightly out of alignment. Hence, for sim to real to work well, we adjusted actions for the right robot arm's joints 1 and 2 using the above calibration: action[7+1]-=0.025 and action[7+2]+=0.025. For base joint 0, we also needed to adjust action[7+0]=1.05*(action[7+0]+0.01). For the base, this angular shift compensates for the shift in location of the sim vs real bases. On the other hand, the small multiplier is a mystery, but works.
- Sim to real home pose: It is important to use the same initial pose for the real and simulated robots. We achieve this using the home_pose variable we added to the (deprecated) TrossenAIStationaryRobotConfig. The value needed for our sim to real experiments is home_pose=[0, np.pi/12, np.pi/12, 0, 0, 0, 0.044].
Training Details
Below is the training configuration used to learn the simulated robot transfer-cube task, available on huggingface. This config is for LoRA fine tuning. For full fine tune, use model=pi0_config.Pi0Config() and remove freeze_filter and ema_decay lines.
TrainConfig(
name="pi0_aloha_sim_trossen_ai_mem_finetune_v2",
model=pi0_config.Pi0Config(paligemma_variant="gemma_2b_lora", action_expert_variant="gemma_300m_lora"),
data=LeRobotAlohaDataConfig(
repo_id="ANRedlich/trossen_ai_stationary_sim_transfer_40mm_cube_13",
default_prompt="Transfer cube",
use_delta_joint_actions=False,
adapt_to_pi=False,
adapt_trossen_to_pi=True, # see above
repack_transforms=_transforms.Group(
inputs=[
_transforms.RepackTransform(
{
"images": {
"cam_high": "observation.images.cam_high",
"cam_low": "observation.images.cam_low",
"cam_left_wrist": "observation.images.cam_left_wrist",
"cam_right_wrist": "observation.images.cam_right_wrist",
},
"state": "observation.state",
"actions": "action",
}
)
]
),
),
weight_loader = weight_loaders.CheckpointWeightLoader("gs://openpi-assets/checkpoints/pi0_base/params"),
num_train_steps=20_000,
freeze_filter=pi0_config.Pi0Config(
paligemma_variant="gemma_2b_lora", action_expert_variant="gemma_300m_lora"
).get_freeze_filter(),
# Turn off EMA for LoRA finetuning.
ema_decay=None,
),
The norm_stats.json file must be created specifically for the above dataset. It will get placed in openpi/ assets/ pi0_aloha_sim_trossen_ai_mem_finetune_v2/ ANRedlich/ trossen_ai_stationary_sim_transfer_40mm_cube_13/:
uv run scripts/compute_norm_stats.py --config-name=pi0_aloha_sim_trossen_ai_mem_finetune_v2
Run training, but make sure the above norm_stats.json is used and not the default norm stats! The checkpoint will be placed in openpi/ checkpoints/ pi0_aloha_sim_trossen_ai_mem_finetune_v2/ trossen_ai_stationary_x1/ 19999/:
XLA_PYTHON_CLIENT_MEM_FRACTION=0.9 uv run scripts/train.py pi0_aloha_sim_trossen_ai_mem_finetune_v2 --exp-name=trossen_ai_stationary_x1
If train.py insists on loading the default norm_stats file instead of the dataset norm_stats, calculated above, it might be necessary to make a small change to src/openpi/policies/policy_config.py:
#in create_trained_policy(...):
...
if data_config.norm_stats is None: #added
norm_stats = _checkpoints.load_norm_stats(checkpoint_dir / "assets", data_config.asset_id)
else: #added
norm_stats=data_config.norm_stats #added
Evaluation details
In serve_policy.py:
In serve_policy.py:
class EnvMode(enum.Enum):
...
#add this line
ALOHA_SIM_TROSSEN_AI_FINETUNE = "aloha_sim_trossen_ai_finetune"
...
DEFAULT_CHECKPOINT: dict[EnvMode, Checkpoint] = {
...
#add:
EnvMode.ALOHA_SIM_TROSSEN_AI_FINETUNE: Checkpoint(
config="pi0_aloha_sim_trossen_ai_mem_finetune_v2",
dir="./checkpoints/pi0_aloha_sim_trossen_ai_mem_finetune_v2/trossen_ai_stationary_x1/19999"
),
Run the policy server, but make sure it uses the norm_stats, above, for the specific dataset, and also uses the above checkpoint. (Note: --no-sync was needed to keep uv from re-installing the default gym-aloha in place of ours.):
uv run --no-sync scripts/serve_policy.py --env ALOHA_SIM_TROSSEN_AI_FINETUNE
In a second terminal, run the real robot control example. (See Sim to real joint calibration, below, for adjust_for_sim_to_real):
MUJOCO_GL=egl uv run python examples/trossen_ai/main.py --adjust_for_sim_to_real=True
Trossen openpi fork vs ours
To get the same training results as us, and to use our pi0 models with the Trossen fork of openpi -- recommended -- the following small mods are required. We do not know if these mods are correct and we continue to experiment with them.
- Joint_flip_mask: We could be wrong, but we are currently flipping the shoulder, not the elbow, and not transforming the gripper:
- Image resize: We believe (could be wrong) that image resizing during training uses resize_with_pad. Also, our datasets and robot images (with our older lerobot code) are RGB, so we made a few mods to main.py in the trossen_ai example:
- Sim to real joint calibration: To align sim to real robots:
- Sim to real home pose: The sim default home position in our dataset and in our gym-aloha is different from the default lerobot staged_position:
in training/config.py:
TrainConfig(
...
data=LeRobotAlohaDataConfig(
adapt_to_pi=False,
...
becomes:
TrainConfig(
...
data=LeRobotAlohaDataConfig(
adapt_to_pi=False,
adapt_trossen_to_pi=True
...
also, in policies/aloha_policy.py:
class AlohaInputs(transforms.DataTransformFn):
adapt_trossen_to_pi: bool = False (added by us)
...
def _joint_flip_mask_trossen() -> np.ndarray:
"""Joint 1 gets flipped by -1"""
return np.array([1, -1, 1, 1, 1, 1, 1, 1, -1, 1, 1, 1, 1, 1])
...
def _decode_state(state: np.ndarray, *, adapt_to_pi: bool = False, adapt_trossen_to_pi: bool = False) -> np.ndarray:
...
elif adapt_trossen_to_pi:
state = _joint_flip_mask_trossen() * state
...
def _encode_actions(actions: np.ndarray, *, adapt_to_pi: bool = False, adapt_trossen_to_pi: bool = False) -> np.ndarray:
...
elif adapt_trossen_to_pi:
actions = _joint_flip_mask_trossen() * actions
...
def _encode_actions_inv(actions: np.ndarray, *, adapt_to_pi: bool = False, adapt_trossen_to_pi: bool = False) -> np.ndarray:
...
elif adapt_trossen_to_pi:
actions = _joint_flip_mask_trossen() * actions
...
# Transform and resize images from all cameras
for cam in cameras:
image_hwc = observation_dict[cam]
# convert BGR to RGB
image_resized = cv2.resize(image_hwc, (224, 224))
image_rgb = cv2.cvtColor(image_resized, cv2.COLOR_BGR2RGB)
image_chw = np.transpose(image_rgb, (2, 0, 1))
observation_dict[cam] = image_chw
becomes:
for cam in cameras:
image_hwc = observation_dict[cam].numpy()
image_resized = image_tools.convert_to_uint8(image_tools.resize_with_pad(image_hwc, 224, 224))
image_chw = np.transpose(image_resized, (2, 0, 1))
observation_dict[cam] = image_chw
in examples/trossen_ai/main.py->run_episode(...)
self.execute_action(a_t)
becomes:
if self.adjust_for_sim_to_real:
a_t=a_t.copy()
a_t[7]=1.05*(a_t[7]+0.01)
a_t[8]=a_t[8]-0.025
a_t[9]=a_t[9]+0.025
self.execute_action(a_t)
in packages/lerobot_robot_trossen/src/lerobot_robot_trossen/config_widowxai_follower.py
staged_positions: list[float] = field(
default_factory=lambda: [0, np.pi / 3, np.pi / 6, np.pi / 5, 0, 0, 0]
)
becomes:
in examples/trossen_ai/main.py:
robot_config=TrossenAIStationaryRobotConfig(max_relative_target,home_pose=[0, np.pi/12, np.pi/12, 0, 0, 0, 0.044])
Using pi0 models from huggingface
openpi does not download and use huggingface models directly, but huggingface models easily can be used in openpi as follows using trossen_ai_stationary_sim_pi013 as an example:
Download model to local file:
huggingface-cli download ANRedlich/trossen_ai_stationary_sim_pi013 --local-dir ~/openpi/openpi/checkpoints/hf_checkpoint
Add assets to TrainConfig in training/config.py:
TrainConfig(
name="pi0_aloha_sim_trossen_ai_mem_finetune_v2",
model=pi0_config.Pi0Config(...),
data=LeRobotAlohaDataConfig(
repo_id="ANRedlich/trossen_ai_stationary_sim_transfer_40mm_cube_13",
assets=AssetsConfig( #note: only use this to over-ride default assets location
assets_dir="./checkpoints/hf_checkpoint/assets",
asset_id="ANRedlich/trossen_ai_stationary_sim_transfer_40mm_cube_13",
),
In serve_policy.py point dir to the downloaded model:
EnvMode.ALOHA_SIM_TROSSEN_AI_FINETUNE: Checkpoint(
config="pi0_aloha_sim_trossen_ai_mem_finetune_v2",
dir="./checkpoints/hf_checkpoint"
),