Experimental Details

🚧 Under Construction

Optimizations

  • Normalization statistics: we use compute_norm_stats.py to create a norm_stats.json file for each of our datasets! This is especially important since the default pi0_base trossen norm_stats.json seems to be for the older Aloha robot which has different joints.
  • LoRA: if lora fine tuning is used, it seems to be necessary to use the same lora model definition for both train.py and for serve_policy.py, although it is possible we are misunderstanding this. For example, if model=pi0_config.Pi0Config(paligemma_variant="gemma_2b_lora") is used in TrainConfig() (in config.py) for training, we find it necessary to have serve_policy.py use the same TrainConfig() for policy rollouts.
  • Joint_flip_mask: in aloha_policy.py, some joint angles are multiplied by -1 to make joint directions consistent with those expected by the pi0 policy. For the original Aloha robot, this required the shoulder and elbow joints (numbers 1 and 2) to be flipped. For the newer Trossen AI Stationary we believe that the shoulder still needs flipping, but the elbow does not. Also, we do not use the gripper transform in aloha_policy.py, just joint_flip_mask. See adapt_trossen_to_pi in our version of aloha_policy.py, and see more below.
  • Image resize: the pi0 model wants 224x224 size images. To resize, openpi uses images.tools.resize_with_pad, so in the trossen_ai example main.py file, we changed the cv2.resize, which does not pad, to the images_tools version, see more below.
  • Sim to real joint calibration: As discussed in the calibration section, above, the real and simulated robot joints are just slightly out of alignment. Hence, for sim to real to work well, we adjusted actions for the right robot arm's joints 1 and 2 using the above calibration: action[7+1]-=0.025 and action[7+2]+=0.025. For base joint 0, we also needed to adjust action[7+0]=1.05*(action[7+0]+0.01). For the base, this angular shift compensates for the shift in location of the sim vs real bases. On the other hand, the small multiplier is a mystery, but works.
  • Sim to real home pose: It is important to use the same initial pose for the real and simulated robots. We achieve this using the home_pose variable we added to the (deprecated) TrossenAIStationaryRobotConfig. The value needed for our sim to real experiments is home_pose=[0, np.pi/12, np.pi/12, 0, 0, 0, 0.044].

Training Details

Below is the training configuration used to learn the simulated robot transfer-cube task, available on huggingface. This config is for LoRA fine tuning. For full fine tune, use model=pi0_config.Pi0Config() and remove freeze_filter and ema_decay lines.


TrainConfig(
    name="pi0_aloha_sim_trossen_ai_mem_finetune_v2",
    model=pi0_config.Pi0Config(paligemma_variant="gemma_2b_lora", action_expert_variant="gemma_300m_lora"),
    data=LeRobotAlohaDataConfig(
        repo_id="ANRedlich/trossen_ai_stationary_sim_transfer_40mm_cube_13",
        default_prompt="Transfer cube",
        use_delta_joint_actions=False,
        adapt_to_pi=False,
        adapt_trossen_to_pi=True, # see above
        repack_transforms=_transforms.Group(
            inputs=[
                _transforms.RepackTransform(
                    {
                        "images": {
                            "cam_high": "observation.images.cam_high",
                            "cam_low": "observation.images.cam_low",
                            "cam_left_wrist": "observation.images.cam_left_wrist",
                            "cam_right_wrist": "observation.images.cam_right_wrist",
                        },
                        "state": "observation.state",
                        "actions": "action",
                    }
                )
            ]
        ),
    ),
    weight_loader = weight_loaders.CheckpointWeightLoader("gs://openpi-assets/checkpoints/pi0_base/params"),
    num_train_steps=20_000,
    freeze_filter=pi0_config.Pi0Config(
        paligemma_variant="gemma_2b_lora", action_expert_variant="gemma_300m_lora"
    ).get_freeze_filter(),
    # Turn off EMA for LoRA finetuning.
    ema_decay=None,
),            
                

The norm_stats.json file must be created specifically for the above dataset. It will get placed in openpi/ assets/ pi0_aloha_sim_trossen_ai_mem_finetune_v2/ ANRedlich/ trossen_ai_stationary_sim_transfer_40mm_cube_13/:

uv run scripts/compute_norm_stats.py --config-name=pi0_aloha_sim_trossen_ai_mem_finetune_v2

Run training, but make sure the above norm_stats.json is used and not the default norm stats! The checkpoint will be placed in openpi/ checkpoints/ pi0_aloha_sim_trossen_ai_mem_finetune_v2/ trossen_ai_stationary_x1/ 19999/:

XLA_PYTHON_CLIENT_MEM_FRACTION=0.9 uv run scripts/train.py pi0_aloha_sim_trossen_ai_mem_finetune_v2 --exp-name=trossen_ai_stationary_x1

If train.py insists on loading the default norm_stats file instead of the dataset norm_stats, calculated above, it might be necessary to make a small change to src/openpi/policies/policy_config.py:

#in create_trained_policy(...):
                        ...
if data_config.norm_stats is None: #added
    norm_stats = _checkpoints.load_norm_stats(checkpoint_dir / "assets", data_config.asset_id)
else: #added
    norm_stats=data_config.norm_stats #added
                    

Evaluation details

In serve_policy.py:

In serve_policy.py:
class EnvMode(enum.Enum):
    ...
    #add this line
    ALOHA_SIM_TROSSEN_AI_FINETUNE = "aloha_sim_trossen_ai_finetune"
    ...
DEFAULT_CHECKPOINT: dict[EnvMode, Checkpoint] = {
    ...
    #add:
    EnvMode.ALOHA_SIM_TROSSEN_AI_FINETUNE: Checkpoint(
        config="pi0_aloha_sim_trossen_ai_mem_finetune_v2",
        dir="./checkpoints/pi0_aloha_sim_trossen_ai_mem_finetune_v2/trossen_ai_stationary_x1/19999"
    ),

Run the policy server, but make sure it uses the norm_stats, above, for the specific dataset, and also uses the above checkpoint. (Note: --no-sync was needed to keep uv from re-installing the default gym-aloha in place of ours.):

uv run --no-sync scripts/serve_policy.py --env ALOHA_SIM_TROSSEN_AI_FINETUNE

In a second terminal, run the real robot control example. (See Sim to real joint calibration, below, for adjust_for_sim_to_real):

MUJOCO_GL=egl uv run python examples/trossen_ai/main.py --adjust_for_sim_to_real=True

Trossen openpi fork vs ours

To get the same training results as us, and to use our pi0 models with the Trossen fork of openpi -- recommended -- the following small mods are required. We do not know if these mods are correct and we continue to experiment with them.

  • Joint_flip_mask: We could be wrong, but we are currently flipping the shoulder, not the elbow, and not transforming the gripper:
  • in training/config.py:
        TrainConfig(
                ...
            data=LeRobotAlohaDataConfig(
                adapt_to_pi=False,
                ...
    becomes:
        TrainConfig(
                ...
            data=LeRobotAlohaDataConfig(
                adapt_to_pi=False,
                adapt_trossen_to_pi=True         
                 ...
    
    also, in policies/aloha_policy.py:
        class AlohaInputs(transforms.DataTransformFn):
            adapt_trossen_to_pi: bool = False (added by us)
                ...
        def _joint_flip_mask_trossen() -> np.ndarray:
            """Joint 1 gets flipped by -1"""
            return np.array([1, -1, 1, 1, 1, 1, 1, 1, -1, 1, 1, 1, 1, 1])
                ...
        def _decode_state(state: np.ndarray, *, adapt_to_pi: bool = False, adapt_trossen_to_pi: bool = False) -> np.ndarray:
                ...
            elif adapt_trossen_to_pi:
                state = _joint_flip_mask_trossen() * state
                ...
        def _encode_actions(actions: np.ndarray, *, adapt_to_pi: bool = False, adapt_trossen_to_pi: bool = False) -> np.ndarray:
                ...
            elif adapt_trossen_to_pi:
                actions = _joint_flip_mask_trossen() * actions
                ...
        def _encode_actions_inv(actions: np.ndarray, *, adapt_to_pi: bool = False, adapt_trossen_to_pi: bool = False) -> np.ndarray:
                ...
            elif adapt_trossen_to_pi:
                actions = _joint_flip_mask_trossen() * actions
                ...
  • Image resize: We believe (could be wrong) that image resizing during training uses resize_with_pad. Also, our datasets and robot images (with our older lerobot code) are RGB, so we made a few mods to main.py in the trossen_ai example:
  • # Transform and resize images from all cameras
        for cam in cameras:
            image_hwc = observation_dict[cam]
            # convert BGR to RGB
            image_resized = cv2.resize(image_hwc, (224, 224))
            image_rgb = cv2.cvtColor(image_resized, cv2.COLOR_BGR2RGB)
            image_chw = np.transpose(image_rgb, (2, 0, 1))
            observation_dict[cam] = image_chw
    
    becomes:
    
        for cam in cameras:
            image_hwc = observation_dict[cam].numpy()
            image_resized = image_tools.convert_to_uint8(image_tools.resize_with_pad(image_hwc, 224, 224))
            image_chw = np.transpose(image_resized, (2, 0, 1))
            observation_dict[cam] = image_chw
  • Sim to real joint calibration: To align sim to real robots:
  • in examples/trossen_ai/main.py->run_episode(...)
    
        self.execute_action(a_t)
    
    becomes:
    
        if self.adjust_for_sim_to_real:
            a_t=a_t.copy()
            a_t[7]=1.05*(a_t[7]+0.01)
            a_t[8]=a_t[8]-0.025
            a_t[9]=a_t[9]+0.025
        self.execute_action(a_t)
  • Sim to real home pose: The sim default home position in our dataset and in our gym-aloha is different from the default lerobot staged_position:
  • 
    in packages/lerobot_robot_trossen/src/lerobot_robot_trossen/config_widowxai_follower.py
    
        staged_positions: list[float] = field(
            default_factory=lambda: [0, np.pi / 3, np.pi / 6, np.pi / 5, 0, 0, 0]
        )
    
    becomes:
    
    in examples/trossen_ai/main.py:
    
        robot_config=TrossenAIStationaryRobotConfig(max_relative_target,home_pose=[0, np.pi/12, np.pi/12, 0, 0, 0, 0.044])

Using pi0 models from huggingface

openpi does not download and use huggingface models directly, but huggingface models easily can be used in openpi as follows using trossen_ai_stationary_sim_pi013 as an example:

Download model to local file:

huggingface-cli download ANRedlich/trossen_ai_stationary_sim_pi013 --local-dir ~/openpi/openpi/checkpoints/hf_checkpoint

Add assets to TrainConfig in training/config.py:


TrainConfig(
       name="pi0_aloha_sim_trossen_ai_mem_finetune_v2",
       model=pi0_config.Pi0Config(...),
       data=LeRobotAlohaDataConfig(
       repo_id="ANRedlich/trossen_ai_stationary_sim_transfer_40mm_cube_13",
       assets=AssetsConfig( #note: only use this to over-ride default assets location
           assets_dir="./checkpoints/hf_checkpoint/assets",
           asset_id="ANRedlich/trossen_ai_stationary_sim_transfer_40mm_cube_13",
       ),
                    

In serve_policy.py point dir to the downloaded model:


EnvMode.ALOHA_SIM_TROSSEN_AI_FINETUNE: Checkpoint(
    config="pi0_aloha_sim_trossen_ai_mem_finetune_v2",
    dir="./checkpoints/hf_checkpoint"
),
-->