1. Hardware
Our Trossen AI Stationary was purchased May 2025. We are still running Trossen Arm Driver v1.7.8. Our local computer is a System76 Thelio Mira desktop running Ubuntu 22.04 with a RTX 5090 GPU. The RTX 5090 is used for LoRA fine tuning of pi0, but for full fine tuning we have been using H100 gpus remotely on runpod.io.
Back to top2. Software
We have started by augmenting and tweaking the gym-aloha environment, as well as the (deprecated) Trossen lerobot framework, with the goal of providing seamless sim to sim, sim to real, and real to sim support for the Trossen AI Stationary robot. We have also been tweaking the lerobot software for smoother Trossen AI Stationary real robot dataset acquisition. In addition, we have added real and simulated Trossen AI Stationary support to the openpi framework. Very recently, we started experimenting with Isaac GR00T N.17 on our robot too! Our forks are at github.com/anredlich. Highlights:
- gym-aloha: we added *.xml mujoco files to the assets folder and augmented the sim.py and sim_end_effector.py simulator code to give gym-aloha the ability to simulate the Trossen AI Stationary robot (mujoco files and code adapted from trossen_arm_mujoco). This includes both joint controlled and end-effector controlled simulations for the transfer-cube task. For this task, we added environmental options such as box size, box position, box orientation, and box color, as well as some control over lighting, robot joint reference angles, and robot base positions.
- lerobot: we added control_sim_robot.py which uses the augmented gym-aloha environment to create and replay simulated datasets for the Trossen AI Stationary robot. We also added scripted_policy.py, a heuristic waypoint policy adapted from trossen_arm_mujoco, for the simulated robot rollouts. In addition, we modified train.py and eval.py so that they can train and evaluate policies for the simulated Trossen AI Stationary robot. Together these additions allow full sim to sim, sim to real, and real to sim evaluations. Combining simulated and real robot replay can also be used to calibrate/match the simulated to the real robot. We added better text to voice and additional voice prompts to control_robot.py to improve real robot dataset acquisition workflow. We also added 4 new evaluate_*.py and train_*.py example files for both the old aloha and the new Trossen AI simulated robots. Moreover, we added a task splitting tool, dataset_splitter.py, which takes an long high level task episode, and splits it into sub-tasks episodes with their own sub-task prompts. We also added a dataset_merge.py tool to pool lerobot datasets into a single dataseet. (Note that all our mods are to the deprecated Trossen lerobot fork. We have not yet ported to the newer plugin version of lerobot.)
- openpi: we have added hardware driver support to run pi0/pi0.5 policies on the Trossen AI Stationary Robot within the openpi framework. This was done by adapting the trossen_ai example, in particular the TrossenOpenPIBridge class in main.py from their openpi fork. We have more recently added record.py which allows human intervention and teleoperation during a pi0/pi0.5 policy rollout. It records this rollout/teleoperated episode in lerobot format, and also can record pure rollouts and pure teleoperation, see Policy Improvement using Human Interventions. We have also added full simulated Trossen AI Stationary robot support in our aloha_sim_trossen_ai example, which calls our gym-aloha fork in place of the gym-aloha that downloads with openpi. (Note that we still use the v1.7.8 robot driver with cut and pastes (deprecated) lerobot code, but for newer drivers we recommend using the Trossen Robotics fork of openpi, with a couple of minor modifications to main.py and aloha_policy.py, outlined here. For record.py a few mods can get it to work in the Trossen fork. One of these day we will adapt our fork to the newer drivers.)
- Isaac GR00T N.17: we forked the Isaac GR00T repository and added the example folder trossen_ai with everything needed to run GR00T models on our Trossen Stationary AI (runs our 1.7.8 driver for the moment, also only in the develop branch at this time!). We then trained on our ANRedlich/trossen_ai_stationary_transfer_40mm_cube_02 dataset using an H100 gpu on runpod.io. The resulting policy was crazy shaky and almost destroyed the robot, so we added EMA low-pass filtering and also re-trained with the action chunk increased from 16 to 32 and finally used the Trossen robot clipping variable max_relative_target to remove any remaining large action deltas. This smoothed out the robot! More details will be given later. We have just gotten started!
3. Optimizations
This is a non-exhaustive list of small optimizations and problem resolutions that may be helpful to other Trossen AI Stationary Robot users.
- robot: WARNING: Do NOT let pets or small children near the leader arms: they can swing and swoop down violently, especially if you play with the arm joint_characteristics. Almost learned this the hard way.
- robot: sticky gripper: The right arm gripper was a bit sticky (it feels like static friction) and would over-shoot. Improved this by adjusting the embedded arm joint_characteristics variable, friction_viscous_coef, for the gripper (joint 6) from 202.61772... to 25.0. See the Trossen documentation for how to do this.
- lerobot: version warning: There was a dataset version error which prevented lerobot simulation testing and dataset visualization for older aloha and pusht datasets. Converted this to a warning.
- lerobot: missing type in config: Fixed a model writing error in train.py: the checkpoint config.json file was missing the "type: act" or "type: diffusion" line so the model could not be read, e.g. by eval.py. Solved this by adding type: str = "act" line to configuration_act.py and type: str = "diffusion" to configuration_diffusion.py.
- lerobot: max joint angle change: For real robot rollouts, we found that setting the robot.max_relative_target to 0.05-0.1 radians makes a huge difference in whether a learned policy succeeds. This argument clips the maximum joint angle change in one step, thereby reducing jerky motions which seem to take the robot out of the learning distribution and often lead to failure.
- lerobot: multi-task dataset workflow: To build a real robot multi-task dataset, we have lerobot randomly choose from a set of prompts and use log_say to voice the prompt. Then we use teleoperation to enact the task corresponding to that prompt. See control_robot.py->record(...) and control_utils.py->record_episode(...) in our lerobot for more details.
- lerobot: building a dataset with sub-tasks: Sub-tasks, such as 'pick up the spoon', which are part of a high level task, such as 'clean up the kitchen', must flow naturally into one another. Hence, to build a dataset of sub-tasks, we first record episodes of full tasks, and then use our dataset_splitter.py to split each full task episode into a number of sub-task episodes. Each sub-task is also labeled with a sub-task prompt. For each sub-task, dataset_splitter.py takes as input a range of full task frames/events, and also a prompt. We use visualize_dataset.py to look at the robot video to determine a frame range for each sub-task. The splitter tool, along with dataset_merger.py are in the develop branch of our lerobot.
- openpi: normalization statistics: We use compute_norm_stats.py to create a norm_stats.json file for each of our datasets! This is especially important since the default pi0_base trossen norm_stats.json seems to be for the older Aloha robot which has different joints.
- openpi: LoRA: If lora fine tuning is used, it seems to be necessary to use the same lora model definition for both train.py and for serve_policy.py, although it is possible we are misunderstanding this. For example, if model=pi0_config.Pi0Config(paligemma_variant="gemma_2b_lora") is used in TrainConfig() (in config.py) for training, we find it necessary to have serve_policy.py use the same TrainConfig() for policy rollouts.
- openpi: joint_flip_mask: In aloha_policy.py, some joint angles are multiplied by -1 to make joint directions consistent with those expected by the pi0 policy. For the original Aloha robot, this required the shoulder and elbow joints (numbers 1 and 2) to be flipped. For the newer Trossen AI Stationary we believe that the shoulder still needs flipping, but the elbow does not. Also, we do not use the gripper transform in aloha_policy.py, just joint_flip_mask. See adapt_trossen_to_pi in our version of aloha_policy.py, and see more below.
- openpi: image resize: The pi0 model wants 224x224 size images. To resize, openpi uses images.tools.resize_with_pad, so in the trossen_ai example main.py file, we changed the cv2.resize, which does not pad, to the images_tools version, see more below.
- openpi: sim to real joint calibration: As discussed in the calibration section, below, the real and simulated robot joints are just slightly out of alignment. Hence, for sim to real to work well, we adjusted actions for the right robot arm's joints 1 and 2 using the above calibration: action[7+1]-=0.025 and action[7+2]+=0.025. For base joint 0, we also needed to adjust action[7+0]=1.05*(action[7+0]+0.01). For the base, this angular shift compensates for the shift in location of the sim vs real bases. On the other hand, the small multiplier is a mystery, but works.
- openpi: sim to real home pose: It is important to use the same initial pose for the real and simulated robots. We achieve this using the home_pose variable we added to the (deprecated) TrossenAIStationaryRobotConfig. The value needed for our sim to real experiments is home_pose=[0, np.pi/12, np.pi/12, 0, 0, 0, 0.044].
- openpi: multiple task prompts: To train a dataset with individual task prompts for each episode, such as in the dataset trossen_ai_stationary_pick_and_place_07, below, a couple of additional lines are needed in TrainConfig, as shown in Training Details, where the line base_config= ..., and the line "prompt": ... need to be uncommented.
- openpi: single arm tasks: In datasets which use only one of the Trossen AI Stationary arms, the state and action standard deviations for the unused arm can be very small or zero in norm_stats.json. This causes the normalized states and actions to blow up leading to huge losses during training and extreme sensitivity to noise/vibration during robot rollouts. pi0 divides by the standard deviation to normalize, so our solution is to replace the left arm standard deviations in the norm_stats.json file with 0.01. pi05, on the other hand, divides by (q99-q01) where q01 and q99 are quantiles, hence we set q01=mean-0.5 and q99=mean+0.5 for the left arm actions and states.
- openpi: image crop: The image transform used by openpi takes the 640x480 realsensse camera image and transforms it using 'resize with padding' to 224x224, which is the size expected by Paligemma. Unfortunately, this pads the images with black space, and hence wastes image tokens. We added CenterCropImages in transforms.py and added a few lines to the TrainConfig in config.py. CenterCropImages performs a square crop so there is no padding. Also, by cropping only the image center, it effectively zooms in for higher resolution.
- openpi: policy improvement using human interventions: This DAgger-like approach has produced the most improvement in performance among the approaches tried here, such as enlarging the imitation learning dataset.
4. Datasets
We have been acquiring and uploading -- to huggingface -- both real robot and simulated robot datasets. The real robot datasets were acquired using the lerobot control_robot.py with the record option. The simulated datasets were acquired using our control_sim_robot.py with the record option. These datasets can be visualized using lerobot's visualize_dataset.py or online at lerobot/visualize_dataset . See the anredlich/lerobot readme for more details. Datasets have 50-500 episodes. Here are the dataset repo_ids:
Real robot:
- ANRedlich/trossen_ai_stationary_transfer_20mm_cube_01
see video on home page - ANRedlich/trossen_ai_stationary_transfer_40mm_cube_02
- ANRedlich/trossen_ai_stationary_transfer_multi_cube_03
- ANRedlich/trossen_ai_stationary_place_lids_04
- ANRedlich/trossen_ai_stationary_pour_box_05
see video on home page - ANRedlich/trossen_ai_stationary_pop_lid_06
see video on home page - ANRedlich/trossen_ai_stationary_pick_and_place_07
multi-task: e.g. 'pick up red cube and place in silver pan' - ANRedlich/trossen_ai_stationary_pick_and_place_08
multiple sub-tasks that flow into each other without going home first
(250 episodes, ~30min total) - ANRedlich/trossen_ai_stationary_pick_and_place_09
larger version of 08 (500 episodes, ~60min total) - ANRedlich/trossen_ai_stationary_place_bead_on_string_10
high dexterity, small objects (50 episodes) - ANRedlich/trossen_ai_stationary_place_lids_13
rollouts from place_lids_04 policy with human intervention at failure - ANRedlich/trossen_ai_stationary_place_bead_on_string_14
one iteration of human interventions at failure added to _10 (100 episodes) - ANRedlich/trossen_ai_stationary_place_bead_on_string_15
second iteration of human interventions at failure added to _14 (150 episodes) - ANRedlich/trossen_ai_stationary_close_tie_wrap_16
high dexterity, (50 episodes) - ANRedlich/trossen_ai_stationary_close_tie_wrap_17
one iteration of human interventions at failure added to _16 (100 episodes)
Simulated robot:
- ANRedlich/trossen_ai_stationary_sim_transfer_40mm_cube_07
cube color=red, size=40mm, tabletop=black, background=none, lighting=bright - ANRedlich/trossen_ai_stationary_sim_transfer_40mm_cube_08
cube_color=dark red, size=40mm, tabletop=mine, background=mine, lighting=medium - ANRedlich/trossen_ai_stationary_sim_transfer_40mm_cube_10
cube_color=r,g,b, size=25,40mm, tabletop=mine, background=none, lighting=bright
see video below - ANRedlich/trossen_ai_stationary_sim_transfer_40mm_cube_13
cube_color=red, tabletop=mine, background=mine, lighting=medium
tabletop=mine is image of my tabletop, background=mine is crudely images of my office walls
5. Models
We have been acquiring and uploading -- to huggingface -- learned models/policies for both the real and simulated robot datasets. So far, these are ACT models used as a baseline, with chunk_size=100, trained for 100K steps. Both the real and simulated models can be tested in the simulator using lerobot eval.py, or for individual episodes, using our evaluate_trossen_ai_stationary_policy.py. See our lerobot readme for more details. Here are the huggingface policy paths:
Real robot ACT models:
- ANRedlich/trossen_ai_stationary_real_act2_3
best real to sim, try in evaluate_pretrained_trossen_ai_policy.py, still only about 20% correct! - ANRedlich/trossen_ai_stationary_real_act5
see video on home page - ANRedlich/trossen_ai_stationary_real_act6
see video on home page
Simulated robot ACT models:
- ANRedlich/trossen_ai_stationary_sim_act7
- ANRedlich/trossen_ai_stationary_sim_act8
- ANRedlich/trossen_ai_stationary_sim_act10
see video below - ANRedlich/trossen_ai_stationary_sim_act13
best sim to real policy, but still very sensitive to conditions
Real robot pi0 models:
- ANRedlich/trossen_ai_stationary_real_pi03
LoRA fine tuned from pi0_base, see High Dexterity Fig 5. - ANRedlich/trossen_ai_stationary_real_pi04
Full fine tuned from pi0_base, see High Dexterity Figs 6b-d and 7.
Simulated robot pi0 models:
- ANRedlich/trossen_ai_stationary_sim_pi013
LoRA fine tuned from pi0_base, see Figs 5-7.
note: openpi doesn't support huggingface models directly, see how to use. Also, see pi0 training/running details.