5.6 Training and Evaluation

Ready to teach your KikoBot C1 how to move intelligently? It’s time to train a policy and evaluate its performance! This guide walks you through the essential steps to train and evaluate a policy for your robot using the latest dataset. Whether you're using a powerful GPU or your local machine, this process will help you develop a control policy that can drive your robot's movements with precision. Let’s dive in!

Step 1: Training the Policy

Training a policy is the first step to enable your KikoBot C1 to make decisions based on the environment and the data it has learned. You can use the following command to begin the training process:

python lerobot/scripts/train.py   
  dataset_repo_id=${HF_USER}/kikobot_test   
  policy=act_kikobot_real   
  env=kikobot_real   
  hydra.run.dir=outputs/train/act_kikobot_test   
  hydra.job.name=act_kikobot_test   
  device=cuda   
  wandb.enable=true

Let’s break it down:

dataset_repo_id=${HF_USER}/kikobot_test: Here, you’re specifying the path to the dataset you’ve uploaded earlier. It contains all the robot’s movements and positions that the model will use to learn.
policy=act_kikobot_real: This is the policy that controls the robot’s actions. It loads the configuration from a YAML file (e.g., lerobot/configs/policy/act_kikobot_c1_real.yaml). This policy defines how the robot should react to various situations. It also specifies that 2 cameras will be used as input (like your laptop and phone), giving the robot a visual understanding of its environment.
env=kikobot_real: This is the environment configuration. It tells the model what kind of environment the robot will be operating in, like the real-world setup where it will use its sensors and cameras to interact with the surroundings.
hydra.run.dir=outputs/train/act_kikobot_test: This specifies the output directory where the training results and logs will be saved, including any model checkpoints.
hydra.job.name=act_kikobot_test: Here, we name the job to keep track of this specific training run.
device=cuda: If you’re training on an Nvidia GPU, the cuda flag enables GPU acceleration, making the training process much faster. For Mac users with Apple silicon, you can use device=mps, and if you don’t have a GPU, you can use device=cpu.
wandb.enable=true: Weights and Biases (Wandb) is a tool that helps you visualize the training process, track progress, and compare different models. By enabling Wandb, you'll get a real-time dashboard showing training metrics like loss and accuracy. Don’t forget to log in to Wandb using wandb login before using it.

Training Duration: Training can take several hours depending on the complexity of the task and the power of your machine. Be patient! After training, you will find the model checkpoints saved in the specified output directory (e.g., outputs/train/act_kikobot_c1_test/checkpoints), where you can resume or evaluate your policy.

Step 2: Evaluating the Policy

Once your policy is trained, it's time to evaluate how well it performs. Evaluation helps you understand how the model applies its learned knowledge in the real world. Here’s how you can evaluate your trained policy on the KikoBot C1.

Run this command to record 10 evaluation episodes based on the policy you've just trained:

python lerobot/scripts/control_robot.py record   
  --robot-path lerobot/configs/robot/kikobot.yaml   
  --fps 30   
  --repo-id ${HF_USER}/eval_kikobot_test   
  --tags kikobot tutorial eval   
  --warmup-time-s 5   
  --episode-time-s 40   
  --reset-time-s 10   
  --num-episodes 10   
  -p outputs/train/act_kikobot_test/checkpoints/last/pretrained_model

Breaking it down:

--robot-path lerobot/configs/robot/kikobot.yaml: This is the configuration file that tells the system which robot model you're using — in this case, the KikoBot C1. It ensures the system knows how to interact with the robot.
--fps 30: Specifies the frames per second for the evaluation. It dictates how fast the system will process the data during evaluation.
--repo-id ${HF_USER}/eval_kikobot_test: This points to the repository where the evaluation dataset will be saved. It helps you keep track of evaluation results separately from the training data.
--tags kikobot tutorial eval: Tags help organize your dataset and evaluation by categorizing it as tutorial or evaluation, making it easier to find later.
--warmup-time-s 5: A 5-second warmup period before the evaluation starts, ensuring the system is ready to record proper data.
--episode-time-s 40: Each evaluation episode will run for 40 seconds.
--reset-time-s 10: After each episode, the robot gets 10 seconds to reset, giving it a brief moment to recalibrate.
--num-episodes 10: Record 10 evaluation episodes to see how well the model performs under different conditions.
-p outputs/train/act_kikobot_test/checkpoints/last/pretrained_model: This flag loads the pretrained model checkpoint from your training session and uses it to guide the robot’s actions during the evaluation episodes. Alternatively, if you uploaded the model to Hugging Face, you can use its URL (e.g., -p ${HF_USER}/act_kikobot_c1_test).

Step 3: Viewing Evaluation Results

After you’ve completed the evaluation, you can visualize the results to see how well your KikoBot C1 performed during the evaluation episodes.

If you uploaded the evaluation dataset to the Hugging Face Hub, simply copy your repo ID:

echo ${HF_USER}/eval_kikobot_test

Then, navigate to Hugging Face’s website and paste the repo ID into the search bar to view your evaluation results.

Locally: If you didn't upload the dataset, you can still visualize it locally by running:

python lerobot/scripts/visualize_dataset_html.py   

  --repo-id ${HF_USER}/eval_kikobot_test

This will generate an HTML file with visual representations of the evaluation episodes, allowing you to analyze how well the trained policy controlled the robot in various scenarios.

Step 4: Replay an Evaluation Episode

Want to see how your KikoBot C1 performs in real time based on the evaluated policy? Simply replay one of the evaluation episodes using this command:

python lerobot/scripts/control_robot.py replay   
  --robot-path lerobot/configs/robot/kikobot.yaml   
  --fps 30   
  --repo-id ${HF_USER}/eval_kikobot_test   
  --episode 0

This command will make your robot replay episode 0 from the evaluation dataset. It’s a great way to test the model’s behavior and see how well it applies its learning in a live setting.

By following these steps, you will be able to train, evaluate, and replay the performance of your KikoBot C1 policy. Whether you're looking to improve the robot’s decision-making or test its real-world capabilities, this process provides a powerful framework to develop and fine-tune your robot’s behavior.

Step 1: Training the Policy​

Step 2: Evaluating the Policy​

Step 3: Viewing Evaluation Results​

Step 4: Replay an Evaluation Episode​

Step 1: Training the Policy

Step 2: Evaluating the Policy

Step 3: Viewing Evaluation Results

Step 4: Replay an Evaluation Episode