Search | arXiv e-print repository

OGMP: Oracle Guided Multimodal Policies for Agile and Versatile Robot Control

Authors: Lokesh Krishna, Nikhil Sobanbabu, Quan Nguyen

Abstract: The efficacy of model-free learning for robot control relies on the tailored integration of task-specific priors and heuristics, hence calling for a unified approach. In this paper, we define a general class for priors called oracles and propose bounding the permissible state around the oracle's ansatz, resulting in task-agnostic oracle-guided policy optimization. Additionally, to enhance modulari… ▽ More The efficacy of model-free learning for robot control relies on the tailored integration of task-specific priors and heuristics, hence calling for a unified approach. In this paper, we define a general class for priors called oracles and propose bounding the permissible state around the oracle's ansatz, resulting in task-agnostic oracle-guided policy optimization. Additionally, to enhance modularity, we introduce the notion of task-vital modes. A policy mastering a compact set of modes and intermediate transitions can then solve perpetual tasks. The proposed approach is validated in challenging biped control tasks: parkour and diving on a 16-DoF dynamic bipedal robot, Hector. OGMP results in a single policy per task, solving indefinite parkour over diverse tracks and omnidirectional diving from varied heights, exhibiting versatile agility. Finally, we introduce a novel latent mode space reachability analysis to study our policy's mode generalization by computing a feasible mode set function through which we certify a set of failure-free modes for our policy to perform at any given state. △ Less

Submitted 14 June, 2024; v1 submitted 6 March, 2024; originally announced March 2024.

Comments: 12 pages, 12 figures

arXiv:2303.05711 [pdf, other]

Learning Multimodal Bipedal Locomotion and Implicit Transitions: A Versatile Policy Approach

Authors: Lokesh Krishna, Quan Nguyen

Abstract: In this paper, we propose a novel framework for synthesizing a single multimodal control policy capable of generating diverse behaviors (or modes) and emergent inherent transition maneuvers for bipedal locomotion. In our method, we first learn efficient latent encodings for each behavior by training an autoencoder from a dataset of rough reference motions. These latent encodings are used as comman… ▽ More In this paper, we propose a novel framework for synthesizing a single multimodal control policy capable of generating diverse behaviors (or modes) and emergent inherent transition maneuvers for bipedal locomotion. In our method, we first learn efficient latent encodings for each behavior by training an autoencoder from a dataset of rough reference motions. These latent encodings are used as commands to train a multimodal policy through an adaptive sampling of modes and transitions to ensure consistent performance across different behaviors. We validate the policy performance in simulation for various distinct locomotion modes such as walking, leaping, jumping on a block, standing idle, and all possible combinations of inter-mode transitions. Finally, we integrate a task-based planner to rapidly generate open-loop mode plans for the trained multimodal policy to solve high-level tasks like reaching a goal position on a challenging terrain. Complex parkour-like motions by smoothly combining the discrete locomotion modes were generated in 3 min. to traverse tracks with a gap of width 0.45 m, a plateau of height 0.2 m, and a block of height 0.4 m, which are all significant compared to the dimensions of our mini-biped platform. △ Less

Submitted 11 August, 2023; v1 submitted 10 March, 2023; originally announced March 2023.

Comments: 8 pages, 9 figures

arXiv:2109.12665 [pdf, other]

Linear Policies are Sufficient to Realize Robust Bipedal Walking on Challenging Terrains

Authors: Lokesh Krishna, Guillermo A. Castillo, Utkarsh A. Mishra, Ayonga Hereid, Shishir Kolathaya

Abstract: In this work, we demonstrate robust walking in the bipedal robot Digit on uneven terrains by just learning a single linear policy. In particular, we propose a new control pipeline, wherein the high-level trajectory modulator shapes the end-foot ellipsoidal trajectories, and the low-level gait controller regulates the torso and ankle orientation. The foot-trajectory modulator uses a linear policy a… ▽ More In this work, we demonstrate robust walking in the bipedal robot Digit on uneven terrains by just learning a single linear policy. In particular, we propose a new control pipeline, wherein the high-level trajectory modulator shapes the end-foot ellipsoidal trajectories, and the low-level gait controller regulates the torso and ankle orientation. The foot-trajectory modulator uses a linear policy and the regulator uses a linear PD control law. As opposed to neural network-based policies, the proposed linear policy has only 13 learnable parameters, thereby not only guaranteeing sample efficient learning but also enabling simplicity and interpretability of the policy. This is achieved with no loss of performance on challenging terrains like slopes, stairs and outdoor landscapes. We first demonstrate robust walking in the custom simulation environment, MuJoCo, and then directly transfer to hardware with no modification of the control pipeline. We subject the biped to a series of pushes and terrain height changes, both indoors and outdoors, thereby validating the presented work. △ Less

Submitted 5 October, 2021; v1 submitted 26 September, 2021; originally announced September 2021.

Comments: 8 pages, 10 Figures

arXiv:2104.01662 [pdf, other]

Learning Linear Policies for Robust Bipedal Locomotion on Terrains with Varying Slopes

Authors: Lokesh Krishna, Utkarsh A. Mishra, Guillermo A. Castillo, Ayonga Hereid, Shishir Kolathaya

Abstract: In this paper, with a view toward deployment of light-weight control frameworks for bipedal walking robots, we realize end-foot trajectories that are shaped by a single linear feedback policy. We learn this policy via a model-free and a gradient-free learning algorithm, Augmented Random Search (ARS), in the two robot platforms Rabbit and Digit. Our contributions are two-fold: a) By using torso and… ▽ More In this paper, with a view toward deployment of light-weight control frameworks for bipedal walking robots, we realize end-foot trajectories that are shaped by a single linear feedback policy. We learn this policy via a model-free and a gradient-free learning algorithm, Augmented Random Search (ARS), in the two robot platforms Rabbit and Digit. Our contributions are two-fold: a) By using torso and support plane orientation as inputs, we achieve robust walking on slopes of up to 20 degrees in simulation. b) We demonstrate additional behaviors like walking backwards, stepping-in-place, and recovery from external pushes of up to 120 N. The end result is a robust and a fast feedback control law for bipedal walking on terrains with varying slopes. Towards the end, we also provide preliminary results of hardware transfer to Digit. △ Less

Submitted 9 August, 2021; v1 submitted 4 April, 2021; originally announced April 2021.

Comments: 6 pages, 5 figures, Accepted in 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2021) in Prague, Czech Republic

arXiv:2010.16342 [pdf, other]

Robust Quadrupedal Locomotion on Sloped Terrains: A Linear Policy Approach

Authors: Kartik Paigwar, Lokesh Krishna, Sashank Tirumala, Naman Khetan, Aditya Sagi, Ashish Joglekar, Shalabh Bhatnagar, Ashitava Ghosal, Bharadwaj Amrutur, Shishir Kolathaya

Abstract: In this paper, with a view toward fast deployment of locomotion gaits in low-cost hardware, we use a linear policy for realizing end-foot trajectories in the quadruped robot, Stoch $2$. In particular, the parameters of the end-foot trajectories are shaped via a linear feedback policy that takes the torso orientation and the terrain slope as inputs. The corresponding desired joint angles are obtain… ▽ More In this paper, with a view toward fast deployment of locomotion gaits in low-cost hardware, we use a linear policy for realizing end-foot trajectories in the quadruped robot, Stoch $2$. In particular, the parameters of the end-foot trajectories are shaped via a linear feedback policy that takes the torso orientation and the terrain slope as inputs. The corresponding desired joint angles are obtained via an inverse kinematics solver and tracked via a PID control law. Augmented Random Search, a model-free and a gradient-free learning algorithm is used to train this linear policy. Simulation results show that the resulting walking is robust to terrain slope variations and external pushes. This methodology is not only computationally light-weight but also uses minimal sensing and actuation capabilities in the robot, thereby justifying the approach. △ Less

Submitted 10 November, 2020; v1 submitted 30 October, 2020; originally announced October 2020.

Comments: Accepted in 4th Conference on Robot Learning 2020, MIT, USA

arXiv:2008.07030 [pdf, other]

Training CNN Classifiers for Semantic Segmentation using Partially Annotated Images: with Application on Human Thigh and Calf MRI

Authors: Chun Kit Wong, Stephanie Marchesseau, Maria Kalimeri, Tiang Siew Yap, Serena S. H. Teo, Lingaraj Krishna, Alfredo Franco-Obregón, Stacey K. H. Tay, Chin Meng Khoo, Philip T. H. Lee, Melvin K. S. Leow, John J. Totman, Mary C. Stephenson

Abstract: Objective: Medical image datasets with pixel-level labels tend to have a limited number of organ or tissue label classes annotated, even when the images have wide anatomical coverage. With supervised learning, multiple classifiers are usually needed given these partially annotated datasets. In this work, we propose a set of strategies to train one single classifier in segmenting all label classes… ▽ More Objective: Medical image datasets with pixel-level labels tend to have a limited number of organ or tissue label classes annotated, even when the images have wide anatomical coverage. With supervised learning, multiple classifiers are usually needed given these partially annotated datasets. In this work, we propose a set of strategies to train one single classifier in segmenting all label classes that are heterogeneously annotated across multiple datasets without moving into semi-supervised learning. Methods: Masks were first created from each label image through a process we termed presence masking. Three presence masking modes were evaluated, differing mainly in weightage assigned to the annotated and unannotated classes. These masks were then applied to the loss function during training to remove the influence of unannotated classes. Results: Evaluation against publicly available CT datasets shows that presence masking is a viable method for training class-generic classifiers. Our class-generic classifier can perform as well as multiple class-specific classifiers combined, while the training duration is similar to that required for one class-specific classifier. Furthermore, the class-generic classifier can outperform the class-specific classifiers when trained on smaller datasets. Finally, consistent results are observed from evaluations against human thigh and calf MRI datasets collected in-house. Conclusion: The evaluation outcomes show that presence masking is capable of significantly improving both training and inference efficiency across imaging modalities and anatomical regions. Improved performance may even be observed on small datasets. Significance: Presence masking strategies can reduce the computational resources and costs involved in manual medical image annotations. All codes are publicly available at https://github.com/wong-ck/DeepSegment. △ Less

Submitted 16 August, 2020; originally announced August 2020.

Comments: Submitted to IEEE Transactions on Medical Imaging (Special Issue on Annotation-Efficient Deep Learning for Medical Imaging)

Showing 1–6 of 6 results for author: Krishna, L