License: CC BY 4.0
arXiv:2307.09754v4 [cs.RO] 26 Jan 2024

ProNav: Proprioceptive Traversability Estimation for Legged Robot Navigation in Outdoor Environments

Mohamed Elnoor, Adarsh Jagan Sathyamoorthy, Kasun Weerakoon, Dinesh Manocha
Technical report and video can be found at http://gamma.umd.edu/pronav/
Abstract

We propose a novel method, ProNav, which uses proprioceptive signals for traversability estimation in challenging outdoor terrains for autonomous legged robot navigation. Our approach uses sensor data from a legged robot’s joint encoders, force, and current sensors to measure the joint positions, forces, and current consumption respectively to accurately assess a terrain’s stability, resistance to the robot’s motion, risk of entrapment, and crash. Based on these factors, we compute the appropriate robot gait to maximize stability, which leads to reduced energy consumption. Our approach can also be used to predict imminent crashes in challenging terrains and execute behaviors to preemptively avoid them. We integrate ProNav with an exteroceptive-based method to navigate real-world environments with dense vegetation, high granularity, negative obstacles, etc. Our method shows an improvement up to 40% in terms of success rate and up to 15.1% reduction in terms of energy consumption compared to exteroceptive-based methods.

I Introduction

In recent years, autonomous legged robots have found applications in surveillance/monitoring [1], exploration [2], and search and rescue [3], etc. The key advantage that enables such applications is their superior capabilities in traversing complex terrains, ones that are inaccessible to wheeled and tracked robots.

It is important to develop autonomous methods for navigation in complex terrains, which can be broken down into three major categories: uneven/rocky outdoor terrains, dense vegetation, and granular terrains like sand and mud. The uneven or rocky terrains challenge the robot’s stability as they often lack solid footholds with sudden variations in elevation [4]. Dense vegetation introduces another layer of complexity, presenting risks of entanglement in branches, dried grass, or bushes [5, 6], leading to unstable behaviors such as slipping and tripping. The third category, granular terrains, often leads to the robot’s legs sinking into surfaces like sand or mud due to their deformability under the robot’s weight [7]. Each of these terrain types presents unique difficulties for legged robots, which can affect their navigational capabilities.

Refer to caption
Figure 1: Comparison of our method ProNav with other methods navigating a Spot robot through dense vegetation: ProNav adapts between two gaits: trot (in red), and amble (in green), Spot’s in-built planner (black), GA-Nav [8] (trot: yellow, crawl: brown), RFC[9] (trot: light blue and amble: dark blue), and VERN [5] (purple). In this scenario, we observe that our method successfully traverses the dense vegetation due to its efficient gait adaptation and accurate proprioception-based traversability estimation.

To tackle these challenges, the robot must be able to accurately evaluate a terrain’s traversability (a measure of the ease of navigation) and then plan its trajectories. Existing methods typically utilize exteroceptive modalities (RGB images, lidar point clouds, and scans) [10, 11, 12, 13] for traversability estimation. Such exteroceptive methods can provide valuable information about the terrain before walking over it. However, these methods experience degradation in perception accuracy in environments with high occlusions, poor illumination, scarce features, etc. For instance, the terrain geometry could be occluded by dense vegetation. Moreover, certain entities (e.g. negative obstacles such as ditches, and potholes) and changes in a terrain’s properties (dry sand versus wet sand) cannot be accurately detected by exteroceptive modalities.

To overcome these limitations, several methods have fused exteroception with proprioception to evaluate a terrain’s traversability [14, 15]. Proprioception measures the state of the robot’s joint and body position and force feedback [16], while exteroception sensing measures the state of the environment using sensors such as cameras, LiDAR, etc. Although proprioception cannot provide a look-ahead for the terrain, it more accurately represents the robot’s stability on a terrain since unstable walking behaviors are reflected by significant changes in the positions, forces experienced at certain joints, and high energy consumption. Existing research works on proprioceptive traversability analysis have predominantly focused on environments where the robot encounters slippage [17, 18, 19], and have not handled regions where the robot’s legs could get entangled (e.g. in dense vegetation).

Besides that, certain terrains such as concrete and asphalt can be traversed using a single ”best” gait. However, this does not apply to all terrains. For example, a grassy terrain may appear uniform but can vary significantly, transitioning from dry to muddy areas with similar visual appearances. Similarly, navigating rocky terrain presents a similar set of challenges as shown in Figure 2. These situations indicate that a legged robot must adapt its gait based on proprioceptive feedback instead of only following visual sensing.

Refer to caption
Figure 2: Images (a)-(c) depict the RGB images captured sequentially from the robot’s camera. (d) Plot of the fluctuations in knee force readings experienced by the robot while traversing the terrain. The high fluctuations represent instances when the robot became unstable. This shows that visually identical terrains could have different stability characteristics.

Main Contributions: To address these limitations, we propose ProNav, an approach for using proprioception for improved terrain traversability estimation in a variety of environments (rocky, granular, densely vegetated, etc). The proprioceptive signals are measured from a legged robot’s joint encoders, force, and current sensors. The novel components of our work include:

  • A novel terrain traversability estimation method using only proprioceptive signals (joint positions, forces, current consumption) to characterize the stability, and resistance to the robot’s motion on a terrain. Our method uses the aforementioned signals to estimate traversability using Principal Component Analysis (PCA) within 1111 second of walking on a new terrain type using edge computing hardware with limited computation power.

  • A novel crash prediction mechanism that can foresee slipping, tripping, and leg entrapment-related crashes. This leads to an improvement of 40% in terms of success rate in densely vegetated regions where all other methods experienced difficulties in reaching the goal.

  • A novel gait adaptation approach that selects the appropriate gait leading to increased stability (lower vibrations), and lower energy consumption while traversing challenging terrains. We highlight ProNav’s performance by integrating it with an exteroception-based navigation method for traversing through dense vegetation, and rocky and granular terrains.

II Related Works

In this section, we discuss the existing methods for estimating terrain traversability. Next, we analyze the existing navigation and planning techniques for legged robots.

II-A Perception for Navigation

Autonomous robot navigation in challenging environments requires robots to perceive the real world through their sensors. To this end, robots often incorporate onboard exteroceptive, and proprioceptive sensors. We briefly review the existing work on exteroceptive and proprioceptive perception in the following sub-sections.

II-A1 Exteroceptive Sensors

A popular approach is the use of geometry-based methods which reconstruct a 3D representation of the environment by using technologies such as LiDAR or stereo cameras [12] . Another approach, as presented in [13], generates a 3D triangle mesh of the environment from a 3D point cloud, which is then input into an online path planner for local navigation. Recently, [20] proposed learning terrain traversability by training a sparse 3D network of occupancy maps. However, these geometry-based methods have limitations, including difficulties with deformable surfaces such as sand, obstacles like tall grass, and the risk of poor estimation [12, 2].

Concurrently, vision-based approaches have seen widespread application in robot perception [10, 11]. Previous work in semantic segmentation categorizes terrain properties into traversable and non-traversable classes. For instance, Guan et al [8] leverage a multi-head vision transformer architecture to segregate the terrain into six distinct categories. Also, traversability classification can be performed using anomaly detection from multi-model images [21]. Even though such vision-based systems perform well under perfect weather conditions, they often result in erroneous classification due to lighting changes [22].

Several studies have also explored the potential of sensor fusion for terrain classification [23, 24, 25]. Notably, in [23], geometric and vision-based techniques are used to deliver improved performance. In [25], reliability-aware sensor fusion is performed to mitigate the performance degradation due to cluttered sensing. Recently, [5] proposed VERN, which utilizes a lightweight Siamese network to classify complex outdoor vegetation based on traversability. The method in [26] employs IMU sensor data to learn surface traction, bumpiness, and deformability using an online self-supervised learning strategy. While this approach has shown promising results for a number of terrains, others like rocks and bushes, with irregular texture/structure, were not investigated.

II-A2 Proprioceptive sensors

In outdoor environments, exteroceptive sensors could receive noisy data because of factors such as degraded lighting conditions and occlusions. Also, the environment can be extreme and challenging. For instance, the ground could be covered by vegetation (e.g., short/tall grass, bushes) and the robot cannot recognize the terrain type using vision or LiDAR. To overcome such issues, there has been a continuous development in proprioceptive perception [27]. Moreover, proprioception can be coupled with vision in legged robots as in [14], where Fu et al. use the camera to create a cost map around the robot, while the terrain traversability is mainly evaluated based on proprioceptive feedback. That also helps in avoiding unexpected obstacles such as glass walls. [28] proposed a cross-modal algorithm that uses an RGB camera and shifted proprioception to learn a walking locomotion policy. More recently, Dey et al. [29] leverage the proprioceptive information from a legged robot’s joints to predict slip and fall events with high accuracy. However, the robot is operated in a limited number of terrains such as rubble and other uneven, underground terrains, and not in densely vegetated environments. Moreover, their proposed model primarily predicted slipping and tripping and it is not used for navigation. Our novel approach uses proprioceptive feedback and current consumption from the actuators to also detect entanglement in dense vegetation and recover the robot.

II-B Outdoor Navigation

Recently, many approaches have been proposed to leverage the agile mobility of legged robots [30, 31] in unstructured outdoor environments, which is challenging for wheeled robots [32]. Some of these works use cost maps to represent the traversability of the environments [33].Semantic Belief Graph are utilized in [34] to train a policy for trajectory generation in extreme environments. Moreover, a traversability uncertainty-based method is proposed in [35]. In [36], the authors presented a traversability estimator that uses a classifier (or a regressor) neural network based on elevation maps. Artplanner [31] is a navigation planner designed for the DARPA Subterranean Challenge that uses geometric reachability checking and a motion cost neural network to compute optimal paths. Proprioceptive feedback is also used in the literature [6, 14, 29]. In [6], Lee et al. utilized proprioceptive feedback to train a robot controller using reinforcement learning. Their approach shows zero-shot capabilities when tested in outdoor settings. However, an inherent limitation of proprioception is its inability to preview terrain features before the robot directly interacts with them. This limitation motivates the integration of ProNav with an exteroceptive-based navigation method, ensuring a more comprehensive navigation strategy.

III Background

In this section, we explain our assumptions, define important notations used, and our problem formulation.

III-A Setup and Conventions

We assume a quadrupedal robot with 12 degrees of freedom (DOF), with 2-DOFs in the hip, and 1-DOF in the knee of each leg. We assign numbers 1, 2, 3, and 4 to denote the front-left, front-right, rear-left, and rear-right legs respectively, and i𝑖iitalic_i to denote each leg. A robot coordinate frame is established at its center of mass with positive X, Y, Z pointing forward, left, and up respectively. Frames with similar conventions are established at each hip and knee joint. The hip has two actuators, one is moving along the X-axis direction and the other one along the Y-axis direction. Moreover, the knee actuator moves along the Z direction. We also measure the positions pxhip,i,pyhip,isubscriptsuperscript𝑝𝑖𝑝𝑖𝑥subscriptsuperscript𝑝𝑖𝑝𝑖𝑦p^{hip,i}_{x},p^{hip,i}_{y}italic_p start_POSTSUPERSCRIPT italic_h italic_i italic_p , italic_i end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT , italic_p start_POSTSUPERSCRIPT italic_h italic_i italic_p , italic_i end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT, pzknee,isubscriptsuperscript𝑝𝑘𝑛𝑒𝑒𝑖𝑧p^{knee,i}_{z}italic_p start_POSTSUPERSCRIPT italic_k italic_n italic_e italic_e , italic_i end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT, velocity vxhip,i,vyhip,isubscriptsuperscript𝑣𝑖𝑝𝑖𝑥subscriptsuperscript𝑣𝑖𝑝𝑖𝑦v^{hip,i}_{x},v^{hip,i}_{y}italic_v start_POSTSUPERSCRIPT italic_h italic_i italic_p , italic_i end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT , italic_v start_POSTSUPERSCRIPT italic_h italic_i italic_p , italic_i end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT, vzknee,isubscriptsuperscript𝑣𝑘𝑛𝑒𝑒𝑖𝑧v^{knee,i}_{z}italic_v start_POSTSUPERSCRIPT italic_k italic_n italic_e italic_e , italic_i end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT, and force fxhip,i,fyhip,i,fzknee,isubscriptsuperscript𝑓𝑖𝑝𝑖𝑥subscriptsuperscript𝑓𝑖𝑝𝑖𝑦subscriptsuperscript𝑓𝑘𝑛𝑒𝑒𝑖𝑧f^{hip,i}_{x},f^{hip,i}_{y},f^{knee,i}_{z}italic_f start_POSTSUPERSCRIPT italic_h italic_i italic_p , italic_i end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT , italic_f start_POSTSUPERSCRIPT italic_h italic_i italic_p , italic_i end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT , italic_f start_POSTSUPERSCRIPT italic_k italic_n italic_e italic_e , italic_i end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT exerted at a time instant t𝑡titalic_t. Several widely used legged robot platforms possess these specifications and capability to measure these parameters [37, 38, 39].

Position and velocity data at the joints are measured using encoder sensors, and the forces experienced are measured using the internal tactile sensing mechanism. Finally, we assume that the current drawn (I(t)𝐼𝑡I(t)italic_I ( italic_t )) from the robot’s battery can be measured using an ammeter or a current sensor while traversing various terrains. We define 𝐗t36subscript𝐗𝑡superscript36\mathbf{X}_{t}\in\mathbb{R}^{36}bold_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT 36 end_POSTSUPERSCRIPT as the set of all positions (3), velocities (3), and forces (3) obtained from all four legs of the robot. Based on our setup and notation, we have formulated the state vector at a given time instant t𝑡titalic_t for our traversability estimation method as,

State Vector=[𝐗t36,I(t)].State Vectordelimited-[]formulae-sequencesubscript𝐗𝑡superscript36𝐼𝑡\text{State Vector}=\left[\mathbf{X}_{t}\in\mathbb{R}^{36},I(t)\in\mathbb{R}% \right].State Vector = [ bold_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT 36 end_POSTSUPERSCRIPT , italic_I ( italic_t ) ∈ blackboard_R ] . (1)

III-B Problem Domain

The focus of our approach is to enhance the navigational capabilities of legged robots traversing through a variety of terrains (e.g. densely vegetated, granular, rocky) using proprioceptive feedback to adapt to changes in surface conditions. In these terrains, the robot’s legs could slip, trip, sink, or get entangled. A robot falling to the ground (we define as a crash) which could be caused by one of the following reasons:

Poor Foothold: This causes the robot’s feet to slip in rocky or slippery terrains because the robot’s feet do not have a firm, flat surface to support themselves on.

Granularity: This causes the robot’s feet to sink into the terrain (e.g. sand, mud, snow) leading to erroneous measurements of joint positions. This could cause the robot’s controller to overcompensate to stabilize itself.

Resistance to Motion: This is typically caused by dense, pliable vegetation that can be passed through (e.g. tall grass and bushes). Additionally, the robot’s legs could get entangled with vegetation causing higher resistance to motion.

Refer to caption
Figure 3: (a) Changes in the hip X-axis position of the robot while traversing grass (green box), and rocks (brown box) plotted over time. (b) Force exerted by the four knee actuators while traversing grass (green box), and rocks (brown box). Steady readings observed on stable grass terrain reflect ease of traversal, while the increased volatility and noticeable spikes on the rocky terrain are indicative of increased resistance and slippage, causing variable load on the actuators. (c) Changes in the hip Y-axis position of the robot while traversing dense vegetation (violet box), and concrete (gray box). High fluctuations are observed while traversing dense vegetation due to the legs’ entanglement instances. Conversely, a steady and consistent reading is observed during concrete traversal.

To traverse various terrains, we assume a legged robot with a locomotion model that can alternate between three gaits: trot, crawl, and amble [29, 30]. Trot is the standard walking gait where the robot walks with two of its feet on the ground at a time instance, allowing fast movements. It is stable on hard surfaces, with moderate power consumption. On the other hand, during crawl and amble, the robot has three of its feet on the ground at a time instance, leading to more stable behaviors in uneven, granular, deformable surfaces. Amble helps to traverse through environments with high resistance to motion while also maintaining stability, which also helps handle poor foothold terrains. Similarly, crawl maintains high stability in granular terrains and regions with poor footholds while consuming minimal power. The maximum velocities for each gait follows the trend vmaxtrot=vmaxamble>vmaxcrawlsubscriptsuperscript𝑣𝑡𝑟𝑜𝑡𝑚𝑎𝑥subscriptsuperscript𝑣𝑎𝑚𝑏𝑙𝑒𝑚𝑎𝑥subscriptsuperscript𝑣𝑐𝑟𝑎𝑤𝑙𝑚𝑎𝑥v^{trot}_{max}=v^{amble}_{max}>v^{crawl}_{max}italic_v start_POSTSUPERSCRIPT italic_t italic_r italic_o italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_m italic_a italic_x end_POSTSUBSCRIPT = italic_v start_POSTSUPERSCRIPT italic_a italic_m italic_b italic_l italic_e end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_m italic_a italic_x end_POSTSUBSCRIPT > italic_v start_POSTSUPERSCRIPT italic_c italic_r italic_a italic_w italic_l end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_m italic_a italic_x end_POSTSUBSCRIPT, and the current consumption for each gait follows IAmble>ITrot>ICrawlsuperscript𝐼𝐴𝑚𝑏𝑙𝑒superscript𝐼𝑇𝑟𝑜𝑡superscript𝐼𝐶𝑟𝑎𝑤𝑙I^{Amble}>I^{Trot}>I^{Crawl}italic_I start_POSTSUPERSCRIPT italic_A italic_m italic_b italic_l italic_e end_POSTSUPERSCRIPT > italic_I start_POSTSUPERSCRIPT italic_T italic_r italic_o italic_t end_POSTSUPERSCRIPT > italic_I start_POSTSUPERSCRIPT italic_C italic_r italic_a italic_w italic_l end_POSTSUPERSCRIPT. Based on these definitions, our formulation can be stated as follows,

Formulation III.1.

To adaptively select a stable gait g*superscript𝑔g^{*}italic_g start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT given collision-free, goal-directed velocities (v*,ω*)superscript𝑣superscript𝜔(v^{*},\omega^{*})( italic_v start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT , italic_ω start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT ), by assessing a terrain’s traversability based on a set of proprioceptive signals from a legged robot to improve stability and prevent crashes.

IV ProNav: Proprioception-based Stable Navigation

In this section, we analyze and choose the relevant proprioceptive signals, process them to assess stability, and explain our gait adaptation strategy to stabilize the robot.

IV-A Analysis of Proprioceptive Signals

Our goal is to choose the fewest number of proprioceptive signals (i.e., the minimum subset 𝐘t𝐗tsubscript𝐘𝑡subscript𝐗𝑡\mathbf{Y}_{t}\subset\mathbf{X}_{t}bold_Y start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ⊂ bold_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT at every time instant t𝑡titalic_t) that are also excellent indicators of stability. Deducing the minimum subset helps reduce the input dimensionality of our approach, which in turn improves its real-time factor.

Hip’s Position: Our empirical analysis revealed a strong correlation between the amount of slip on a terrain and the change in hip position along the X-axis and Y-axis. Figure 3 visually represents these changes as the robot navigates three different types of terrain, each representing different levels of traversability: dense vegetation, rocks, concrete, etc.

Knee’s force: Sudden peaks in the forces experienced by the robot’s knee actuators (Fig. 3) along the Z direction indicate an absence of stable footholds due to unevenness, causing the robot to exert more effort to stabilize itself.

Current Consumption: The amount of current consumed while traversing various terrains at a consistent elevation is proportional to the resistance to motion experienced in each terrain (Fig. 4). Also, the robot’s gait consistently impacts current consumption on different terrains as mentioned before.

Refer to caption
Figure 4: The average current consumption in amperes, with 95% confidence interval as the robot traverses concrete, grass, and sand. Lower current consumption on concrete indicates ease of traversal. However, higher values on sand highlight increased resistance and energy usage. Additionally, a consistent trend in current consumption is exhibited while using crawl, trot, and amble on various terrains.

IV-B Prepossessing Proprioceptive Signals

Our goal in preprocessing the chosen force and position data is to obtain quantities that change drastically on various terrains, thus indicating their properties. Vectors of the processed data are then analyzed using Principal Component Analysis (PCA).

IV-B1 Preprocessing Force Data

At any time instant t𝑡titalic_t, we consider the past n𝑛nitalic_n samples of fzknee,isubscriptsuperscript𝑓𝑘𝑛𝑒𝑒𝑖𝑧f^{knee,i}_{z}italic_f start_POSTSUPERSCRIPT italic_k italic_n italic_e italic_e , italic_i end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT of the ithsuperscript𝑖𝑡i^{th}italic_i start_POSTSUPERSCRIPT italic_t italic_h end_POSTSUPERSCRIPT leg. That is, we consider the vector [fzknee,i(t),fzknee,i(t1),,fzknee,i(tn+1)]subscriptsuperscript𝑓𝑘𝑛𝑒𝑒𝑖𝑧𝑡subscriptsuperscript𝑓𝑘𝑛𝑒𝑒𝑖𝑧𝑡1subscriptsuperscript𝑓𝑘𝑛𝑒𝑒𝑖𝑧𝑡𝑛1[f^{knee,i}_{z}(t),f^{knee,i}_{z}(t-1),...,f^{knee,i}_{z}(t-n+1)][ italic_f start_POSTSUPERSCRIPT italic_k italic_n italic_e italic_e , italic_i end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT ( italic_t ) , italic_f start_POSTSUPERSCRIPT italic_k italic_n italic_e italic_e , italic_i end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT ( italic_t - 1 ) , … , italic_f start_POSTSUPERSCRIPT italic_k italic_n italic_e italic_e , italic_i end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT ( italic_t - italic_n + 1 ) ]. Next, we obtain the mean force for the ithsuperscript𝑖𝑡i^{th}italic_i start_POSTSUPERSCRIPT italic_t italic_h end_POSTSUPERSCRIPT leg’s knee as μfi=(j=0n1fzknee,i(tj))/nsubscriptsuperscript𝜇𝑖𝑓superscriptsubscript𝑗0𝑛1subscriptsuperscript𝑓𝑘𝑛𝑒𝑒𝑖𝑧𝑡𝑗𝑛\mu^{i}_{f}=(\sum_{j=0}^{n-1}f^{knee,i}_{z}(t-j))/nitalic_μ start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT = ( ∑ start_POSTSUBSCRIPT italic_j = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT italic_f start_POSTSUPERSCRIPT italic_k italic_n italic_e italic_e , italic_i end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT ( italic_t - italic_j ) ) / italic_n, and then the mean force experienced by the robot as a whole as μfrob=(i=14μfi)/4subscriptsuperscript𝜇𝑟𝑜𝑏𝑓superscriptsubscript𝑖14subscriptsuperscript𝜇𝑖𝑓4\mu^{rob}_{f}=(\sum_{i=1}^{4}\mu^{i}_{f})/4italic_μ start_POSTSUPERSCRIPT italic_r italic_o italic_b end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT = ( ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT italic_μ start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ) / 4. Finally, we calculate the difference Δfi=μfrobfzknee,i(t)subscriptsuperscriptΔ𝑖𝑓subscriptsuperscript𝜇𝑟𝑜𝑏𝑓subscriptsuperscript𝑓𝑘𝑛𝑒𝑒𝑖𝑧𝑡\Delta^{i}_{f}=\mu^{rob}_{f}-f^{knee,i}_{z}(t)roman_Δ start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT = italic_μ start_POSTSUPERSCRIPT italic_r italic_o italic_b end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT - italic_f start_POSTSUPERSCRIPT italic_k italic_n italic_e italic_e , italic_i end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT ( italic_t ) for each leg. As the robot walks on various terrains, Δfi<0subscriptsuperscriptΔ𝑖𝑓0\Delta^{i}_{f}<0roman_Δ start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT < 0 indicates that the robot has entered a poorly traversable terrain which leads to high knee forces, and Δfi>0subscriptsuperscriptΔ𝑖𝑓0\Delta^{i}_{f}>0roman_Δ start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT > 0 indicates a highly traversable terrain. To further amplify changes in traversability, we use i=14Δfisuperscriptsubscript𝑖14subscriptsuperscriptΔ𝑖𝑓\sum_{i=1}^{4}\Delta^{i}_{f}∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT roman_Δ start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT, and a counter that denotes the number of spikes in the force experienced, defined as: count=count+1𝑐𝑜𝑢𝑛𝑡𝑐𝑜𝑢𝑛𝑡1count=count+1italic_c italic_o italic_u italic_n italic_t = italic_c italic_o italic_u italic_n italic_t + 1 if Δti<0,i{1,2,3,4}formulae-sequencesubscriptsuperscriptΔ𝑖𝑡0𝑖1234\Delta^{i}_{t}<0,i\in\{1,2,3,4\}roman_Δ start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT < 0 , italic_i ∈ { 1 , 2 , 3 , 4 }.

IV-B2 Preprocessing Position Data

At time instant t𝑡titalic_t, we consider the past m𝑚mitalic_m samples of pxhip,isubscriptsuperscript𝑝𝑖𝑝𝑖𝑥p^{hip,i}_{x}italic_p start_POSTSUPERSCRIPT italic_h italic_i italic_p , italic_i end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT. For each leg i𝑖iitalic_i, we calculate the maximum maxi𝑚𝑎subscript𝑥𝑖max_{i}italic_m italic_a italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and minimum mini𝑚𝑖subscript𝑛𝑖min_{i}italic_m italic_i italic_n start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT values of these m𝑚mitalic_m samples and finally calculate Ωp,xrob(t)=i=14|maxi,xmini,x|subscriptsuperscriptΩ𝑟𝑜𝑏𝑝𝑥𝑡superscriptsubscript𝑖14𝑚𝑎subscript𝑥𝑖𝑥𝑚𝑖subscript𝑛𝑖𝑥\Omega^{rob}_{p,x}(t)=\sum_{i=1}^{4}\left|max_{i,x}-min_{i,x}\right|roman_Ω start_POSTSUPERSCRIPT italic_r italic_o italic_b end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_p , italic_x end_POSTSUBSCRIPT ( italic_t ) = ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT | italic_m italic_a italic_x start_POSTSUBSCRIPT italic_i , italic_x end_POSTSUBSCRIPT - italic_m italic_i italic_n start_POSTSUBSCRIPT italic_i , italic_x end_POSTSUBSCRIPT |. Ωp,xrobsubscriptsuperscriptΩ𝑟𝑜𝑏𝑝𝑥\Omega^{rob}_{p,x}roman_Ω start_POSTSUPERSCRIPT italic_r italic_o italic_b end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_p , italic_x end_POSTSUBSCRIPT represents the magnitude of variation in the hip positions along the X direction. Similarly, we obtain Ωp,yrobsubscriptsuperscriptΩ𝑟𝑜𝑏𝑝𝑦\Omega^{rob}_{p,y}roman_Ω start_POSTSUPERSCRIPT italic_r italic_o italic_b end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_p , italic_y end_POSTSUBSCRIPT along the Y direction. A high value of Ωp,xrob(t)subscriptsuperscriptΩ𝑟𝑜𝑏𝑝𝑥𝑡\Omega^{rob}_{p,x}(t)roman_Ω start_POSTSUPERSCRIPT italic_r italic_o italic_b end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_p , italic_x end_POSTSUBSCRIPT ( italic_t ) oder Ωp,yrob(t)subscriptsuperscriptΩ𝑟𝑜𝑏𝑝𝑦𝑡\Omega^{rob}_{p,y}(t)roman_Ω start_POSTSUPERSCRIPT italic_r italic_o italic_b end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_p , italic_y end_POSTSUBSCRIPT ( italic_t ) indicates the unavailability of stable footholds which leads to slippage (e.g. in rocky terrains), or the presence of a granular surface that leads to sinkage.

IV-B3 Processed Input Vector

We combine the processed quantities in knee forces and hip positions with the current drawn from the robot’s battery to construct the input vector A9𝐴superscript9A\in\mathbb{R}^{9}italic_A ∈ blackboard_R start_POSTSUPERSCRIPT 9 end_POSTSUPERSCRIPT to estimate terrain traversability as,

A(t)=[Δf1,Δf2,Δf3,Δf4,i=14Δfi,count,Ωp,xrob,Ωp,yrob,I].𝐴𝑡subscriptsuperscriptΔ1𝑓subscriptsuperscriptΔ2𝑓subscriptsuperscriptΔ3𝑓subscriptsuperscriptΔ4𝑓superscriptsubscript𝑖14subscriptsuperscriptΔ𝑖𝑓𝑐𝑜𝑢𝑛𝑡subscriptsuperscriptΩ𝑟𝑜𝑏𝑝𝑥subscriptsuperscriptΩ𝑟𝑜𝑏𝑝𝑦𝐼A(t)=[\Delta^{1}_{f},\,\Delta^{2}_{f},\,\Delta^{3}_{f},\,\Delta^{4}_{f},\,\sum% _{i=1}^{4}\Delta^{i}_{f},\,count,\,\Omega^{rob}_{p,x},\Omega^{rob}_{p,y},\,I].italic_A ( italic_t ) = [ roman_Δ start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT , roman_Δ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT , roman_Δ start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT , roman_Δ start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT , ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT roman_Δ start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT , italic_c italic_o italic_u italic_n italic_t , roman_Ω start_POSTSUPERSCRIPT italic_r italic_o italic_b end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_p , italic_x end_POSTSUBSCRIPT , roman_Ω start_POSTSUPERSCRIPT italic_r italic_o italic_b end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_p , italic_y end_POSTSUBSCRIPT , italic_I ] . (2)

All the quantities on the right in equation 2 are functions of t𝑡titalic_t. It is omitted for readability.

IV-C Terrain Traversability Estimation

To estimate a terrain’s traversability using our preprocessed proprioceptive signals, we first apply Principal Component Analysis (PCA) to reduce its 9 dimensions into two principal components as,

𝐩t=PCA(A(t)).subscript𝐩𝑡PCA𝐴𝑡\mathbf{p}_{t}=\text{PCA}(A(t)).bold_p start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = PCA ( italic_A ( italic_t ) ) . (3)

Here, 𝐩tsubscript𝐩𝑡\mathbf{p}_{t}bold_p start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT is a 2D point in the PCA space (Fig.5). PCA allows us to simplify and effectively compare different terrains based on these components. We chose to use two principal components because it yielded all the required information needed for traversability estimation. Using just one component was insufficient, and three components did not add new useful information in terms of visualizing distributions for each type of terrain as shown in Figure 6.

Continuously plotting the PCA points corresponding to traversing a terrain T𝑇Titalic_T with gait g𝑔gitalic_g for a time period results in a distribution/cluster of points as shown in Fig. 5a. We obtain several key insights from our analysis: 1. The variance of the data along the two principal components differentiates stable (low variance/small cluster) and unstable (high variance/big cluster) terrains, 2. Terrain-gait pairs that have similar stability characteristics have similar clusters (e.g. concrete-trot and asphalt-trot), and 3. The position of the PCA points can also aid in predicting imminent crashes with noticeable shifts during the moments immediately before and after crashes (see Fig. 5b).

Refer to caption
Figure 5: (a) PCA applied to key proprioceptive metrics (hip actuator positions, knee actuator force, and battery current) across two different terrains when using the trot gait. The variances along the two principal components indicate the level of stability on a terrain. (b) The figure shows the shift in the PCA distribution between stable navigation (grey points), before a crash (yellow), which represents 3 seconds before the crash, and 10 seconds after a crash (red), where a robot falls to the ground. If the robot’s proprioceptive signals lie outside the ellipse ΓsafesubscriptΓ𝑠𝑎𝑓𝑒\Gamma_{safe}roman_Γ start_POSTSUBSCRIPT italic_s italic_a italic_f italic_e end_POSTSUBSCRIPT, the robot is heading towards a crash.
Refer to caption
Figure 6: Figure (a) illustrates the use of one principal component, (b) shows the use of two components, and (c) displays the use of three components. In these figures, red indicates the proprioceptive data recorded on rocky terrain, while black denotes concrete terrain while the robot used the trot gait. Notably, the use of two components, as depicted in (b), offers a more distinct and clearer representation of the terrains compared to using either one or three components.

We extend this analysis to using all three gaits on terrains with poor footholds, granularity, and high resistance to motion, and obtained a unique cluster of points for each terrain-gait pair. By fitting a 2D Gaussian to each cluster, we obtain a characteristic ellipse (see Fig. 7a) that forms the boundary of the cluster. Similar to our previous insights, the size/area of each ellipse denotes the robot’s stability on a terrain while using a certain gait.

Since our objective is to maintain high stability in all terrain types, while also maintaining a fast progress towards the robot’s goal while navigating, we consider only the ellipses with the lowest area to maximum velocity of the gait ratio (Fig. 7b). We refer to them as high stability ellipses for each terrain type. Of these ellipses, we observe that trotting on stable, flat terrains such as concrete/asphalt creates the ellipse with the smallest area, and highest stability. We refer to this ellipse and its enclosing region as the Low Variance Zone (LVZ), highlighted in Fig. 7b. Ideally, the current PCA point 𝐩tsubscript𝐩𝑡\mathbf{p}_{t}bold_p start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT indicating the robot’s stability should lie within the LVZ. However, on other challenging terrains, 𝐩tsubscript𝐩𝑡\mathbf{p}_{t}bold_p start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT would most likely lie outside the LVZ. Next, we detail how 𝐩tsubscript𝐩𝑡\mathbf{p}_{t}bold_p start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT and the other ellipses in Fig. 7b can be used to select a stabilizing gait when 𝐩tLVZsubscript𝐩𝑡𝐿𝑉𝑍\mathbf{p}_{t}\notin LVZbold_p start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ∉ italic_L italic_V italic_Z without any exteroceptive feedback.

Refer to caption
Figure 7: The clusters’ ellipses for the PCA components of four different terrains, granular is red, poor foothold is blue, solid-flat is green, and high resistance is black. For each terrain, three types of gait data are shown using different ellipse boundaries. The solid line denotes trot, the dashed line denotes amble, and the dash-dot line is for crawl. (a) All 12 ellipses (b) The stable zone’s (SZ) ellipses 14subscript1subscript4\mathcal{E}_{1}-\mathcal{E}_{4}caligraphic_E start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - caligraphic_E start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT, where SZ=(1234)Γsafewhere SZsubscript1subscript2subscript3subscript4subscriptΓsafe\text{where SZ}=\left(\mathcal{E}_{1}\cup\mathcal{E}_{2}\cup\mathcal{E}_{3}% \cup\mathcal{E}_{4}\right)\subset\Gamma_{\text{safe}}where SZ = ( caligraphic_E start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∪ caligraphic_E start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∪ caligraphic_E start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ∪ caligraphic_E start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT ) ⊂ roman_Γ start_POSTSUBSCRIPT safe end_POSTSUBSCRIPT.

IV-D Stable Gait Adaptation

A key insight from Fig. 7 is that when the appropriate gait for a terrain is chosen (e.g. crawl for granular terrains) at time t𝑡titalic_t, the point 𝐩t+ii>0subscript𝐩𝑡𝑖for-all𝑖0\mathbf{p}_{t+i}\,\forall i>0bold_p start_POSTSUBSCRIPT italic_t + italic_i end_POSTSUBSCRIPT ∀ italic_i > 0 would be contained within the corresponding high stability ellipse (e.g. granular-crawl ellipse) as represented in Fig. 7b. Conversely, if an inappropriate gait is chosen, 𝐩t+ii>0subscript𝐩𝑡𝑖for-all𝑖0\mathbf{p}_{t+i}\forall i>0bold_p start_POSTSUBSCRIPT italic_t + italic_i end_POSTSUBSCRIPT ∀ italic_i > 0 will not lie within the high stability ellipse for that terrain. A key challenge is selecting the appropriate gait without knowing the terrain type.

To determine the most appropriate stabilizing gait g*superscript𝑔g^{*}italic_g start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT (with the corresponding ellipse *superscript\mathcal{E}^{*}caligraphic_E start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT) for a terrain, we only consider the four ellipses and their enclosing areas in Fig. 7b, and refer to them as 1=LVZ,2,3,4subscript1𝐿𝑉𝑍subscript2subscript3subscript4\mathcal{E}_{1}=LVZ,\mathcal{E}_{2},\mathcal{E}_{3},\mathcal{E}_{4}caligraphic_E start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = italic_L italic_V italic_Z , caligraphic_E start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , caligraphic_E start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT , caligraphic_E start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT, as marked. We refer to their union as the Stable Zone (SZ𝑆𝑍SZitalic_S italic_Z). Let us consider two points in subsequent time instances 𝐩tsubscript𝐩𝑡\mathbf{p}_{t}bold_p start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT and 𝐩t+1subscript𝐩𝑡1\mathbf{p}_{t+1}bold_p start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT. At time instant t𝑡titalic_t, if 𝐩tLVZsubscript𝐩𝑡𝐿𝑉𝑍\mathbf{p}_{t}\notin LVZbold_p start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ∉ italic_L italic_V italic_Z, and 𝐩tsubscript𝐩𝑡\mathbf{p}_{t}bold_p start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT lies in the intersection of any of the other ellipses, we calculate the minimum area ellipse as tMA=argmini(area(i)),i{Ellipses in the intersection}formulae-sequencesubscriptsuperscript𝑀𝐴𝑡𝑎𝑟𝑔𝑚𝑖subscript𝑛𝑖𝑎𝑟𝑒𝑎subscript𝑖for-allsubscript𝑖Ellipses in the intersection\mathcal{E}^{MA}_{t}=argmin_{i}(area(\mathcal{E}_{i})),\forall\mathcal{E}_{i}% \in\{\text{Ellipses in the intersection}\}caligraphic_E start_POSTSUPERSCRIPT italic_M italic_A end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = italic_a italic_r italic_g italic_m italic_i italic_n start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_a italic_r italic_e italic_a ( caligraphic_E start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ) , ∀ caligraphic_E start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ { Ellipses in the intersection }. We set t*=tMAsubscriptsuperscript𝑡subscriptsuperscript𝑀𝐴𝑡\mathcal{E}^{*}_{t}=\mathcal{E}^{MA}_{t}caligraphic_E start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = caligraphic_E start_POSTSUPERSCRIPT italic_M italic_A end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT, and gt*subscriptsuperscript𝑔𝑡g^{*}_{t}italic_g start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT as the associated gait calculated as gait(t*)𝑔𝑎𝑖𝑡subscriptsuperscript𝑡gait(\mathcal{E}^{*}_{t})italic_g italic_a italic_i italic_t ( caligraphic_E start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ). If gt*subscriptsuperscript𝑔𝑡g^{*}_{t}italic_g start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT is the appropriate gait for the terrain, 𝐩t+1t*subscript𝐩𝑡1subscriptsuperscript𝑡\mathbf{p}_{t+1}\in\mathcal{E}^{*}_{t}bold_p start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT ∈ caligraphic_E start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT, and the robot can continue to execute the same gait, i.e., gt+1*=gt*subscriptsuperscript𝑔𝑡1subscriptsuperscript𝑔𝑡g^{*}_{t+1}=g^{*}_{t}italic_g start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT = italic_g start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT.

However, if gt*subscriptsuperscript𝑔𝑡g^{*}_{t}italic_g start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT is not the appropriate gait, then 𝐩t+1t*subscript𝐩𝑡1subscriptsuperscript𝑡\mathbf{p}_{t+1}\notin\mathcal{E}^{*}_{t}bold_p start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT ∉ caligraphic_E start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT. This leads to two scenarios: 𝐩t+1SZsubscript𝐩𝑡1𝑆𝑍\mathbf{p}_{t+1}\in SZbold_p start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT ∈ italic_S italic_Z, or 𝐩t+1SZsubscript𝐩𝑡1𝑆𝑍\mathbf{p}_{t+1}\notin SZbold_p start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT ∉ italic_S italic_Z. If 𝐩t+1SZsubscript𝐩𝑡1𝑆𝑍\mathbf{p}_{t+1}\in SZbold_p start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT ∈ italic_S italic_Z, then we can compute the minimum area ellipse as before, tMA=argmini(area(i)),i{Ellipses in the intersection}formulae-sequencesubscriptsuperscript𝑀𝐴𝑡𝑎𝑟𝑔𝑚𝑖subscript𝑛𝑖𝑎𝑟𝑒𝑎subscript𝑖for-allsubscript𝑖Ellipses in the intersection\mathcal{E}^{MA}_{t}=argmin_{i}(area(\mathcal{E}_{i})),\forall\mathcal{E}_{i}% \in\{\text{Ellipses in the intersection}\}caligraphic_E start_POSTSUPERSCRIPT italic_M italic_A end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = italic_a italic_r italic_g italic_m italic_i italic_n start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_a italic_r italic_e italic_a ( caligraphic_E start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ) , ∀ caligraphic_E start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ { Ellipses in the intersection }. If 𝐩t+1SZsubscript𝐩𝑡1𝑆𝑍\mathbf{p}_{t+1}\notin SZbold_p start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT ∉ italic_S italic_Z, we select a high stability ellipse based on its distance from 𝐩t+1subscript𝐩𝑡1\mathbf{p}_{t+1}bold_p start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT. That is, t+1MD=argmini(dist({2,3,4}{t*},𝐩t+1))subscriptsuperscript𝑀𝐷𝑡1𝑎𝑟𝑔𝑚𝑖subscript𝑛𝑖𝑑𝑖𝑠𝑡subscript2subscript3subscript4subscriptsuperscript𝑡subscript𝐩𝑡1\mathcal{E}^{MD}_{t+1}=argmin_{i}(dist(\{\mathcal{E}_{2},\mathcal{E}_{3},% \mathcal{E}_{4}\}-\{\mathcal{E}^{*}_{t}\},\mathbf{p}_{t+1}))caligraphic_E start_POSTSUPERSCRIPT italic_M italic_D end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT = italic_a italic_r italic_g italic_m italic_i italic_n start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_d italic_i italic_s italic_t ( { caligraphic_E start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , caligraphic_E start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT , caligraphic_E start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT } - { caligraphic_E start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT } , bold_p start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT ) ). In both cases, gt+1*=gait(t+1*)subscriptsuperscript𝑔𝑡1𝑔𝑎𝑖𝑡subscriptsuperscript𝑡1g^{*}_{t+1}=gait(\mathcal{E}^{*}_{t+1})italic_g start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT = italic_g italic_a italic_i italic_t ( caligraphic_E start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT ). We temporarily remove t*subscriptsuperscript𝑡\mathcal{E}^{*}_{t}caligraphic_E start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT from consideration, as the gait corresponding to it caused 𝐩tsubscript𝐩𝑡\mathbf{p}_{t}bold_p start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT to leave the stable zone SZ𝑆𝑍SZitalic_S italic_Z at time t𝑡titalic_t. Intuitively, removing the ellipse allows the formulation to converge to the correct ellipse corresponding to the current terrain type, and its associated gait. The ellipse is added back to consideration for future gait calculations when a gait change is required. To summarize,

gt+1*={Trotif 𝐩t+1LVZgt*if 𝐩t+1LVZ, 𝐩t+1t*gait(t+1*)if 𝐩t+1SZ, 𝐩t+1t*t+1*=t+1MAgait(t+1*)if 𝐩t+1SZ, t+1*=t+1MDNoneif 𝐩t+1Γsafe.subscriptsuperscript𝑔𝑡1casesTrotif subscript𝐩𝑡1LVZsubscriptsuperscript𝑔𝑡if subscript𝐩𝑡1LVZ, subscript𝐩𝑡1subscriptsuperscript𝑡𝑔𝑎𝑖𝑡subscriptsuperscript𝑡1if subscript𝐩𝑡1SZ, subscript𝐩𝑡1subscriptsuperscript𝑡subscriptsuperscript𝑡1subscriptsuperscript𝑀𝐴𝑡1𝑔𝑎𝑖𝑡subscriptsuperscript𝑡1if subscript𝐩𝑡1SZ, subscriptsuperscript𝑡1subscriptsuperscript𝑀𝐷𝑡1Noneif subscript𝐩𝑡1subscriptΓsafeg^{*}_{t+1}=\begin{cases}\text{Trot}&\text{if }\mathbf{p}_{t+1}\,\in\,\text{% LVZ}\text{, }\\ g^{*}_{t}&\text{if }\mathbf{p}_{t+1}\,\notin\,\text{LVZ, }\mathbf{p}_{t+1}\,% \in\,\mathcal{E}^{*}_{t}\text{, }\\ gait(\mathcal{E}^{*}_{t+1})&\text{if }\mathbf{p}_{t+1}\,\in\,\text{SZ, }% \mathbf{p}_{t+1}\,\notin\,\mathcal{E}^{*}_{t}\text{, }\mathcal{E}^{*}_{t+1}=% \mathcal{E}^{MA}_{t+1}\\ gait(\mathcal{E}^{*}_{t+1})&\text{if }\mathbf{p}_{t+1}\,\notin\,\text{SZ, }% \mathcal{E}^{*}_{t+1}=\mathcal{E}^{MD}_{t+1}\\ \text{None}&\text{if }\mathbf{p}_{t+1}\,\notin\,\Gamma_{\text{safe}}.\end{cases}italic_g start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT = { start_ROW start_CELL Trot end_CELL start_CELL if bold_p start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT ∈ roman_LVZ , end_CELL end_ROW start_ROW start_CELL italic_g start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_CELL start_CELL if bold_p start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT ∉ LVZ, bold_p start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT ∈ caligraphic_E start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , end_CELL end_ROW start_ROW start_CELL italic_g italic_a italic_i italic_t ( caligraphic_E start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT ) end_CELL start_CELL if bold_p start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT ∈ SZ, bold_p start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT ∉ caligraphic_E start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , caligraphic_E start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT = caligraphic_E start_POSTSUPERSCRIPT italic_M italic_A end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL italic_g italic_a italic_i italic_t ( caligraphic_E start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT ) end_CELL start_CELL if bold_p start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT ∉ SZ, caligraphic_E start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT = caligraphic_E start_POSTSUPERSCRIPT italic_M italic_D end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL None end_CELL start_CELL if bold_p start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT ∉ roman_Γ start_POSTSUBSCRIPT safe end_POSTSUBSCRIPT . end_CELL end_ROW (4)

We also noted a significant pattern in the behavior of (pc1(t),pc2(t))𝑝subscript𝑐1𝑡𝑝subscript𝑐2𝑡(pc_{1}(t),pc_{2}(t))( italic_p italic_c start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_t ) , italic_p italic_c start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_t ) ) leading up to a crash. Specifically, for three seconds before a crash, there is a noticeable shift along the PC2 axis (Fig. 5b). Also, the data distribution preceding a crash exhibits high variances, indicating a lack of stability. Based on this observation, we adopted a preventative measure of halting the robot in these scenarios to mitigate the risk of crashes.

Refer to caption
Figure 8: The overall system architecture integrating ProNav with an exteroception-based planner. We utilize hip X- and Y-axis positions, knee force, and current drawn as our proprioceptive signals. Our PCA-based approach estimates the traversability of the terrain, and the gait adaptation selects the improved gait for stability. The camera and lidar are used with the integrated exteroception-based planner for obstacle avoidance.

IV-E ProNav + Exteroception

Although proprioceptive modalities help to estimate the traversability of the terrain the robot is walking on, they lack look-ahead capability (whenever terrain is visible) that exteroceptive sensing affords. That is because the traversability of terrains ahead of the robot cannot be assessed using past and current proprioceptive data. Therefore, fusing exteroception and proprioception helps bring out the best of both worlds.

ProNav can be easily combined with any navigation method that uses exteroceptive sensing such as RGB images, 3D point clouds, etc. An overview of such a hybrid architecture is shown in Fig. 8. The collision-free, goal-directed velocities (v*,ω*)superscript𝑣superscript𝜔(v^{*},\omega^{*})( italic_v start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT , italic_ω start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT ) are extracted from an exteroception-based planner [40, 25, 5], and the gait evaluated to be the most stable for the current scenario by ProNav (equation 4) are used by the robot. (v*,ω*)superscript𝑣superscript𝜔(v^{*},\omega^{*})( italic_v start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT , italic_ω start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT ) ensures the robot’s safety in terms of avoiding solid obstacles, and ProNav ensures walking stability and low power consumption.

V Results and Analysis

This section outlines ProNav’s implementation, our chosen evaluation parameters, the varied test environments, and comparisons with other methods.

V-A Robot Setup and Dataset

Our approach is implemented on a Spot robot, a 12-degree-of-freedom (DOF) quadruped from Boston Dynamics. The robot provides joint feedback from its 12 actuators and the battery current data during its operation. Our data collection was carried out at the University of Maryland College Park campus, on different terrains including concrete, asphalt, grass, rocks, sand, bushes, mulch, etc. The resulting dataset represents approximately 9 hours of operation, during which the joint feedback and battery current were continuously recorded. ProNav runs at 16 Hz on an Intel NUC edge computer equipped with an Intel i7 CPU, and an Nvidia RTX2060 GPU.

Refer to caption
Figure 9: An instance of Spot navigating in different outdoor terrains using: ProNav (trot: red, crawl: light green, amble: green), RFC (trot: light blue, crawl: army green, amble: dark blue), GA-Nav (trot: yellow, crawl: brown, amble: orange), VERN (purple), Spot’s in-built planner (black). We observe that our method chooses the appropriate gait and velocities based on terrain traversability, leading to better success rates, lower power consumption, and vibration cost.

V-B Comparison Methods

We combine ProNav with VERN [5] and compare its performance with exteroceptive, and proprioceptive navigation techniques:

  • Spot’s in-built planner: Uses 360°superscript360°360^{\degree}360 start_POSTSUPERSCRIPT ° end_POSTSUPERSCRIPT RGB-D cameras to detect and avoid obstacles. It also automatically adapts the robot’s leg raise heights based in the terrain.

  • VERN [5]: Uses RGB images and 3D point clouds to differentiate pliable vegetation from solid obstacles and traverse vegetated environments.

  • GA-Nav [8]: Uses RGB images for semantic segmentation, computing traversability costs for various terrains. It is combined with a planner [40] to compute trajectories on low-cost terrains and avoid obstacles.

  • Random Forest Classifier (RFC) [9]: Uses proprioceptive signals’ input vector 𝐀(t)𝐀𝑡\mathbf{A}(t)bold_A ( italic_t ) to classify the terrain type. We used the dataset described in section V.A to train the classifier to identify between four different terrain types (granular, poor foothold, high resistance and stable). Also, RFC is combined with VERN to generate goal-directed velocities (v*,ω*)superscript𝑣superscript𝜔(v^{*},\omega^{*})( italic_v start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT , italic_ω start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT ).

We use GA-Nav [8] and RFC [9] to classify the terrains based on RGB images and proprioception, respectively. After that, we choose the appropriate gait according to the following: Trot if the robot’s trajectory is on stable terrains (concrete, asphalt), crawl for granular terrains (sand), and amble for terrains with poor footholds and high resistance to motion (forest, dense vegetation). Also, we also perform ablation studies by removing the joint positions and current components from our input to our PCA-based system.

V-C Evaluation Metrics

  • Success Rate: The ratio of successful navigation trials where the robot was able to reach its goal without freezing or colliding with obstacles.

  • Mean power consumption: The amount of power (in Watts)consumed from the robot’s battery averaged over all trials.

  • Mean Velocity: The robot’s average velocity along its trajectory as it traverses various surfaces.

  • Time to Goal: The robot’s average time to reach its goal in the successful trials.

  • Vibration Cost: The cumulative sum of deviations in hip joint positions from a stable reference, measured at each time instant t𝑡titalic_t. Deviations for each hip joint j𝑗jitalic_j in the x- and y-axes are calculated as follows:

    δj(t)={|pj(t)min(prefj)|if pj(t)<min(prefj),|pj(t)max(prefj)|if pj(t)>max(prefj).subscript𝛿𝑗𝑡casessuperscript𝑝𝑗𝑡subscriptsuperscript𝑝𝑗refif superscript𝑝𝑗𝑡subscriptsuperscript𝑝𝑗refsuperscript𝑝𝑗𝑡subscriptsuperscript𝑝𝑗refif superscript𝑝𝑗𝑡subscriptsuperscript𝑝𝑗ref\delta_{j}(t)=\begin{cases}\left|p^{j}(t)-\min(p^{j}_{\text{ref}})\right|&% \text{if }p^{j}(t)<\min(p^{j}_{\text{ref}}),\\ \left|p^{j}(t)-\max(p^{j}_{\text{ref}})\right|&\text{if }p^{j}(t)>\max(p^{j}_{% \text{ref}}).\\ \end{cases}italic_δ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( italic_t ) = { start_ROW start_CELL | italic_p start_POSTSUPERSCRIPT italic_j end_POSTSUPERSCRIPT ( italic_t ) - roman_min ( italic_p start_POSTSUPERSCRIPT italic_j end_POSTSUPERSCRIPT start_POSTSUBSCRIPT ref end_POSTSUBSCRIPT ) | end_CELL start_CELL if italic_p start_POSTSUPERSCRIPT italic_j end_POSTSUPERSCRIPT ( italic_t ) < roman_min ( italic_p start_POSTSUPERSCRIPT italic_j end_POSTSUPERSCRIPT start_POSTSUBSCRIPT ref end_POSTSUBSCRIPT ) , end_CELL end_ROW start_ROW start_CELL | italic_p start_POSTSUPERSCRIPT italic_j end_POSTSUPERSCRIPT ( italic_t ) - roman_max ( italic_p start_POSTSUPERSCRIPT italic_j end_POSTSUPERSCRIPT start_POSTSUBSCRIPT ref end_POSTSUBSCRIPT ) | end_CELL start_CELL if italic_p start_POSTSUPERSCRIPT italic_j end_POSTSUPERSCRIPT ( italic_t ) > roman_max ( italic_p start_POSTSUPERSCRIPT italic_j end_POSTSUPERSCRIPT start_POSTSUBSCRIPT ref end_POSTSUBSCRIPT ) . end_CELL end_ROW

    Here, pj(t)superscript𝑝𝑗𝑡p^{j}(t)italic_p start_POSTSUPERSCRIPT italic_j end_POSTSUPERSCRIPT ( italic_t ) is the position of joint j𝑗jitalic_j at time t𝑡titalic_t, and prefjsubscriptsuperscript𝑝𝑗refp^{j}_{\text{ref}}italic_p start_POSTSUPERSCRIPT italic_j end_POSTSUPERSCRIPT start_POSTSUBSCRIPT ref end_POSTSUBSCRIPT represents reference positions from a stable terrain (for this work concrete is considered a stable terrain). The total Vibration Cost at time t𝑡titalic_t is then computed as Vibration Cost(t)=jδj(t)Vibration Cost𝑡subscript𝑗subscript𝛿𝑗𝑡\text{Vibration Cost}(t)=\sum_{j}\delta_{j}(t)Vibration Cost ( italic_t ) = ∑ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_δ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( italic_t ).

  • IMU Energy Density : The mean of the aggregated squared acceleration values measured by the IMU sensors across the x, y, and z axes, calculated over the successful trials. The relevant equations implemented are adopted from [41]:

    Ei=n=1Nai2,subscript𝐸𝑖superscriptsubscript𝑛1𝑁superscriptsubscript𝑎𝑖2E_{i}=\sum_{n=1}^{N}a_{i}^{2},italic_E start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT italic_n = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , (5)
    ETotal=Eax+Eay+Eaz,subscript𝐸Totalsubscript𝐸𝑎𝑥subscript𝐸𝑎𝑦subscript𝐸𝑎𝑧E_{\text{Total}}=E_{ax}+E_{ay}+E_{az},italic_E start_POSTSUBSCRIPT Total end_POSTSUBSCRIPT = italic_E start_POSTSUBSCRIPT italic_a italic_x end_POSTSUBSCRIPT + italic_E start_POSTSUBSCRIPT italic_a italic_y end_POSTSUBSCRIPT + italic_E start_POSTSUBSCRIPT italic_a italic_z end_POSTSUBSCRIPT , (6)

    where aisubscript𝑎𝑖a_{i}italic_a start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT represents one of the three acceleration signals (x, y, and z axes), and N𝑁Nitalic_N is the IMU readings along the trajectory.

V-D Test Scenarios

  • Scenario 1: Granular terrains (small rocks and sand).

  • Scenario 2: Concrete, rocks, and vegetation.

  • Scenario 3: Sparse tall grass, fallen logs, and trees.

  • Scenario 4: Dense vegetation and bushes.

Table I: Our method’s navigation performance, measured against other methods using four different metrics, shows ProNav excelling in success rate. Meanwhile, our method achieves the lowest power consumption and vibration cost.
Metrics Method Scen. 1 Scen. 2 Scen. 3 Scen. 4
Success Rate (%) In-built system

30

20

10

-

GA-Nav[8]

80

50

50

30

VERN[5]

70

60

40

20

RFC[9]

80

70

70

50

w/o current+VERN

100

80

60

70
w/o position+VERN

90

70

80

60

ProNav+VERN 100 90 90 70
Mean Power Consumption (watts) In-built system

503

462

542

-

GA-Nav[8]

384

373

374

442

VERN[5]

462

372

362

450

RFC[9]

365

380

369

419

w/o current+VERN 371

356

357

398

w/o position+VERN

388

361

370

403

ProNav+VERN

379

349 353 375
Mean Velocity (m/s) In-built system

0.43

0.35

0.33

-

GA-Nav[8]

0.29

0.43

0.33

0.35

VERN[5]

0.42

0.47

0.34

0.34

RFC[9]

0.30

0.39

0.35

0.33

w/o current+VERN

0.32

0.45

0.36

0.38

w/o position+VERN

0.24

0.49

0.32

0.36

ProNav+VERN

0.27

0.38

0.31

0.32

Time to Goal (seconds) In-built system

20.13

25.11

25.17

-

GA-Nav[8]

24.85

16.19

25.01

26.23

VERN[5]

19.04

16.31

24.18

27.14

RFC[9]

24.31

17.40

24.92

25.83

w/o current+VERN

26.82

18.45

23.42

22.78

w/o position+VERN

34.45

15.94

25.31

25.39

ProNav+VERN

25.68

21.02

27.73

25.97

Vibration Cost In-built system

66.3

53.2

50.6

-

GA-Nav[8]

19.6

14.9

29.5

30.2

VERN[5]

23.4

25.5

27.8

33.1

RFC[9]

18.1

16.3

25.6

29.4

w/o current+VERN

22.7

12.6

64.0

15.4

w/o position+VERN

39.4

3.1

23.7

12.8

ProNav+VERN 16.9

4.4

10.7 7.1
IMU Energy Density In-built system

55283

32367

46835

-

GA-Nav[8]

41175

23919

19307

28366

VERN[5]

51374

26950

17052

33948

RFC[9]

38957

25314

16329

30424

w/o current+VERN

28075

16834

24499

24005

w/o position+VERN

25378

15223

13207

18750

ProNav+VERN 23503 12388

14274

17186

V-E Analysis and Discussion

In this section, we evaluate qualitatively and quantitatively the performance of our method and compare it with other methods. Figure 9 provides a visual representation of the trajectories in different terrains. Our method showcases its superiority in navigating through dense vegetation (Fig. 1), granular (Fig. 9a), rocky (Fig. 9b), and unstructured forested terrains (Fig. 9c). Notably, our method adapts by choosing crawl on sand (scenario 1) and amble in other scenarios whenever poor footholds and resistance to motion dominate, and 𝐩tLVZsubscript𝐩𝑡𝐿𝑉𝑍\mathbf{p}_{t}\notin LVZbold_p start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ∉ italic_L italic_V italic_Z. RFC exhibited effective terrain analysis using proprioception in the granular scenario. However, the classifier’s performance declined in the other scenarios due to their unstructured nature. This complexity led to either frequent gait changes, or no gait change resulting in instability, or failure often caused by entanglements in vegetation. Spot’s built-in system faces challenges in vegetation-rich scenarios (2, 3 and 4) due to its default trot gait leading to leg entanglement, as reflected in its lower success rates. It also exhibits unstable behavior when the ground contains branches, and rocks of various sizes, as it considers them as obstacles that should be circumvented. VERN also encounters failure instances due to leg entanglements, particularly in scenario 4 with denser vegetation. GA-Nav, similar to RFC shows efficiency in open and uncovered terrains like in scenario 1, but in vegetated scenarios (3 and 4), often struggles to accurately identify the correct terrain type, primarily due to motion blur caused by entanglements. Also, strong lighting (Scenario 3), and low lighting (Scenario 4) drastically affect VERN’s and GA-Nav’s performance. This leads to either frequent gait changes, or changing to an inappropriate gait (e.g. crawl in dense vegetation which causes further leg entanglement). Conversely, our method achieves the highest success rate in all scenarios with its appropriate gait adaption without requiring visual feedback. ProNav halts in extreme cases to prevent imminent crashes, particularly in dense vegetation scenarios (3 and 4). Compared to the second best method, our method improves the success rate by 25%, 28.57%, 28.57%, and 40% in scenarios 1,2,3, and 4, respectively.

We compute the success rate improvement using the following equation:

Improvement (%)=(SRoursSR2ndSR2nd)×100Improvement (%)𝑆subscript𝑅ours𝑆subscript𝑅2nd𝑆subscript𝑅2nd100\text{Improvement (\%)}=\left(\frac{SR_{\text{ours}}-SR_{\text{2nd}}}{SR_{% \text{2nd}}}\right)\times 100Improvement (%) = ( divide start_ARG italic_S italic_R start_POSTSUBSCRIPT ours end_POSTSUBSCRIPT - italic_S italic_R start_POSTSUBSCRIPT 2nd end_POSTSUBSCRIPT end_ARG start_ARG italic_S italic_R start_POSTSUBSCRIPT 2nd end_POSTSUBSCRIPT end_ARG ) × 100 (7)

Where, SRours𝑆subscript𝑅oursSR_{\text{ours}}italic_S italic_R start_POSTSUBSCRIPT ours end_POSTSUBSCRIPT and SR2nd𝑆subscript𝑅2ndSR_{\text{2nd}}italic_S italic_R start_POSTSUBSCRIPT 2nd end_POSTSUBSCRIPT represent the success rates of our method and the second-best method, respectively.

We note that our approach consistently yields the lowest power consumption in all evaluated terrains. This efficiency is a result of its capability to assess stability and its superiority in gait selection. VERN, while comparable in certain scenarios, has increased power consumption in the fourth scenario due to its default trot gait leading to more entanglements and consequent motion resistance. Likewise, GA-Nav exhibits increased power consumption in scenario 4, primarily due to its multiple changes in gait selection. Moreover, our method consistently records the lowest vibration levels (in terms of the vibration cost and the IMU energy density metrics). Conversely, the frequent changes in gait exhibited by RFC lead to increased vibration costs when traversing through dense vegetation. In scenario 4, RFC and GA-Nav show high vibration in scenario 4 due to entanglements and gait alternations. Also, Spot’s in-built system experiences the highest vibration costs due to sinkage (scenario 1), slippage (scenario 2), and motion resistance (scenarios 3 and 4). In the mean velocity metric, ProNav shows a reduced pace, particularly during the gait switch to crawl in scenario 1. ProNav’s time to goal is comparable to other methods, except in scenario 1, where the exteroceptive-based methods used a faster, yet high vibration gait.

Ablation Study on Proprioceptive Signals: Our ablation analysis focused on two proprioceptive signals of ProNav: current drawn from the battery and hip joints’ positions. In our evaluations (Table I), omitting battery current resulted in notably delayed or incorrect traversability estimations, notably impacting power consumption and vibration costs, especially evident in scenario 4 in dense vegetation. Removing hip joints’ positions also hindered performance but to a lesser extent. Despite their relative performance, neither ablated configuration could exceed the performance of the fully integrated ProNav system. We did not remove the knee force for our ablation study, since a PCA cluster could not be formed without it, which hinders the comparison.

Table II shows navigation comparisons when using a single stable gait (crawl or amble) as well as ProNav with its adaptive gait adjustment. We observe that crawl gait has the lowest power consumption and vibration levels compared to the amble gait. However, its application in dense vegetation presents challenges; the robot moves slowly, leading to its legs getting entangled with the vegetation. For instance, in scenario 4, we note elevated power consumption and vibration levels alongside a significantly low velocity. In contrast, the amble gait consistently achieves superior velocities and reaches the goal quickly relative to crawl and ProNav. Also, it has high mean power consumption which reduces the risk of entanglement (as the robot exerts more torque), and consequently lower vibration cost. ProNav on the other hand provides the best trade-off between the average power consumption, vibration cost and mean velocity.

Table II: Navigation performance while using only stable gaits, crawl, and amble. ProNav adaptively selects the appropriate gait by assessing terrain traversability and stability.
Metrics Method Scen. 1 Scen. 2 Scen. 3 Scen. 4
Mean Power Consumption (watts) Crawl

382

358

374

488

Amble

443

370

421

427

ProNav 379 349 353 375
Mean Velocity (m/s) Crawl

0.29

0.24

0.21

0.17

Amble

0.33

0.56

0.28

0.54

ProNav

0.27

0.38

0.31

0.32

Time to Goal (seconds) Crawl

29.54

33.76

54.21

68.2

Amble

24.05

14.91

18.75

19.90

ProNav

15.2

2.7

8.1

4.8

Vibration cost Crawl

18.3

34.4

46.2

37.8

Amble

28.4

22.9

5.7 6.1
ProNav 16.9 4.4

10.7

7.1

IMU Energy Density Crawl

18078

25194

22359

37405

Amble

34901

18846

7892 15984
ProNav 23503 10388

14274

17186

VI Conclusion, Limitations & Future work

We present ProNav, a new method that uses proprioceptive data to evaluate terrain’s traversability in real time for legged robots. Our method optimizes robotic gait selection for improved stability and reduced energy consumption. Also, the inclusion of an advanced crash prediction system ensures safer and more efficient navigation. We also combined ProNav with an exteroceptive-based navigation method, which improved its performance. We validate our method in different outdoor environments and provide a detailed comparison with other navigational methods.

However, ProNav has some limitations. It can only assess the stability of the terrain the robot is currently on. This could lead to failures and crashes in extreme environments. To solve this, we are considering adding other sensor modalities (e.g. RGB, thermal, or hyperspectral images) that can provide meaningful lookahead for the robot. Our gait adaptation alternates between the existing gaits on our hardware platform as custom gaits cannot be executed on it. In the future, we would like to create and utilize custom gaits for stabilization on an open hardware platform. We would also like to investigate techniques to improve crash prevention, adapting our approach to more diverse environments and situations where halting is insufficient to prevent a crash.

References

  • [1] Z. Chen, T. Fan, X. Zhao, J. Liang, C. Shen, H. Chen, D. Manocha, J. Pan, and W. Zhang, “Autonomous social distancing in urban environments using a quadruped robot,” IEEE Access, vol. 9, pp. 8392–8403, 2021.
  • [2] S. B. Goldberg, M. W. Maimone, and L. Matthies, “Stereo vision and rover navigation software for planetary exploration,” in Proceedings, IEEE aerospace conference, vol. 5.   IEEE, 2002, pp. 5–5.
  • [3] C. D. Bellicoso, M. Bjelonic, L. Wellhausen, K. Holtmann, F. Günther, M. Tranzatto, P. Fankhauser, and M. Hutter, “Advances in real-world applications for legged robots,” Journal of Field Robotics, vol. 35, no. 8, pp. 1311–1326, 2018.
  • [4] E. Tennakoon, T. Peynot, J. Roberts, and N. Kottege, “Probe-before-step walking strategy for multi-legged robots on terrain with risk of collapse,” in 2020 IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2020, pp. 5530–5536.
  • [5] A. J. Sathyamoorthy, K. Weerakoon, T. Guan, M. Russell, D. Conover, J. Pusey, and D. Manocha, “Vern: Vegetation-aware robot navigation in dense unstructured outdoor environments,” arXiv preprint arXiv:2303.14502, 2023.
  • [6] J. Lee, J. Hwangbo, L. Wellhausen, V. Koltun, and M. Hutter, “Learning quadrupedal locomotion over challenging terrain,” Science robotics, vol. 5, no. 47, p. eabc5986, 2020.
  • [7] H. Kolvenbach, C. Bärtschi, L. Wellhausen, R. Grandia, and M. Hutter, “Haptic inspection of planetary soils with legged robots,” IEEE Robotics and Automation Letters, vol. 4, no. 2, pp. 1626–1632, 2019.
  • [8] T. Guan, D. Kothandaraman, R. Chandra, A. J. Sathyamoorthy, K. Weerakoon, and D. Manocha, “Ga-nav: Efficient terrain segmentation for robot navigation in unstructured outdoor environments,” IEEE Robotics and Automation Letters, vol. 7, no. 3, pp. 8138–8145, 2022.
  • [9] C. Kertész, “Rigidity-based surface recognition for a domestic legged robot,” IEEE Robotics and automation letters, vol. 1, no. 1, pp. 309–315, 2016.
  • [10] S. Fahmi, V. Barasuol, D. Esteban, O. Villarreal, and C. Semini, “Vital: Vision-based terrain-aware locomotion for legged robots,” IEEE Transactions on Robotics, 2022.
  • [11] A. Agarwal, A. Kumar, J. Malik, and D. Pathak, “Legged locomotion in challenging terrains using egocentric vision,” in Conference on Robot Learning.   PMLR, 2023, pp. 403–415.
  • [12] D. B. Gennery, “Traversability analysis and path planning for a planetary rover,” Autonomous Robots, vol. 6, pp. 131–146, 1999.
  • [13] S. Pütz, T. Wiemann, J. Sprickerhof, and J. Hertzberg, “3d navigation mesh generation for path planning in uneven terrain,” IFAC-PapersOnLine, vol. 49, no. 15, pp. 212–217, 2016.
  • [14] Z. Fu, A. Kumar, A. Agarwal, H. Qi, J. Malik, and D. Pathak, “Coupling vision and proprioception for navigation of legged robots,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 17 273–17 283.
  • [15] T. Homberger, L. Wellhausen, P. Fankhauser, and M. Hutter, “Support surface estimation for legged robots,” in 2019 International Conference on Robotics and Automation (ICRA).   IEEE, 2019, pp. 8470–8476.
  • [16] A. H. Al-dabbagh and R. Ronsse, “A review of terrain detection systems for applications in locomotion assistance,” Robotics and Autonomous Systems, vol. 133, p. 103628, 2020.
  • [17] J. Carius, R. Ranftl, V. Koltun, and M. Hutter, “Trajectory optimization for legged robots with slipping motions,” IEEE Robotics and Automation Letters, vol. 4, no. 3, pp. 3013–3020, 2019.
  • [18] S. Teng, M. W. Mueller, and K. Sreenath, “Legged robot state estimation in slippery environments using invariant extended kalman filter with velocity update,” in 2021 IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2021, pp. 3104–3110.
  • [19] D. W. Haldane, P. Fankhauser, R. Siegwart, and R. S. Fearing, “Detection of slippery terrain with a heterogeneous team of legged robots,” in 2014 IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2014, pp. 4576–4581.
  • [20] J. Frey, D. Hoeller, S. Khattak, and M. Hutter, “Locomotion policy guided traversability learning using volumetric representations of complex environments,” in 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).   IEEE, 2022, pp. 5722–5729.
  • [21] L. Wellhausen, R. Ranftl, and M. Hutter, “Safe robot navigation via multi-modal anomaly detection,” IEEE Robotics and Automation Letters, vol. 5, no. 2, pp. 1326–1333, 2020.
  • [22] M. Aladem, S. Baek, and S. A. Rawashdeh, “Evaluation of image enhancement techniques for vision-based navigation under low illumination,” Journal of Robotics, vol. 2019, 2019.
  • [23] F. Schilling, X. Chen, J. Folkesson, and P. Jensfelt, “Geometric and visual terrain classification for autonomous mobile navigation,” in 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).   IEEE, 2017, pp. 2678–2684.
  • [24] D. Wisth, M. Camurri, and M. Fallon, “Vilens: Visual, inertial, lidar, and leg odometry for all-terrain legged robots,” IEEE Transactions on Robotics, 2022.
  • [25] K. Weerakoon, A. J. Sathyamoorthy, J. Liang, T. Guan, U. Patel, and D. Manocha, “Graspe: Graph based multimodal fusion for robot navigation in unstructured outdoor environments,” arXiv preprint arXiv:2209.05722, 2022.
  • [26] A. J. Sathyamoorthy, K. Weerakoon, T. Guan, J. Liang, and D. Manocha, “Terrapn: Unstructured terrain navigation using online self-supervised learning,” in 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).   IEEE, 2022, pp. 7197–7204.
  • [27] T. Miki, J. Lee, J. Hwangbo, L. Wellhausen, V. Koltun, and M. Hutter, “Learning robust perceptive locomotion for quadrupedal robots in the wild,” Science Robotics, vol. 7, no. 62, p. eabk2822, 2022.
  • [28] A. Loquercio, A. Kumar, and J. Malik, “Learning visual locomotion with cross-modal supervision,” in 2023 IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2023, pp. 7295–7302.
  • [29] S. Dey, D. Fan, R. Schmid, A. Dixit, K. Otsu, T. Touma, A. F. Schilling, and A.-A. Agha-Mohammadi, “Prepare: Predictive proprioception for agile failure event detection in robotic exploration of extreme terrains,” in 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).   IEEE, 2022, pp. 4338–4343.
  • [30] J. Truong, A. Zitkovich, S. Chernova, D. Batra, T. Zhang, J. Tan, and W. Yu, “Indoorsim-to-outdoorreal: Learning to navigate outdoors without any outdoor experience,” arXiv preprint arXiv:2305.01098, 2023.
  • [31] L. Wellhausen and M. Hutter, “Artplanner: Robust legged robot navigation in the field,” arXiv preprint arXiv:2303.01420, 2023.
  • [32] P. Biswal and P. K. Mohanty, “Development of quadruped walking robots: A review,” Ain Shams Engineering Journal, vol. 12, no. 2, pp. 2017–2031, 2021.
  • [33] T. Overbye and S. Saripalli, “Path optimization for ground vehicles in off-road terrain,” in 2021 IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2021, pp. 7708–7714.
  • [34] M. F. Ginting, S.-K. Kim, O. Peltzer, J. Ott, S. Jung, M. J. Kochenderfer, and A.-a. Agha-mohammadi, “Safe and efficient navigation in extreme environments using semantic belief graphs,” arXiv preprint arXiv:2304.00645, 2023.
  • [35] J. Guzzi, R. O. Chavez-Garcia, L. M. Gambardella, and A. Giusti, “On the impact of uncertainty for path planning,” in 2019 International Conference on Robotics and Automation (ICRA).   IEEE, 2019, pp. 5929–5935.
  • [36] R. O. C. García, M. A. Estrada, M. Ebrahimi, F. Zuppichini, L. M. Gambardella, A. Giusti, and A. J. Ijspeert, “Gait-dependent traversability estimation on the k-rock2 robot,” in 2022 26th International Conference on Pattern Recognition (ICPR).   IEEE, 2022, pp. 4204–4210.
  • [37] M. Hutter, C. Gehring, D. Jud, A. Lauber, C. D. Bellicoso, V. Tsounis, J. Hwangbo, K. Bodie, P. Fankhauser, M. Bloesch, R. Diethelm, S. Bachmann, A. Melzer, and M. Hoepflinger, “Anymal - a highly mobile and dynamic quadrupedal robot,” in 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2016, pp. 38–44.
  • [38] “Ghost vision 60,” http://www.stonexperu.com/pdf/GR%20Vision%2060-P%20Quad%20UGV-%20Full%20Spec%20rev4.0˙STN.pdf, Ghost Robotics Corp., [Online; accessed 17-January-2024].
  • [39] “About spot,” https://dev.bostondynamics.com/docs/concepts/about˙spot, Boston Dynamics, [Online; accessed 17-January-2024].
  • [40] D. Fox, W. Burgard, and S. Thrun, “The dynamic window approach to collision avoidance,” IEEE Robotics & Automation Magazine, vol. 4, no. 1, pp. 23–33, 1997.
  • [41] P. Try and M. Gebhard, “A vibration sensing device using a six-axis imu and an optimized beam structure for activity monitoring,” Sensors, vol. 23, no. 19, p. 8045, 2023.