ProNav: Proprioceptive Traversability Estimation for Legged Robot Navigation in Outdoor Environments
Abstract
We propose a novel method, ProNav, which uses proprioceptive signals for traversability estimation in challenging outdoor terrains for autonomous legged robot navigation. Our approach uses sensor data from a legged robot’s joint encoders, force, and current sensors to measure the joint positions, forces, and current consumption respectively to accurately assess a terrain’s stability, resistance to the robot’s motion, risk of entrapment, and crash. Based on these factors, we compute the appropriate robot gait to maximize stability, which leads to reduced energy consumption. Our approach can also be used to predict imminent crashes in challenging terrains and execute behaviors to preemptively avoid them. We integrate ProNav with an exteroceptive-based method to navigate real-world environments with dense vegetation, high granularity, negative obstacles, etc. Our method shows an improvement up to 40% in terms of success rate and up to 15.1% reduction in terms of energy consumption compared to exteroceptive-based methods.
I Introduction
In recent years, autonomous legged robots have found applications in surveillance/monitoring [1], exploration [2], and search and rescue [3], etc. The key advantage that enables such applications is their superior capabilities in traversing complex terrains, ones that are inaccessible to wheeled and tracked robots.
It is important to develop autonomous methods for navigation in complex terrains, which can be broken down into three major categories: uneven/rocky outdoor terrains, dense vegetation, and granular terrains like sand and mud. The uneven or rocky terrains challenge the robot’s stability as they often lack solid footholds with sudden variations in elevation [4]. Dense vegetation introduces another layer of complexity, presenting risks of entanglement in branches, dried grass, or bushes [5, 6], leading to unstable behaviors such as slipping and tripping. The third category, granular terrains, often leads to the robot’s legs sinking into surfaces like sand or mud due to their deformability under the robot’s weight [7]. Each of these terrain types presents unique difficulties for legged robots, which can affect their navigational capabilities.
To tackle these challenges, the robot must be able to accurately evaluate a terrain’s traversability (a measure of the ease of navigation) and then plan its trajectories. Existing methods typically utilize exteroceptive modalities (RGB images, lidar point clouds, and scans) [10, 11, 12, 13] for traversability estimation. Such exteroceptive methods can provide valuable information about the terrain before walking over it. However, these methods experience degradation in perception accuracy in environments with high occlusions, poor illumination, scarce features, etc. For instance, the terrain geometry could be occluded by dense vegetation. Moreover, certain entities (e.g. negative obstacles such as ditches, and potholes) and changes in a terrain’s properties (dry sand versus wet sand) cannot be accurately detected by exteroceptive modalities.
To overcome these limitations, several methods have fused exteroception with proprioception to evaluate a terrain’s traversability [14, 15]. Proprioception measures the state of the robot’s joint and body position and force feedback [16], while exteroception sensing measures the state of the environment using sensors such as cameras, LiDAR, etc. Although proprioception cannot provide a look-ahead for the terrain, it more accurately represents the robot’s stability on a terrain since unstable walking behaviors are reflected by significant changes in the positions, forces experienced at certain joints, and high energy consumption. Existing research works on proprioceptive traversability analysis have predominantly focused on environments where the robot encounters slippage [17, 18, 19], and have not handled regions where the robot’s legs could get entangled (e.g. in dense vegetation).
Besides that, certain terrains such as concrete and asphalt can be traversed using a single ”best” gait. However, this does not apply to all terrains. For example, a grassy terrain may appear uniform but can vary significantly, transitioning from dry to muddy areas with similar visual appearances. Similarly, navigating rocky terrain presents a similar set of challenges as shown in Figure 2. These situations indicate that a legged robot must adapt its gait based on proprioceptive feedback instead of only following visual sensing.
Main Contributions: To address these limitations, we propose ProNav, an approach for using proprioception for improved terrain traversability estimation in a variety of environments (rocky, granular, densely vegetated, etc). The proprioceptive signals are measured from a legged robot’s joint encoders, force, and current sensors. The novel components of our work include:
-
•
A novel terrain traversability estimation method using only proprioceptive signals (joint positions, forces, current consumption) to characterize the stability, and resistance to the robot’s motion on a terrain. Our method uses the aforementioned signals to estimate traversability using Principal Component Analysis (PCA) within second of walking on a new terrain type using edge computing hardware with limited computation power.
-
•
A novel crash prediction mechanism that can foresee slipping, tripping, and leg entrapment-related crashes. This leads to an improvement of 40% in terms of success rate in densely vegetated regions where all other methods experienced difficulties in reaching the goal.
-
•
A novel gait adaptation approach that selects the appropriate gait leading to increased stability (lower vibrations), and lower energy consumption while traversing challenging terrains. We highlight ProNav’s performance by integrating it with an exteroception-based navigation method for traversing through dense vegetation, and rocky and granular terrains.
II Related Works
In this section, we discuss the existing methods for estimating terrain traversability. Next, we analyze the existing navigation and planning techniques for legged robots.
II-A Perception for Navigation
Autonomous robot navigation in challenging environments requires robots to perceive the real world through their sensors. To this end, robots often incorporate onboard exteroceptive, and proprioceptive sensors. We briefly review the existing work on exteroceptive and proprioceptive perception in the following sub-sections.
II-A1 Exteroceptive Sensors
A popular approach is the use of geometry-based methods which reconstruct a 3D representation of the environment by using technologies such as LiDAR or stereo cameras [12] . Another approach, as presented in [13], generates a 3D triangle mesh of the environment from a 3D point cloud, which is then input into an online path planner for local navigation. Recently, [20] proposed learning terrain traversability by training a sparse 3D network of occupancy maps. However, these geometry-based methods have limitations, including difficulties with deformable surfaces such as sand, obstacles like tall grass, and the risk of poor estimation [12, 2].
Concurrently, vision-based approaches have seen widespread application in robot perception [10, 11]. Previous work in semantic segmentation categorizes terrain properties into traversable and non-traversable classes. For instance, Guan et al [8] leverage a multi-head vision transformer architecture to segregate the terrain into six distinct categories. Also, traversability classification can be performed using anomaly detection from multi-model images [21]. Even though such vision-based systems perform well under perfect weather conditions, they often result in erroneous classification due to lighting changes [22].
Several studies have also explored the potential of sensor fusion for terrain classification [23, 24, 25]. Notably, in [23], geometric and vision-based techniques are used to deliver improved performance. In [25], reliability-aware sensor fusion is performed to mitigate the performance degradation due to cluttered sensing. Recently, [5] proposed VERN, which utilizes a lightweight Siamese network to classify complex outdoor vegetation based on traversability. The method in [26] employs IMU sensor data to learn surface traction, bumpiness, and deformability using an online self-supervised learning strategy. While this approach has shown promising results for a number of terrains, others like rocks and bushes, with irregular texture/structure, were not investigated.
II-A2 Proprioceptive sensors
In outdoor environments, exteroceptive sensors could receive noisy data because of factors such as degraded lighting conditions and occlusions. Also, the environment can be extreme and challenging. For instance, the ground could be covered by vegetation (e.g., short/tall grass, bushes) and the robot cannot recognize the terrain type using vision or LiDAR. To overcome such issues, there has been a continuous development in proprioceptive perception [27]. Moreover, proprioception can be coupled with vision in legged robots as in [14], where Fu et al. use the camera to create a cost map around the robot, while the terrain traversability is mainly evaluated based on proprioceptive feedback. That also helps in avoiding unexpected obstacles such as glass walls. [28] proposed a cross-modal algorithm that uses an RGB camera and shifted proprioception to learn a walking locomotion policy. More recently, Dey et al. [29] leverage the proprioceptive information from a legged robot’s joints to predict slip and fall events with high accuracy. However, the robot is operated in a limited number of terrains such as rubble and other uneven, underground terrains, and not in densely vegetated environments. Moreover, their proposed model primarily predicted slipping and tripping and it is not used for navigation. Our novel approach uses proprioceptive feedback and current consumption from the actuators to also detect entanglement in dense vegetation and recover the robot.
II-B Outdoor Navigation
Recently, many approaches have been proposed to leverage the agile mobility of legged robots [30, 31] in unstructured outdoor environments, which is challenging for wheeled robots [32]. Some of these works use cost maps to represent the traversability of the environments [33].Semantic Belief Graph are utilized in [34] to train a policy for trajectory generation in extreme environments. Moreover, a traversability uncertainty-based method is proposed in [35]. In [36], the authors presented a traversability estimator that uses a classifier (or a regressor) neural network based on elevation maps. Artplanner [31] is a navigation planner designed for the DARPA Subterranean Challenge that uses geometric reachability checking and a motion cost neural network to compute optimal paths. Proprioceptive feedback is also used in the literature [6, 14, 29]. In [6], Lee et al. utilized proprioceptive feedback to train a robot controller using reinforcement learning. Their approach shows zero-shot capabilities when tested in outdoor settings. However, an inherent limitation of proprioception is its inability to preview terrain features before the robot directly interacts with them. This limitation motivates the integration of ProNav with an exteroceptive-based navigation method, ensuring a more comprehensive navigation strategy.
III Background
In this section, we explain our assumptions, define important notations used, and our problem formulation.
III-A Setup and Conventions
We assume a quadrupedal robot with 12 degrees of freedom (DOF), with 2-DOFs in the hip, and 1-DOF in the knee of each leg. We assign numbers 1, 2, 3, and 4 to denote the front-left, front-right, rear-left, and rear-right legs respectively, and to denote each leg. A robot coordinate frame is established at its center of mass with positive X, Y, Z pointing forward, left, and up respectively. Frames with similar conventions are established at each hip and knee joint. The hip has two actuators, one is moving along the X-axis direction and the other one along the Y-axis direction. Moreover, the knee actuator moves along the Z direction. We also measure the positions , , velocity , , and force exerted at a time instant . Several widely used legged robot platforms possess these specifications and capability to measure these parameters [37, 38, 39].
Position and velocity data at the joints are measured using encoder sensors, and the forces experienced are measured using the internal tactile sensing mechanism. Finally, we assume that the current drawn () from the robot’s battery can be measured using an ammeter or a current sensor while traversing various terrains. We define as the set of all positions (3), velocities (3), and forces (3) obtained from all four legs of the robot. Based on our setup and notation, we have formulated the state vector at a given time instant for our traversability estimation method as,
(1) |
III-B Problem Domain
The focus of our approach is to enhance the navigational capabilities of legged robots traversing through a variety of terrains (e.g. densely vegetated, granular, rocky) using proprioceptive feedback to adapt to changes in surface conditions. In these terrains, the robot’s legs could slip, trip, sink, or get entangled. A robot falling to the ground (we define as a crash) which could be caused by one of the following reasons:
Poor Foothold: This causes the robot’s feet to slip in rocky or slippery terrains because the robot’s feet do not have a firm, flat surface to support themselves on.
Granularity: This causes the robot’s feet to sink into the terrain (e.g. sand, mud, snow) leading to erroneous measurements of joint positions. This could cause the robot’s controller to overcompensate to stabilize itself.
Resistance to Motion: This is typically caused by dense, pliable vegetation that can be passed through (e.g. tall grass and bushes). Additionally, the robot’s legs could get entangled with vegetation causing higher resistance to motion.
To traverse various terrains, we assume a legged robot with a locomotion model that can alternate between three gaits: trot, crawl, and amble [29, 30]. Trot is the standard walking gait where the robot walks with two of its feet on the ground at a time instance, allowing fast movements. It is stable on hard surfaces, with moderate power consumption. On the other hand, during crawl and amble, the robot has three of its feet on the ground at a time instance, leading to more stable behaviors in uneven, granular, deformable surfaces. Amble helps to traverse through environments with high resistance to motion while also maintaining stability, which also helps handle poor foothold terrains. Similarly, crawl maintains high stability in granular terrains and regions with poor footholds while consuming minimal power. The maximum velocities for each gait follows the trend , and the current consumption for each gait follows . Based on these definitions, our formulation can be stated as follows,
Formulation III.1.
To adaptively select a stable gait given collision-free, goal-directed velocities , by assessing a terrain’s traversability based on a set of proprioceptive signals from a legged robot to improve stability and prevent crashes.
IV ProNav: Proprioception-based Stable Navigation
In this section, we analyze and choose the relevant proprioceptive signals, process them to assess stability, and explain our gait adaptation strategy to stabilize the robot.
IV-A Analysis of Proprioceptive Signals
Our goal is to choose the fewest number of proprioceptive signals (i.e., the minimum subset at every time instant ) that are also excellent indicators of stability. Deducing the minimum subset helps reduce the input dimensionality of our approach, which in turn improves its real-time factor.
Hip’s Position: Our empirical analysis revealed a strong correlation between the amount of slip on a terrain and the change in hip position along the X-axis and Y-axis. Figure 3 visually represents these changes as the robot navigates three different types of terrain, each representing different levels of traversability: dense vegetation, rocks, concrete, etc.
Knee’s force: Sudden peaks in the forces experienced by the robot’s knee actuators (Fig. 3) along the Z direction indicate an absence of stable footholds due to unevenness, causing the robot to exert more effort to stabilize itself.
Current Consumption: The amount of current consumed while traversing various terrains at a consistent elevation is proportional to the resistance to motion experienced in each terrain (Fig. 4). Also, the robot’s gait consistently impacts current consumption on different terrains as mentioned before.
IV-B Prepossessing Proprioceptive Signals
Our goal in preprocessing the chosen force and position data is to obtain quantities that change drastically on various terrains, thus indicating their properties. Vectors of the processed data are then analyzed using Principal Component Analysis (PCA).
IV-B1 Preprocessing Force Data
At any time instant , we consider the past samples of of the leg. That is, we consider the vector . Next, we obtain the mean force for the leg’s knee as , and then the mean force experienced by the robot as a whole as . Finally, we calculate the difference for each leg. As the robot walks on various terrains, indicates that the robot has entered a poorly traversable terrain which leads to high knee forces, and indicates a highly traversable terrain. To further amplify changes in traversability, we use , and a counter that denotes the number of spikes in the force experienced, defined as: if .
IV-B2 Preprocessing Position Data
At time instant , we consider the past samples of . For each leg , we calculate the maximum and minimum values of these samples and finally calculate . represents the magnitude of variation in the hip positions along the X direction. Similarly, we obtain along the Y direction. A high value of oder indicates the unavailability of stable footholds which leads to slippage (e.g. in rocky terrains), or the presence of a granular surface that leads to sinkage.
IV-B3 Processed Input Vector
We combine the processed quantities in knee forces and hip positions with the current drawn from the robot’s battery to construct the input vector to estimate terrain traversability as,
(2) |
All the quantities on the right in equation 2 are functions of . It is omitted for readability.
IV-C Terrain Traversability Estimation
To estimate a terrain’s traversability using our preprocessed proprioceptive signals, we first apply Principal Component Analysis (PCA) to reduce its 9 dimensions into two principal components as,
(3) |
Here, is a 2D point in the PCA space (Fig.5). PCA allows us to simplify and effectively compare different terrains based on these components. We chose to use two principal components because it yielded all the required information needed for traversability estimation. Using just one component was insufficient, and three components did not add new useful information in terms of visualizing distributions for each type of terrain as shown in Figure 6.
Continuously plotting the PCA points corresponding to traversing a terrain with gait for a time period results in a distribution/cluster of points as shown in Fig. 5a. We obtain several key insights from our analysis: 1. The variance of the data along the two principal components differentiates stable (low variance/small cluster) and unstable (high variance/big cluster) terrains, 2. Terrain-gait pairs that have similar stability characteristics have similar clusters (e.g. concrete-trot and asphalt-trot), and 3. The position of the PCA points can also aid in predicting imminent crashes with noticeable shifts during the moments immediately before and after crashes (see Fig. 5b).
We extend this analysis to using all three gaits on terrains with poor footholds, granularity, and high resistance to motion, and obtained a unique cluster of points for each terrain-gait pair. By fitting a 2D Gaussian to each cluster, we obtain a characteristic ellipse (see Fig. 7a) that forms the boundary of the cluster. Similar to our previous insights, the size/area of each ellipse denotes the robot’s stability on a terrain while using a certain gait.
Since our objective is to maintain high stability in all terrain types, while also maintaining a fast progress towards the robot’s goal while navigating, we consider only the ellipses with the lowest area to maximum velocity of the gait ratio (Fig. 7b). We refer to them as high stability ellipses for each terrain type. Of these ellipses, we observe that trotting on stable, flat terrains such as concrete/asphalt creates the ellipse with the smallest area, and highest stability. We refer to this ellipse and its enclosing region as the Low Variance Zone (LVZ), highlighted in Fig. 7b. Ideally, the current PCA point indicating the robot’s stability should lie within the LVZ. However, on other challenging terrains, would most likely lie outside the LVZ. Next, we detail how and the other ellipses in Fig. 7b can be used to select a stabilizing gait when without any exteroceptive feedback.
IV-D Stable Gait Adaptation
A key insight from Fig. 7 is that when the appropriate gait for a terrain is chosen (e.g. crawl for granular terrains) at time , the point would be contained within the corresponding high stability ellipse (e.g. granular-crawl ellipse) as represented in Fig. 7b. Conversely, if an inappropriate gait is chosen, will not lie within the high stability ellipse for that terrain. A key challenge is selecting the appropriate gait without knowing the terrain type.
To determine the most appropriate stabilizing gait (with the corresponding ellipse ) for a terrain, we only consider the four ellipses and their enclosing areas in Fig. 7b, and refer to them as , as marked. We refer to their union as the Stable Zone (). Let us consider two points in subsequent time instances and . At time instant , if , and lies in the intersection of any of the other ellipses, we calculate the minimum area ellipse as . We set , and as the associated gait calculated as . If is the appropriate gait for the terrain, , and the robot can continue to execute the same gait, i.e., .
However, if is not the appropriate gait, then . This leads to two scenarios: , or . If , then we can compute the minimum area ellipse as before, . If , we select a high stability ellipse based on its distance from . That is, . In both cases, . We temporarily remove from consideration, as the gait corresponding to it caused to leave the stable zone at time . Intuitively, removing the ellipse allows the formulation to converge to the correct ellipse corresponding to the current terrain type, and its associated gait. The ellipse is added back to consideration for future gait calculations when a gait change is required. To summarize,
(4) |
We also noted a significant pattern in the behavior of leading up to a crash. Specifically, for three seconds before a crash, there is a noticeable shift along the PC2 axis (Fig. 5b). Also, the data distribution preceding a crash exhibits high variances, indicating a lack of stability. Based on this observation, we adopted a preventative measure of halting the robot in these scenarios to mitigate the risk of crashes.
IV-E ProNav + Exteroception
Although proprioceptive modalities help to estimate the traversability of the terrain the robot is walking on, they lack look-ahead capability (whenever terrain is visible) that exteroceptive sensing affords. That is because the traversability of terrains ahead of the robot cannot be assessed using past and current proprioceptive data. Therefore, fusing exteroception and proprioception helps bring out the best of both worlds.
ProNav can be easily combined with any navigation method that uses exteroceptive sensing such as RGB images, 3D point clouds, etc. An overview of such a hybrid architecture is shown in Fig. 8. The collision-free, goal-directed velocities are extracted from an exteroception-based planner [40, 25, 5], and the gait evaluated to be the most stable for the current scenario by ProNav (equation 4) are used by the robot. ensures the robot’s safety in terms of avoiding solid obstacles, and ProNav ensures walking stability and low power consumption.
V Results and Analysis
This section outlines ProNav’s implementation, our chosen evaluation parameters, the varied test environments, and comparisons with other methods.
V-A Robot Setup and Dataset
Our approach is implemented on a Spot robot, a 12-degree-of-freedom (DOF) quadruped from Boston Dynamics. The robot provides joint feedback from its 12 actuators and the battery current data during its operation. Our data collection was carried out at the University of Maryland College Park campus, on different terrains including concrete, asphalt, grass, rocks, sand, bushes, mulch, etc. The resulting dataset represents approximately 9 hours of operation, during which the joint feedback and battery current were continuously recorded. ProNav runs at 16 Hz on an Intel NUC edge computer equipped with an Intel i7 CPU, and an Nvidia RTX2060 GPU.
V-B Comparison Methods
We combine ProNav with VERN [5] and compare its performance with exteroceptive, and proprioceptive navigation techniques:
-
•
Spot’s in-built planner: Uses RGB-D cameras to detect and avoid obstacles. It also automatically adapts the robot’s leg raise heights based in the terrain.
-
•
VERN [5]: Uses RGB images and 3D point clouds to differentiate pliable vegetation from solid obstacles and traverse vegetated environments.
- •
-
•
Random Forest Classifier (RFC) [9]: Uses proprioceptive signals’ input vector to classify the terrain type. We used the dataset described in section V.A to train the classifier to identify between four different terrain types (granular, poor foothold, high resistance and stable). Also, RFC is combined with VERN to generate goal-directed velocities .
We use GA-Nav [8] and RFC [9] to classify the terrains based on RGB images and proprioception, respectively. After that, we choose the appropriate gait according to the following: Trot if the robot’s trajectory is on stable terrains (concrete, asphalt), crawl for granular terrains (sand), and amble for terrains with poor footholds and high resistance to motion (forest, dense vegetation). Also, we also perform ablation studies by removing the joint positions and current components from our input to our PCA-based system.
V-C Evaluation Metrics
-
•
Success Rate: The ratio of successful navigation trials where the robot was able to reach its goal without freezing or colliding with obstacles.
-
•
Mean power consumption: The amount of power (in Watts)consumed from the robot’s battery averaged over all trials.
-
•
Mean Velocity: The robot’s average velocity along its trajectory as it traverses various surfaces.
-
•
Time to Goal: The robot’s average time to reach its goal in the successful trials.
-
•
Vibration Cost: The cumulative sum of deviations in hip joint positions from a stable reference, measured at each time instant . Deviations for each hip joint in the x- and y-axes are calculated as follows:
Here, is the position of joint at time , and represents reference positions from a stable terrain (for this work concrete is considered a stable terrain). The total Vibration Cost at time is then computed as .
-
•
IMU Energy Density : The mean of the aggregated squared acceleration values measured by the IMU sensors across the x, y, and z axes, calculated over the successful trials. The relevant equations implemented are adopted from [41]:
(5) (6) where represents one of the three acceleration signals (x, y, and z axes), and is the IMU readings along the trajectory.
V-D Test Scenarios
-
•
Scenario 1: Granular terrains (small rocks and sand).
-
•
Scenario 2: Concrete, rocks, and vegetation.
-
•
Scenario 3: Sparse tall grass, fallen logs, and trees.
-
•
Scenario 4: Dense vegetation and bushes.
Metrics | Method | Scen. 1 | Scen. 2 | Scen. 3 | Scen. 4 |
Success Rate (%) | In-built system |
30 |
20 |
10 |
- |
GA-Nav[8] |
80 |
50 |
50 |
30 |
|
VERN[5] |
70 |
60 |
40 |
20 |
|
RFC[9] |
80 |
70 |
70 |
50 |
|
w/o current+VERN |
100 |
80 |
60 |
70 | |
w/o position+VERN |
90 |
70 |
80 |
60 |
|
ProNav+VERN | 100 | 90 | 90 | 70 | |
Mean Power Consumption (watts) | In-built system |
503 |
462 |
542 |
- |
GA-Nav[8] |
384 |
373 |
374 |
442 |
|
VERN[5] |
462 |
372 |
362 |
450 |
|
RFC[9] |
365 |
380 |
369 |
419 |
|
w/o current+VERN | 371 |
356 |
357 |
398 |
|
w/o position+VERN |
388 |
361 |
370 |
403 |
|
ProNav+VERN |
379 |
349 | 353 | 375 | |
Mean Velocity (m/s) | In-built system |
0.43 |
0.35 |
0.33 |
- |
GA-Nav[8] |
0.29 |
0.43 |
0.33 |
0.35 |
|
VERN[5] |
0.42 |
0.47 |
0.34 |
0.34 |
|
RFC[9] |
0.30 |
0.39 |
0.35 |
0.33 |
|
w/o current+VERN |
0.32 |
0.45 |
0.36 |
0.38 |
|
w/o position+VERN |
0.24 |
0.49 |
0.32 |
0.36 |
|
ProNav+VERN |
0.27 |
0.38 |
0.31 |
0.32 |
|
Time to Goal (seconds) | In-built system |
20.13 |
25.11 |
25.17 |
- |
GA-Nav[8] |
24.85 |
16.19 |
25.01 |
26.23 |
|
VERN[5] |
19.04 |
16.31 |
24.18 |
27.14 |
|
RFC[9] |
24.31 |
17.40 |
24.92 |
25.83 |
|
w/o current+VERN |
26.82 |
18.45 |
23.42 |
22.78 |
|
w/o position+VERN |
34.45 |
15.94 |
25.31 |
25.39 |
|
ProNav+VERN |
25.68 |
21.02 |
27.73 |
25.97 |
|
Vibration Cost | In-built system |
66.3 |
53.2 |
50.6 |
- |
GA-Nav[8] |
19.6 |
14.9 |
29.5 |
30.2 |
|
VERN[5] |
23.4 |
25.5 |
27.8 |
33.1 |
|
RFC[9] |
18.1 |
16.3 |
25.6 |
29.4 |
|
w/o current+VERN |
22.7 |
12.6 |
64.0 |
15.4 |
|
w/o position+VERN |
39.4 |
3.1 |
23.7 |
12.8 |
|
ProNav+VERN | 16.9 |
4.4 |
10.7 | 7.1 | |
IMU Energy Density | In-built system |
55283 |
32367 |
46835 |
- |
GA-Nav[8] |
41175 |
23919 |
19307 |
28366 |
|
VERN[5] |
51374 |
26950 |
17052 |
33948 |
|
RFC[9] |
38957 |
25314 |
16329 |
30424 |
|
w/o current+VERN |
28075 |
16834 |
24499 |
24005 |
|
w/o position+VERN |
25378 |
15223 |
13207 |
18750 |
|
ProNav+VERN | 23503 | 12388 |
14274 |
17186 |
V-E Analysis and Discussion
In this section, we evaluate qualitatively and quantitatively the performance of our method and compare it with other methods. Figure 9 provides a visual representation of the trajectories in different terrains. Our method showcases its superiority in navigating through dense vegetation (Fig. 1), granular (Fig. 9a), rocky (Fig. 9b), and unstructured forested terrains (Fig. 9c). Notably, our method adapts by choosing crawl on sand (scenario 1) and amble in other scenarios whenever poor footholds and resistance to motion dominate, and . RFC exhibited effective terrain analysis using proprioception in the granular scenario. However, the classifier’s performance declined in the other scenarios due to their unstructured nature. This complexity led to either frequent gait changes, or no gait change resulting in instability, or failure often caused by entanglements in vegetation. Spot’s built-in system faces challenges in vegetation-rich scenarios (2, 3 and 4) due to its default trot gait leading to leg entanglement, as reflected in its lower success rates. It also exhibits unstable behavior when the ground contains branches, and rocks of various sizes, as it considers them as obstacles that should be circumvented. VERN also encounters failure instances due to leg entanglements, particularly in scenario 4 with denser vegetation. GA-Nav, similar to RFC shows efficiency in open and uncovered terrains like in scenario 1, but in vegetated scenarios (3 and 4), often struggles to accurately identify the correct terrain type, primarily due to motion blur caused by entanglements. Also, strong lighting (Scenario 3), and low lighting (Scenario 4) drastically affect VERN’s and GA-Nav’s performance. This leads to either frequent gait changes, or changing to an inappropriate gait (e.g. crawl in dense vegetation which causes further leg entanglement). Conversely, our method achieves the highest success rate in all scenarios with its appropriate gait adaption without requiring visual feedback. ProNav halts in extreme cases to prevent imminent crashes, particularly in dense vegetation scenarios (3 and 4). Compared to the second best method, our method improves the success rate by 25%, 28.57%, 28.57%, and 40% in scenarios 1,2,3, and 4, respectively.
We compute the success rate improvement using the following equation:
(7) |
Where, and represent the success rates of our method and the second-best method, respectively.
We note that our approach consistently yields the lowest power consumption in all evaluated terrains. This efficiency is a result of its capability to assess stability and its superiority in gait selection. VERN, while comparable in certain scenarios, has increased power consumption in the fourth scenario due to its default trot gait leading to more entanglements and consequent motion resistance. Likewise, GA-Nav exhibits increased power consumption in scenario 4, primarily due to its multiple changes in gait selection. Moreover, our method consistently records the lowest vibration levels (in terms of the vibration cost and the IMU energy density metrics). Conversely, the frequent changes in gait exhibited by RFC lead to increased vibration costs when traversing through dense vegetation. In scenario 4, RFC and GA-Nav show high vibration in scenario 4 due to entanglements and gait alternations. Also, Spot’s in-built system experiences the highest vibration costs due to sinkage (scenario 1), slippage (scenario 2), and motion resistance (scenarios 3 and 4). In the mean velocity metric, ProNav shows a reduced pace, particularly during the gait switch to crawl in scenario 1. ProNav’s time to goal is comparable to other methods, except in scenario 1, where the exteroceptive-based methods used a faster, yet high vibration gait.
Ablation Study on Proprioceptive Signals: Our ablation analysis focused on two proprioceptive signals of ProNav: current drawn from the battery and hip joints’ positions. In our evaluations (Table I), omitting battery current resulted in notably delayed or incorrect traversability estimations, notably impacting power consumption and vibration costs, especially evident in scenario 4 in dense vegetation. Removing hip joints’ positions also hindered performance but to a lesser extent. Despite their relative performance, neither ablated configuration could exceed the performance of the fully integrated ProNav system. We did not remove the knee force for our ablation study, since a PCA cluster could not be formed without it, which hinders the comparison.
Table II shows navigation comparisons when using a single stable gait (crawl or amble) as well as ProNav with its adaptive gait adjustment. We observe that crawl gait has the lowest power consumption and vibration levels compared to the amble gait. However, its application in dense vegetation presents challenges; the robot moves slowly, leading to its legs getting entangled with the vegetation. For instance, in scenario 4, we note elevated power consumption and vibration levels alongside a significantly low velocity. In contrast, the amble gait consistently achieves superior velocities and reaches the goal quickly relative to crawl and ProNav. Also, it has high mean power consumption which reduces the risk of entanglement (as the robot exerts more torque), and consequently lower vibration cost. ProNav on the other hand provides the best trade-off between the average power consumption, vibration cost and mean velocity.
Metrics | Method | Scen. 1 | Scen. 2 | Scen. 3 | Scen. 4 |
Mean Power Consumption (watts) | Crawl |
382 |
358 |
374 |
488 |
Amble |
443 |
370 |
421 |
427 |
|
ProNav | 379 | 349 | 353 | 375 | |
Mean Velocity (m/s) | Crawl |
0.29 |
0.24 |
0.21 |
0.17 |
Amble |
0.33 |
0.56 |
0.28 |
0.54 |
|
ProNav |
0.27 |
0.38 |
0.31 |
0.32 |
|
Time to Goal (seconds) | Crawl |
29.54 |
33.76 |
54.21 |
68.2 |
Amble |
24.05 |
14.91 |
18.75 |
19.90 |
|
ProNav |
15.2 |
2.7 |
8.1 |
4.8 |
|
Vibration cost | Crawl |
18.3 |
34.4 |
46.2 |
37.8 |
Amble |
28.4 |
22.9 |
5.7 | 6.1 | |
ProNav | 16.9 | 4.4 |
10.7 |
7.1 |
|
IMU Energy Density | Crawl |
18078 |
25194 |
22359 |
37405 |
Amble |
34901 |
18846 |
7892 | 15984 | |
ProNav | 23503 | 10388 |
14274 |
17186 |
VI Conclusion, Limitations & Future work
We present ProNav, a new method that uses proprioceptive data to evaluate terrain’s traversability in real time for legged robots. Our method optimizes robotic gait selection for improved stability and reduced energy consumption. Also, the inclusion of an advanced crash prediction system ensures safer and more efficient navigation. We also combined ProNav with an exteroceptive-based navigation method, which improved its performance. We validate our method in different outdoor environments and provide a detailed comparison with other navigational methods.
However, ProNav has some limitations. It can only assess the stability of the terrain the robot is currently on. This could lead to failures and crashes in extreme environments. To solve this, we are considering adding other sensor modalities (e.g. RGB, thermal, or hyperspectral images) that can provide meaningful lookahead for the robot. Our gait adaptation alternates between the existing gaits on our hardware platform as custom gaits cannot be executed on it. In the future, we would like to create and utilize custom gaits for stabilization on an open hardware platform. We would also like to investigate techniques to improve crash prevention, adapting our approach to more diverse environments and situations where halting is insufficient to prevent a crash.
References
- [1] Z. Chen, T. Fan, X. Zhao, J. Liang, C. Shen, H. Chen, D. Manocha, J. Pan, and W. Zhang, “Autonomous social distancing in urban environments using a quadruped robot,” IEEE Access, vol. 9, pp. 8392–8403, 2021.
- [2] S. B. Goldberg, M. W. Maimone, and L. Matthies, “Stereo vision and rover navigation software for planetary exploration,” in Proceedings, IEEE aerospace conference, vol. 5. IEEE, 2002, pp. 5–5.
- [3] C. D. Bellicoso, M. Bjelonic, L. Wellhausen, K. Holtmann, F. Günther, M. Tranzatto, P. Fankhauser, and M. Hutter, “Advances in real-world applications for legged robots,” Journal of Field Robotics, vol. 35, no. 8, pp. 1311–1326, 2018.
- [4] E. Tennakoon, T. Peynot, J. Roberts, and N. Kottege, “Probe-before-step walking strategy for multi-legged robots on terrain with risk of collapse,” in 2020 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2020, pp. 5530–5536.
- [5] A. J. Sathyamoorthy, K. Weerakoon, T. Guan, M. Russell, D. Conover, J. Pusey, and D. Manocha, “Vern: Vegetation-aware robot navigation in dense unstructured outdoor environments,” arXiv preprint arXiv:2303.14502, 2023.
- [6] J. Lee, J. Hwangbo, L. Wellhausen, V. Koltun, and M. Hutter, “Learning quadrupedal locomotion over challenging terrain,” Science robotics, vol. 5, no. 47, p. eabc5986, 2020.
- [7] H. Kolvenbach, C. Bärtschi, L. Wellhausen, R. Grandia, and M. Hutter, “Haptic inspection of planetary soils with legged robots,” IEEE Robotics and Automation Letters, vol. 4, no. 2, pp. 1626–1632, 2019.
- [8] T. Guan, D. Kothandaraman, R. Chandra, A. J. Sathyamoorthy, K. Weerakoon, and D. Manocha, “Ga-nav: Efficient terrain segmentation for robot navigation in unstructured outdoor environments,” IEEE Robotics and Automation Letters, vol. 7, no. 3, pp. 8138–8145, 2022.
- [9] C. Kertész, “Rigidity-based surface recognition for a domestic legged robot,” IEEE Robotics and automation letters, vol. 1, no. 1, pp. 309–315, 2016.
- [10] S. Fahmi, V. Barasuol, D. Esteban, O. Villarreal, and C. Semini, “Vital: Vision-based terrain-aware locomotion for legged robots,” IEEE Transactions on Robotics, 2022.
- [11] A. Agarwal, A. Kumar, J. Malik, and D. Pathak, “Legged locomotion in challenging terrains using egocentric vision,” in Conference on Robot Learning. PMLR, 2023, pp. 403–415.
- [12] D. B. Gennery, “Traversability analysis and path planning for a planetary rover,” Autonomous Robots, vol. 6, pp. 131–146, 1999.
- [13] S. Pütz, T. Wiemann, J. Sprickerhof, and J. Hertzberg, “3d navigation mesh generation for path planning in uneven terrain,” IFAC-PapersOnLine, vol. 49, no. 15, pp. 212–217, 2016.
- [14] Z. Fu, A. Kumar, A. Agarwal, H. Qi, J. Malik, and D. Pathak, “Coupling vision and proprioception for navigation of legged robots,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 17 273–17 283.
- [15] T. Homberger, L. Wellhausen, P. Fankhauser, and M. Hutter, “Support surface estimation for legged robots,” in 2019 International Conference on Robotics and Automation (ICRA). IEEE, 2019, pp. 8470–8476.
- [16] A. H. Al-dabbagh and R. Ronsse, “A review of terrain detection systems for applications in locomotion assistance,” Robotics and Autonomous Systems, vol. 133, p. 103628, 2020.
- [17] J. Carius, R. Ranftl, V. Koltun, and M. Hutter, “Trajectory optimization for legged robots with slipping motions,” IEEE Robotics and Automation Letters, vol. 4, no. 3, pp. 3013–3020, 2019.
- [18] S. Teng, M. W. Mueller, and K. Sreenath, “Legged robot state estimation in slippery environments using invariant extended kalman filter with velocity update,” in 2021 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2021, pp. 3104–3110.
- [19] D. W. Haldane, P. Fankhauser, R. Siegwart, and R. S. Fearing, “Detection of slippery terrain with a heterogeneous team of legged robots,” in 2014 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2014, pp. 4576–4581.
- [20] J. Frey, D. Hoeller, S. Khattak, and M. Hutter, “Locomotion policy guided traversability learning using volumetric representations of complex environments,” in 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2022, pp. 5722–5729.
- [21] L. Wellhausen, R. Ranftl, and M. Hutter, “Safe robot navigation via multi-modal anomaly detection,” IEEE Robotics and Automation Letters, vol. 5, no. 2, pp. 1326–1333, 2020.
- [22] M. Aladem, S. Baek, and S. A. Rawashdeh, “Evaluation of image enhancement techniques for vision-based navigation under low illumination,” Journal of Robotics, vol. 2019, 2019.
- [23] F. Schilling, X. Chen, J. Folkesson, and P. Jensfelt, “Geometric and visual terrain classification for autonomous mobile navigation,” in 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2017, pp. 2678–2684.
- [24] D. Wisth, M. Camurri, and M. Fallon, “Vilens: Visual, inertial, lidar, and leg odometry for all-terrain legged robots,” IEEE Transactions on Robotics, 2022.
- [25] K. Weerakoon, A. J. Sathyamoorthy, J. Liang, T. Guan, U. Patel, and D. Manocha, “Graspe: Graph based multimodal fusion for robot navigation in unstructured outdoor environments,” arXiv preprint arXiv:2209.05722, 2022.
- [26] A. J. Sathyamoorthy, K. Weerakoon, T. Guan, J. Liang, and D. Manocha, “Terrapn: Unstructured terrain navigation using online self-supervised learning,” in 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2022, pp. 7197–7204.
- [27] T. Miki, J. Lee, J. Hwangbo, L. Wellhausen, V. Koltun, and M. Hutter, “Learning robust perceptive locomotion for quadrupedal robots in the wild,” Science Robotics, vol. 7, no. 62, p. eabk2822, 2022.
- [28] A. Loquercio, A. Kumar, and J. Malik, “Learning visual locomotion with cross-modal supervision,” in 2023 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2023, pp. 7295–7302.
- [29] S. Dey, D. Fan, R. Schmid, A. Dixit, K. Otsu, T. Touma, A. F. Schilling, and A.-A. Agha-Mohammadi, “Prepare: Predictive proprioception for agile failure event detection in robotic exploration of extreme terrains,” in 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2022, pp. 4338–4343.
- [30] J. Truong, A. Zitkovich, S. Chernova, D. Batra, T. Zhang, J. Tan, and W. Yu, “Indoorsim-to-outdoorreal: Learning to navigate outdoors without any outdoor experience,” arXiv preprint arXiv:2305.01098, 2023.
- [31] L. Wellhausen and M. Hutter, “Artplanner: Robust legged robot navigation in the field,” arXiv preprint arXiv:2303.01420, 2023.
- [32] P. Biswal and P. K. Mohanty, “Development of quadruped walking robots: A review,” Ain Shams Engineering Journal, vol. 12, no. 2, pp. 2017–2031, 2021.
- [33] T. Overbye and S. Saripalli, “Path optimization for ground vehicles in off-road terrain,” in 2021 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2021, pp. 7708–7714.
- [34] M. F. Ginting, S.-K. Kim, O. Peltzer, J. Ott, S. Jung, M. J. Kochenderfer, and A.-a. Agha-mohammadi, “Safe and efficient navigation in extreme environments using semantic belief graphs,” arXiv preprint arXiv:2304.00645, 2023.
- [35] J. Guzzi, R. O. Chavez-Garcia, L. M. Gambardella, and A. Giusti, “On the impact of uncertainty for path planning,” in 2019 International Conference on Robotics and Automation (ICRA). IEEE, 2019, pp. 5929–5935.
- [36] R. O. C. García, M. A. Estrada, M. Ebrahimi, F. Zuppichini, L. M. Gambardella, A. Giusti, and A. J. Ijspeert, “Gait-dependent traversability estimation on the k-rock2 robot,” in 2022 26th International Conference on Pattern Recognition (ICPR). IEEE, 2022, pp. 4204–4210.
- [37] M. Hutter, C. Gehring, D. Jud, A. Lauber, C. D. Bellicoso, V. Tsounis, J. Hwangbo, K. Bodie, P. Fankhauser, M. Bloesch, R. Diethelm, S. Bachmann, A. Melzer, and M. Hoepflinger, “Anymal - a highly mobile and dynamic quadrupedal robot,” in 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2016, pp. 38–44.
- [38] “Ghost vision 60,” http://www.stonexperu.com/pdf/GR%20Vision%2060-P%20Quad%20UGV-%20Full%20Spec%20rev4.0˙STN.pdf, Ghost Robotics Corp., [Online; accessed 17-January-2024].
- [39] “About spot,” https://dev.bostondynamics.com/docs/concepts/about˙spot, Boston Dynamics, [Online; accessed 17-January-2024].
- [40] D. Fox, W. Burgard, and S. Thrun, “The dynamic window approach to collision avoidance,” IEEE Robotics & Automation Magazine, vol. 4, no. 1, pp. 23–33, 1997.
- [41] P. Try and M. Gebhard, “A vibration sensing device using a six-axis imu and an optimized beam structure for activity monitoring,” Sensors, vol. 23, no. 19, p. 8045, 2023.