Improved reward estimation for efficient robot navigation using inverse reinforcement learning

Olimpiya Saha, Prithviraj Dasgupta

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

Robot navigation is a central problem in extraterrestrial environments and a suitable navigation algorithm that allows the robot to quickly but precisely avoid initially unknown obstacles is important for efficient navigation. In this paper, we consider a well-known machine learning-based framework called reinforcement learning for robot navigation and investigate a technique for adaptively adjusting the rewards associated with robot maneuvers or actions within this framework. Most reinforcement learning techniques rely on hand-coded, simplistic reward functions which might not be able to determine the most appropriate actions for the robot when the robot is required to perform tasks with new features. To address this problem, we propose an algorithm called IRL-SMDPT (Inverse Reinforcement Learning in Semi Markov Decision Processes with Transfer) which utilizes an inverse reinforcement learning technique called Distance Minimization Inverse Reinforcement Learning (DM-IRL) to estimate an appropriate reward function so that a robot's navigation in complicated environments is improved. Our experimental results show that IRL-SMDPT can improve robot navigation by estimating rewards of trajectories more accurately in comparison to random and greedy reward variants and is also robust against small errors or noise in scoring trajectories.

Original languageEnglish (US)
Title of host publication2017 NASA/ESA Conference on Adaptive Hardware and Systems, AHS 2017
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages245-252
Number of pages8
ISBN (Electronic)9781538634394
DOIs
StatePublished - Sep 19 2017
Event2017 NASA/ESA Conference on Adaptive Hardware and Systems, AHS 2017 - Pasadena, United States
Duration: Jul 24 2017Jul 27 2017

Publication series

Name2017 NASA/ESA Conference on Adaptive Hardware and Systems, AHS 2017

Other

Other2017 NASA/ESA Conference on Adaptive Hardware and Systems, AHS 2017
CountryUnited States
CityPasadena
Period7/24/177/27/17

Fingerprint

Reinforcement learning
reinforcement
navigation
robots
learning
Navigation
Robots
extraterrestrial environments
Trajectories
trajectories
machine learning
scoring
maneuvers
Learning systems
estimating
adjusting
optimization
estimates

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Hardware and Architecture
  • Aerospace Engineering
  • Electrical and Electronic Engineering
  • Instrumentation

Cite this

Saha, O., & Dasgupta, P. (2017). Improved reward estimation for efficient robot navigation using inverse reinforcement learning. In 2017 NASA/ESA Conference on Adaptive Hardware and Systems, AHS 2017 (pp. 245-252). [8046385] (2017 NASA/ESA Conference on Adaptive Hardware and Systems, AHS 2017). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/AHS.2017.8046385

Improved reward estimation for efficient robot navigation using inverse reinforcement learning. / Saha, Olimpiya; Dasgupta, Prithviraj.

2017 NASA/ESA Conference on Adaptive Hardware and Systems, AHS 2017. Institute of Electrical and Electronics Engineers Inc., 2017. p. 245-252 8046385 (2017 NASA/ESA Conference on Adaptive Hardware and Systems, AHS 2017).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Saha, O & Dasgupta, P 2017, Improved reward estimation for efficient robot navigation using inverse reinforcement learning. in 2017 NASA/ESA Conference on Adaptive Hardware and Systems, AHS 2017., 8046385, 2017 NASA/ESA Conference on Adaptive Hardware and Systems, AHS 2017, Institute of Electrical and Electronics Engineers Inc., pp. 245-252, 2017 NASA/ESA Conference on Adaptive Hardware and Systems, AHS 2017, Pasadena, United States, 7/24/17. https://doi.org/10.1109/AHS.2017.8046385
Saha O, Dasgupta P. Improved reward estimation for efficient robot navigation using inverse reinforcement learning. In 2017 NASA/ESA Conference on Adaptive Hardware and Systems, AHS 2017. Institute of Electrical and Electronics Engineers Inc. 2017. p. 245-252. 8046385. (2017 NASA/ESA Conference on Adaptive Hardware and Systems, AHS 2017). https://doi.org/10.1109/AHS.2017.8046385
Saha, Olimpiya ; Dasgupta, Prithviraj. / Improved reward estimation for efficient robot navigation using inverse reinforcement learning. 2017 NASA/ESA Conference on Adaptive Hardware and Systems, AHS 2017. Institute of Electrical and Electronics Engineers Inc., 2017. pp. 245-252 (2017 NASA/ESA Conference on Adaptive Hardware and Systems, AHS 2017).
@inproceedings{f27556797ae64a6596b06e09b23c1135,
title = "Improved reward estimation for efficient robot navigation using inverse reinforcement learning",
abstract = "Robot navigation is a central problem in extraterrestrial environments and a suitable navigation algorithm that allows the robot to quickly but precisely avoid initially unknown obstacles is important for efficient navigation. In this paper, we consider a well-known machine learning-based framework called reinforcement learning for robot navigation and investigate a technique for adaptively adjusting the rewards associated with robot maneuvers or actions within this framework. Most reinforcement learning techniques rely on hand-coded, simplistic reward functions which might not be able to determine the most appropriate actions for the robot when the robot is required to perform tasks with new features. To address this problem, we propose an algorithm called IRL-SMDPT (Inverse Reinforcement Learning in Semi Markov Decision Processes with Transfer) which utilizes an inverse reinforcement learning technique called Distance Minimization Inverse Reinforcement Learning (DM-IRL) to estimate an appropriate reward function so that a robot's navigation in complicated environments is improved. Our experimental results show that IRL-SMDPT can improve robot navigation by estimating rewards of trajectories more accurately in comparison to random and greedy reward variants and is also robust against small errors or noise in scoring trajectories.",
author = "Olimpiya Saha and Prithviraj Dasgupta",
year = "2017",
month = "9",
day = "19",
doi = "10.1109/AHS.2017.8046385",
language = "English (US)",
series = "2017 NASA/ESA Conference on Adaptive Hardware and Systems, AHS 2017",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "245--252",
booktitle = "2017 NASA/ESA Conference on Adaptive Hardware and Systems, AHS 2017",

}

TY - GEN

T1 - Improved reward estimation for efficient robot navigation using inverse reinforcement learning

AU - Saha, Olimpiya

AU - Dasgupta, Prithviraj

PY - 2017/9/19

Y1 - 2017/9/19

N2 - Robot navigation is a central problem in extraterrestrial environments and a suitable navigation algorithm that allows the robot to quickly but precisely avoid initially unknown obstacles is important for efficient navigation. In this paper, we consider a well-known machine learning-based framework called reinforcement learning for robot navigation and investigate a technique for adaptively adjusting the rewards associated with robot maneuvers or actions within this framework. Most reinforcement learning techniques rely on hand-coded, simplistic reward functions which might not be able to determine the most appropriate actions for the robot when the robot is required to perform tasks with new features. To address this problem, we propose an algorithm called IRL-SMDPT (Inverse Reinforcement Learning in Semi Markov Decision Processes with Transfer) which utilizes an inverse reinforcement learning technique called Distance Minimization Inverse Reinforcement Learning (DM-IRL) to estimate an appropriate reward function so that a robot's navigation in complicated environments is improved. Our experimental results show that IRL-SMDPT can improve robot navigation by estimating rewards of trajectories more accurately in comparison to random and greedy reward variants and is also robust against small errors or noise in scoring trajectories.

AB - Robot navigation is a central problem in extraterrestrial environments and a suitable navigation algorithm that allows the robot to quickly but precisely avoid initially unknown obstacles is important for efficient navigation. In this paper, we consider a well-known machine learning-based framework called reinforcement learning for robot navigation and investigate a technique for adaptively adjusting the rewards associated with robot maneuvers or actions within this framework. Most reinforcement learning techniques rely on hand-coded, simplistic reward functions which might not be able to determine the most appropriate actions for the robot when the robot is required to perform tasks with new features. To address this problem, we propose an algorithm called IRL-SMDPT (Inverse Reinforcement Learning in Semi Markov Decision Processes with Transfer) which utilizes an inverse reinforcement learning technique called Distance Minimization Inverse Reinforcement Learning (DM-IRL) to estimate an appropriate reward function so that a robot's navigation in complicated environments is improved. Our experimental results show that IRL-SMDPT can improve robot navigation by estimating rewards of trajectories more accurately in comparison to random and greedy reward variants and is also robust against small errors or noise in scoring trajectories.

UR - http://www.scopus.com/inward/record.url?scp=85032889169&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85032889169&partnerID=8YFLogxK

U2 - 10.1109/AHS.2017.8046385

DO - 10.1109/AHS.2017.8046385

M3 - Conference contribution

AN - SCOPUS:85032889169

T3 - 2017 NASA/ESA Conference on Adaptive Hardware and Systems, AHS 2017

SP - 245

EP - 252

BT - 2017 NASA/ESA Conference on Adaptive Hardware and Systems, AHS 2017

PB - Institute of Electrical and Electronics Engineers Inc.

ER -