Zum Hauptinhalt springen

Showing 1–5 of 5 results for author: Shelke, O

Searching in archive cs. Search in all archives.
.
  1. arXiv:2311.16171  [pdf, other

    cs.AI cs.LG cs.MA

    Multi-Agent Learning of Efficient Fulfilment and Routing Strategies in E-Commerce

    Authors: Omkar Shelke, Pranavi Pathakota, Anandsingh Chauhan, Harshad Khadilkar, Hardik Meisheri, Balaraman Ravindran

    Abstract: This paper presents an integrated algorithmic framework for minimising product delivery costs in e-commerce (known as the cost-to-serve or C2S). One of the major challenges in e-commerce is the large volume of spatio-temporally diverse orders from multiple customers, each of which has to be fulfilled from one of several warehouses using a fleet of vehicles. This results in two levels of decision-m… ▽ More

    Submitted 20 November, 2023; originally announced November 2023.

  2. arXiv:2311.02125  [pdf, other

    cs.LG cs.AI math.OC

    Using General Value Functions to Learn Domain-Backed Inventory Management Policies

    Authors: Durgesh Kalwar, Omkar Shelke, Harshad Khadilkar

    Abstract: We consider the inventory management problem, where the goal is to balance conflicting objectives such as availability and wastage of a large range of products in a store. We propose a reinforcement learning (RL) approach that utilises General Value Functions (GVFs) to derive domain-backed inventory replenishment policies. The inventory replenishment decisions are modelled as a sequential decision… ▽ More

    Submitted 3 November, 2023; originally announced November 2023.

  3. arXiv:2203.00874  [pdf, other

    cs.LG cs.AI

    Follow your Nose: Using General Value Functions for Directed Exploration in Reinforcement Learning

    Authors: Durgesh Kalwar, Omkar Shelke, Somjit Nath, Hardik Meisheri, Harshad Khadilkar

    Abstract: Improving sample efficiency is a key challenge in reinforcement learning, especially in environments with large state spaces and sparse rewards. In literature, this is resolved either through the use of auxiliary tasks (subgoals) or through clever exploration strategies. Exploration methods have been used to sample better trajectories in large environments while auxiliary tasks have been incorpora… ▽ More

    Submitted 27 February, 2023; v1 submitted 2 March, 2022; originally announced March 2022.

  4. arXiv:2102.11762  [pdf, other

    cs.AI cs.LG cs.MA

    School of hard knocks: Curriculum analysis for Pommerman with a fixed computational budget

    Authors: Omkar Shelke, Hardik Meisheri, Harshad Khadilkar

    Abstract: Pommerman is a hybrid cooperative/adversarial multi-agent environment, with challenging characteristics in terms of partial observability, limited or no communication, sparse and delayed rewards, and restrictive computational time limits. This makes it a challenging environment for reinforcement learning (RL) approaches. In this paper, we focus on developing a curriculum for learning a robust and… ▽ More

    Submitted 24 February, 2021; v1 submitted 23 February, 2021; originally announced February 2021.

    Comments: 8 pages, Submitted to ALA workshop 2021

    Journal ref: CODS-COMAD 2022: 5th Joint International Conference on Data Science & Management of Data (9th ACM IKDD CODS and 27th COMAD)

  5. arXiv:1911.04947  [pdf, other

    cs.LG stat.ML

    Accelerating Training in Pommerman with Imitation and Reinforcement Learning

    Authors: Hardik Meisheri, Omkar Shelke, Richa Verma, Harshad Khadilkar

    Abstract: The Pommerman simulation was recently developed to mimic the classic Japanese game Bomberman, and focuses on competitive gameplay in a multi-agent setting. We focus on the 2$\times$2 team version of Pommerman, developed for a competition at NeurIPS 2018. Our methodology involves training an agent initially through imitation learning on a noisy expert policy, followed by a proximal-policy optimizat… ▽ More

    Submitted 13 November, 2019; v1 submitted 12 November, 2019; originally announced November 2019.

    Comments: Presented at Deep Reinforcement Learning workshop, NeurIPS-2019