We gratefully acknowledge support from
the Simons Foundation and member institutions.

Washim Mondal and Vaneet Aggarwal are qualified to endorse.

Learning General Parameterized Policies for Infinite Horizon Average Reward Constrained MDPs via Primal-Dual Policy Gradient Algorithm

Washim Mondal: Is registered as an author of this paper.
Can endorse for cs.AI, cs.GT, cs.IT, cs.LG, cs.MA, cs.NI, cs.PF, math.IT. (why?)
Vaneet Aggarwal: Is registered as an author of this paper.
Can endorse for cs.AI, cs.CC, cs.CR, cs.CV, cs.CY, cs.DC, cs.DM, cs.DS, cs.ET, cs.GT, cs.HC, cs.IT, cs.LG, cs.MA, cs.MM, cs.NA, cs.NE, cs.NI, cs.PF, cs.RO, cs.SI, cs.SY, eess.IV, eess.SY, math.AG, math.CO, math.IT, math.NA, math.OC, physics.optics, q-bio.GN, q-bio.QM, quant-ph, stat.ML. (why?)

Qinbo Bai is not registered as an owner of this paper. (why?)