DecPOMDP Usage
Decentralized POMDP
The DecPOMDP struct gives the following objects:
ฮณ
: discount factorโ
: agents๐ฎ
: state space๐
: joint action space๐ช
: joint observation spaceT
: transition functionO
: joint observation functionR
: joint reward function
The agents โ
are the players of the game. The joint action space ๐
is the set of all possible ordered pairs of actions amongst all of the agents. The joint observation space ๐ช
is the set of all possible joint observations. The transition function takes in a state s
in ๐ฎ
, a joint action a
and a new state s'
and returns the transition probability of going from s
to s'
by taking action a
. The joint observation function takes in a state, s
, a joint action, a
, and a joint observation o
in ๐ช
and returns a probability of observing o
by taking action a
from state s
. The joint reward function R
takes a state and a joint action in ๐
and returns a reward value.