You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Oct 16, 2022. It is now read-only.
Hello, thank you for making this repo,
I think while calculating the returns you should take done into consideration as,
def calculate_returns(self, rewards, dones, normalize = True):
returns = []
R = 0
for r, d in zip(reversed(rewards), reversed(dones)):
if d:
R = 0
R = r + R * self.gamma
returns.insert(0, R)
returns = torch.tensor(returns).to(device)
if normalize:
returns = (returns - returns.mean()) / returns.std()
return returns
Also can you please briefly describe the Generalized Advantage Estimation (GAE) while calculating the advantages.
Hello, thank you for making this repo,
I think while calculating the returns you should take done into consideration as,
Also can you please briefly describe the Generalized Advantage Estimation (GAE) while calculating the advantages.