Skip to content

some suggestions for improvements on the GPREPS case studies #18

@Seas00n

Description

@Seas00n

Hello, after reviewing your project, I have learned a lot, thanks for your great help!!

I have found some detail issues in the cartpole_learn.py for GPREPS.

I've noticed that after each systemRollout, the new X and Y are concatenated together, but the current and future states before and after the concatenation do not correspond. This could lead to incorrect system dynamics being fitted. In fact, after skipping these discontinuous indices and fitting the system dynamics, the program learns much faster. Originally, it required 7 fittings of the GP model, but now it only requires 3 fittings of the GP model.

Here are some of my modifications.
I use a interval_list to record the length of each continuous rollout data.

x, y = systemRollout(env, hpol, pol)
interval_list.append(x.shape[0])

And only fit these continuous data.

def fit_continuous(self, X, Y, interval_list):
        Xt = X[:, self.dyni]
        Yt = Y[:, self.difi]-X[:,self.difi]
        Xt_continuous = np.delete(Xt, np.array(interval_list)-1, axis=0)
        Yt_continuous = np.delete(Yt, np.array(interval_list)-1, axis=0)
        for i in range(self.nout):
            try:
                self.gps[i].fit(Xt_continuous, Yt_continuous[:, i])
            except ValueError as e:
                print( 'ValueError cought for i:{0}: e:{1}'.format( i, e ) )
                raise e

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions