@ tf.function ValueError：tf.functionで装飾された関数への最初の呼び出しでの変数を作成することで、行動を理解できない

Question

なぜこの関数の理由を知りたいのですが。

@tf.function def train(self,TargetNet,epsilon): if len(self.experience['s']) < self.min_experiences: return 0 ids=np.random.randint(low=0,high=len(self.replay_buffer['s']),size=self.batch_size) states=np.asarray([self.experience['s'][i] for i in ids]) actions=np.asarray([self.experience['a'][i] for i in ids]) rewards=np.asarray([self.experience['r'][i] for i in ids]) next_states=np.asarray([self.experience['s1'][i] for i in ids]) dones = np.asarray([self.experience['done'][i] for i in ids]) q_next_actions=self.get_action(next_states,epsilon) q_value_next=TargetNet.predict(next_states) q_value_next=tf.gather_nd(q_value_next,tf.stack((tf.range(self.batch_size),q_next_actions),axis=1)) targets=tf.where(dones, rewards, rewards+self.gamma*q_value_next) with tf.GradientTape() as tape: estimates=tf.math.reduce_sum(self.predict(states)*tf.one_hot(actions,self.num_actions),axis=1) loss=tf.math.reduce_sum(tf.square(estimates - targets)) variables=self.model.trainable_variables gradients=tape.gradient(loss,variables) self.optimizer.apply_gradients(Zip(gradients,variables))  _

valueError：TF.Functionで装飾された関数への最初の呼び出しで変数を作成します。非常に模様のこのコード：

@tf.function def train(self, TargetNet): if len(self.experience['s']) < self.min_experiences: return 0 ids = np.random.randint(low=0, high=len(self.experience['s']), size=self.batch_size) states = np.asarray([self.experience['s'][i] for i in ids]) actions = np.asarray([self.experience['a'][i] for i in ids]) rewards = np.asarray([self.experience['r'][i] for i in ids]) states_next = np.asarray([self.experience['s2'][i] for i in ids]) dones = np.asarray([self.experience['done'][i] for i in ids]) value_next = np.max(TargetNet.predict(states_next), axis=1) actual_values = np.where(dones, rewards, rewards+self.gamma*value_next) with tf.GradientTape() as tape: selected_action_values = tf.math.reduce_sum( self.predict(states) * tf.one_hot(actions, self.num_actions), axis=1) loss = tf.math.reduce_sum(tf.square(actual_values - selected_action_values)) variables = self.model.trainable_variables gradients = tape.gradient(loss, variables) self.optimizer.apply_gradients(Zip(gradients, variables))  _

誤りを投げません。理由を理解するのに役立ちます。

[〜＃〜]編集[〜＃〜]：関数からパラメータepsilonを削除しました。

nessuno · Accepted Answer

TF.Functionを使用して、装飾された関数の内容を変換しています。これは、TensorflowがEAGERコードをそのグラフ表現にコンパイルしようとすることを意味します。

ただし、変数は特別なオブジェクトです。実際、TensorFlow 1.x（Graph Mode）を使用している場合は、変数を1回だけ定義してから/更新していました。

TensorFlow 2.0では、純粋なEAGERの実行を使用する場合は、EAGERモードで_tf.Variable_ - がわずか1回以上宣言して再使用できます。 Python オブジェクト関数が終わるとすぐに破壊され、したがって範囲外になります。

テンソルフローを正しく作成できるようにするために、関数の外側の変数を宣言して、関数スコープを壊してください。

要するに、あなたがEAGERモードで正しく機能する関数がある場合は、次のようにします。

_def f(): a = tf.constant([[10,10],[11.,1.]]) x = tf.constant([[1.,0.],[0.,1.]]) b = tf.Variable(12.) y = tf.matmul(a, x) + b return y _

あなたはその構造を次のようなものに変えなければなりません。

_b = None @tf.function def f(): a = tf.constant([[10, 10], [11., 1.]]) x = tf.constant([[1., 0.], [0., 1.]]) global b if b is None: b = tf.Variable(12.) y = tf.matmul(a, x) + b print("PRINT: ", y) tf.print("TF-PRINT: ", y) return y f() _

_tf.function_デコレータで正しく動作させるために。

私はいくつかのブログ投稿でこの（そしてその他の）シナリオをカバーしました：最初の部分はセクション内のこの動作を分析します関数範囲を壊すことを処理してください（しかし私は最初からそれを読むことを提案し、そしてそれを読むことを提案します） 2と3）。