forked from arcman7/pysc2
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathRYAN-NOTES
53 lines (30 loc) · 1.11 KB
/
RYAN-NOTES
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
RYAN-NOTES
========================================================================
Node / NPM package equivalents required:
1. numpy
2. collections.dequeue
3. queue
4. WeakSet and or _weakrefset
5. threading
6. importlib
7. getpass
Y = np.arange(12).reshape(3,4)
Y.shape
row = Y[0]
row.shape
Y.shape[-1:] == row.shape
tup = (1,2,3,4)
tup[:-1]
ALPHA STAR PAPER Notes
========================================================================
The policy of AlphaStar is a function πθ(at | st,z) that maps all previous
observations and actions st = o1:t, a1:t − 1 (defined in Extended Data Tables 1, 2)
observation 1 - t AND action 1 - (t - 1)
and z (representing strategy statistics) to a probability distribution
over actions at for the current step. πθ is implemented as a deep neural
network with the following structure
* The policy function itself, is a neural network.
The observations ot are encoded into vector representations,
combined, and processed by a deep LSTM9, which maintains
memory between steps.
/c/Users/Ryan/AppData/Local/Programs/Python/Python37-32/Lib/site-packages/s2clientprotocol