Input and output encoding (please be crazy!)

103 views
Skip to first unread message

Peter Borrmann

unread,
Jan 7, 2019, 1:06:07 PM1/7/19
to LCZero
The current encoding (see below for reference)  of the input and output planes are more or less replicated from Alpha Zero, which was a "quick" shot to cover go, chess, and shogi. This works obviously quite well, but there is always room for improvement ...

Questions/potential drawbacks: 
  • The policy head is quite broad with 4672 outputs compared to an average of 26-27 legal moves 
  • The history planes are heavily correlated as only one (or two) pieces change their position within a ply
  • Aggregate planes like fields dominated etc. may help to reduce network size 
Initial thoughts on optimization (might be not compatible)
  • Encode pieces not by their position, but by there legal moves (including not to move) 
    • Encoding needs to be 0,1,2 in this case (rook, knight, pawn) 
    • Castling planes can be removed 
  • Use a delta encoding for the history planes ( 0 = no change, 1 = piece moved to the position, -1 = piece moved from position)
  • Reduce history planes 
  • Replace value head by - fast - evaluation of most probable among all legal moves. 
    • within training one would need to replicate a position for all legal moves and the move choosen would be 1 all other 0 
  • ...
I hope you have crazy good ideas! It is just to stimulate discussion. 

( With legal moves encoding a Queen could be encoded as bishop and rook, and a king as  a queen with "limited mobility". )


----------------------------
Alpha Zero encoding: 

Input: 
  • The position is encoded with  8x8 planes  (current + 7 history planes  = 14*8 = 112) 
    • 6 planes:  player 1 pieces: pawn, rook, knight , bishop, queen, king
    • 6 planes:  player 2 pieces: pawn, rook, knight , bishop, queen, king
    • 2 planes: repetition
  • Addtional planes are: 
    • colour (1)
    • move count (1)
    • player 1 casting (2)
    • player 2 castling (2)
    • no progess count (1)
Output: 
  • Value:  Game result (-1,0,1)
  • Policy:  Vector with 4672 possible moves 
Reply all
Reply to author
Forward
0 new messages