writingnero.blogg.se - Stockfish chess testing

Available online: (accessed on 13 April 2022). The authors declare no conflict of interest. When implementing a computational form of imagination, it is important to ensure machines do not make the same mistake. They may fall into the trap of “magical thinking”, which “involves our inclination to seek and interpret connections between the events around us together with our disinclination to revise belief after further observation”. Humans often optimistically bias their calculations based on their memory of similar positions. The opponent must truly have no alternative. It is important to note that in order to verify a checkmate as a forced win, all relevant lines must still be examined. In statistical terms, this may be thought of as “extending the conversation”, since we are extending our assessment of win probability by conditioning on the event of reaching a certain checkmate position. Such a process would augment the sequential search process with a deeper, more speculative lookahead. Since all puzzles have a guaranteed solution, it may be possible for a learner to explicitly predict likely checkmate positions and use these positions to condition the search. The engine thus chooses to prioritize searching other lines instead. The engine must see the entire checkmate before it is able to realize the benefits, especially regarding the 3 ♗c2+ sacrifice made in a materially losing position. However, the skewed search is also due to subtleties in the puzzle.

It stands to reason that a node with significantly more moves remaining would also require a deeper search (there is greater potential the engine will stumble upon a line that will reverse its evaluation deep in the tree). Furthermore, once LCZero determines that all moves seem losing, it tries to focus on the one with the most promise to extend the game, which the moves-left head identifies as 1 ♘×e3. This intuition seems reasonable, since Stockfish evaluates 1 ♘×e3 as the second-best move in the position, as shown in Table 3. The policy indicates that there is a 15.75% probability 1 ♘×e3 is the optimal move, compared to a 7.40% probability for 1 ♘f6+. This is partially due to the prior probabilities determined by the policy head. On average, the engine spends 0.36% of the time, or about 216,000 nodes (SD = 450), searching positions following 1 ♘f6+, seeing it as less promising. A surprising 92.4% of the 60 million searched positions follow from 1 ♘×e3, even though its win probability of 3.01% is lower than the top 5 moves. It is possible to understand Leela’s selective search strategy by examining the distribution of positions searched. To conclude, we discuss the implications of our work on artificial intelligence (AI) and artificial general intelligence (AGI), suggesting possible avenues for future research. On the theoretical side, we describe how Bellman’s equation may be applied to optimize the probability of winning.

Drawing inspiration from how humans solve chess problems, we ask whether machines can possess a form of imagination. We examine the algorithmic differences between the engines and use our observations as a basis for carefully interpreting the test results. Our experiments show that Stockfish outperforms LCZero on the puzzle. We use Plaskett’s Puzzle, a famous endgame study from the late 1970s, to compare the two engines. Two of the leading chess engines, Stockfish and Leela Chess Zero (LCZero), employ significantly different methods during play. We find that they can serve as a tool for testing machine ability as well.

Endgame studies have long served as a tool for testing human creativity and intelligence.