Once the players are ranked, the bottom third are eliminated, and replaced with new players which are combinations of the first two thirds. The first new player is a mix of player 1 and 2, the next is a mix of 3 and 4, and so on. Then the new players are mutated, by adding small random values to their weights.
For as many iterations as desired, the process will repeat, with the new set of players re-sorted. Over time, successful players will be preserved and their weights will be used to generate new players.
We have used the Coach to train better players. Because the weights can be saved to a file, the generated weights can be used and a human can play against it if desired.