[Computer-go] Playouts vs playing strength (Detlef Schmicker)

Discussion:

Martin Mueller

2013-12-19 17:23:05 UTC

Thank you Detlef for doing these tests!

I want to get more people interested into this scaling, therefore I did
also some scaling tests fuego against pachi :)
It is not as bad as oakfoam against pachi, but pachi scales a lot better
than fuego too. (attached file) To avoid additional complications I set
the number of playouts to the same value for both opponents. ELO is
again as defined in CGOS from winning rate.

I assume this is on 19x19? Yes, it is also my experience that pachi scales better than Fuego on the big board.
I suspect that a big part of it is large patterns, which Fuego does not yet have. But it is also possible that something else contributes to better scaling, such as the UCT formula.

I did some testing of Fuego vs pachi a few months ago. In the beginning, I did not know how to set up the pattern files of pachi correctly. I saw that this pachi version without patterns did not scale nearly as well, but I aborted the experiments when I saw that it was playing without patterns.

My two conjectures are that 1. using knowledge from large patterns decreases the effective branching factor in pachi, and/or 2. patterns allow it to focus on better moves, improving the quality of the tree.
I think part 2. is relatively clear. Part 1. is not clear to me.

Does oakfoam have large patterns? I am currently working on adding a large pattern system to Fuego, but I just started the implementation so it will be a while.

By the way, would it be possible to use the current svn Fuego instead of 1.1? It would be much more interesting for Fuego developers. Also, it is much stronger :)

https://sourceforge.net/p/fuego/code/HEAD/tree/trunk/

Martin

Pachi version 10.00 (Satsugen)
fuego 1.1 (does not show a more detailed version)
with following configuration
opponent_program2='/home/detlef/fuego-1.1/fuegomain/fuego'
opponent_settings2='uct_param_player ignore_clock 1\nuct_param_player
max_games '+str(playouts)+'\nuct_param_player resign_min_games 5000
\nuct_param_search number_threads 8\nuct_max_memory 8000000000
\nuct_param_player reuse_subtree 1'
opponent_program3='/home/detlef/pachi/pachi -d 0 -t ='+str(playouts)+'
-r chinese threads=8,max_tree_size=2048,pondering=0,pass_all_alive '
opponent_settings3=''
taken from a CLOP like python file.
For oakfoam I tried to optimize a number of parameters which I thought
are relevant to scaling (progressive widening, ucb_c weighting of random
moves in playouts), but none of them was as relevant as I thought :(
I hope I did not understand the playout number parameters wrong in pachi
and fuego.
To me it seems there is a lot of potential in scaling, not only for
oakfoam...
I read fuego and pachi mailing list too, if it is not of too much
interest here, we might change the mailing list:)
Detlef

I did a comparison of the playings strength vs. playouts.
This time I used 4 times the oakfoam playouts for pachi
(eg. 1000 for oakfoam 4000 for pachi)
The graph shows how bad we become (in comparison) with more playouts:(.

From the games the first impression is, that the joseki becomes worse

with more playouts e.g.
http://www.physik.de/playouts2.pdf
The plot is 1050 games fitted with a 5th order polynome. The borders of
the plot are not statistical significant!
Thanks for every hint :)
Detlef

make sure Pachi isn't doing any kind of pondering in the
background.

Indeed, Pachi will ponder by default. Turn pondering off by passing
pondering=0
on the commandline.

Thanks a lot for the hint!!! From the command line documentation I
thought pondering is off by default.and I did not check it:(

pachi -d 0 -t =4000 -r chinese threads=1,max_tree_size=2048

Also, it may be worth passing pass_all_alive unless you are doing a
sophisticated scoring procedure, to make sure Pachi captures all dead
groups at the end of the game.
P.S.: Do your results imply that on 4000 playouts/move, oakfoam is
quite stronger than Pachi now? I'd love to hear more. :) How does the
playout speed compare?

Yes, we play even with 1000 against this settings. But I did not take
pondering into account, as I thought it is turned off. Therefore I do
not know if pachi really played 4000 playouts, as I thought.
We have a little less than 1000 playouts/core/second. And my main aim is
to get the iPad version strong, therefore the strength with lower
playouts is more important to me.
I did not optimize parameter against pachi alown, I started running clop
with three opponents gnugo level 10, pachi with this setting and
/home/detlef/fuego-1.1/fuegomain/fuego
with setting
uct_param_player ignore_clock 1
uct_param_player max_games 3000
uct_param_player resign_min_games 5000
uct_max_memory 300000000
All 4 programs have comparable strenght than.
Always happy to share any idea:)
Detlef
_______________________________________________
Computer-go mailing list
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

_______________________________________________
Computer-go mailing list
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

-------------- next part --------------
A non-text attachment was scrubbed...
Name: playouts_oakfoam_fuego_pachi.pdf
Type: application/pdf
Size: 18499 bytes
Desc: not available
URL: <http://dvandva.org/pipermail/computer-go/attachments/20131219/420ce2e5/attachment.pdf>
------------------------------
_______________________________________________
Computer-go mailing list
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go
End of Computer-go Digest, Vol 47, Issue 15
*******************************************

Detlef Schmicker

2013-12-19 17:54:05 UTC

Permalink

Post by Martin Mueller
Thank you Detlef for doing these tests!

Sorry, it is on 13x13, I forgot to mention. I did not set up large
pattern, probably it is not using them. Oakfoam uses large patterns and
my guess was the other way around. Because of good move selection and
bad playouts we are better at small number of playouts :)

Post by Martin Mueller
I did some testing of Fuego vs pachi a few months ago. In the beginning, I did not know how to set up the pattern files of pachi correctly. I saw that this pachi version without patterns did not scale nearly as well, but I aborted the experiments when I saw that it was playing without patterns.

I will have to check this. In my theory pattern files would make scaling
worse, so we can see who is right :)

Post by Martin Mueller
My two conjectures are that 1. using knowledge from large patterns decreases the effective branching factor in pachi, and/or 2. patterns allow it to focus on better moves, improving the quality of the tree.
I think part 2. is relatively clear. Part 1. is not clear to me.
Does oakfoam have large patterns? I am currently working on adding a large pattern system to Fuego, but I just started the implementation so it will be a while.
By the way, would it be possible to use the current svn Fuego instead of 1.1? It would be much more interesting for Fuego developers. Also, it is much stronger :)
https://sourceforge.net/p/fuego/code/HEAD/tree/trunk/

I just downloaded and compiled, I will do a run with this version:)

Upps, seems to be a lot stronger, I will not be able to keep the number
of playouts the same with this pachi configuration :)

Are my fuego parameters ok?
opponent_settings2='uct_param_player ignore_clock 1\nuct_param_player
max_games '+str(playouts)+'\nuct_param_player resign_min_games 5000
\nuct_param_search number_threads 8\nuct_max_memory 8000000000
\nuct_param_player reuse_subtree 1'

Detlef

Post by Martin Mueller
Martin

Petr Baudis

2013-12-20 11:59:44 UTC

Permalink

Hi!

Post by Martin Mueller
I assume this is on 19x19? Yes, it is also my experience that pachi scales better than Fuego on the big board.
I suspect that a big part of it is large patterns, which Fuego does not yet have. But it is also possible that something else contributes to better scaling, such as the UCT formula.

In general, for scaling, the best tradeoff is to minimize bias at the
expense of slower convergence. How to accomplish this is of course a
difficult question, e.g. more randomness does not always mean less bias.

Two things may help: (i) in Pachi, I always tried to make all decisions
noisy; almost no rule in the playout is applied 100% of the time.
(ii) there is rather involved code for reducing bias in some common
tactical situations; I think many other programs don't have that much of
this (though at least Zen probably does, I'd expect; maybe CS as well).
As an interesting experiment, you might wish to test scaling with
modified Pachi that has 'return false;' at the top of
is_bad_selfatari_slow() body in tactics/selfatari.c.

Out Pachi paper may also contain some ideas; it lists Elo effect of some
of Pachi's heuristics with various time allowances; e.g. strength
contribution of playout-heuristic tree prior is mostly constant, but CFG
prior is more important with more time.

But the biggest difference may be that Pachi had the enormous luxury
of having its parameters historically tuned with similar game conditions
as in real games, rather than the usual low-performance blitz, thanks
to Jean-loup Gailly. You may find some of the parameter changes in
history (e.g. 49249d but not only), but I'm not sure if there are clear
or universal lessons in that and understanding the parameter semantics
might involve reading a lot of Pachi's code.

I'm not sure there is a simple answer. One thing is that in low-end
scenario, optimum pattern prior is *10 to *20 the usual prior, while
in high-end scenario, optimum pattern prior is just *4 the usual prior.
[1] So leaning heavily on patterns will make Pachi focus too much on
just nice shapes. But I'd be surprised if the optimum moved much further
down when going above the high-end scenario.

[1] But it's also confounded by the fact that *20 pattern prior was
when testing against GNUGo while *4 pattern prior was against Fuego.
Performance data is not always perfect. :-)

--
Petr "Pasky" Baudis
Sick and yet happy, in peril and yet happy, dying and yet happy,
in exile and happy, in disgrace and happy. -- Epictetus

Detlef Schmicker

2013-12-21 07:18:50 UTC

Permalink

Thanks a lot for sharing so much information. I am checking all your
suggestions :)

I wonder, if the biggest difference between fuego, oakfoam compared to
pachi is not the use of progressive widening?!

This might be narrowing the search very good for low number of playouts,
but becoming useless later?!

Detlef

Post by Petr Baudis
Hi!

In general, for scaling, the best tradeoff is to minimize bias at the
expense of slower convergence. How to accomplish this is of course a
difficult question, e.g. more randomness does not always mean less bias.
Two things may help: (i) in Pachi, I always tried to make all decisions
noisy; almost no rule in the playout is applied 100% of the time.
(ii) there is rather involved code for reducing bias in some common
tactical situations; I think many other programs don't have that much of
this (though at least Zen probably does, I'd expect; maybe CS as well).
As an interesting experiment, you might wish to test scaling with
modified Pachi that has 'return false;' at the top of
is_bad_selfatari_slow() body in tactics/selfatari.c.
Out Pachi paper may also contain some ideas; it lists Elo effect of some
of Pachi's heuristics with various time allowances; e.g. strength
contribution of playout-heuristic tree prior is mostly constant, but CFG
prior is more important with more time.
But the biggest difference may be that Pachi had the enormous luxury
of having its parameters historically tuned with similar game conditions
as in real games, rather than the usual low-performance blitz, thanks
to Jean-loup Gailly. You may find some of the parameter changes in
history (e.g. 49249d but not only), but I'm not sure if there are clear
or universal lessons in that and understanding the parameter semantics
might involve reading a lot of Pachi's code.

I'm not sure there is a simple answer. One thing is that in low-end
scenario, optimum pattern prior is *10 to *20 the usual prior, while
in high-end scenario, optimum pattern prior is just *4 the usual prior.
[1] So leaning heavily on patterns will make Pachi focus too much on
just nice shapes. But I'd be surprised if the optimum moved much further
down when going above the high-end scenario.
[1] But it's also confounded by the fact that *20 pattern prior was
when testing against GNUGo while *4 pattern prior was against Fuego.
Performance data is not always perfect. :-)

Petr Baudis

2013-12-21 12:10:55 UTC

Permalink

Hi!

Post by Detlef Schmicker
Thanks a lot for sharing so much information. I am checking all your
suggestions :)
I wonder, if the biggest difference between fuego, oakfoam compared to
pachi is not the use of progressive widening?!
This might be narrowing the search very good for low number of playouts,
but becoming useless later?!

I admit that I forgot what oakfoam uses here. I think both Fuego and
Pachi are essentially Mogo clones in these regards, i.e. using Silver's
RAVE formula and node priors (almost like progressive bias).

Petr "Pasky" Baudis

Detlef Schmicker

2013-12-21 13:43:56 UTC

Permalink

Post by Petr Baudis
Hi!

Fuego I did only noticed, it has a switch widening. I will check if it
is turned on