lichess.org
Donate

Science of Chess: A g-factor for chess? A psychometric scale for playing ability

ground-truth measure ELO. does that not need some explaining.

Also, one does have some choice of how much need explaining, no? in PCA?

I also agree, that hypothesis is exactly what this work is looking for. So, there is no need to prove the hopthesis before testing it. Arguing for the staring method.

but nothing is forbidden. and I don't understand (possibly should read the blog, not just the comments, although interesting, and stimulating) why ELO has priority as there is so much chess information that ELO is blind too.

Once one gives up the blind (hidden assumption) pretense, one is free to be fully logical and only has to divulge what is hypothesis in the reasoning and data analsys tool consequences under that assumption yields. If the loss of purpose in the matrix projections into smaller matrices (or other transformation to reduced forms) coming from possible artfefacts, or loss of informatoin.

This is not like seeking a correlactin over ratios kind of data variation re-entry in the transformation of the input domain? I have no clue what level of raw data is used in ACT. my bad.
When you rate the effectiveness of a test based on how well it correlates to ELO, aren't you conceding that ELO is by default a better measure than whatever it's being compared with?
@RaisinBranCrunch said in #42:
> When you rate the effectiveness of a test based on how well it correlates to ELO, aren't you conceding that ELO is by default a better measure than whatever it's being compared with?

No, because they serve different purposes and require different practices in terms of measurement. For example, you don't get an ELO rating without playing a lot of opponents, which can take a lot of time if you're talking about OTB tournaments, but you can be assessed with the ACT in about an hour. Also, to the extent that one thinks that the subscales of the ACT are sufficiently reliable, etc. it provides multiple assessments of a player's ability, while ELO just gives you one number.

Another thing to think about: If you wanted to know if the ACT really did capture playing strength (a question about its validity), what else do you compare it to instead of ELO? I'm not asking that rhetorically - if you've got an idea in mind, it would be neat to hear.
@NDpatzer said in #43:
> Another thing to think about: If you wanted to know if the ACT really did capture playing strength (a question about its validity), what else do you compare it to instead of ELO? I'm not asking that rhetorically - if you've got an idea in mind, it would be neat to hear.

Other tests have been conducted (some involving humans, some involving engines which could be parameterized, and which humans could opt to play against): www.chessprogramming.org/Test-Positions
@Toadofsky said in #44:
> Other tests have been conducted (some involving humans, some involving engines which could be parameterized, and which humans could opt to play against): www.chessprogramming.org/Test-Positions

This is a really interesting site - thanks for sharing. I'm just starting to read about some of the different positions and in at least a number of cases, the question seems to be whether or not humans can solve the position at all. What would be neat for an assessment (and just in general) is having a database of what incorrect solutions are offered by players of different strength. Does lichess record that for their puzzles, I wonder?