Angband Forums

Angband Forums (http://angband.oook.cz/forum/index.php)
-   Vanilla (http://angband.oook.cz/forum/forumdisplay.php?f=3)
-   -   list of bugs and wishes (http://angband.oook.cz/forum/showthread.php?t=9046)

fph October 28, 2018 22:15

Quote:

Originally Posted by Nick (Post 133944)
This is a common complaint, usually refuted by an appeal to advanced mathematical statistics :)

As an expert in advanced mathematics :p, the binomial test for 26 failures out of 335 at 4% failure rate gives a one-sided p-value of 0.00114. This means that you can expect to observe 26 or more failures about 0.11% of the times.

Usually, in a single, non-repeated experiment, a result is considered statistically significant if the p-value is 0.05 or below. So yes, that seems a statistically unusually high failure rate, and worth investigating. There is usually a reporting bias, in that only 'unusual' results get reported to the devs, so the best thing would be collecting completely new data to test it.

Pete Mack October 28, 2018 22:16

BTW: for the rest one turn keymap, I suggest 's', and make it rest 5 or 10 turns (or until disturbed.) I used to use the old search command for this all the time: wait until a nearby monster moves, or appears around a corner .

fph October 28, 2018 22:36

Quote:

Originally Posted by fph (Post 133945)
There is usually a reporting bias, in that only 'unusual' results get reported to the devs, so the best thing would be collecting completely new data to test it.

Ran some tests: created a human mage, advanced him with debug commands to lvl21 and enough int to get 4% failure rate, and cast detect monsters 335 times. Got 14 failures. p-value of 0.47. Verdict: the RNG seems to work OK. Complaint "refuted by an appeal to advanced mathematical statistics". :)

Pete Mack October 28, 2018 22:51

For low probability events, a poisson distribution is generally appropriate. It always has a fat tail (because you can't have fewer than zero events.) Yes, it will eventually converge to normal (as any distribution will.) But even at expectation value E=14, while the curve looks close to normal, the tail is significantly fatter at 1.5*E. The estimate of 0.11% is too low by a significant amount.

fph October 28, 2018 23:57

Quote:

Originally Posted by Pete Mack (Post 133948)
For low probability events, a poisson distribution is generally appropriate. It always has a fat tail (because you can't have fewer than zero events.) Yes, it will eventually converge to normal (as any distribution will.) But even at expectation value E=14, while the curve looks close to normal, the tail is significantly fatter at 1.5*E. The estimate of 0.11% is too low by a significant amount.

I don't follow you. These are independent trials, so their exact distribution is binomial/Bernoulli. I'm not approximating anything with a normal. 0.11% is what is returned by
Code:

scipy.stats.binom_test(26,335,0.04, alternative='greater')

Pete Mack October 29, 2018 03:58

Right. My bad.

PowerWyrm October 29, 2018 09:17

Note: I've already failed a spell with 1% failure rate five times in a row :)

Sky October 29, 2018 12:37

Quote:

Originally Posted by fph (Post 133947)
Ran some tests: created a human mage, advanced him with debug commands to lvl21 and enough int to get 4% failure rate, and cast detect monsters 335 times. Got 14 failures. p-value of 0.47. Verdict: the RNG seems to work OK. Complaint "refuted by an appeal to advanced mathematical statistics". :)

All you've proven is that under controlled test circumstances the rng doesnt fail.
I hope it's evident why that isnt enough.

Everyone has reported unusual rng results. At some point you gotta accept that it's not an epidemic of bias and something in the code is doing something it shouldnt. And god knows what it is. What is the rng based on anyway, system clock? That has to go through the OS which could unwittingly manipulate the results.

Empyrical observation does have its merits, y'know.

Oh btw congrats PowerWyrm on a 1:10,000,000,000 chance.

fph October 29, 2018 19:52

Quote:

Originally Posted by Sky (Post 133960)
All you've proven is that under controlled test circumstances the rng doesnt fail.
I hope it's evident why that isnt enough.

Everyone has reported unusual rng results. At some point you gotta accept that it's not an epidemic of bias and something in the code is doing something it shouldnt. And god knows what it is. What is the rng based on anyway, system clock? That has to go through the OS which could unwittingly manipulate the results.

Empyrical observation does have its merits, y'know.

Oh btw congrats PowerWyrm on a 1:10,000,000,000 chance.

As far as I understand (I didn't write the code) the RNG does one call to time() in Rand_init() when the game is started, then it's all state-based. If I read the branches correctly it uses this RNG.

Nick October 29, 2018 20:42

Quote:

Originally Posted by Sky (Post 133960)
Everyone has reported unusual rng results. At some point you gotta accept that it's not an epidemic of bias and something in the code is doing something it shouldnt.

Actually, no. Humans are built to recognise patterns; if there is no actual pattern, we will still find one. The correct thing to do is to have a mathematically sound source of randomness, check the implementation from time to time, and otherwise accept that people will continue to tell you that the RNG is broken :)


All times are GMT +1. The time now is 03:27.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, vBulletin Solutions Inc.