![]() |
#1 |
Angband Devteam member
|
Looking for a SQL guru
Is there anybody with some didactic talent who would be willing to help me get to grips with SQL? I have the stats from twenty-five million dungeons for 3.3.0 on my server (250,000 runs of levels 1-100; password available on request for those interested in browsing), but I lack the skill to make the queries I want. I could of course RTFM but I thought it might be more enjoyable to learn from a fellow Angbander, if anyone was interested in sharing their expertise by email on an irregular basis. (I mention didactic talent because I'd quite like to learn the concepts, rather than just get someone to write queries for me.)
For example, I don't even know if it's possible for queries to produce two-dimensional (tabular) output, instead of one-dimensional lists. So I know that I can ask it "list the numbers of base items which are found with +speed between +1 and +4 on dungeon levels 1 to 50", but can I ask it instead for a table, showing me the numbers of items with each of +1, +2, +3 and +4? Or do I just have to run four separate queries for that? Ho hum. The joys of stats await ...
__________________
"3.4 is much better than 3.1, 3.2 or 3.3. It still is easier than 3.0.9, but it is more convenient to play without being ridiculously easy, so it is my new favorite of the versions." - Timo Pietila |
![]() |
![]() |
![]() |
#2 |
Adept
Join Date: Nov 2009
Posts: 173
![]() |
Which database engine are you using?
Also it maybe interesting to see the table structure.
__________________
Proud candidate for the Angband Darwin Award! |
![]() |
![]() |
![]() |
#3 |
Administrator
|
Sure, feel free to fire your questions my way.
__________________
See the elves and everything! http://angband.oook.cz |
![]() |
![]() |
![]() |
#4 |
Veteran
Join Date: Jun 2007
Posts: 1,372
![]() |
IMO, SQL is not really all that great for this kind of "spelunking". You'd probably be better of just doing this using custom python/perl scripts or something.
... at least as long as you can either 1) fit data into memory, or 2) do whatever you need to do in a few passes. If you can't you're probably going to need a full database with good indexes anyway... that is likely to be at least as much effort as just writing a few scripts. Of course, if you just want to learn... have at it! ![]() EDIT: In other words: RDMBS'es aren't magic that you can sprinkle onto a "lots of data" problem and expect things to just work faster. |
![]() |
![]() |
![]() |
#5 |
Rookie
Join Date: Aug 2011
Location: Münster, Germany
Posts: 7
![]() |
What better way to have a first post than this?
I wouldn't call myself SQL guru, but I'm using it on a daily basis, it's what I do at work... What you are looking for here is one of many aggregation functions, namely COUNT(). In your query, you want to have two fields: the value of +speed and how often that value was found. So your query begins like this (keywords in upper case): SELECT plusspeed, COUNT(*) FROM yourtable The function COUNT() does what it says; it counts the rows in your result. Only COUNT(*) in your query would give you the total sum of results fitting your filters. Put some detail fields in your SELECT and you can group that sum (more on that later). Then you want to filter what is queried, so your query becomes: SELECT plusspeed, COUNT(*) FROM yourtable WHERE (dungeonlevel BETWEEN 1 AND 50) AND (plusspeed BETWEEN 1 AND 4) This won't work, it will give you an error message. This is because of that aggregration function COUNT(). You need to group every field in SELECT which is _not_ aggregated (in this example only plusspeed): SELECT plusspeed, COUNT(*) FROM yourtable WHERE (dungeonlevel BETWEEN 1 AND 50) AND (plusspeed BETWEEN 1 AND 4) GROUP BY plusspeed Last, you could order your result: SELECT plusspeed, COUNT(*) FROM yourtable WHERE (dungeonlevel BETWEEN 1 AND 50) AND (plusspeed BETWEEN 1 AND 4) GROUP BY plusspeed ORDER BY plusspeed If you want to go into detail, e. g. see at which dungeonlevel which speed enhancement was found, you can have two (or more) detail fields. Just remember to group these: SELECT dungeonlevel, plusspeed, COUNT(*) FROM yourtable WHERE (dungeonlevel BETWEEN 1 AND 50) AND (plusspeed BETWEEN 1 AND 4) GROUP BY dungeonlevel, plusspeed ORDER BY dungeonlevel, plusspeed The last two queries should work for your table, given you query existing fields, of course. Feel free to ask away, I find writing SQL to be quite fun. Also, if you happen to have more questions, I'd be happy to try to help you answering those. Knowing (and having access) to your database would help there... ;-P Hope you can understand this blubber of mine... ;-) Cheers, NinjaN |
![]() |
![]() |
![]() |
#6 |
Angband Devteam member
|
@NinjaN - thank you, that's awesome! Really easy to follow - I went to the database and re-phrased your query into this:
SELECT level, pval, COUNT(*) FROM `wearables_pval_flags` WHERE (level BETWEEN 1 AND 50) AND (pval BETWEEN 1 AND 4) GROUP BY level, pval ORDER BY level, pval and it worked, first time! There were 67 items with +4 speed dropped on dl1, out of a total of 734 items with +speed dropped on dl1 (that seems quite high - 0.3% chance in any given game of getting a speed item on dl1?!). Many thanks to you and pav for your willingness to help - I will come back with more queries on queries. @SaThaRiel - the table structure is set out in the source code in src/stats/ - myshkin designed it, and so far it has given me all the info I have asked for! @AnonymousHero - you may be right that perl/whatever would be better for running custom queries, but we chose SQL so that the raw data was transferable between people in a format that's widely used. Generating the data takes quite a long time (1.2s per run for me) and has a footprint of ~2.5GB, so we wanted to enable people to query data without having to run stats themselves.
__________________
"3.4 is much better than 3.1, 3.2 or 3.3. It still is easier than 3.0.9, but it is more convenient to play without being ridiculously easy, so it is my new favorite of the versions." - Timo Pietila |
![]() |
![]() |
![]() |
#7 | |
Prophet
Join Date: Apr 2007
Location: Climbing up from hole I just dug.
Posts: 4,096
![]() |
Quote:
|
|
![]() |
![]() |
![]() |
#8 | |
Angband Devteam member
|
Quote:
EDIT: Doh!! I just realised that my query looks at ALL pvals, not just speed. I need to add WHERE of_idx IS speed ... ... ok, done that - but now the numbers look too low. My query is: SELECT level, pval, COUNT(*) FROM `wearables_pval_flags` WHERE (level BETWEEN 1 AND 100) AND (pval BETWEEN 1 AND 50) AND (of_idx IS 13) GROUP BY level, pval ORDER BY level, pval and this shows me only 30 +speed items dropped at dl1 in 250,000 games. More worryingly, it doesn't show any kind of spike at dl5 (where =telep comes into depth) or dl15 (where =escaping comes into depth). So I am now wondering whether the wearables_pval_flags table is right - I'll need to cross-check against a query about those specific rings to be certain.
__________________
"3.4 is much better than 3.1, 3.2 or 3.3. It still is easier than 3.0.9, but it is more convenient to play without being ridiculously easy, so it is my new favorite of the versions." - Timo Pietila Last edited by Magnate; August 31, 2011 at 14:38. |
|
![]() |
![]() |
![]() |
#9 |
Adept
Join Date: Nov 2009
Posts: 173
![]() |
Magnate: Im not getting any smarter from the src/stats directory...looks more like some db driver capabilities.
Can you mail me the database-file? You should get my mail address from trac. You qurey looks really wrong to me. But i need to figure out whats wrong.
__________________
Proud candidate for the Angband Darwin Award! |
![]() |
![]() |
![]() |
#10 |
Angband Devteam member
|
It's about 500MB, so I don't think that'd be a good idea. I sent you the url and pw in a PM, but had some problems (Oook said something about a missing security token a few times). If you don't get the PM, let me know and I'll email you.
__________________
"3.4 is much better than 3.1, 3.2 or 3.3. It still is easier than 3.0.9, but it is more convenient to play without being ridiculously easy, so it is my new favorite of the versions." - Timo Pietila |
![]() |
![]() |
![]() |
Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
Thread Tools | |
Display Modes | |
|
|