View previous topic :: View next topic |
Author |
Message |
Disco
Joined: 06 Dec 2006 Posts: 152
|
Posted: Wed May 02, 2007 18:00 Post subject: MySQL (sub)query making server go bonkers |
|
|
Hello,
I want to get all the accounts associated with the cdkeys of a known account in a huge table with session data. This is the general data format:
Code: | cdkey account date
ABCABC Lulu ...
EFGEFG Wannabe ...
XYZXYZ Yadayada ...
KLMKLM Lulu ...
KLMKLM Serbo ... |
Say.. I have the account Lulu and I want all the accounts associated with it. That's Lulu (ABCABC & KLMKLM ), and Serbo ( KLMKLM ).
I started with this subquery:
Code: | SELECT account FROM table WHERE cdkey =
( SELECT cdkey FROM t1 WHERE account = 'Lulu' GROUP BY cdkey LIMIT 1)
GROUP BY account ORDER BY account |
This only works with one individual result from the subquery, hence the LIMIT. Some subqueries return multiple results, so I tried:
Code: | SELECT account FROM table WHERE cdkey IN
( SELECT cdkey FROM t1 WHERE account = 'Lulu' GROUP BY cdkey)
GROUP BY account ORDER BY account |
This made the server unreachable for 3 hours. Complete 100% load, couldn't remote login or anything. I also tried this at home, with the same table and a dual-core machine. One core at 100%. Indexing on cdkey+account doesn't help.
I think it's checking every possible combination in some horribly inefficient manner when using IN (or =ANY). Intuitively I'd expect it to run the same query as the = option, but n cdkey times. The = option returns inmediately, so I'd guess runnign that 3 or 4 times wouldn't give a noticeable delay. In the test case the subquery shoudl return 2 keys. We do have 800000+ records, though.
I already circumvented this behaviour by getting the cdkeys first, and then the accounts for each key (this is about PHP scripts, not NWN), but I am still curious why the IN statement is nuking MySQL. |
|
Back to top |
|
|
Grinning Fool
Joined: 12 Feb 2005 Posts: 264
|
Posted: Wed May 02, 2007 18:35 Post subject: |
|
|
I'm... honestly not entirely sure what that query would do, especially with the group by clause. Try this instead (I think I ahve it right, your table and column names are a bit hard to follow )
Quote: |
SELECT table.account FROM table, t1 WHERE table1cdkey = t1.cdkey AND t1.account ='Lulu' |
_________________ Khalidine, a NWN2 persistent world
Looking for volunteers. |
|
Back to top |
|
|
Gryphyn
Joined: 20 Jan 2005 Posts: 431
|
Posted: Thu May 03, 2007 0:00 Post subject: |
|
|
Code: | SELECT DISTINCT account, cdkey
FROM table
WHERE cdkey IN
(SELECT DISTINCT cdkey FROM table WHERE account = 'Lulu')
ORDER BY account
/*remove cdkey from the outer select if you only want account info*/
|
This should be all you need.
*Note an index on (account, cdkey) will greatly improve performance.
--the index MUST use this order of fields
As to why you get Nuked.
your scanning ALL 800K+ records and sorting them, then filtering the result (3 execution steps on all 800K records)
then your comparing this subset against all 800K records for EACH row in the subset. *well not you, the database.
Check your execution plan - you'll see several billion row accesses.
Even though you're collecting all this info, just how relevant is it historically?
I personally would extend the index to include the date and make a date-range part of the selection
-- date BETWEEN <today> - <n days> AND <today>
if n = 7 you'd only be getting info about the last week (30->month) etc
This also gives you the opportunity to archive off your very old data.
--not sure if MySQL optimizes BETWEEN on dates better the "date > ?date?", some databases do.
Cheers
Gryphyn |
|
Back to top |
|
|
Disco
Joined: 06 Dec 2006 Posts: 152
|
Posted: Thu May 03, 2007 8:07 Post subject: |
|
|
Well, we got a lot of players, and some have this pesky habit of switching accounts and cdkeys like I change socks. Which is quite often, let me assure you that. We store some goodies on their accounts, and sometimes they want them switched from account 1 to 2. I just want to make a dropdown list of all their associated accounts so it's a matter of selecting the proper one and clicking 'move', that's all.
I was indeed thinking about a table join first but couldn't think of a simple one. Thanks!
I now see what I did wrong: the = option generated a "too many result sets" error, so I put the LIMIT onnit. That kinda killed the whole subquery. |
|
Back to top |
|
|
Grinning Fool
Joined: 12 Feb 2005 Posts: 264
|
Posted: Thu May 03, 2007 17:11 Post subject: |
|
|
When it's possible possible, joining the table (even if to itself) will perform better than a subquery. _________________ Khalidine, a NWN2 persistent world
Looking for volunteers. |
|
Back to top |
|
|
xaltos
Joined: 03 Jun 2006 Posts: 31 Location: Germany
|
Posted: Thu May 03, 2007 19:49 Post subject: |
|
|
I would collect the IP address of the player as well.
Maybe not all , but a few, if they change a lot.
I have seen it several times that a player that got banned for some reason tries to sneak back into the server with a new Gamespy ID and/or Gamespy ID.
In this case is the IP very helpfull to connect the diffrent accounts.
On the other side are maybe several members of a family active on your server and share the same accounts or the same computers or even they use a router and all have the same IP.
" Oh , Papa isn't at home, I can use his computer, he has the better CPU"
So you can't never trust your data, if you don't know the background of the players. _________________ xaltos
NWN2_Audolo |
|
Back to top |
|
|
Disco
Joined: 06 Dec 2006 Posts: 152
|
Posted: Fri May 04, 2007 0:37 Post subject: |
|
|
We collect a lot of stuff, including IP. Privacy isn't something I respect when it comes to players that try to mess up our server or other players. This implementation is a service to them, though.
I see I made a bit of a mistake when quoting my queries, btw. This
Code: | SELECT account FROM table WHERE cdkey IN
( SELECT cdkey FROM t1 WHERE account = 'Lulu' GROUP BY cdkey)
GROUP BY account ORDER BY account |
should be this
Code: | SELECT account FROM t1 WHERE cdkey IN
( SELECT cdkey FROM t1 WHERE account = 'Lulu' GROUP BY cdkey)
GROUP BY account ORDER BY account
|
|
|
Back to top |
|
|
Disco
Joined: 06 Dec 2006 Posts: 152
|
Posted: Fri May 04, 2007 9:09 Post subject: |
|
|
Bit of an update:
Code: |
SELECT DISTINCT account
FROM table
WHERE cdkey IN
(SELECT DISTINCT cdkey FROM table WHERE account = 'Lulu')
ORDER BY account |
still nukes....
This one works beautifully, though:
Code: | SELECT DISTINCT t1.account FROM table as t1, table as t2 WHERE t1.cdkey = t2.cdkey AND t2.account ='Lulu' |
Can't say I really understand what I am doing, but... a positive result is a positive result.
Thanks for all the info, folks! |
|
Back to top |
|
|
|