View previous topic :: View next topic |
Author |
Message |
Skywing
Joined: 03 Jan 2008 Posts: 321
|
Posted: Fri Feb 29, 2008 22:29 Post subject: MySQL resultset handling |
|
|
edit by Papillon: split from this post.
While we're on the topic of xp_mysql...
Incidentally, re-reviewing this more, looks like there's a leak in xp_mysql if you call "NEXT" to fetch a rowset from a query that is only expected to have one rowset and do not continue iterating "NEXT" until an empty rowset is returned.
This is because the current result, if we've already fetched it, is not freed in general until we try to fetch another rowset and come back empty.
To be clear, however, this is not a regression (in my patch-set) and has been an always-existing bug in the mainline distribution.
There is also this code fragment in xp_mysql (MySQL::Execute) that is highly suspicious to me (also something that has also, always existed in mainline, and is not a new thing in my tree):
Code: | // throw away last resultset if a SELECT statement failed
if (_strnicmp(query, wxT("SELECT"), 6) == 0)
{
mysql_free_result(result);
result = NULL;
row = NULL;
num_fields = 0;
} |
This seems just completely bogus. It's operating on the current querystring and based on what that is, it's deciding whether to delete the last rowset returned by the previous querystring for the immediately previous query. It looks like this might have been an attempt to avoid the memory leak I discussed previously, but it doesn't really work unless you have a query that fails after every successful SELECT that means the conditions I mentioned above, and furthermore, the failing query is also a SELECT.
Does the original author or current maintainer of xp_mysql have any comments on that case? Otherwise I am inclined to fix this by unconditionally releasing the last rowset when we get around to doing another query, and removing the attempted parsing of the current querystring for "SELECT", which doesn't make sense to me at all. However, I'm not exactly in expert at the mysql client library, so if there's anything I'm missing before I make that change, I'd certainly like to know about it... |
|
Back to top |
|
|
Papillon x-man
Joined: 28 Dec 2004 Posts: 1060 Location: Germany
|
Posted: Sat Mar 01, 2008 10:30 Post subject: |
|
|
Skywing wrote: | While we're on the topic of xp_mysql...
Incidentally, re-reviewing this more, looks like there's a leak in xp_mysql if you call "NEXT" to fetch a rowset from a query that is only expected to have one rowset and do not continue iterating "NEXT" until an empty rowset is returned.
This is because the current result, if we've already fetched it, is not freed in general until we try to fetch another rowset and come back empty.
|
Let me try to make sure I understand this correctly. Do you think that the memory region pointed to by "result" is not freed by mysql_free_result in certain circumstances ? If so, please point out the exact situation where this can occur, because I do not see it. They way I see it, mysql_free_result is called in any case, not matter what the program flow is.
Skywing wrote: |
To be clear, however, this is not a regression (in my patch-set) and has been an always-existing bug in the mainline distribution.
|
Of course
Skywing wrote: |
There is also this code fragment in xp_mysql (MySQL::Execute) that is highly suspicious to me (also something that has also, always existed in mainline, and is not a new thing in my tree):
Code: | // throw away last resultset if a SELECT statement failed
if (_strnicmp(query, wxT("SELECT"), 6) == 0)
{
mysql_free_result(result);
result = NULL;
row = NULL;
num_fields = 0;
} |
This seems just completely bogus. It's operating on the current querystring and based on what that is, it's deciding whether to delete the last rowset returned by the previous querystring for the immediately previous query. It looks like this might have been an attempt to avoid the memory leak I discussed previously, but it doesn't really work unless you have a query that fails after every successful SELECT that means the conditions I mentioned above, and furthermore, the failing query is also a SELECT.
|
No, actually it's quite simple. Imagine a construct like this:
Code: |
SQLExecDirect("select * form my_table");
while (SQLFetch() == SQL_SUCCESS)
{
s = SQLGetData(1);
}
|
This will fail of course (invalid sql command). Now if you do not delete the previous resultset, the NWScript code will never notice because the plugin happily returns the rows from the old resultset of some previous query. On the other hand, if for example an UPDATE statement failed, the old resultset is still valid and thus not deleted here. _________________ Papillon |
|
Back to top |
|
|
Gryphyn
Joined: 20 Jan 2005 Posts: 431
|
Posted: Sat Mar 01, 2008 10:58 Post subject: |
|
|
Papillon wrote: |
No, actually it's quite simple. Imagine a construct like this:
Code: |
SQLExecDirect("select * form my_table");
while (SQLFetch() == SQL_SUCCESS)
{
s = SQLGetData(1);
}
|
This will fail of course (invalid sql command). Now if you do not delete the previous resultset, the NWScript code will never notice because the plugin happily returns the rows from the old resultset of some previous query. On the other hand, if for example an UPDATE statement failed, the old resultset is still valid and thus not deleted here. |
This is partly because SQLFetch() returns column count (of the result set) to determine success or failure, rather than the 'error code' provided by the API call. If you check NWNXJr or SQLServer they both implement SQLSuccess/SQLFailure based on the ODBC API, prior to assuming a valid result set (even if it is empty), MySQL should be doing something similar.
Cheers
Gryphyn |
|
Back to top |
|
|
Skywing
Joined: 03 Jan 2008 Posts: 321
|
Posted: Sat Mar 01, 2008 15:57 Post subject: |
|
|
Papillon wrote: | Let me try to make sure I understand this correctly. Do you think that the memory region pointed to by "result" is not freed by mysql_free_result in certain circumstances ? If so, please point out the exact situation where this can occur, because I do not see it. They way I see it, mysql_free_result is called in any case, not matter what the program flow is.
|
Yep, looks like you're right. Missed the part that freed the MySQL::result in the non-failure case, somehow. Definitely not a bug that I see, anymore.
Papillon wrote: | No, actually it's quite simple. Imagine a construct like this:
Code: |
SQLExecDirect("select * form my_table");
while (SQLFetch() == SQL_SUCCESS)
{
s = SQLGetData(1);
}
|
This will fail of course (invalid sql command). Now if you do not delete the previous resultset, the NWScript code will never notice because the plugin happily returns the rows from the old resultset of some previous query. On the other hand, if for example an UPDATE statement failed, the old resultset is still valid and thus not deleted here. |
That makes a bit more sense given what you pointed out earlier. However, I'm still not seeing why you're destroying all but the current resultset before initiating the query, and then only destroying the current resultset in either that specific failure case or when both a query succeeds and it returns a resultset of it's own. Is the intention then to allow queries that do not return data to be interspersed while parsing out the result of a data-returning query?
If this is the case, multi-statement support still seems to break in that any remaining statements which we haven't processed yet, i.e. any remaining resultsets, were already thrown away by this loop at the start of MySQL::Execute:
Code: | // eat any leftover resultsets so mysql does not get out of sync
while (mysql_more_results(&mysql))
{
mysql_next_result(&mysql);
newResult = mysql_store_result(&mysql);
mysql_free_result(newResult);
} |
IOW, it would seem that once you do an UPDATE or the like, only the result of the "current" statement is available for further processing.
I suppose fragment above is what's been throwing me for a loop (no pun intended), as it's inconsistent with the policy of how MySQL::result is released.
If the intention is to allow UPDATEs followed by further processing of rows returned by a SELECT or any other statement which returns rows, then I think a more sustainable approach would be to just introduce the concept of a query result object that encapsulates the MYSQL_RES object.
On the subject of that (allowing nested queries)...
I was looking through the MySQL manual, http://dev.mysql.com/doc/refman/5.1/en/c-api-multiple-queries.html - and things are a bit unfortunate there. Seems like things should work to encapsulate a MYSQL_RES object returned by mysql_store_result, with respect to being able to execute subsequent queries on that connection without destroying the data in that MYSQL_RES object. However, multiple statement support seems to throw that off a bit.
http://dev.mysql.com/doc/refman/5.1/en/mysql-next-result.html seems to claim that it's illegal to advance to the next resultset without destroying the current resultset, which would seem to indicate that you couldn't just suck out all of the available resultsets when you make the query and store them into your result object.
Specifically...
Quote: | Before calling mysql_next_result(), you must call mysql_free_result() for the preceding statement if it is a query that returned a result set. |
However, I should note that we already violate this rule with the loop I mentioned above, as that will iterate through and destroy any resultsets *after* the current resultset, and then later on destroy the current resultsets. So, either the mysql documentation is wrong and it is supported, which would be a good thing in that you would be able to support nested queries with multiple statement support (which MySQL seems to claim is required for stored procedures), or things are already a bit broken...
Anyways, as I said, my familiarity with the MySQL client libraries is at this point just limited to browsing their documentation (which is a bit on the scarce side on a number of points, unfortunately). Ideally, however, nested queries supported with a handle to a result blob object you can iterate through seems like it'd an architecturally cleaner approach, if that's what we're trying to achieve here with the keeping of the current resultset across a non-SELECT. (It also gets us out of having to try and parse querystrings, which is ugly.)
This would of course require a new set of nwscript APIs to allow the addition of an extra parameter (result object handle or the like) returned by the query call and passed to fetch/getdata calls. |
|
Back to top |
|
|
Papillon x-man
Joined: 28 Dec 2004 Posts: 1060 Location: Germany
|
Posted: Sat Mar 01, 2008 22:25 Post subject: |
|
|
Gryphyn wrote: |
This is partly because SQLFetch() returns column count (of the result set) to determine success or failure, rather than the 'error code' provided by the API call. If you check NWNXJr or SQLServer they both implement SQLSuccess/SQLFailure based on the ODBC API, prior to assuming a valid result set (even if it is empty), MySQL should be doing something similar.
|
Yes, but this is exactly what the MySQL docs have to say on this subject. If I remember correctly, there is no better way to find out if a query did succeed or not, at least according to the docs. _________________ Papillon |
|
Back to top |
|
|
Papillon x-man
Joined: 28 Dec 2004 Posts: 1060 Location: Germany
|
Posted: Sat Mar 01, 2008 22:57 Post subject: |
|
|
Skywing wrote: |
That makes a bit more sense given what you pointed out earlier. However, I'm still not seeing why you're destroying all but the current resultset before initiating the query, and then only destroying the current resultset in either that specific failure case or when both a query succeeds and it returns a resultset of it's own. Is the intention then to allow queries that do not return data to be interspersed while parsing out the result of a data-returning query?
|
Exactly. It was requested by some users back then and as it made sense, I implemented it that way. Btw, it is no wonder you were seeing potential bugs and found that part of the code strange: It is.
Skywing wrote: | If this is the case, multi-statement support still seems to break in that any remaining statements which we haven't processed yet, i.e. any remaining resultsets, were already thrown away by this loop at the start of MySQL::Execute:
|
Yes, you are right. This is the compromise which I accepted back then as "good enough" for practical use. The main idea was to allow for stored procedures to return result sets, and stack multiple statements, where maybe one or two return rows. It was not meant to cover every thinkable combination with dozens of result sets.
Skywing wrote: |
If the intention is to allow UPDATEs followed by further processing of rows returned by a SELECT or any other statement which returns rows, then I think a more sustainable approach would be to just introduce the concept of a query result object that encapsulates the MYSQL_RES object.
|
Hmm yes, probably. But is it really worth it ? What happens if some crazy programmer selects * from a really big table ? Do you really want to retrieve megabytes of data from the server in one swoop and cache it locally in the mysql plugin ? That was (and is) my reasoning for not handling the result sets outside of MySQL.
Skywing wrote: |
On the subject of that (allowing nested queries)...
I was looking through the MySQL manual, http://dev.mysql.com/doc/refman/5.1/en/c-api-multiple-queries.html - and things are a bit unfortunate there. Seems like things should work to encapsulate a MYSQL_RES object returned by mysql_store_result, with respect to being able to execute subsequent queries on that connection without destroying the data in that MYSQL_RES object. However, multiple statement support seems to throw that off a bit.
http://dev.mysql.com/doc/refman/5.1/en/mysql-next-result.html seems to claim that it's illegal to advance to the next resultset without destroying the current resultset, which would seem to indicate that you couldn't just suck out all of the available resultsets when you make the query and store them into your result object.
Specifically...
Quote: | Before calling mysql_next_result(), you must call mysql_free_result() for the preceding statement if it is a query that returned a result set. |
|
MySQL gets totally upset (out of sync) if you issue another statement that returns a result set, if you have no retrieved all result sets from the previous query. That is the raison d'être of the loop. But what you are suggesting here should work ok, technically, as long as you retrieve every row of every result set at once.
Skywing wrote: |
However, I should note that we already violate this rule with the loop I mentioned above, as that will iterate through and destroy any resultsets *after* the current resultset, and then later on destroy the current resultsets. So, either the mysql documentation is wrong and it is supported, which would be a good thing in that you would be able to support nested queries with multiple statement support (which MySQL seems to claim is required for stored procedures), or things are already a bit broken...
|
This might be a problem, yes. The code accounts for the situation where you cycle through one result set, but do not call NEXT and instead issue another statement. I have to look up my regression tests if the case you mentioned is tested and whether it is a real problem which needs to be coded around.
Skywing wrote: |
Anyways, as I said, my familiarity with the MySQL client libraries is at this point just limited to browsing their documentation (which is a bit on the scarce side on a number of points, unfortunately). Ideally, however, nested queries supported with a handle to a result blob object you can iterate through seems like it'd an architecturally cleaner approach, if that's what we're trying to achieve here with the keeping of the current resultset across a non-SELECT. (It also gets us out of having to try and parse querystrings, which is ugly.)
This would of course require a new set of nwscript APIs to allow the addition of an extra parameter (result object handle or the like) returned by the query call and passed to fetch/getdata calls. |
This is something I would like to avoid. I was playing around with the idea of implementing a way to identify multiple result sets (for example by number), but then found this to be way above what 99% of NWNX4 users will need, while still introducing a complex new concept which would not be backwards compatible.
No, just kidding. I was just lazy ! _________________ Papillon |
|
Back to top |
|
|
Skywing
Joined: 03 Jan 2008 Posts: 321
|
Posted: Sun Mar 02, 2008 0:23 Post subject: |
|
|
Papillon wrote: | Hmm yes, probably. But is it really worth it ? What happens if some crazy programmer selects * from a really big table ? Do you really want to retrieve megabytes of data from the server in one swoop and cache it locally in the mysql plugin ? That was (and is) my reasoning for not handling the result sets outside of MySQL. |
Well, we're already going to be caching megabytes of data anyway in that case, since we're using mysql_store_result instead of mysql_use_result. Regardless, if somebody wants to be foolish and shoot themselves in the foot, they're going to find a way to do it one way or another. I've in general found nested statements to not be an entirely uncommon thing myself, but that is just my experience (to be clear, not in NWN2, but in other projects - I don't do toolset work in NWN2 presently). Adding support for this doesn't make things worse at all unless you go out of your way to shoot yourself in the foot with it, e.g., do numerous levels of nested queries returning massive datasets. Plenty of other reasonable use cases besides that, though.
Papillon wrote: | This might be a problem, yes. The code accounts for the situation where you cycle through one result set, but do not call NEXT and instead issue another statement. I have to look up my regression tests if the case you mentioned is tested and whether it is a real problem which needs to be coded around. |
I'd tend to think that it's probably more just a case of the docs being wrong if I had to hazard a guess, but I'd definitely be interested to know. If you can't pull out multiple resultsets at the same time, then the local cursor idea kind of falls out of the water. Hooray for limited MySQL in that regard, if that comes to pass. (If that is indeed so, then this really limits the ability to do nested queries clientside, which seems fishy to me though, as lots of database abstraction layers (e.g. perl dbi) seem to at least claim to support that for MySQL, at least after a few minutes of quick research. Perhaps they just don't support multiple resultsets/stored procedures, although that also sounds a bit... fishy.)
Papillon wrote: | This is something I would like to avoid. I was playing around with the idea of implementing a way to identify multiple result sets (for example by number), but then found this to be way above what 99% of NWNX4 users will need, while still introducing a complex new concept which would not be backwards compatible.
No, just kidding. I was just lazy ! |
No reason that it would need to break backwards compatibility I think, the old APIs can still be supported by having the notion of an implicit current result-glob-object that is used for them. |
|
Back to top |
|
|
Gryphyn
Joined: 20 Jan 2005 Posts: 431
|
Posted: Sun Mar 02, 2008 1:22 Post subject: |
|
|
Just to bring the multi-result set thing into focus.
The most common appearance of this is when using StoredProc's.
Now not that many system procs are executed for NWN data, but they typically have multiple result sets, and are not always consistance in column count for each set.
This also highlights the TWO information channels (I'll call them that) that are used for returning data.
1. Feedback: these are the Messages/Codes that are returned (Info,Warn,Error etc)
2. ResultSet: this is the the actual data that is returned (or not if empty)
Send any feedback to the Log...very useful for debugging
Code: | //Diagnostics
bool SQLServer::LogDiagnostics(SQLSMALLINT HandleType, SQLHANDLE Handle)
{
if (sqlResult == SQL_SUCCESS) return true;
if (sqlResult == SQL_NO_DATA) return true;
wxString Symbol = "!";
if (sqlResult == SQL_SUCCESS_WITH_INFO) wxString Symbol = "*";
SQLSMALLINT RecNumber = 1;
while ((DiagResult = SQLGetDiagRec(HandleType, Handle, RecNumber, DiagIdentifier, &DiagInfoPtr, DiagMessage, sizeof(DiagMessage), &DiagLength)) != SQL_NO_DATA)
{
//wxLogMessage(wxT("* %s(%d) %s"), DiagIdentifier, DiagInfoPtr, DiagMessage);
wxLogMessage(wxT("%s [%s] %s"), Symbol, DiagIdentifier, DiagMessage);
RecNumber++;
}
if (sqlResult == SQL_SUCCESS_WITH_INFO) return true;
return false;
} |
Code: |
...routine to GET data.
<snip>
sqlResult = SQLGetData(hQuery[Context], nParam2, SQL_C_CHAR, Buffer, sizeof(Buffer), &isNull);
if (!LogDiagnostics(SQL_HANDLE_STMT, hQuery[Context])) break;
if (sqlResult == SQL_NO_DATA) break;
if (isNull == SQL_NULL_DATA) break;
</snip>
...process the ResultSet
|
Note the above is using ODBC, but the MySQL API can easily replace it.
*hQuery[Context] allows for side-by-side queries. (but it's just a query handle)
added to this...
Code: | int SQLServer::nwStatus()
{
switch (sqlResult)
{
case SQL_SUCCESS: return 1;
case SQL_SUCCESS_WITH_INFO: return 1;
case SQL_NO_DATA: return 2;
default: return 0;
}
} |
I distinguish between SUCCESS, NO_DATA (also SUCCESS) and an ERROR. (this gets back to NWN)
Cheers
Gryphyn |
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
Powered by phpBB © 2001, 2005 phpBB Group
|