Differences between revisions 5 and 6
Revision 5 as of 2013-04-09 21:27:14
Size: 5217
Editor: 108-231-78-45
Comment:
Revision 6 as of 2013-04-10 18:41:31
Size: 5684
Editor: 32
Comment:
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
This page describes some initial steps that a user can take to help track down the cause of performance problems that might occur between Derby releases. For example, if an application executes a query against Derby 10.1 and it finishes in some amount of time, but the same query in the same application against Derby 10.2 takes longer, then this document describes some steps a user can take to help developers determine the cause of the slow-down. This page describes some initial steps that a user can take to help track down the cause of performance problems that might occur between Derby releases. For example, if an application executes a query against Derby 10.5 and it finishes in some amount of time, but the same query in the same application against Derby 10.8 takes longer, then this document describes some steps a user can take to help developers determine the cause of the slow-down.
Line 7: Line 7:
If possible, try to separate the time taken for query compilation and the time taken for query execution. One way to do this is to use a Prepared``Statement for the query. Measure how long it takes for the "prepareStatement(...)" call to finish, and then measure how long it takes for the "execute" or "executeQuery" call to finish. The first measurement is the compilation time; the second is the execution time. If possible, try to separate the time taken for query compilation and the time taken for query execution. One way to do this is to use a Prepared``Statement for the query. Measure how long it takes for the "prepareStatement(...)" call to finish, and then measure how long it takes for the "execute" or "executeQuery" call to finish. The first measurement is the compilation time; the second is the execution time.   If you don't know which query is causing the problem, set the "derby.language.logStatementText" property to log the exact statement parameters, start and end times to help you track down the slow query. For details on how to do that, see [[https://builds.apache.org/job/Derby-docs/lastSuccessfulBuild/artifact/trunk/out/devguide/index.html|"Working with Derby properties"]] in the Derby Developer Guide and the [[https://builds.apache.org/job/Derby-docs/lastSuccessfulBuild/artifact/trunk/out/ref/rrefproper32213.html| Derby Property Reference]] in the Derby Reference Guide.
Line 11: Line 12:
Tell Derby to log the query plans for its queries by setting the "derby.language.logQueryPlan" system property to true. For details on how to do that, see [[http://db.apache.org/derby/docs/dev/tuning/ctunsetprop34818.html|"Working with Derby properties"]] in the Derby tuning guide.

Once that property is set, re-run the query. Then open derby.log and find the query plan that was chosen for the query. Copy that query plan and hold on to it.
Tell Derby to log the query plans for its queries by setting the "derby.language.logQueryPlan" property to true. Once that property is set, re-run the query. Then open derby.log and find the query plan that was chosen for the query. Copy that query plan and hold on to it.
Line 36: Line 35:
If that's as far as you were able to go (i.e. you were unable to do step 3), then send these numbers and the query plans to the derby-user mailing list and explain the performance problem that you are seeing. Hopefully the community can then use the info you've provided to help isolate and resolve the issue. If you have a database and/or DDL and can attach that to a Jira issue, so much the better. If that's as far as you were able to go (i.e. you were unable to do step 3), then send these numbers and the query plans to the [[http://www.apache.org/foundation/mailinglists.html | derby-user mailing list ]] and explain the performance problem that you are seeing. Hopefully the community can then use the info you've provided to help isolate and resolve the issue. If you have a database and/or DDL and can attach that to a Jira issue, so much the better.

This page describes some initial steps that a user can take to help track down the cause of performance problems that might occur between Derby releases. For example, if an application executes a query against Derby 10.5 and it finishes in some amount of time, but the same query in the same application against Derby 10.8 takes longer, then this document describes some steps a user can take to help developers determine the cause of the slow-down.

Some other general performance tips are available in ApacheCon presentations and in the Derby Tuning Guide of the Derby Documentation.

Step 1: Compilation or Execution?

If possible, try to separate the time taken for query compilation and the time taken for query execution. One way to do this is to use a PreparedStatement for the query. Measure how long it takes for the "prepareStatement(...)" call to finish, and then measure how long it takes for the "execute" or "executeQuery" call to finish. The first measurement is the compilation time; the second is the execution time. If you don't know which query is causing the problem, set the "derby.language.logStatementText" property to log the exact statement parameters, start and end times to help you track down the slow query. For details on how to do that, see "Working with Derby properties" in the Derby Developer Guide and the Derby Property Reference in the Derby Reference Guide.

Step 2: Query plans

Tell Derby to log the query plans for its queries by setting the "derby.language.logQueryPlan" property to true. Once that property is set, re-run the query. Then open derby.log and find the query plan that was chosen for the query. Copy that query plan and hold on to it.

Now, in addition to the logQueryPlan property, also set the "derby.optimizer.noTimeout" property to true. This tells the optimizer to try out ALL join orders until it finds what it thinks is truly the best one. NOTE: depending on the complexity of your query, it could potentially take the optimizer a long time to finish optimizing. If you don't have the time to wait, then feel free to skip this step; but this can be useful info so if you're able, this is a good thing to do.

Once the noTimeout property is set, re-run the query again, and again open derby.log to find out what query plan the optimizer chose.

Step 3: Numbers and query plans for older release

If it's possible for you to use or recreate the database and to access it with the earlier Derby release (the one showing better performance), then repeat steps 1 and 2 for that earlier release.

Step 4: Database and/or DDL

It is often much easier to figure out why the optimizer makes the choices it makes when the person investigating knows what database objects exist. If you have a database that shows the performance problem and that you can share with the community, this can help a LOT in tracking down the problem. If that's the case, mention this is in your emails and attach the database to a Jira issue for the problem.

Otherwise, if you do not want to (or are unable to) provide the full database, a secondary option is to generate the DDL for the database using Derby's dblook utility. The full DDL is easiest to generate and is preferred, but if that's too much information, you could also just try to generate the DDL for the specific tables that are referenced by the query in question.

The more information about the database you can provide, the better. Note even if you can provide the database you may get more immediate help from the community if you also provide the DDL in your original request for help to the community.

Step 5: Bring it to the Derby community

At this point you should have rough estimates for the amount of time taken to compile and execute the query against the later Derby release, and you should have two query plans for that release: one for when the query is run as "normal", and one for when it is run with "derby.optimizer.noTimeout=true".

If that's as far as you were able to go (i.e. you were unable to do step 3), then send these numbers and the query plans to the derby-user mailing list and explain the performance problem that you are seeing. Hopefully the community can then use the info you've provided to help isolate and resolve the issue. If you have a database and/or DDL and can attach that to a Jira issue, so much the better.

If you completed step 3, as well, then please send that info to derby-user, too. If you're so inclined you can take a look at the query plans to see if anything obvious has changed from one release to the other--for example, use of indexes or different join strategies--and describe your findings on the list. But if that's not your itch, feel free to just post the info to the Derby list and to ask the community for help in figuring out what the problem is.

This is, of course, just the first step in diagnosing the problem. But this data--executions times, query plans and DDL--can go a long ways toward successfully diagnosing performance issues that have come up from one Derby release to another. So start with these steps and then see what happens...

PerformanceDiagnosisTips (last edited 2013-04-10 18:41:31 by 32)