Published by Marco on 8. Jan 2016 14:52:40
Updated by Marco on 15. Jan 2016 10:34:29
------------------------------------------------------------------------

An oft-quoted bit of software-development sagacity is

"Premature optimization is the root of all evil."

As is so often the case with quotes -- especially those on the Internet [1] --
this one has a slightly different meaning in context. The snippet above invites
developers to overlook the word "premature" and interpret the received wisdom as
"you don't ever need to optimize."

Instead, Knuth's full quote actually tells you how much of your code is likely
to be affected by performance issues that matter (highlighted below).

"Programmers waste enormous amounts of time thinking about, or worrying about,
the speed of noncritical parts of their programs, and these attempts at
efficiency actually have a strong negative impact when debugging and maintenance
are considered. We should forget about small efficiencies, say about 97% of the
time: premature optimization is the root of all evil. Yet we should not pass up
our opportunities in that critical 3%."

[An Optimization Opportunity in Quino [2]]

In other articles, I'd mentioned that we'd upgraded several solutions to Quino 2
in order to test that the API was solid enough for a more general release. One
of these products is both quite large and has a test suite of almost 1500 tests.
The product involves a lot of data-import and manipulation and the tests include
several scenarios where Quino is used very intensively to load, process and save
data.

These tests used to run in a certain amount of time, but started taking about
25% longer after the upgrade to Quino 2.

[Measuring Execution Speed]

Before doing anything else -- making educated guesses as to what the problem
could be, for example -- we measure. At Encodo, we use JetBrains DotTrace to
collect performance profiles.
 
There is no hidden secret: the standard procedure is to take a measurement
before and after the change and to compare them. However, so much had changed
from Quino 1.13 to Quino 2 -- e.g. namespaces and type names had changed -- that
while DotTrace was able to show some matches, the comparisons were not as useful
as usual.

A comparison between codebases that hadn't changed so much is much easier, but I
didn't have that luxury.

[Tracking the Problem]

Even excluding the less-than-optimal comparison, it was an odd profile.
Ordinarily, one or two issues stick out right away, but the slowness seemed to
suffuse the entire test run. Since the direct profiling comparison was
difficult, I downloaded test-speed measurements as CSV from TeamCity for the
product where we noticed the issue.

How much slower, you might ask? The test that I looked at most closely took
almost 4 minutes (236,187ms) in the stable version, but took 5:41 in the latest
build.

[image]

This test was definitely one of the largest and longest tests, so it was
particularly impacted. Most other tests that imported and manipulated data
ranged anywhere from 10% to 30% slower.

When I looked for hot-spots, the profile unsurprisingly showed me that database
access took up the most time. The issue was more subtle: while database-access
still used the most time, it was using a smaller percentage of the total time.
Hot-spot analysis wasn't going to help this time. Sorting by absolute times and
using call counts in the tracing profiles yielded better clues.

The tests were slower when saving and also when loading data. But I knew that
the ORM code itself had barely changed at all. And, since the product was using
Quino so heavily, the stack traces ran quite deep. After a lot of digging, I
noticed that creating the ExpressionContext to hold an object while evaluating
expressions locally seemed to be taking longer than before. This was my first,
real clue.

Once I was on the trail, I found that when evaluating calls (getting objects)
that used local evaluation, it was also always slower.

[Don't Get Distracted]

[image]Once you start looking for places where performance is not optimal,
you're likely to start seeing them everywhere. However, as noted above, 97% of
them are harmless.

To be clear, we're not optimizing because we feel that the framework is too slow
but because we've determined that the framework is now slower than it used to be
and we don't know why.

Even after we've finished restoring the previous performance (or maybe even
making it a little better), we might still be able to easily optimize further,
based on other information that we gleaned during our investigation.

But we want to make sure that we don't get distracted and start trying to FIX
ALL THE THINGS instead of just focusing on one task at a time. While it's
somewhat disturbing that we seem to be created 20 million ExpressionContext
objects in a 4-minute test, that is also how we've always done it, and no-one
has complained about the speed up until now.

Sure, if we could reduce that number to only 2 million, we might be even faster
[3], but the point is that that we used to be faster on the exact same number of
calls -- so fix that first.

[A Likely Culprit: Scope]

I found a likely candidate in the Scope class, which implements the IScope
interface. This type is used throughout Quino, but the two use-cases that affect
performance are:

   1. As a base for the ExpressionContext, which holds the named values and
      objects to be used when evaluating the value of an IExpression. These
      expressions are used everywhere in the data driver.
   2. As a base for the poor-man's IOC used in Stage 2 of application execution.
      [4]

The former usage has existed unchanged for years; its implementation is unlikely
to be the cause of the slowdown. The latter usage is new and I recall having
made a change to the semantics of which objects are returned by the Scope in
order to make it work there as well.

[How could this happen?]

You may already be thinking: smooth move, moron. You changed the behavior of a
class that is used everywhere for a tacked-on use case. That's definitely a
valid accusation to make.

In my defense, my instinct is to reuse code wherever possible. If I already have
a class that holds a list of objects and gives me back the object that matches a
requested type, then I will use that. If I discover that the object that I get
back isn't as predictable as I'd like, then I improve the predictability of the
API until I've got what I want. If the improvement comes at no extra cost, then
it's a win-win situation. However, this time I paid for the extra functionality
with degraded performance.

Where I really went wrong was that I'd made two assumptions:

   1. I assumed that all other usages were also interested in improved
      predictability.
   2. I assumed that all other usages were not performance-critical. When I
      wrote the code you'll see below, I distinctly remember thinking: it's not
      fast, but it'll do and I'll make it faster if it becomes a problem. Little
      did I know how difficult it would be to find the problem.

[Preventing future slippage]

"Avoid changing a type shared by different systems without considering all
stakeholder requirements."

I think a few words on process here are important. Can we improve the
development process so that this doesn't happen again? One obvious answer would
be to avoid changing a type shared by different systems without considering all
stakeholder requirements. That's a pretty tall order, though. Including this in
the process will most likely lead to less refactoring and improvement out of
fear of breaking something.

We discussed above how completely reasonable assumptions and design decisions
led to the performance degradation. So we can't be sure it won't happen again.
What we would like, though, is to be notified quickly when there is performance
degradation, so that it appears as a test failure.

"Notify quickly when there is performance degradation"

Our requirements are captured by tests. If all of the tests pass, then the
requirements are satisfied. Performance is a non-functional requirement. Where
we could improve Quino is to include high-level performance tests that would
sound the alarm the next time something like this happens. [5]

Enough theory: in "part II"
, we'll describe the
problem in detail and take a crack at improving the speed. See you there.

--------------------------------------------------------------------------------


[1] In fairness, the quote is at least properly attributed. It really was Donald
    Knuth who wrote it.


[1] By "opportunity", of course, I mean that I messed something up that made
    Quino slower in the new version.


[1] See the article "Quino 2: Starting up an application, in detail"
     for more
    information on this usage.


[1] I'm working on this right now, in issue "Add standard performance tests"
     for release 2.1.