Differences between revisions 1 and 2
Revision 1 as of 2014-07-13 18:05:57
Size: 15453
Comment: Initial version.
Revision 2 as of 2014-07-13 18:23:07
Size: 15452
Comment: typo
Deletions are marked like this. Additions are marked like this.
Line 64: Line 64:
The spacing parameter should be equal to the average revision data being written to other repos for each revision / packed shard copy round. IOW something like (total size of "missing" repos) / (number of revs in copied repos). As that may result in too much transiently disk space being allocation, you may use 128 here. That plays well with typical RAID strip and OS prefetch sizes. The spacing parameter should be equal to the average revision data being written to other repos for each revision / packed shard copy round. IOW something like (total size of "missing" repos) / (number of revs in copied repos). As that may result in too much transiently disk space being allocated, you may use 128 here. That plays well with typical RAID strip and OS prefetch sizes.

Measuring Repository Performance

Repositories are just passive collections of files on disk, but their physical layout affects performance just as well as the hardware and software configuration.

This page is about

  • how to produce realistic on-disk layouts
  • what to test
  • how to to produce meaningful test results

The methodology applies to a wide range of setups and repository formats. FSFS is being used as an example but the basic rationales should apply to others as well.

How to produce realistic on-disk layouts

TL;DR

Either properly clone your data storage (see below) or use a script like this for FSFS repositories:

copy_repo.py SRC DST COUNT SEPARATOR_SIZE

SRC            Immediate parent folder of all the repositories to copy.
DST            Folder to copy into; current contents will be lost.
COUNT          Number of copies to create of each source repository.
SEPARATOR_SIZE Additional spacing, in kBytes, between revisions.

e.g. copy_repo.py /tmp/acme /test/acme 4 128

Problem

Non-packed FSFS repositories may consist of millions of files of varying sizes. Even for seemingly simple queries, data from thousands of them needs to be processed. If they happened to be stored physically close to each other on disk, the information can be retrieved with just a handful of I/Os. Should they be dispersed across the storage medium, it will take thousands of individual and hard-to-predict (by the OS) I/Os.

Therefore, it is paramount that your test setup mimics the physical properties of the class of production environments you want to test for.

Note that this does not only apply to HDDs (spinning disks) but to any media that has non-zero I/O latency, including SSDs. The issue is whether OS- or hardware-side prefetching will be effective such that most requests can be served from some level of cache.

Things are complicated by the fact that the OS, its file system code or the SAN software give the user very little control over where it puts the data physically. The best one may hope for is that the same sequence of operations produces similar - if not identical - layouts on previously empty disks.

Ideal solution

Clone the production environment. Everything else is just an approximation. Various methods to produce good approximations are discussed in later sections.

Cloning in this case, however, must produce the same (or reasonably similar) disk images. Merely copying files will create the same unrealistic setup as the Naive approach below. That also precludes differential file copies as a means to keep the test environment in sync with the production environment.

Naive approach

Simply copying an existing repository to the target storage will often store data densely and roughly in path order. For non-packed FSFS repositories, for instance, that means revision content is stored consecutively almost producing a packed repository. The same applies to its revision properties, making the enumeration of all revprops via 'svn log $repo/' very fast.

"Naturally grown" repositories store data in revision order, though. As a result, it may be cheaper to get from the revision file to the revprops (for time stamps etc.) than in a copied repo. OTOH, the revprops themselves may get much more spread out if the OS decides to intersperse them with revision data.

For servers hosting multiple active repositories, "naturally grown" repositories look even more different. Because many commits to other repositories tend to happen between any two commits to the same repository, the individual revisions of all these repositories may interleave on disk, creating large and random gaps between the individual revisions of a repository.

Copying packed FSFS repositories creates the same problems in theory but less so in practice. However, as the individual pack files are large and get written quickly in a single go, individual packs look very much like copied packs. Their sheer size requires some I/O going from one pack to the other and the physical distance between packs becomes a secondary factor.

Note that the OS or SAN may still decide to place small files within the same directory close to each other on disk. That's fine and should produce good, *reproducible* performance figures. But we must try to not inadvertently aid the OS in doing this.

'svnadmin load'

This creates repository data in the correct order and if your server hosts only one major repository that may even be a fair approximation of the production environment.

There is one caveat, though. An OS may decide to delay the allocation of physical disk space for short periods of time and treat revision files created in rapid succession differently from those created minutes, hours or even days apart. This problem is not solved by any of the non-cloning approaches.

Scripted copies

Use a script to control exactly how data is being copied. The minimum you want to do is to copy repositories in revision order with a result similar to 'svnadmin load' above.

With this python script, you also get random data files of up to the specified size being written in between revisions. Those get deleted once the copy is complete. Depending on the OS / SAN heuristics, this may mimic the presence of other repositories growing as the copied one does. Moreover, multiple repositories get copied at once in a round-robin interleaved scheme, i.e. copy rev 0 of all repositories, then rev 1 of all repositories, etc.

The spacing parameter should be equal to the average revision data being written to other repos for each revision / packed shard copy round. IOW something like (total size of "missing" repos) / (number of revs in copied repos). As that may result in too much transiently disk space being allocated, you may use 128 here. That plays well with typical RAID strip and OS prefetch sizes.

What to test

Before you conduct performance tests, you need to decide upon the scenarios as well as what kind of conclusions you want to be able to draw from the results. For the latter, it is often useful to identify the components and settings involved and to be able to isolate their contribution.

The layers, operations, scenarios and access methods can be combined freely.

Layers

For repository performance, the following layers should be investigated isolated from the others:

  • physical storage layer, i.e. the actual disks This will highlight differences in I/O demands.
  • SAN / OS disk caching and OS file handling This will identify issues with using OS resource efficiently.
  • SVN operating on internal caches This will show differences in internal processing overhead.

Operations

User-side read operations:

  • Checkout / export a whole development line (e.g. project trunk). This traverses the repository along the "path dimension". Also has to collect time stamp and user information from the revision properties.
  • Plain log without further options, except --limit, on repository root Collects the revision properties without touching the rev contents.
  • Log on some path inside the repository Traverses the repository sparsely along the "time dimension". When combined with -v and / or -g options, more auxiliary data needs to be gathered.

Commits and merges are interesting but may depend to much on working copy performance.

Scenarios

There are three basic scenarios.

  • A single request. The server has no concurrent requests to handle. This is the most controlled setup. Simulates the normal load in many small and mid-sizes deployments.
  • Many concurrent requests to the same repo. Simulates the occasional peak caused e.g. by teams updating to the latest release.
  • Concurrent requests to many repositories. Simulates Monday morning peak loads and repository hosting services.

Access methods

AKA, RA layers, differ in access pattern and support for various internal optimizations:

  • HTTP:// Most popular. 1.8+ clients use a more randomized access pattern.
  • SVN:// Lowest network overhead with fixed access pattern. Zero copy code is stressing internal caches.
  • FILE:// Lowest latency (no network) but does not use various internal optimizations and caching options.

Data to collect

Most data collected will be dynamic, such as execution times. As some operations may be CPU limited, it is important to not only measure elapsed time but also the time spent in user code and the kernel.

Other data that may be interesting to identify bottlenecks that mask other performance aspects:

  • I/O count and data transfered between SAN/disk and OS
  • I/O count and data transfered between OS and SVN
  • I/O latencies and throughput
  • Network load between SVN client and server

Please note that taking those measurements may impact execution times.

How to test

Most investigations will involve repeated runs and various repository / test case combinations. So, script it.

Problem

Server deployments - as most other computers - are complex systems, i.e. their behavior cannot be fully controlled nor is their behavior entirely predictable. Like with any other experimentation we need to identify sources for systematic and compensate for random error. Random error manifests as noise and can be reduced by repetition.

Systematic error in our context is everything that favors one configuration over the other in our test environment but not in the production environment. Notable sources for that are:

  • "Lucky" / "unlucky" placement of data on disk
  • Using shared, i.e. uncontrolled, resources like the SAN and network.
  • Having no way to completely empty disk caches at all levels.
  • Complex or hybrid LRU/LFU caching schemes may "pin" certain data in cache for some repositories but not for others.
  • Running the client on the same machine and disks as the server.

Solution

Data placement issues can be mitigated by the script mentioned above. It creates multiple copies of the same repository interleaved with copies of other repositories. As a result, "good" locations should either be beneficial to all repositories or be restricted to some but probably not all copies of a given repository.

An operation should be measured for all repositories and all copies before continuing with the next operation or configuration. The repository order should be repository major:

  • repo1, copy 1
  • ...
  • repoN, copy 1
  • repo1, copy 2
  • ...
  • repoN, copy 2
  • ...

With that ordering, dispersed copies of the same repository also address the following issues:

  • Repetitions to reduce random noise are already implied by the scheme.
  • Peaks and lows in shared resource performance are much more likely to affect multiple repositories equally than only the copies of one repository. This makes for fair comparisons.
  • It is easy to create a data set >> cache sizes. After each round, the caches are likely to be cool again for copy 1 repos.

  • Items getting "stuck" or "pinned" in disk cache will either be restricted to individual copies or all copies of a given repository. In the latter case, it is likely that the same will happen in production environment and is, therefore, fair and no longer a systematic error.

Running client and server on the same machine may be a valid use-case, e.g. if a build server or check script needs local working copies etc. Due to the added noise and extra load on CPU, disk and network (in case of a SAN), the performance will be significantly different from what external clients would see.

The easiest solution here is to use svn-bench. It still creates CPU load but usually much less than the standard client.

Heating up caches

Cache implementations are rarely circular buffer style LRUs. Hence, it can be hard to know how hot a given cache actually is and where or if it would settle after a while.

To simply get an rough indication on what the impact of hot caches on performance is, it will be enough to simply repeat an operation once. If you want some indication on how much variation there is repeat the operation twice. Proper heat-up has been observed to take about 5 repetitions for results to settle. And even then the system may settle on different states for different test runs.

It is also useful to start the heating process from a defined state, e.g. "all caches cold". It depends on the system how close you will get to a well-defined starting point.

Windows specifics

Out of the box, Windows does not provide means to measure CPU load of individual processes. However, there are Win32 API functions that return the desired information. This tool can be used to execute commands, measure their execution time and CPU while suppressing their stdout.

Controlling the file / disk cache is also next to impossible. However, for our purposes, it is already helpful to remove all cached data from RAM e.g. using this tool. It should be built as 64 bit application and will quickly free up RAM such that it can used for caching new data. The OS may still decide to cache other data in the page file but that will be dealt with using the multiple copy approach above.

Analyzing results

First off, always include a full log of your measurement data. This is not only to prevent selection bias but also preserves lots of information such as temporal correlation that you were initially not interested in.

There is a large body of literature about statistical analysis, so the following are merely a few practical suggestions:

  • If you run your tests with 3 repetitions, use the median as this is probably value the least affected by cache and disk placement favoritism on one end of the spectrum and transient drops in system performance on the other end.
  • Identify tests where results deviate by more than x%. These give you an indication on how reliable the median value may be.
  • Check for correlation between data sequences that follow the same change in configuration. E.g. if the system gets faster with larger caches independent of other parameters, data points counter to this may need further investigation. Conversely, if the influence of one parameter is supported by many data sequences, the confidence is high that this is not a mere measurement artifact.

A note on CPU load

The actual amount of work done being equal, the CPU time spent by Subversion, including the kernel execution times, slowly grows with the total elapsed time. IOW, if you reduce all waits (fast I/O, fast network, fast server), the total runtime time may be less than the initial CPU time in the "slow" setup.

So, unless the CPU load is close to 100%, CPU time is not the absolute lower limit to the best-case execution time.

MeasuringRepositoryPerformance (last edited 2014-07-13 18:23:07 by StefanFuhrmann)