Configuration Spaces and Histories

Before we dive into versioned filesystem design and implementation details, let's first take a look at the concepts we're talking about. We'll begin by discussing what a configuration management system actually operates on.

N.B.: Although the following text uses mathematical concepts and notation, it's not intended to be a rigorous dissertation on configuration theory. We'll often make statements and draw conclusions without providing proof. Therefore, dear reader, you can either trust we're right, or do your own homework to prove (or disprove) our assertions. We will, however, try to conscientiously flagβš‘ all leaps of faith in the text.

Definition: A data configuration is a unique pattern of data that can be represented as exactly one distinct sequence of binary digits.

Definition: A data configuration space is a metric space (𝔹, 𝒹) where:

Definition: A configuration history is a sequence of elements of 𝔹; in other words, it is a curve in (𝔹, 𝒹).

Defining the Metric

We'll now show that we can indeed define a metric for 𝔹. One example is the Levenshtein distance,

applied to the binary-digit sequence representation of data configurations. It is easy to showβš‘ that 𝓁 has all the required properties of a metric. In addition, it has the following properties:

In the rest of this text, we'll specifically talk about (𝔹, 𝓁) β€” that is, the data configuration space with a Levenshtein distance metric; also referred to as edit distance. Even though the term "edit distance" is less precise in general, it fits the concept of edits (or edit transformations), which we'll discuss later on.

The History Curve

FS2/Theory (last edited 2012-12-01 05:30:22 by brane)