Differences between revisions 4 and 5
Revision 4 as of 2013-05-07 13:31:25
Size: 1992
Comment:
Revision 5 as of 2013-05-22 12:10:49
Size: 4588
Comment:
Deletions are marked like this. Additions are marked like this.
Line 8: Line 8:
    to whatever further discussion will suggest  to whatever further discussion will suggest
Line 28: Line 28:
            optional: copyfrom path + rev optional: copyfrom path + rev
Line 51: Line 51:
Line 54: Line 55:
== Notes from Discussions ==
Line 55: Line 57:
 * Merged changes within the same merged change list may cancel each other out.
 We still need to record all of them.
 * We need to represent undo and blocking merges
 * Operations should be simple
 * There will be different path matching strategies being applied in the following
 order: last merge, latest common ancestor, (relative) path name.
 * Merge history and "natural" history should be treated (somewhat) uniformly.
 Thus, we may want a common path mapping index for that. So, store the base path
 of every merge.
 * What is the semantics of "undo merges" from before the latest common ancestor?
 * Ability to mark change as "irrelevant" for future merges.
 * The info about blocked merges is part of HEAD. We record the actual merge
 decisions when they happen. "Block" will only filter future merges and *might*
 be a basis to suggest undo merges.
 * Merges between "mismatched" sub-paths establish a new path mapping relationship
 by default (user may override). Rationale: the merge went through, probably with
 some conflicts left to resolve, and the user committed the result. It's no longer
 a mere accident in the mayority of cases.
Line 58: Line 78:
== Notes from Discussions ==

 * Lists of merged changes may have significant overlap, e.g. when sync'ing
 multiple branches with /trunk. We should represent them as lists of shared
 sub-lists.
 * Having a copy-only sub-set of the history will minimize the sheer amount
 of data to be processed. Possibly, some extra indexing on that separate
 storage may add to its usefulness.
 * Indexing the merge data model: Since this can be written / modified after
 the fact, having a tight representation may be difficult. However, we use
 have some revision-related storage such as a few extra files to a pack file.
 Rationale: once the relevant range of history has been identified by path
 mapping, all further operations can be related to revisions or ranges of
 revisions.
Line 59: Line 94:

== Notes from Discussions ==

 * How do undo merges work retroactively?
 * Diff'ing and combining change lists and merges is part of the external model
 (respectively its operations)
 * There is a significant difference between "revert" (undo the effects of
 a merge in the working copy) and "resolve" (take whatever the current w/c
 state is for the result of the merge). That's a UI / user awareness issue.
 * Splits and joins are not modelled explicitly. For many typical cases,
 however, conflict resolution strategies may be provided that cover them
 nicely without adding complexity to the data model.
 

Logical Data Model

Notes:

  1. diagrams are yet to follow
  2. many aspects are not modelled correctly and will change to whatever further discussion will suggest

Entities

Repository. In this context, a mere container for revisions. There may be more than one repository and merge tracking shall work across repository boundaries. Attributes: ID (may or may not be its UUID)

Revision. A (possibly empty) set of atomic changes. They are unalterable and identified by their number within the repository. Per repository, revisions form a single, contiguous line of history. Attributes: Revnum.

Atomic Change. The smallest tracable entity of change and associated to exactly one revision. The list of atomic changes can be modified by the user. Due guarantee referential integrity, there is no way to delete an atomic change. Attributes: ID (relative to revision), change type, path, node type, optional: copyfrom path + rev

Change types:

  • no-op (instead of deletion; also for merges that result in zero net change)
  • contents (no distiction between property and contents change)
  • delete
  • add
  • copy (add with copy-from info)
  • rename (implies deletion at copy-from info)

Changes are strictly ordered within a revision. This is necessary since removals, copies and renames may replace the same path with different nodes and that may happen multiple times per revision. Q: Normalize lists, e.g. tree mods first followed by a max 1 contents change?

Merged Change. A refence to an atomic change. The list of merged changes is a sub-element of an atomic change. That list may be empty if there have been no merges in this revision for that path. As a convention, we only record the changes immediately merged, i.e. those pertaining to the merge source instead of all changes along the merge hierarchy.

Operations

Notes from Discussions

  • Merged changes within the same merged change list may cancel each other out. We still need to record all of them.
  • We need to represent undo and blocking merges
  • Operations should be simple
  • There will be different path matching strategies being applied in the following order: last merge, latest common ancestor, (relative) path name.
  • Merge history and "natural" history should be treated (somewhat) uniformly. Thus, we may want a common path mapping index for that. So, store the base path of every merge.
  • What is the semantics of "undo merges" from before the latest common ancestor?
  • Ability to mark change as "irrelevant" for future merges.
  • The info about blocked merges is part of HEAD. We record the actual merge decisions when they happen. "Block" will only filter future merges and *might* be a basis to suggest undo merges.
  • Merges between "mismatched" sub-paths establish a new path mapping relationship by default (user may override). Rationale: the merge went through, probably with some conflicts left to resolve, and the user committed the result. It's no longer a mere accident in the mayority of cases.

Internal Data Model

Notes from Discussions

  • Lists of merged changes may have significant overlap, e.g. when sync'ing multiple branches with /trunk. We should represent them as lists of shared sub-lists.
  • Having a copy-only sub-set of the history will minimize the sheer amount of data to be processed. Possibly, some extra indexing on that separate storage may add to its usefulness.
  • Indexing the merge data model: Since this can be written / modified after the fact, having a tight representation may be difficult. However, we use have some revision-related storage such as a few extra files to a pack file. Rationale: once the relevant range of history has been identified by path mapping, all further operations can be related to revisions or ranges of revisions.

External Data Model

Notes from Discussions

  • How do undo merges work retroactively?
  • Diff'ing and combining change lists and merges is part of the external model (respectively its operations)
  • There is a significant difference between "revert" (undo the effects of a merge in the working copy) and "resolve" (take whatever the current w/c state is for the result of the merge). That's a UI / user awareness issue.
  • Splits and joins are not modelled explicitly. For many typical cases, however, conflict resolution strategies may be provided that cover them nicely without adding complexity to the data model.

MergeDev/DataModel (last edited 2013-05-22 12:10:49 by StefanFuhrmann)