Ideas/DerivingCommuteRules
Note: This page documents a possible idea for developing new patch types and tweaking existing ones. It doesn’t currently form the basis for any implementation.
As we add new patch types, the number of commute rules we will have to implement could grow quadratically, although in some cases we might just say that patches never commute - e.g. syntax tree edits and hunk changes. We need a way of figuring out these rules. This proposal has one idea for doing that.
Let’s start with an example of hunk patches within a single file. These are currently represented as a line number plus a sequence of existing lines to remove and a sequence of lines to put in their place. For simplicity lets ignore the surrounding file system and just think about the file itself.
We can model the hunk patch as a Haskell type (playing fast and loose with the precise syntax):
Hunk (n :: Int) (old :: [String]) (new :: [String])
In fact, we don’t really care whether the contents are Strings or not. Let’s abstract that to the type variable line
:
Hunk (n :: Int) (old :: [line]) (new :: [line])
The underlying assumption here is that a file is modelled as a list of lines; we can index into the file at position n
, chop out old
and insert new
.
Let’s call the initial contents of the file oldcontents
:
oldcontents :: [line]
If the hunk patch applies to the file, then we actually know that oldcontents must contain old
at position n
:
oldcontents = prefix ++ old ++ suffix
length prefix = n
From this we can get the new contents:
newcontents = prefix ++ new ++ suffix
How does this help us? Let’s now consider two patches in sequence:
patch1 = Hunk n1 old1 new1
patch2 = Hunk n2 old2 new2
The overall goal is to figure out how we can commute the sequence patch1; patch2
into an alternate sequence patch2'; patch1'
The file contents then goes through three states - initial
, intermediate
(after patch1
) and final
(after patch2
):
initial = prefix1 ++ old1 ++ suffix1
length prefix1 = n1
intermediate = prefix1 ++ new1 ++ suffix1
intermediate = prefix2 ++ old2 ++ suffix2
length prefix2 = n2
final = prefix2 ++ new2 ++ suffix2
We have two equations for intermediate
. Consider how the two possible breakdowns might line up, and in particular where old2
is found within prefix1
, new1
and suffix1
. I’ll break it down into three cases:
old2
is entirely contained withinprefix1
, i.e.length prefix2 + length old2 <= length prefix1
old2
is entirely contained withinsuffix1
, i.e.length prefix2 >= length prefix1 + length new1
- otherwise
old2
overlaps withnew1
Appealing to intuition, we declare that case 3 implies that patch1
and patch2
don’t commute.
I’ll consider only case 1 for now.
Given condition 2, we can say that
prefix1 = prefix2 ++ old2 ++ gap
and so our set of equations becomes
initial = prefix2 ++ old2 ++ gap ++ old1 ++ suffix1
length prefix2 + length old2 + length gap = n1
intermediate = prefix2 ++ old2 ++ gap ++ new1 ++ suffix1
intermediate = prefix2 ++ old2 ++ suffix2
length prefix2 = n2
final = prefix2 ++ new2 ++ suffix2
from this we can derive
suffix2 = gap ++ new1 ++ suffix1
and so our set of equations becomes
initial = prefix2 ++ old2 ++ gap ++ old1 ++ suffix1
length prefix2 + length old2 + length gap = n1
intermediate = prefix2 ++ old2 ++ gap ++ new1 ++ suffix1
length prefix2 = n2
final = prefix2 ++ new2 ++ gap ++ new1 ++ suffix1
Again appealing to intuition, after a commute, the intermediate state between patch2'
and patch1'
should be
intermediate' = prefix2 ++ new2 ++ gap ++ old1 ++ suffix1
from this, we can derive
patch2' = Hunk (length prefix2) old2 new2
patch1' = Hunk (length prefix2 + length new2 + length gap) old1 new1
Using what we know about the lengths, we get
patch2' = Hunk n2 old2 new2
patch1' = Hunk (n1 + length new2 - length old2) old1 new1
and this is subject to the condition that
n2 + length old2 <= n1
Using a similar argument for case 2 gives us that
patch2' = Hunk (n2 - length new1 + length old1) old2 new2
patch1' = Hunk n1 old1 new1
n2 >= n1 + length new1
(TODO: check this, I didn’t actually go through the derivation :-)
We have so far got the following condition for commuting hunks:
Hunk n1 old1 new1; Hunk n2 old2 new2
commutes to
Hunk n2 old2 new2 ; Hunk (n1 + length new2 - length old2) old1 new1
*if* n2 + length old2 <= n1
and to
Hunk (n2 - length new1 + length old1) old2 new2 ; Hunk n1 old1 new1
*if* n2 >= n1 + length new1
What if both conditions are true? This can only happen if
length old2 = 0
length new1 = 0
n1 = n2 (let's call this n)
If that is the case, the two different commutes are
Hunk n old1 []; Hunk n [] new2
commutes to
Hunk n [] new2 ; Hunk (n + length new2) old1 []
and to
Hunk (n + length old1) [] new2 ; Hunk n old1 []
These are the same if
length old1 = 0
length new2 = 0
in which case we have the degenerate case
Hunk n [] [] ; Hunk n [] []
commutes to
Hunk n [] [] ; Hunk n [] []
Let’s rule this out by saying that for any hunk patch, either the old content or the new content must be non-empty. We could allow these patches, but it makes reasoning more complicated and they have no real value.
So we need to avoid both results happening. Appealing to symmetry, the best thing is to rule out this case, i.e. commuting fails if
length old2 = 0 && length new1 = 0 && n1 = n2