Drawing a directed graph with ASCII-characters in Emacs (Part 2)

In part 1 I started researching on how to implement an algorithm for drawing directed graphs in Emasc.

The ranking algorithm

(defun rank ()
  (feasible-tree)
  (let (e)
    (while (setq e (leave-edge))
      (exchange e (enter-edge e))))
  (normalize)
  (balance))

The ranking algorithm stats with finding a feasible tree

feasible-tree

(defun feasible-tree ()
  (init-rank)
  (while (< (tight-tree) (absolute-value V))
    (let ((e (get-non-tree-node-incident-on-tree-with-min-slack))
          (delta (slack e)))
      (when (eq incident-node (e-head e))
        (setq delta (- delta)))
      (mapc
       (lambda (v) (setf (v-rank v) (+ (v-rank v) delta)))
       tree)))
  (init-cutvalues))

Remember v is a node, e is an edge. Rank is the the level of a node in the graph. A tight edge has the slack 0. So if the minimal length of an edge is 1, then the slack is calculated by (- (length e) minimal-length) so an edge with length 1 would have a slack of zero, thus being tight.

init-rank

The paper won't go into detail for brevity (sigh). They use a queue. You go through all nodes and place a node in the queue when they have no unscanned in-edges. So the first nodes would be the ones without any in-edges.

a -- b -- d
  \-----/

In the above graph a has no unscanned in-edges. All edges so far are unscanned. So we assign a rank that satisfies the in-edges. Since there are no edges, we use 0. Then we mark the outedges as scanned:

a0 == b -- d
 \========/

So an equal sing declares an edge as marked. a has been assigned rank 0. We check our remaining nodes for candidates for the queue. d has an unscanned in-edge (b--d). But b is good. So we assign b a rank that satisfies the in edges. The edge connects to a wich already has a rank 0. So to satisfy it b gets 1. Then we mark out-edges as scanned.

a0 == b1 == d
 \=========/

Now d has no unscanned in-edges anymore so we put it in the queue. On in edge goes to a which has rank 0 and one goes to b with rank 1. So to satisfy both we need the maximum plus 1. So d gets rank 2. The final result of init-rank is:

a0 == b2 == d3
 \=========/

tight-tree

So according to the paper tight-tree finds a maximal tree of tight edges containing some fixed node and returns the number of nodes in the tree. So for this we disregard directions and use an undirected graph. We choose a random node, e.g. a, and traverse the tree in all directions using tight edges only.

a -- b -- c -- d -- f -- g
 \        h --/         /
  \----------- l ------/
i -- j -- k -/

The paper explains that the tight-tree is a spanning-tree. So if we take the graph above and use node a as a start, the max spanning tree would look like this:

a -- b -- c -- d -- f -- g
          h --/

i -- j -- k -- l

We have 2 tight trees. One contains a and the other one contains i. Note that the original direction of the edges don't matter, so h is just another leaf node of the spanning tree. The tree with a has the maximum amount of nodes so we start with that one. Every node in the trees is connected by at least one thight edge.

Our while loop will run until the tight tree contains all nodes of our graph. So we have to figure out how to make the tree with i part of our maximum tight tree.

while body

(let ((e (get-non-tree-node-incident-on-tree-with-min-slack))
      (delta (slack e)))
  (when (eq incident-node (e-head e))
    (setq delta (- delta)))
  (mapc
   (lambda (v) (setf (v-rank v) (+ (v-rank v) delta)))
   tree))

We take a non tree edge incident on the tree with minimal slack. So we loop through the nodes of our tree and check for non tight edges. Then we have to check if the edge connects to a node that is not part of our current max tree.

a -- b -- c -- d -- f -- g
 \        h --/         /
  \----------- l ------/
i -- j -- k --/

a -- l is a non-tight edge. l also happens to be part of the other tree. The slack is 2 (because the node is 2 units longer than the minimum length 1). We take a note of that and continue our search. l -- g is also a non-tight edge and l is still not part of the maximal tight tree. The slack in this case is just 1. We choose l -- g because the slack is minimal. Our delta is 1. The incident node is the head of the edge so the delta is negated. Now it's time to loop through the nodes of the tree and add delta to the rank. The result looks like this:

a -- b -- c -- d -- f -- g
 \        h --/         /
  \---------------- l -/
     i -- j -- k --/

We add both tight trees together and now the number of nodes is the same as the number of nodes in the graph. The while loop terminates here.

init-cutvalues

This took a while until I fully understood what the paper was trying to say. Also it's gonna get complicated calculating this without a very dumb algorithm. The paper has some suggestions but they seem more complex than the other stuff.

So this computes the cut values of tree edges. We iterate over the edges and mark nodes belonging to the head or tail component. It's important that we keep in mind, which edges are tree edges. Consider a graph like this

a === b === c === d === f === g
 \          h ===/           /
  \-------------------- k ==/
   \------- i === j ===/

How do we calculate the cutvalues? We cut the graph at an edge. Let's take l--g. This results in a head and a tail component of tight trees:

Head:
a === b === c === d === f === g
            h ===/
Tail:
                        k
            i === j ===/

Now we perform

the sum of the signed weights of all edges whose head and tail are in different components, the sign being negative for those edges goin from the head to the tail component.

Alright! Let's get all of those edges that connect the trees. They are markes with a +

a === b === c === d === f === g
 \          h ===/           /
  \++++++++++++++++++++ k ++/
   \+++++++ i === j ===/

So we get a--k, a--i and k--g. Because a is in the head, the sign of a--k and a--i is negative. (-1) + (-1) + 1 gives us a cut value of -1.

a === b === c === d === f ==== g
 \          h ===/            /
  \-------------------- k =-1/
   \------- i === j ===/

k--g has been marked with cut value -1. We do the same for the other edges:

a =3= b =3= c =3= d =3= f =3== g
 \          h =1=/            /
  \-------------------- k =-1/
   \------- i =0= j =0=/

The feasible-tree algorithm ends here but the network simplex just started.

(defun rank ()
  (feasible-tree)
  (let (e)
    (while (setq e (leave-edge))
      (exchange e (enter-edge e))))
  (normalize)
  (balance))

It will replace the edges with negative cut values with non-tree edges to get only non negative cut values. The leave-edge described in the paper would be k--g because of the negative cut value. The enter-edge is one of the non-tree edges going from head to tail component with minimal slack.

Head:
a === b === c === d === f === g
            h ===/
Tail:
                        k
            i === j ===/

Non-Tree edges from head to tail:
a
 \--------------------- k
  \-------- i === j ===/

So a--i and a--k are candidates for enter-edges. a--i has the smaller slack. So we exchange a--i and k--g. This will adjust the ranks of our tail tree:

a === b === c === d === f === g
 \          h ===/           /
  \-------------- k --------/
   \= i === j ===/

The new cut values are:

a =2= b =2= c =2= d =2= f =2= g
 \          h =1=/           /
  \-------------- k --------/
   \1 i =1= j =1=/

Because we only have positive cut values the all that is left is normalize and balancing.

Optimization by using only local edges.

The following part was a little bit confusing because it only showed the term for one example. It did not give any rules, how the term was constructed. The point the paper is trying to make is, that if you have a node, where all cut values of tree edges are known, except for 1, you can calculate the cutvalue with only looking at the edges connecting with that node. The rules are the following:

start with the weight of the edge with unknown cut value (should be 1 in our case).
all non-tree egdes going out increment it by their weight (a.k.a. 1).
all non-tree edges going in decrement it by their weight.
for tree edges add their cut value but subtract their weight.

This works only when doing it incrementally. We take the graph from above:

a === b == c == d == h
 \--- f == g -------/
  \== e ==/

If we start with h, we could calculate the trivial solution and get 2. Next we calculate another leaf f and get 0.

a === b === c == d =2= h
 \--- f =0= g --------/
  \== e ===/

Now our incremental algorithm takes over. We need a node with only 1 unkown tree edge. d and g are good candidates. Let's take d and calculate the unknown c--d cut value. It has one tree edge with cut value 2 and weight 1. Also we need the weight of c--d which is 1.

cut(c,d) = w(c,d) + cut(d,h) - w(d,h)
         = 1      + 2        - 1
         = 2

Alright. Result is now this:

a === b === c =2= d =2= h
 \--- f =0= g ---------/
  \== e ===/

Next we do the same with c and then b getting to this:

a =2= b =2= c =2= d =2= h
 \--- f =0= g ---------/
  \== e ===/

Not time for g. The unknown edge is e--g with weight 1. We have 1 outgoing non-tree edge and another tree edge.

cut(e,g) = w(e,g) + w(g,h) + cut(f,g) - w(f,g)
         = 1      + 1      + 0        - 1
         = 1

Notice how outgoing non-tree edges are positive. Weights of tree egdes are always negative while their cut value is positive.

a =2= b =2= c =2= d =2= h
 \--- f =0= g ---------/
  \== e =1=/

Last but not least we look at e and the unknown edge a--e.

cut(a,e) = w(a,e) + cut(e,g) - w(e,g)
         = 1      + 1        - 1
         = 1

Final initial cut values are:

a ==2= b =2= c =2= d =2= h
 \---- f =0= g ---------/
  \=1= e =1=/

I hope this is correct. At least it's consistent with the paper. Not sure if it makes any sense, but I feel like I understood it finally. Sadly we are not done with optimizing.

Optimization by labeling with postorder traversal

It provides an inexpensive way to test, which nodes are in the head or in the tail. It's important for knowing if a nontree edge crosses between the two components. We do a postorder traversal starting from a fixed root node. Each node gets labeld with low(v) and lim(v). We start with both being 1. In a tree we always take the left branch first, labeling the node we pass with the current low and lim values. The low will always stay set on the node. Once we cannot go deeper, we have to go back inceasing current low and lim by one. The next node we encounter get's set it's lim but not low because it's already set. We continue doing that:

               (1,1)
             _/     \_
           _/         \_
         _/             \_
        /                 \
     ( , )            ____( , )____
    /     \         _/      |      \_
   /       \       /        |        \
( , )     ( , ) ( , )     ( , )     ( , )
                  |
                  |
                ( , )

               (1,1)
             _/     \_
           _/         \_
         _/             \_
        /                 \
     (1,1)            ____( , )____
    /     \         _/      |      \_
   /       \       /        |        \
(1,1)     ( , ) ( , )     ( , )     ( , )
                  |
                  |
                ( , )

Now we reached a dead end, so we increase low and lim.

               (1,1)
             _/     \_
           _/         \_
         _/             \_
        /                 \
     (1,2)            ____( , )____
    /     \         _/      |      \_
   /       \       /        |        \
(1,1)     ( , ) ( , )     ( , )     ( , )
                  |
                  |
                ( , )

               (1,1)
             _/     \_
           _/         \_
         _/             \_
        /                 \
     (1,2)            ____( , )____
    /     \         _/      |      \_
   /       \       /        |        \
(1,1)     (2,2) ( , )     ( , )     ( , )
                  |
                  |
                ( , )

Note that low never changed because it keeps the lowest value. Another dead end so again increase low and lim.

               (1,4)
             _/     \_
           _/         \_
         _/             \_
        /                 \
     (1,3)            ____( , )____
    /     \         _/      |      \_
   /       \       /        |        \
(1,1)     (2,2) ( , )     ( , )     ( , )
                  |
                  |
                ( , )

We had to go up twice so lim(root) is now 4.

               (1,4)
             _/     \_
           _/         \_
         _/             \_
        /                 \
     (1,3)            ____(4,4)____
    /     \         _/      |      \_
   /       \       /        |        \
(1,1)     (2,2) (4,4)     ( , )     ( , )
                  |
                  |
                (4,4)

               (1,4)
             _/     \_
           _/         \_
         _/             \_
        /                 \
     (1,3)            ____(4,6)____
    /     \         _/      |      \_
   /       \       /        |        \
(1,1)     (2,2) (4,5)     ( , )     ( , )
                  |
                  |
                (4,4)

               (1,4)
             _/     \_
           _/         \_
         _/             \_
        /                 \
     (1,3)            ____(4,6)____
    /     \         _/      |      \_
   /       \       /        |        \
(1,1)     (2,2) (4,5)     (6,6)     ( , )
                  |
                  |
                (4,4)

               (1,4)
             _/     \_
           _/         \_
         _/             \_
        /                 \
     (1,3)            ____(4,7)____
    /     \         _/      |      \_
   /       \       /        |        \
(1,1)     (2,2) (4,5)     (6,6)     (7,7)
                  |
                  |
                (4,4)

               (1,9)
             _/     \_
           _/         \_
         _/             \_
        /                 \
     (1,3)            ____(4,8)____
    /     \         _/      |      \_
   /       \       /        |        \
(1,1)     (2,2) (4,5)     (6,6)     (7,7)
                  |
                  |
                (4,4)

So how do we determine head or tail components. Let's take the edge with the node (1,3) to (1,1). Let's call them u and v. Any node is in the tail component if and only if

low(u) less or equal lim(w) less or equal lim(u)

We use that knowledge to update the tree efficiently during the network simplex iterations.

a === b === c === d ==== h
 \--------- f === g ===/
  \-------- e ===/

If we decide that our entering edge is a--e, then we have to walk the parent edges until we find the least common ancestor l. All edges we traversed have to be updated. The least common ancestor l is the node that satisfies:

low(l) less or equal lim(a) and lim(e) less equal lim(l).

In our case this will h. So we have to update all edges but f--g.

The paper suggests that when choosing an edge with negative-cut value, we shouldn't always take the first but search cyclically through. Not sure what they exactly mean, but we could try just choosing one at random.

Puh! I have a feeling without a nice graph framework, it's gonna be a pain implementing this. We need stuff like a data structure, postorder traversal and depth-first search. Those are still quite simple but since I'm starting from scratch it's still a lot of effort. In Part 3 we continue with some basic graph implementations.

Storax