This post will bounce between math/programming and the “real world” as I attempt to tease apart how efficiency can change when a problem is decomposed.

# Decomposition

To `decompose` some problem `P`, we want to split it apart so that the `size` of each component part is less than the whole. We define (in a circular fashion) an inverse operator `compose` that given two sub-problems, combines them into a single problem. The easiest way to interpret `compose` at this point is taking it to just be like a data constructor for list or something; ie compose doesn’t do anything, just holds onto sub-problems `p0` and `p1` along with some `ϵ` that captures anything lost in the decomposition process.

``````decompose P = compose p0 p1 ϵ
size P ≥ (size p0) + (size p1)
``````

# Solving Problems

To `solve` some problem `P`, we assume we have some function `f` that solves it! And to solve the solve sub-problems, we have to combine the solutions to the sub-problems. Here I’ve used a primed `'` versions of `solve` and `compose` to work on the sub-problem/solution level (and that `solve{N}'` may be different for each sub-problem). Note that we have lost the solution to `ϵ` when combining our sub-problems.

``````solve P = f P
solve (decompose P) = compose' (solve0' p0) (solve1' p1)
``````

# Wins

We don’t only have to think of decomposing a problem instance, we can also think about decomposing the space of all problems and special-casing each. For example, if your problem is “sort a list of numbers”, you could have a case for first inspecting whether the list is already sorted in which case you’re done. The efficiency gain from this check is a win for those cases where the list is already sorted, but we incur a penalty on every list (and in the worst case, we have to check up until the last letter; the asymptotic worst case doesn’t change, but in reality, it would be a penalty).

So to be more precise, even if we gain efficiency in some sub-problems because of some information we’ve gained through the decomposition process, we must account for the efficiency of the decomposition, the solutions to each sub-problem, and then finally combining the sub-solutions. This math is a bit hand-wavy because like in the above example, the decomposition is mutually exclusive, ie we are walking down a decision tree to figure out which `solve{N}'` to use and so `compose'` is free. In other problems, `compose'` might actually combine results, like in a divide and conquer algorithm.

``````# E for efficiency
# F for frequency
efficiency solve (decompose P) = E decompose + E compose' + (E solve0' * F p0) + (E solve1' * F p1)
F p0 + F p1 = 1

# In most cases we'd expect
efficiency solve P ≤ efficiency solve0' p0
# but maybe the efficiency is greater in some sub-problems, and less in others; still a win so long as they are favorable overall
# also need to account for the frequency in which sub-problems are encountered, as that weights the overall benefit you get
``````

# Losses

We can imagine having many `decompose` steps for large problems, splitting a sub-problem `p0` into `compose p00 p01`. For the time being, we’ll assume we always split into a full binary tree, ending up with `2^k` sub-problems (1 split gives 2 sub-problems, 2 splits gives 4 sub-problems etc). This gives us `2^k - 1` internal nodes, which means we have `2^k - 1` decompose steps, and `2^k - 1` compose steps. Notice that this grows linear with the number of sub-problems. You can see that it is possible to end up with less efficient solutions if either a) compose/decompose are costly wrt the cost of of `solve{N}'` b) the efficiency of `solve{N}'` grows slower than the cost of the compose/decompose. And remember, all this has to consider the frequency in which you encounter each sub-problem to know when you actually win. Even if `solve0'` gives you a super-win, if it happens 1 in 100 and `solve1'` is slightly slower, then you’ll have lost in the average case.

# Communication

One nice part is that for `2^k` sub-problems, there are only `k` hops to get from the root to the sub-problem (ie logarithmic growth). This tells us that if `solve0'` and `solve1'` can happen independently, we can “hear back from” a sub-problem in only `k` steps. This is most relevant in something like company structures, where the latency of communication can be very high (answering emails!). A company structure decomposes the problem the company is trying to solve by introducing managers, departments, etc. For `2^k` worker drones, we have `2^k - 1` managers, and the CEO should be able to get an answer from a worker drone in a mere `2k` email-response-times (the email has to go down the tree and back up).

# Context

Often times in programming, special-cases are a big win because of assumed properties which means we can ignore the cost of `decompose` and likely even `compose'`. This can happen because of reasoning that the output of some part of the program ensures a property, or perhaps the decision tree used in `decompose` can actually be shared between many problem instances, so we only have to pay it once, but get to reap the benefit across many `solve{N}'` instances.

# Cost of Being Wrong

In many cases, the decision trees that `decompose` a problem are not perfect and so they incorrectly classify an instance of `P` to be a `p0` when it’s actually a `p1`. We can think of adding a penalty each time we use `solve0' p1` or `solve1' p0`. As with decision trees in ML theory, we can see that as you increase the depth of your tree (ie. make more decisions), you have more chances to make the wrong classification, and start accumulating more errors. On top of that, I’ll claim that the penalty for `solve010101' p001100` increases with depth also. These two penalty’s combine to really push for short trees, even if you theoretically have awesome solutions to very small subsets of your population. The optimal structure then comes at the frontier of having better sub-problem solvers and having too large of a tree to give worse results.

# Real World Impact

I think the world tends to overvalue the wins of solving sub-problems and undervalues the loss of growing a tree too big. We do know this at some level because people like “cutting out the middleman”, but it’s not as pervasive as the popularity of that phrase would suggest. Also worth keeping in mind that often times cutting out the middleman just means someone is hiding the middleman from you and other times the middleman is actually important and shouldn’t be cut out.

The types of problems I think this is most relevant to are large-scale government level issues like welfare and healthcare. I find things like UBI and single-payer as viable solutions because they are very shallow trees.