Wikipedia:Reference desk/Archives/Mathematics/2014 January 13

= January 13 =

Optimal Order of List
I am not a mathematician. But here's a real-world applied maths problem which I have.

I have a list of ten properties, and two pieces of information about each: firstly its value, secondly the taxable gain which would arise if it were sold.

My customer has a specified tax loss available to set against any taxable gain.

What my customer wants to do is to sell the maximum value of properties (i.e. realising the largest possible amount of cash) without the taxable gain exceeding the tax loss which is available to set against it.

I have found quite a good answer, by sorting my list in order of gain from lowest to highest, which shows that he can sell seven of the properties. But that might not be the optimal answer (to see that this is the case, you just need to imagine a hypothetical property with a gain which would be eighth in my sequence but a multi-million dollar value).

So my question is: is there a way to ensure that I get an optimal answer (or just to prove the answer I have is it)? Obviously, manually testing every possible order in which I could list the properties isn't tenable. AndyJones (talk) 17:07, 13 January 2014 (UTC)


 * This is an example of the 0-1 knapsack problem. It's NP-complete, meaning finding the optimal solution is in general quite hard.  The specifics of property sales might pose some restraints that make it easier to solve in this case, however.--71.175.63.37 (talk) 18:15, 13 January 2014 (UTC)


 * In case the above doesn't make sense to the OP: unless there's more to it that you're not telling us, there probably isn't any better way to do it than manually testing every possible order. This wouldn't be too hard to do with a computer: there are only 10! = 3628800 possible orderings of 10 items, which isn't all that many. If you increase your number of properties beyond 10 this will get intractable pretty quick even for a computer, but if it stays around 10 you should be OK. Staecker (talk) 22:25, 13 January 2014 (UTC)


 * Actually, I think they can do a lot better than that. The purpose of the list was just to determine which of the 10 properties to sell and which to keep.  If we bypass the list, and only consider the binary case of each property being sold or kept, we only get 210 possible answers, or 1024.  A computer could spit that out in a fraction of a second. (And we could do even better than that, since there's no need to continue down any chain which has already exceeded the allowed value.)


 * AndyJones, if you want to list the data here (maybe substituting "Property 1" - "Property 10" for the actual names), I would be glad to use my PC to find the answer for you. StuRat (talk) 23:25, 13 January 2014 (UTC)

Brilliant. Let's try that. Here's the list:

Values($)   Gains($) Property 1 500,000.00   249,922.12 Property 2 300,000.00   162,868.62 Property 3 375,000.00   202,579.40 Property 4 325,000.00   172,500.00 Property 5 100,000.00    39,000.00 Property 6 100,000.00    39,000.00 Property 7 125,000.00    78,529.95 Property 8 175,000.00    45,000.00 Property 9 240,000.00    37,500.00 Property 10 125,000.00    3,000.00 Thanks for everyone's help with this! AndyJones (talk) 10:49, 14 January 2014 (UTC)
 * Also, the loss figure is -$572,439.39. AndyJones (talk) 10:56, 14 January 2014 (UTC)


 * To maximise value, I think the optimal set of disposals is (1, 4, 5, 8, 9, 10) with a total gain of $546,922.12 and a total value of $1,465,000. For comparison, the greedy algorithm (take largest possible values first) would give (1, 3, 5, 7) with a total gain of $570,031.47 and a total value of $1,100,000. Or if you take the properties with the lowest possible values first, you get (2, 5, 6, 7, 8, 9, 10) with a total gain of $404,898.57 and a total value of $1,165,000.Gandalf61 (talk) 11:53, 14 January 2014 (UTC)
 * (Since Properties 5 and 6 have identical values and gains, you can substitute 6 for 5 in the above to get an alternative, but essentially identical, solution) Gandalf61 (talk) 12:03, 14 January 2014 (UTC)


 * Did you use a computer program to get these results ? At any rate, I wrote one, and got the same results as your first (optimal) solution.  That is, two equally good solutions, with properties (1, 4, 5, 8, 9, 10) and (1, 4, 6, 8, 9, 10).  StuRat (talk) 18:45, 14 January 2014 (UTC)


 * Yes, I used Python to do a brute force search across all 210 subsets of the set of 10 properties. You can reduce the size of the search space by a small amount if you notice that a set that meets the total gain constraint has to contain between 3 and 7 properties. Brute force search is practical up to about 25 to 30 properties, but after that the search space becomes too large to search exhuastively. There are more efficient search approaches such as hill climbing or simulated annealing which will prdouce a pretty good solution, but will not always find the global optimum. Gandalf61 (talk) 10:47, 15 January 2014 (UTC)


 * I think you could go a bit further than that and still guarantee the optimal solution, just by pruning any a path which has already exceeding the limit (or better yet, which would exceed the limit if any of the remaining properties were added). So, if we sorted the list by gains, in ascending order, we'd have:

Values($)   Gains($) Property 10 125,000.00    3,000.00 Property 9 240,000.00    37,500.00 Property 5 100,000.00    39,000.00 Property 6 100,000.00    39,000.00 Property 8 175,000.00    45,000.00 Property 7 125,000.00    78,529.95 Property 2 300,000.00   162,868.62 Property 4 325,000.00   172,500.00 Property 3 375,000.00   202,579.40 Property 1 500,000.00   249,922.12


 * Then, if we are considering the path with properties 10, 9, 5, 6, 8, 7, and 2 already selected, we can see that adding anything further will exceed the limit, so the 7 possibilities with each combo of the remaining 3 properties added need not be considered. Likewise, if we consider the case where properties 10, 9, 5, 6, 8, 7, and 4 are already selected, we can eliminate the remaining 3 combos we'd get by adding the final two properties.  This pruning method would be even more effective with a lower relative limit. StuRat (talk) 20:26, 15 January 2014 (UTC)


 * Thank you. Did I mention my standard 5% commission ? ;) Gandalf61 (talk) 10:47, 15 January 2014 (UTC)


 * You're quite welcome. I'll split my commission with Gandalf. :-)


 * BTW, if you had a solution which gave you an extra million dollars in sales, yet only exceeded the loss figure by 1 cent, would you want to suggest that to your client ? If so, we can re-run the numbers, looking for a near-miss like that.  StuRat (talk) 20:26, 15 January 2014 (UTC)


 * It occurred to me that you want a computer program to perform the searching for you and this problem is simple enough that it can easily be programmed in C language which is probably the fastest language to run the search. And it may take up so little room that the entire code could be running on the on chip cache of the CPU. 202.177.218.59 (talk) 01:52, 16 January 2014 (UTC)


 * How funny. About a year ago, a friend of mine -- an estate lawyer -- came to me with exactly the same problem; a list of 40 or so properties, and the desire to find the subset which could be sold to maximize income and minimize tax. I did it by simulation - random subsets keeping track of the best solutions. It was not a theoretically perfect solution, but it was sufficient (especially after a few million iterations.) --jpgordon:==( o ) 16:56, 16 January 2014 (UTC)


 * So there were too many to do a brute force approach to find the optimal solution ? I bet with my pruning method described above you could handle 40.  StuRat (talk) 17:02, 16 January 2014 (UTC)
 * I didn't want to have to think that hard, and there were other variables - for example, some of the properties were jointly owned, some were individually owned, and there were actually two columns to be maximized against two different tax liabilities. Besides, I was in the mood to do a little simulation. --jpgordon:==( o ) 18:45, 16 January 2014 (UTC)


 * SAT solvers and SMT solvers can be effective with larger problems of this sort, without having to think too hard. They use exponential-time algorithms combined with heuristics that find quicker solutions in specific instances, which for many practical problems turn out to be a lot of such instances.  Setting them up and using them is rather technical though.  For the immediate problem with 10 properties (or even a modestly higher number like 20), brute force search is just fine. 50.0.121.102 (talk) 22:46, 17 January 2014 (UTC)