Proportionately distribute (prorate) a value across a set of values ~ Discussion of Coding

Proportionately distribute (prorate) a value across a set of values

I have a need to write code that will prorate a value across a list, based on the relative weights of "basis" values in the list. Simply dividing the "basis" values by the sum of the "basis" values and then multiplying the factor by the original value to prorate works to a certain degree:

proratedValue = (basis / basisTotal) * prorationAmount;

However, the result of this calculation must then be rounded to integer values. The effect of the rounding means that the the sum of proratedValue for all items in the list may differ from the original prorationAmount.

Can anyone explain how to apply a "lossless" proration algorithm that proportionately distributes a value across a list as accurately as possible, without suffering from rounding errors?

Answer by Amber for Proportionately distribute (prorate) a value across a set of values

Simple algorithm sketch here...

Have a running total which starts at zero.
Do your standard "divide basis by total basis, then multiply by proration amount" for the first item.
Store the original value of the running total elsewhere, then add the amount you just calculated in #2.
Round both the old value and the new value of the running total to integers (don't modify the existing values, round them into separate variables), and take the difference.
The number calculated in step 4 is the value assigned to the current basis.
Repeat steps #2-5 for each basis.

This is guaranteed to have the total amount prorated equal to the input prorate amount, because you never actually modify the running total itself (you only take rounded values of it for other calculations, you don't write them back). What would have been an issue with integer rounding before is now dealt with, since the rounding error will add up over time in the running total and eventually push a value across the rounding threshold in the other direction.

Basic example:

Input basis: [0.2, 0.3, 0.3, 0.2]  Total prorate: 47    ----    R used to indicate running total here:    R = 0    First basis:    oldR = R [0]    R += (0.2 / 1.0 * 47) [= 9.4]    results[0] = int(R) - int(oldR) [= 9]    Second basis:    oldR = R [9.4]    R += (0.3 / 1.0 * 47) [+ 14.1, = 23.5 total]    results[1] = int(R) - int(oldR) [23-9, = 14]    Third basis:    oldR = R [23.5]    R += (0.3 / 1.0 * 47) [+ 14.1, = 37.6 total]    results[1] = int(R) - int(oldR) [38-23, = 15]    Fourth basis:    oldR = R [37.6]    R += (0.2 / 1.0 * 47) [+ 9.4, = 47 total]    results[1] = int(R) - int(oldR) [47-38, = 9]    9+14+15+9 = 47

Answer by Mathias for Proportionately distribute (prorate) a value across a set of values

The problem you have is to define what an "acceptable" rounding policy is, or in other words, what it is you are trying to minimize. Consider first this situation: you have only 2 identical items in your list, and are trying to allocate 3 units. Ideally, you would want to allocate the same amount to each item (1.5), but that is clearly not going to happen. The "best" you could do is likely to allocate 1 and 2, or 2 and 1. So

there might be multiple solutions to each allocation
identical items may not receive an identical allocation

Then, I chose 1 and 2 over 0 and 3 because I assume that what you want is to minimize the difference between the perfect allocation, and the integer allocation. This might not be what you consider "a good allocation", and this is a question you need to think about: what would make an allocation better than another one?
One possible value function could be to minimize the "total error", i.e. the sum of the absolute values of the differences between your allocation and the "perfect", unconstrained allocation.
It sounds to me that something inspired by Branch and Bound could work, but it's non trivial.
Assuming that Dav solution always produces an allocation that satisfies the constraint (which I'll trust is the case), I assume that it is not guaranteed to give you the "best" solution, "best" defined by whatever distance/fit metric you end up adopting. My reason for this is that this is a greedy algorithm, which in integer programming problems can lead you to solutions which are really off the optimal solution. But if you can live with a "somewhat correct" allocation, then I say, go for it! Doing it "optimally" doesn't sound trivial.
Best of luck!

Answer by Jay Stevens for Proportionately distribute (prorate) a value across a set of values

Ok. I'm pretty certain that the original algorithm (as written) and the code posted (as written) doesn't quite answer the mail for the test case outlined by @Mathias.

My intended use of this algorithm is a slightly more specific application. Rather than calculating the % using (@amt / @SumAmt) as shown in the original question. I have a fixed $ amount that needs to be split or spread across multiple items based on a % split defined for each of those items. The split % sums to 100%, however, straight multiplication often results in decimals that (when forced to round to whole $) don't add up to the total amount that I'm splitting apart. This is the core of the problem.

I'm fairly certain that the original answer from @Dav doesn't work in cases where (as @Mathias described) the rounded values are equal across multiple slices. This problem with the original algorithm and code can be summed up with one test case:

Take $100 and split it 3 ways using 33.333333% as your percentage.

Using the code posted by @jtw (assuming this is an accurate implementation of the original algorithm), yields you the incorrect answer of allocating $33 to each item (resulting in an overall sum of $99), so it fails the test.

I think a more accurate algorithm might be:

Have a running total which starts at 0
For each item in the group:
Calculate the un-rounded allocation amount as ( [Amount to be Split] * [% to Split] )
Calculate the cumulative Remainder as [Remainder] + ( [UnRounded Amount] - [Rounded Amount] )
If Round( [Remainder], 0 ) > 1 OR the current item is the LAST ITEM in the list, then set the item's allocation = [Rounded Amount] + Round( [Remainder], 0 )
else set item's allocation = [Rounded Amount]
Repeat for next item

Implemented in T-SQL, it looks like this:

-- Start of Code --  Drop Table #SplitList  Create Table #SplitList ( idno int , pctsplit decimal(5, 4), amt int , roundedAmt int )    -- Test Case #1  --Insert Into #SplitList Values (1, 0.3333, 100, 0)  --Insert Into #SplitList Values (2, 0.3333, 100, 0)  --Insert Into #SplitList Values (3, 0.3333, 100, 0)    -- Test Case #2  --Insert Into #SplitList Values (1, 0.20, 57, 0)  --Insert Into #SplitList Values (2, 0.20, 57, 0)  --Insert Into #SplitList Values (3, 0.20, 57, 0)  --Insert Into #SplitList Values (4, 0.20, 57, 0)  --Insert Into #SplitList Values (5, 0.20, 57, 0)    -- Test Case #3  --Insert Into #SplitList Values (1, 0.43, 10, 0)  --Insert Into #SplitList Values (2, 0.22, 10, 0)  --Insert Into #SplitList Values (3, 0.11, 10, 0)  --Insert Into #SplitList Values (4, 0.24, 10, 0)    -- Test Case #4  Insert Into #SplitList Values (1, 0.50, 75, 0)  Insert Into #SplitList Values (2, 0.50, 75, 0)    Declare @R Float  Declare @Results Float  Declare @unroundedAmt Float  Declare @idno Int  Declare @roundedAmt Int  Declare @amt Float  Declare @pctsplit Float  declare @rowCnt int    Select @R = 0  select @rowCnt = 0    -- Define the cursor   Declare SplitList Cursor For   Select idno, pctsplit, amt, roundedAmt From #SplitList Order By amt Desc  -- Open the cursor  Open SplitList    -- Assign the values of the first record  Fetch Next From SplitList Into @idno, @pctsplit, @amt, @roundedAmt  -- Loop through the records  While @@FETCH_STATUS = 0    Begin      -- Get derived Amounts from cursor      select @unroundedAmt = ( @amt * @pctsplit )      select @roundedAmt = Round( @unroundedAmt, 0 )        -- Remainder      Select @R = @R + @unroundedAmt - @roundedAmt      select @rowCnt = @rowCnt + 1        -- Magic Happens!  (aka Secret Sauce)      if ( round(@R, 0 ) >= 1 ) or ( @@CURSOR_ROWS = @rowCnt ) Begin          select @Results = @roundedAmt + round( @R, 0 )          select @R = @R - round( @R, 0 )      End      else Begin          Select @Results = @roundedAmt      End        If Round(@Results, 0) <> 0      Begin          Update #SplitList Set roundedAmt = @Results Where idno = @idno      End        -- Assign the values of the next record      Fetch Next From SplitList Into @idno, @pctsplit, @amt, @roundedAmt  End    -- Close the cursor  Close SplitList  Deallocate SplitList    -- Now do the check  Select * From #SplitList  Select Sum(roundedAmt), max( amt ),   case when max(amt) <> sum(roundedamt) then 'ERROR' else 'OK' end as Test   From #SplitList    -- End of Code --

Which yields a final result set for the test case of:

idno   pctsplit   amt     roundedAmt  1      0.3333    100     33  2      0.3333    100     34  3      0.3333    100     33

As near as I can tell (and I've got several test cases in the code), this handles all of these situations pretty gracefully.

Answer by Charles for Proportionately distribute (prorate) a value across a set of values

This is an apportionment problem, for which there are many known methods. All have certain pathologies: the Alabama paradox, the population paradox, or a failure of the quota rule. (Balinski and Young proved that no method can avoid all three.) You'll probably want one that follows the quote rule and avoids the Alabama paradox; the population paradox isn't as much of a concern since there's no much difference in the number of days per month between different years.

Answer by Joseph Kingry for Proportionately distribute (prorate) a value across a set of values

TL;DR algorithm with best (+20%) possible accuracy, 70% slower.

Evaulated algorithms presented in accepted answer here as well as answer to python question of similar nature.

Distribute 1 - based on Amber's algorithm
Distribute 2 - based on John Machin's algorithm
Distribute 3 - see below
Distribute 4 - optimized version of Distribute 3 (eg. removed LINQ, used arrays)

Testing results (10,000 iterations)

Algorithm    | Avg Abs Diff (x lowest) | Time (x lowest)       ------------------------------------------------------------------  Distribute 1 | 0.5282 (1.1992)         | 00:00:00.0906921 (1.0000)  Distribute 2 | 0.4526 (1.0275)         | 00:00:00.0963136 (1.0620)  Distribute 3 | 0.4405 (1.0000)         | 00:00:01.1689239 (12.8889)  Distribute 4 | 0.4405 (1.0000)         | 00:00:00.1548484 (1.7074)

Method 3 present has 19.9% better accuracy, for 70.7% slower execution time as expected.

Distribute 3

Makes best effort to be as accurate as possible in distributing amount.

Distribute weights as normal
Increment weights with highest error until actual distributed amount equals expected amount

Sacrifices speed for accuracy by making more then one pass through the loop.

public static IEnumerable Distribute3(IEnumerable weights, int amount)  {      var totalWeight = weights.Sum();      var query = from w in weights                  let fraction = amount * (w / totalWeight)                  let integral = (int)Math.Floor(fraction)                  select Tuple.Create(integral, fraction);        var result = query.ToList();      var added = result.Sum(x => x.Item1);        while (added < amount)      {          var maxError = result.Max(x => x.Item2 - x.Item1);          var index = result.FindIndex(x => (x.Item2 - x.Item1) == maxError);          result[index] = Tuple.Create(result[index].Item1 + 1, result[index].Item2);          added += 1;      }        return result.Select(x => x.Item1);  }

Distribute 4

public static IEnumerable Distribute4(IEnumerable weights, int amount)  {      var totalWeight = weights.Sum();      var length = weights.Count();        var actual = new double[length];      var error = new double[length];      var rounded = new int[length];        var added = 0;        var i = 0;      foreach (var w in weights)      {          actual[i] = amount * (w / totalWeight);          rounded[i] = (int)Math.Floor(actual[i]);          error[i] = actual[i] - rounded[i];          added += rounded[i];          i += 1;      }        while (added < amount)      {          var maxError = 0.0;          var maxErrorIndex = -1;          for(var e = 0; e  < length; ++e)          {              if (error[e] > maxError)              {                  maxError = error[e];                  maxErrorIndex = e;              }          }            rounded[maxErrorIndex] += 1;          error[maxErrorIndex] -= 1;            added += 1;      }        return rounded;  }

Test Harness

static void Main(string[] args)  {      Random r = new Random();        Stopwatch[] time = new[] { new Stopwatch(), new Stopwatch(), new Stopwatch(), new Stopwatch() };        double[][] results = new[] { new double[Iterations], new double[Iterations], new double[Iterations], new double[Iterations] };        for (var i = 0; i < Iterations; ++i)      {          double[] weights = new double[r.Next(MinimumWeights, MaximumWeights)];          for (var w = 0; w < weights.Length; ++w)          {              weights[w] = (r.NextDouble() * (MaximumWeight - MinimumWeight)) + MinimumWeight;          }          var amount = r.Next(MinimumAmount, MaximumAmount);            var totalWeight = weights.Sum();          var expected = weights.Select(w => (w / totalWeight) * amount).ToArray();            Action runTest = (resultIndex, func) =>              {                  time[resultIndex].Start();                  var result = func(weights, amount).ToArray();                  time[resultIndex].Stop();                    var total = result.Sum();                    if (total != amount)                      throw new Exception("Invalid total");                    var diff = expected.Zip(result, (e, a) => Math.Abs(e - a)).Sum() / amount;                    results[resultIndex][i] = diff;              };            runTest(0, Distribute1);          runTest(1, Distribute2);          runTest(2, Distribute3);          runTest(3, Distribute4);      }  }

Answer by mmmmmm for Proportionately distribute (prorate) a value across a set of values

I think proportional distributions is the answer: http://www.sangakoo.com/en/unit/proportional-distributions-direct-and-inverse

Fatal error: Call to a member function getElementsByTagName() on a non-object in D:\XAMPP INSTALLASTION\xampp\htdocs\endunpratama9i\www-stackoverflow-info-proses.php on line 72

Discussion of Coding

Blog coding and discussion of coding about JavaScript, PHP, CGI, general web building etc.

Wednesday, June 22, 2016