Proportionately distribute (prorate) a value across a set of values
Proportionately distribute (prorate) a value across a set of values
I have a need to write code that will prorate a value across a list, based on the relative weights of "basis" values in the list. Simply dividing the "basis" values by the sum of the "basis" values and then multiplying the factor by the original value to prorate works to a certain degree:
proratedValue = (basis / basisTotal) * prorationAmount; However, the result of this calculation must then be rounded to integer values. The effect of the rounding means that the the sum of proratedValue for all items in the list may differ from the original prorationAmount.
Can anyone explain how to apply a "lossless" proration algorithm that proportionately distributes a value across a list as accurately as possible, without suffering from rounding errors?
Answer by Amber for Proportionately distribute (prorate) a value across a set of values
Simple algorithm sketch here...
- Have a running total which starts at zero.
- Do your standard "divide basis by total basis, then multiply by proration amount" for the first item.
- Store the original value of the running total elsewhere, then add the amount you just calculated in #2.
- Round both the old value and the new value of the running total to integers (don't modify the existing values, round them into separate variables), and take the difference.
- The number calculated in step 4 is the value assigned to the current basis.
- Repeat steps #2-5 for each basis.
This is guaranteed to have the total amount prorated equal to the input prorate amount, because you never actually modify the running total itself (you only take rounded values of it for other calculations, you don't write them back). What would have been an issue with integer rounding before is now dealt with, since the rounding error will add up over time in the running total and eventually push a value across the rounding threshold in the other direction.
Basic example:
Input basis: [0.2, 0.3, 0.3, 0.2] Total prorate: 47 ---- R used to indicate running total here: R = 0 First basis: oldR = R [0] R += (0.2 / 1.0 * 47) [= 9.4] results[0] = int(R) - int(oldR) [= 9] Second basis: oldR = R [9.4] R += (0.3 / 1.0 * 47) [+ 14.1, = 23.5 total] results[1] = int(R) - int(oldR) [23-9, = 14] Third basis: oldR = R [23.5] R += (0.3 / 1.0 * 47) [+ 14.1, = 37.6 total] results[1] = int(R) - int(oldR) [38-23, = 15] Fourth basis: oldR = R [37.6] R += (0.2 / 1.0 * 47) [+ 9.4, = 47 total] results[1] = int(R) - int(oldR) [47-38, = 9] 9+14+15+9 = 47 Answer by Mathias for Proportionately distribute (prorate) a value across a set of values
The problem you have is to define what an "acceptable" rounding policy is, or in other words, what it is you are trying to minimize. Consider first this situation: you have only 2 identical items in your list, and are trying to allocate 3 units. Ideally, you would want to allocate the same amount to each item (1.5), but that is clearly not going to happen. The "best" you could do is likely to allocate 1 and 2, or 2 and 1. So
- there might be multiple solutions to each allocation
- identical items may not receive an identical allocation
Then, I chose 1 and 2 over 0 and 3 because I assume that what you want is to minimize the difference between the perfect allocation, and the integer allocation. This might not be what you consider "a good allocation", and this is a question you need to think about: what would make an allocation better than another one?
One possible value function could be to minimize the "total error", i.e. the sum of the absolute values of the differences between your allocation and the "perfect", unconstrained allocation.
It sounds to me that something inspired by Branch and Bound could work, but it's non trivial.
Assuming that Dav solution always produces an allocation that satisfies the constraint (which I'll trust is the case), I assume that it is not guaranteed to give you the "best" solution, "best" defined by whatever distance/fit metric you end up adopting. My reason for this is that this is a greedy algorithm, which in integer programming problems can lead you to solutions which are really off the optimal solution. But if you can live with a "somewhat correct" allocation, then I say, go for it! Doing it "optimally" doesn't sound trivial.
Best of luck!
Answer by Jay Stevens for Proportionately distribute (prorate) a value across a set of values
Ok. I'm pretty certain that the original algorithm (as written) and the code posted (as written) doesn't quite answer the mail for the test case outlined by @Mathias.
My intended use of this algorithm is a slightly more specific application. Rather than calculating the % using (@amt / @SumAmt) as shown in the original question. I have a fixed $ amount that needs to be split or spread across multiple items based on a % split defined for each of those items. The split % sums to 100%, however, straight multiplication often results in decimals that (when forced to round to whole $) don't add up to the total amount that I'm splitting apart. This is the core of the problem.
I'm fairly certain that the original answer from @Dav doesn't work in cases where (as @Mathias described) the rounded values are equal across multiple slices. This problem with the original algorithm and code can be summed up with one test case:
Take $100 and split it 3 ways using 33.333333% as your percentage.
Using the code posted by @jtw (assuming this is an accurate implementation of the original algorithm), yields you the incorrect answer of allocating $33 to each item (resulting in an overall sum of $99), so it fails the test.
I think a more accurate algorithm might be:
- Have a running total which starts at 0
- For each item in the group:
- Calculate the un-rounded allocation amount as
( [Amount to be Split] * [% to Split] ) - Calculate the cumulative Remainder as
[Remainder] + ( [UnRounded Amount] - [Rounded Amount] ) - If
Round( [Remainder], 0 ) > 1OR the current item is the LAST ITEM in the list, then set the item's allocation =[Rounded Amount] + Round( [Remainder], 0 ) - else set item's allocation =
[Rounded Amount] - Repeat for next item
Implemented in T-SQL, it looks like this:
-- Start of Code -- Drop Table #SplitList Create Table #SplitList ( idno int , pctsplit decimal(5, 4), amt int , roundedAmt int ) -- Test Case #1 --Insert Into #SplitList Values (1, 0.3333, 100, 0) --Insert Into #SplitList Values (2, 0.3333, 100, 0) --Insert Into #SplitList Values (3, 0.3333, 100, 0) -- Test Case #2 --Insert Into #SplitList Values (1, 0.20, 57, 0) --Insert Into #SplitList Values (2, 0.20, 57, 0) --Insert Into #SplitList Values (3, 0.20, 57, 0) --Insert Into #SplitList Values (4, 0.20, 57, 0) --Insert Into #SplitList Values (5, 0.20, 57, 0) -- Test Case #3 --Insert Into #SplitList Values (1, 0.43, 10, 0) --Insert Into #SplitList Values (2, 0.22, 10, 0) --Insert Into #SplitList Values (3, 0.11, 10, 0) --Insert Into #SplitList Values (4, 0.24, 10, 0) -- Test Case #4 Insert Into #SplitList Values (1, 0.50, 75, 0) Insert Into #SplitList Values (2, 0.50, 75, 0) Declare @R Float Declare @Results Float Declare @unroundedAmt Float Declare @idno Int Declare @roundedAmt Int Declare @amt Float Declare @pctsplit Float declare @rowCnt int Select @R = 0 select @rowCnt = 0 -- Define the cursor Declare SplitList Cursor For Select idno, pctsplit, amt, roundedAmt From #SplitList Order By amt Desc -- Open the cursor Open SplitList -- Assign the values of the first record Fetch Next From SplitList Into @idno, @pctsplit, @amt, @roundedAmt -- Loop through the records While @@FETCH_STATUS = 0 Begin -- Get derived Amounts from cursor select @unroundedAmt = ( @amt * @pctsplit ) select @roundedAmt = Round( @unroundedAmt, 0 ) -- Remainder Select @R = @R + @unroundedAmt - @roundedAmt select @rowCnt = @rowCnt + 1 -- Magic Happens! (aka Secret Sauce) if ( round(@R, 0 ) >= 1 ) or ( @@CURSOR_ROWS = @rowCnt ) Begin select @Results = @roundedAmt + round( @R, 0 ) select @R = @R - round( @R, 0 ) End else Begin Select @Results = @roundedAmt End If Round(@Results, 0) <> 0 Begin Update #SplitList Set roundedAmt = @Results Where idno = @idno End -- Assign the values of the next record Fetch Next From SplitList Into @idno, @pctsplit, @amt, @roundedAmt End -- Close the cursor Close SplitList Deallocate SplitList -- Now do the check Select * From #SplitList Select Sum(roundedAmt), max( amt ), case when max(amt) <> sum(roundedamt) then 'ERROR' else 'OK' end as Test From #SplitList -- End of Code -- Which yields a final result set for the test case of:
idno pctsplit amt roundedAmt 1 0.3333 100 33 2 0.3333 100 34 3 0.3333 100 33 As near as I can tell (and I've got several test cases in the code), this handles all of these situations pretty gracefully.
Answer by Charles for Proportionately distribute (prorate) a value across a set of values
This is an apportionment problem, for which there are many known methods. All have certain pathologies: the Alabama paradox, the population paradox, or a failure of the quota rule. (Balinski and Young proved that no method can avoid all three.) You'll probably want one that follows the quote rule and avoids the Alabama paradox; the population paradox isn't as much of a concern since there's no much difference in the number of days per month between different years.
Answer by Joseph Kingry for Proportionately distribute (prorate) a value across a set of values
TL;DR algorithm with best (+20%) possible accuracy, 70% slower.
Evaulated algorithms presented in accepted answer here as well as answer to python question of similar nature.
- Distribute 1 - based on Amber's algorithm
- Distribute 2 - based on John Machin's algorithm
- Distribute 3 - see below
- Distribute 4 - optimized version of Distribute 3 (eg. removed LINQ, used arrays)
Testing results (10,000 iterations)
Algorithm | Avg Abs Diff (x lowest) | Time (x lowest) ------------------------------------------------------------------ Distribute 1 | 0.5282 (1.1992) | 00:00:00.0906921 (1.0000) Distribute 2 | 0.4526 (1.0275) | 00:00:00.0963136 (1.0620) Distribute 3 | 0.4405 (1.0000) | 00:00:01.1689239 (12.8889) Distribute 4 | 0.4405 (1.0000) | 00:00:00.1548484 (1.7074) Method 3 present has 19.9% better accuracy, for 70.7% slower execution time as expected.
Distribute 3
Makes best effort to be as accurate as possible in distributing amount.
- Distribute weights as normal
- Increment weights with highest error until actual distributed amount equals expected amount
Sacrifices speed for accuracy by making more then one pass through the loop.
public static IEnumerable Distribute3(IEnumerable weights, int amount) { var totalWeight = weights.Sum(); var query = from w in weights let fraction = amount * (w / totalWeight) let integral = (int)Math.Floor(fraction) select Tuple.Create(integral, fraction); var result = query.ToList(); var added = result.Sum(x => x.Item1); while (added < amount) { var maxError = result.Max(x => x.Item2 - x.Item1); var index = result.FindIndex(x => (x.Item2 - x.Item1) == maxError); result[index] = Tuple.Create(result[index].Item1 + 1, result[index].Item2); added += 1; } return result.Select(x => x.Item1); } Distribute 4
public static IEnumerable Distribute4(IEnumerable weights, int amount) { var totalWeight = weights.Sum(); var length = weights.Count(); var actual = new double[length]; var error = new double[length]; var rounded = new int[length]; var added = 0; var i = 0; foreach (var w in weights) { actual[i] = amount * (w / totalWeight); rounded[i] = (int)Math.Floor(actual[i]); error[i] = actual[i] - rounded[i]; added += rounded[i]; i += 1; } while (added < amount) { var maxError = 0.0; var maxErrorIndex = -1; for(var e = 0; e < length; ++e) { if (error[e] > maxError) { maxError = error[e]; maxErrorIndex = e; } } rounded[maxErrorIndex] += 1; error[maxErrorIndex] -= 1; added += 1; } return rounded; } Test Harness
static void Main(string[] args) { Random r = new Random(); Stopwatch[] time = new[] { new Stopwatch(), new Stopwatch(), new Stopwatch(), new Stopwatch() }; double[][] results = new[] { new double[Iterations], new double[Iterations], new double[Iterations], new double[Iterations] }; for (var i = 0; i < Iterations; ++i) { double[] weights = new double[r.Next(MinimumWeights, MaximumWeights)]; for (var w = 0; w < weights.Length; ++w) { weights[w] = (r.NextDouble() * (MaximumWeight - MinimumWeight)) + MinimumWeight; } var amount = r.Next(MinimumAmount, MaximumAmount); var totalWeight = weights.Sum(); var expected = weights.Select(w => (w / totalWeight) * amount).ToArray(); Action runTest = (resultIndex, func) => { time[resultIndex].Start(); var result = func(weights, amount).ToArray(); time[resultIndex].Stop(); var total = result.Sum(); if (total != amount) throw new Exception("Invalid total"); var diff = expected.Zip(result, (e, a) => Math.Abs(e - a)).Sum() / amount; results[resultIndex][i] = diff; }; runTest(0, Distribute1); runTest(1, Distribute2); runTest(2, Distribute3); runTest(3, Distribute4); } } Answer by mmmmmm for Proportionately distribute (prorate) a value across a set of values
I think proportional distributions is the answer: http://www.sangakoo.com/en/unit/proportional-distributions-direct-and-inverse
Fatal error: Call to a member function getElementsByTagName() on a non-object in D:\XAMPP INSTALLASTION\xampp\htdocs\endunpratama9i\www-stackoverflow-info-proses.php on line 72

0 comments:
Post a Comment