Wednesday, January 6, 2016

Better way for concatenating two sorted list of integers

Better way for concatenating two sorted list of integers


Lets assume I have one list and another tuple both of them are already sorted:

A = [10, 20, 30, 40]  B = (20, 60, 81, 90)  

What I would need is to add all the elements from B into A in such a way that A remains sorted.

Solution I could come with was:

for item in B:      for i in range(0, len(A)):          if item > A[i]:              i += 1          else:               A.insert(i, item)  

assuming A of size m, and B of size n; this solution would take O(mxn) in worst case, how can I make it perform better ?

Answer by vaultah for Better way for concatenating two sorted list of integers


bisect module "provides support for maintaining a list in sorted order without having to sort the list after each insertion":

import bisect  for b in B:      bisect.insort(A, b)  

This solution does not create a new list.


Please note that bisect.insort(A, b) is equivalent to

A.insert(bisect.bisect_right(A, b), b)  

Even though the search is fast (O(log n)), the insertion is slow (O(n)).

Answer by Arman for Better way for concatenating two sorted list of integers


edited

l1 = [10,20,30,40]  l2 = (10,20,30,40)  l2 = list(l2)  l1 = sorted(l1+l2)  

Answer by Jayanth Koushik for Better way for concatenating two sorted list of integers


You need to perform a merge. But the "traditional" merge generates a new list; so you need some modifications in order to expand one list.

ia = ib = 0  while ib < len(B):      if ia < len(A) and A[ia] < B[ib]:          if ia < len(A):              ia += 1      else:          A.insert(ia + 1, B[ib])          ib += 1  

Answer by Padraic Cunningham for Better way for concatenating two sorted list of integers


A simple way would be heapq.merge:

A = [10, 20, 30, 40]    B = (20, 60, 81, 90)    from heapq import merge    for ele in merge(A,B):      print(ele)  

Output:

10  20  20  30  40  60  81  90  

Some timings using the other O(n) solution:

In [53]: A = list(range(10000))    In [54]: B = list(range(1,20000,10))    In [55]: timeit list(merge(A,B))  100 loops, best of 3: 2.52 ms per loop    In [56]: %%timeit  C = []  i = j = 0  while i < len(A) and j < len(B):      if A[i] < B[j]:          C.append(A[i])          i += 1      else:          C.append(B[j])          j += 1  C += A[i:] + B[j:]     ....:   100 loops, best of 3: 4.29 ms per loop  In [58]: m =list(merge(A,B))  In [59]: m == C  Out[59]: True  

If you wanted to roll your own this is a bit faster than merge:

def merger_try(a, b):      if not a or not b:          yield chain(a, b)      iter_a, iter_b = iter(a), iter(b)      prev_a, prev_b = next(iter_a), next(iter_b)      while True:          if prev_a >= prev_b:              yield prev_b              try:                  prev_b = next(iter_b)              except StopIteration:                  yield prev_a                  break          else:              yield prev_a              try:                  prev_a = next(iter_a)              except StopIteration:                  yield prev_b                  break      for ele in chain(iter_b, iter_a):          yield ele  

Some timings:

In [128]: timeit list(merge(A,B))  1 loops, best of 3: 771 ms per loop    In [129]: timeit list(merger_try(A,B))  1 loops, best of 3: 581 ms per loop    In [130]: list(merger_try(A,B))  == list(merge(A,B))  Out[130]: True    In [131]: %%timeit                                   C = []  i = j = 0  while i < len(A) and j < len(B):      if A[i] < B[j]:          C.append(A[i])          i += 1      else:          C.append(B[j])          j += 1  C += A[i:] + B[j:]     .....:   1 loops, best of 3: 919 ms per loop  

Answer by eskaev for Better way for concatenating two sorted list of integers


Here is a solution in O(n):

A = [10, 20, 30, 40]  B = [20, 60, 81, 90]  C = []  i = j = 0  while i < len(A) and j < len(B):      if A[i] < B[j]:          C.append(A[i])          i += 1      else:          C.append(B[j])          j += 1  C += A[i:] + B[j:]  

Answer by Jayanth Koushik for Better way for concatenating two sorted list of integers


Lots of good discussion in this post! Arguing about timing is hard, so I wrote some timing script. It's quite rudimentary but I think it will do for now. I've attached the results too.

import timeit  import math  import matplotlib.pyplot as plt  from collections import defaultdict      setup = """  import bisect  import heapq  from random import randint      A = sorted((randint(1, 10000) for _ in range({})))  B = sorted((randint(1, 10000) for _ in range({})))      def bisect_sol(A, B):      for b in B:          bisect.insort(A, b)      def merge_sol(A, B):      ia = ib = 0      while ib < len(B):          if ia < len(A) and A[ia] < B[ib]:              if ia < len(A):                  ia += 1          else:              A.insert(ia + 1, B[ib])              ib += 1      def heap_sol(A, B):      return heapq.merge(A, B)      def sorted_sol(A, B):      return sorted(A + B)  """      sols = ["bisect", "merge", "heap", "sorted"]  times = defaultdict(list)  iters = [100, 1000, 2000, 5000, 10000, 20000, 50000, 100000]    for n in iters:      for sol in sols:          t = min(timeit.repeat(stmt="{}_sol(A, B)".format(sol), setup=setup.format(n, n), number=1, repeat=5))          print("({}, {}) done".format(n, sol))          times[sol].append(math.log(t))    for sol in sols:      plt.plot(iters, times[sol])  plt.xlabel("iterations")  plt.ylabel("log time")  plt.legend(sols)  plt.show()  

This is the result:

Iterations vs. Time

It's clear that modifying the list is the major bottleneck, so creating a new list is the way to go.


Fatal error: Call to a member function getElementsByTagName() on a non-object in D:\XAMPP INSTALLASTION\xampp\htdocs\endunpratama9i\www-stackoverflow-info-proses.php on line 72

No comments:

Post a Comment