Blog coding and discussion of coding about JavaScript, PHP, CGI, general web building etc.

Monday, August 1, 2016

Python - how to extract the last occurrence meeting a certain condition from a list

Python - how to extract the last occurrence meeting a certain condition from a list


For example, I have the following data as a list:

l = [['A', 'aa', '1', '300'],       ['A', 'ab', '2', '30'],       ['A', 'ac', '3', '60'],       ['B', 'ba', '5', '50'],       ['B', 'bb', '4', '10'],       ['C', 'ca', '6', '50']]  

Now for 'A', 'B', and 'C', I wanted to get their last occurrences, i.e.:

[['A', 'ab', '3', '30'],   ['B', 'bb', '4', '10'],   ['C', 'ca', '6', '50']]  

or further, the third column in these occurrences, i.e.:

['3', '4', '6']  

Currently, the way I deal with this is:

import pandas as pd  df = pd.DataFrame(l, columns=['u', 'w', 'y', 'z'])  df.set_index('u', inplace=True)  ll = []  for letter in df.index.unique():      ll.append((df.ix[letter, 'y'][-1]))  

Then I %timeit, it shows:

>> The slowest run took 27.86 times longer than the fastest.   >> This could mean that an intermediate result is being cached.  >> 1000000 loops, best of 3: 887 ns per loop  

Just wondering if there is a way to do this using less time than my code? Thanks!

Answer by Nils Gudat for Python - how to extract the last occurrence meeting a certain condition from a list


Even though I'm not sure I understood your question, here's what you could do:

li = [l[i][0] for i in range(len(l))]  [l[j][2] for j in [''.join(li).rfind(i) for i in set(li)]]  

Note that the output is [3,4,6], as the last occurrence of A seems to be the third, not the second array.

Edit as you seem very concerned about performance (although you don't say what you've tried and what qualifies as "good"):

%timeit li = [l[i][0] for i in range(len(l))]  %timeit [l[j][2] for j in [''.join(li).rfind(i) for i in set(li)]]  >> 1000000 loops, best of 3: 1.19 ?s per loop  >> 100000 loops, best of 3: 2.57 ?s per loop    %timeit [list(group)[-1][2] for key, group in itertools.groupby(l, lambda x: x[0])]  >> 100000 loops, best of 3: 5.11 ?s per loop  

So it seems the list comprehension is marginally faster than itertools (although I'm not an expert on benchmarks and there might be a better way to run the itertools one).

Answer by Thom for Python - how to extract the last occurrence meeting a certain condition from a list


A not-very-pythonic approach: (note that Nils' solution is the most pythonic - using list comprehension)

def get_last_row(xs,q):      for i in range(len(xs)-1,-1,-1):          if xs[i][0] == q:              return xs[i][2]    def get_third_cols(xs):      third_cols = []      for q in ["A","B","C"]:          third_cols.append(get_last_row(xs,q))      return third_cols    print get_third_cols(xs)  

This prints ['3', '4', '6'] if that's what you meant by last occurrence.

Answer by acushner for Python - how to extract the last occurrence meeting a certain condition from a list


{l[0]: l[2] for l in vals} will get you a mapping of 'A', 'B', and 'C' to their last values

Answer by TessellatingHeckler for Python - how to extract the last occurrence meeting a certain condition from a list


l =  [['A', 'aa', '1', '300'],    ['A', 'ab', '2', '30'],    ['A', 'ac', '3', '60'],    ['B', 'ba', '5', '50'],    ['B', 'bb', '4', '10'],    ['C', 'ca', '6', '50']]    import itertools  for key, group in itertools.groupby(l, lambda x: x[0]):      print key, list(group)[-1]  

With no comment on "efficiency" because you haven't explained your conditions at all. Assuming the list is sorted by first element of sublist in advance.

If the list is sorted, one run through should be enough:

def tidy(l):      tmp = []      prev_row = l[0]        for row in l:          if row[0] != prev_row[0]:              tmp.append(prev_row)          prev_row = row      tmp.append(prev_row)      return tmp  

and this is ~5x faster than itertools.groupby in a timeit test. Demonstration: https://repl.it/C5Af/0

[Edit: OP has updated their question to say they're already using Pandas to groupby, which is possibly way faster already]

Answer by michael_j_ward for Python - how to extract the last occurrence meeting a certain condition from a list


This will generalize to any key / value location. note, the output will be in in the order that the first key was observed. It would wouldn't be hard to adjust so that the order of the output is the order that the output value was observed

import operator    l = [['A', 'aa', '1', '300'],    ['A', 'ab', '2', '30'],    ['A', 'ac', '3', '60'],    ['B', 'ba', '5', '50'],    ['B', 'bb', '4', '10'],    ['C', 'ca', '6', '50']]    def getLast(data, key, value):      f = operator.itemgetter(key,value)      store = dict()      keys = []      for row in data:          key, value = f(row)          if key not in store:              keys.append(key)          store[key] = value      return [store[k] for k in keys]  

Now timing it,

%timeit getLast(l,0,2)  

Gives:

The slowest run took 9.44 times longer than the fastest. This could mean that an intermediate result is being cached   100000 loops, best of 3: 2.85 ?s per loop  

And the function Outputs:

['3', '4', '6']  


Fatal error: Call to a member function getElementsByTagName() on a non-object in D:\XAMPP INSTALLASTION\xampp\htdocs\endunpratama9i\www-stackoverflow-info-proses.php on line 72

0 comments:

Post a Comment

Popular Posts

Powered by Blogger.