Blog coding and discussion of coding about JavaScript, PHP, CGI, general web building etc.

Saturday, July 23, 2016

Ruby: How to find item in array which has the most occurrences?

Ruby: How to find item in array which has the most occurrences?


[1, 1, 1, 2, 3].mode  => 1    ['cat', 'dog', 'snake', 'dog'].mode  => dog  

Answer by Ben Alpert for Ruby: How to find item in array which has the most occurrences?


First build a hash mapping each value in the array to its frequency?

arr = [1, 1, 1, 2, 3]    freq = arr.inject(Hash.new(0)) { |h,v| h[v] += 1; h }  #=> {1=>3, 2=>1, 3=>1}  

? then use the frequency table to find the element with the highest frequency:

arr.max_by { |v| freq[v] }  #=> 1  

Answer by Derek P. for Ruby: How to find item in array which has the most occurrences?


idx = {}  [2,2,1,3,1].each { |i| idx.include?(i) ? idx[i] += 1 : idx[i] = 1}  

This is just a simple indexer. You could replace the [2,2,1..] array with any sort of symbol/string based identifier, this wouldn't work with objects, you'd need to introduce a bit more complexity, but this is simple enough.

rereading your questions, this solution is a bit over-engineered since its going to return you an index of all occurrences, not just the one with the most.

Answer by Mike Woodhouse for Ruby: How to find item in array which has the most occurrences?


While I adore the grep solution for its elegance and for reminding (or teaching) me about a method in Enumerable that I'd forgotten (or overlooked completely), it's slow, slow, slow. I agree 100% that creating the Array#mode method is a good idea, however - this is Ruby, we don't need a library of functions that act on arrays, we can create a mixin that adds the necessary functions into the Array class itself.

But the inject(Hash) alternative uses a sort, which we also don't really need: we just want the value with the highest occurrence.

Neither of the solutions address the possibility that more than one value may be the mode. Maybe that's not an issue in the problem as stated (can't tell). I think I'd want to know if there was a tie, though, and anyway, I think we can improve a little on the performance.

require 'benchmark'    class Array    def mode1      sort_by {|i| grep(i).length }.last    end    def mode2      freq = inject(Hash.new(0)) { |h,v| h[v] += 1; h }      sort_by { |v| freq[v] }.last        end    def mode3      freq = inject(Hash.new(0)) { |h,v| h[v] += 1; h }      max = freq.values.max                   # we're only interested in the key(s) with the highest frequency      freq.select { |k, f| f == max }         # extract the keys that have the max frequency    end  end    arr = Array.new(1_000) { |i| rand(100) }    # something to test with    Benchmark.bm(30) do |r|    res = {}    (1..3).each do |i|      m = "mode#{i}"      r.report(m) do        100.times do          res[m] = arr.send(m).inspect        end      end    end    res.each { |k, v| puts "%10s = %s" % [k, v] }  end  

And here's output from a sample run.

                                user     system      total        real  mode1                          34.375000   0.000000  34.375000 ( 34.393000)  mode2                           0.359000   0.000000   0.359000 (  0.359000)  mode3                           0.219000   0.000000   0.219000 (  0.219000)       mode1 = 41       mode2 = 41       mode3 = [[41, 17], [80, 17], [72, 17]]  

The "optimised" mode3 took 60% of the time of the previous record-holder. Note also the multiple highest-frequency entries.

EDIT

A few months down the line, I noticed Nilesh's answer, which offered this:

def mode4    group_by{|i| i}.max{|x,y| x[1].length <=> y[1].length}[0]  end  

It doesn't work with 1.8.6 out of the box, because that version doesn't have Array#group_by. ActiveSupport has it, for the Rails developers, although it seems about 2-3% slower than mode3 above. Using the (excellent) backports gem, though, produces a 10-12% gain, as well as delivering a whole pile of 1.8.7 and 1.9 extras.

The above applies to 1.8.6 only - and mainly only if installed on Windows. Since I have it installed, here's what you get from IronRuby 1.0 (on .NET 4.0):

==========================   IronRuby   =====================================  (iterations bumped to **1000**)    user     system      total        real  mode1 (I didn't bother :-))  mode2                           4.265625   0.046875   4.312500 (  4.203151)  mode3                           0.828125   0.000000   0.828125 (  0.781255)  mode4                           1.203125   0.000000   1.203125 (  1.062507)  

So in the event that performance is super-critical, benchmark the options on your Ruby version & OS. YMMV.

Answer by Nilesh C for Ruby: How to find item in array which has the most occurrences?


Mike: I found a faster method. Try this:

  class Array      def mode4        group_by{|i| i}.max{|x,y| x[1].length <=> y[1].length}[0]      end    end  

The Benchmark output:

                                    user     system      total        real  mode1                          24.340000   0.070000  24.410000 ( 24.526991)  mode2                           0.200000   0.000000   0.200000 (  0.195348)  mode3                           0.120000   0.000000   0.120000 (  0.118200)  mode4                           0.050000   0.010000   0.060000 (  0.056315)       mode1 = 76       mode2 = 76       mode3 = [[76, 18]]       mode4 = 76  

Answer by Brandon for Ruby: How to find item in array which has the most occurrences?


This is a duplicate of this question: Ruby - Unique elements in Array

Here is that question's solution:

group_by { |n| n }.values.max_by(&:size).first  

That version seems to be even faster than Nilesh C's answer. Here is the code I used to benchmark it (OS X 10.6 Core 2 2.4GHz MB).

Kudos to Mike Woodhouse for the (original) benchmarking code:

class Array     def mode1       group_by { |n| n }.values.max_by(&:size).first     end     def mode2       freq = inject(Hash.new(0)) { |h,v| h[v] += 1; h }       max = freq.values.max                   # we're only interested in the key(s) with the highest frequency       freq.select { |k, f| f == max }         # extract the keys that have the max frequency     end  end    arr = Array.new(1_0000) { |i| rand(100000) }    # something to test with    Benchmark.bm(30) do |r|      (1..2).each do |i| r.report("mode#{i}") { 100.times do arr.send("mode#{i}").inspect; end }; end  end  

And here are the results of the benchmark:

                                user     system      total        real  mode1                           1.830000   0.010000   1.840000 (  1.876642)  mode2                           2.280000   0.010000   2.290000 (  2.382117)   mode1 = 70099   mode2 = [[70099, 3], [70102, 3], [51694, 3], [49685, 3], [38410, 3], [90815, 3], [30551, 3], [34720, 3], [58373, 3]]  

As you can see, this version is about 20% faster with the caveat of ignoring ties. I also like the succinctness, I personally use it as-is without monkey patching all over the place. :)

Answer by glenn mcdonald for Ruby: How to find item in array which has the most occurrences?


Here's another version that does give you the ties as a mode should:

def mode    group_by {|x| x}.group_by {|k,v| v.size}.sort.last.last.map(&:first)  end  

In other words, group the values, then group those kv pairs by the number of values, then sort those kv pairs, take the last (highest) size-group, and then unwind its values. I like group_by.

Answer by jj_ for Ruby: How to find item in array which has the most occurrences?


if you are trying to avoid learning #inject (which you should not do...)

words = ['cat', 'dog', 'snake', 'dog']  count = Hash.new(0)    words.each {|word| count[word] += 1}  count.sort_by { |k,v| v }.last  

but if I read this answer before, now I would know nothing about #inject and man, you need to know about #inject.

Answer by Hemchandra for Ruby: How to find item in array which has the most occurrences?


def mode(array)        count = []  # Number of times element is repeated in array      output = []       array.compact!      unique = array.uniq      j=0        unique.each do |i|          count[j] = array.count(i)          j+=1      end      k=0      count.each do |i|          output[k] = unique[k] if i == count.max          k+=1      end          return output.compact.inspect  end    p mode([3,3,4,5]) #=> [3]    p mode([1,2,3]) #=> [1,2,3]    p mode([0,0,0,0,0,1,2,3,3,3,3,3]) #=> [0,3]    p mode([-1,-1,nil,nil,nil,0]) #=> [-1]    p mode([-2,-2,3,4,5,6,7,8,9,10,1000]) #=> [-2]  

Answer by etringer for Ruby: How to find item in array which has the most occurrences?


arr = [ 1, 3, 44, 3 ]  most_frequent_item = arr.uniq.max_by{ |i| arr.count( i ) }  puts most_frequent_item  #=> 3  

No need to even think about frequency mappings.

Answer by Nathan for Ruby: How to find item in array which has the most occurrences?


array.max_by { |i| array.count(i) }  


Fatal error: Call to a member function getElementsByTagName() on a non-object in D:\XAMPP INSTALLASTION\xampp\htdocs\endunpratama9i\www-stackoverflow-info-proses.php on line 72

0 comments:

Post a Comment

Popular Posts

Powered by Blogger.