Blog coding and discussion of coding about JavaScript, PHP, CGI, general web building etc.

Saturday, April 9, 2016

Why is 1000000000000000 in range(1000000000000001) so fast in Python 3?

Why is 1000000000000000 in range(1000000000000001) so fast in Python 3?


It is my understanding that the range() function, which is actually an object type in Python 3, generates its contents on the fly, similar to a generator.

This being the case, I would have expected the following line to take an inordinate amount of time, because in order to determine whether 1 quadrillion is in the range, a quadrillion values would have to be generated:

1000000000000000 in range(1000000000000001)  

Furthermore: it seems that no matter how many zeroes I add on, the calculation more or less takes the same amount of time (basically instantaneous).

I have also tried things like this, but the calculation is still almost instant:

1000000000000000000000 in range(0,1000000000000000000001,10) # count by tens  

If I try to implement my own range function, the result is not so nice!!

def my_crappy_range(N):      i = 0      while i < N:          yield i          i += 1      return  

What is the range() object doing under the hood that makes it so fast?


EDIT: This has turned out to be a much more nuanced topic than I anticipated - there seems to be a bit of history behind the optimization of range().

Martijn Pieters' answer was chosen for its completeness, but also see abarnert's first answer for a good discussion of what it means for range to be a full-fledged sequence in Python 3, and some information/warning regarding potential inconsistency for __contains__ function optimization across Python implementations. abarnert's other answer goes into some more detail and provides links for those interested in the history behind the optimization in Python 3 (and lack of optimization of xrange in Python 2). Answers by poke and by wim provide the relevant C source code and explanations for those who are interested.

Answer by Martijn Pieters for Why is 1000000000000000 in range(1000000000000001) so fast in Python 3?


The Python 3 range() object doesn't produce numbers immediately; it is a smart sequence object that produces numbers on demand. All it contains is your start, stop and step values, then as you iterate over the object the next integer is calculated each iteration.

The object also implements the object.__contains__ hook, and calculates if your number is part of its range. Calculating is a O(1) constant time operation. There is never a need to scan through all possible integers in the range.

From the range() object documentation:

The advantage of the range type over a regular list or tuple is that a range object will always take the same (small) amount of memory, no matter the size of the range it represents (as it only stores the start, stop and step values, calculating individual items and subranges as needed).

So at a minimum, your range() object would do:

class my_range(object):      def __init__(self, start, stop=None, step=1):          if stop is None:              start, stop = 0, start          self.start, self.stop, self.step = start, stop, step          if step < 0:              lo, hi = stop, start          else:              lo, hi = start, stop          self.length = ((hi - lo - 1) // abs(step)) + 1        def __iter__(self):          current = self.start          if self.step < 0:              while current > self.stop:                  yield current                  current += self.step          else:              while current < self.stop:                  yield current                  current += self.step        def __len__(self):          return self.length        def __getitem__(self, i):          if 0 <= i < self.length:              return self.start + i * self.step          raise IndexError('Index out of range: {}'.format(i))        def __contains__(self, num):          if self.step < 0:              if not (self.stop < num <= self.start):                  return False          else:              if not (self.start <= num < self.stop):                  return False          return (num - self.start) % self.step == 0  

This is still missing several things that a real range() supports (such as the .index() or .count() methods, hashing, equality testing, or slicing), but should give you an idea.

I also simplified the __contains__ implementation to only focus on integer tests; if you give a real range() object a non-integer value (including subclasses of int), a slow scan is initiated to see if there is a match, just as if you use a containment test against a list of all the contained values. This was done to continue to support other numeric types that just happen to support equality testing with integers but are not expected to support integer arithmetic as well. See the original Python issue that implemented the containment test.

Answer by poke for Why is 1000000000000000 in range(1000000000000001) so fast in Python 3?


To add to Martijn?s answer, this is the relevant part of the source (in C, as the range object is written in native code):

static int  range_contains(rangeobject *r, PyObject *ob)  {      if (PyLong_CheckExact(ob) || PyBool_Check(ob))          return range_contains_long(r, ob);        return (int)_PySequence_IterSearch((PyObject*)r, ob,                                         PY_ITERSEARCH_CONTAINS);  }  

So for PyLong objects (which is int in Python 3), it will use the range_contains_long function to determine the result. And that function essentially checks if ob is in the specified range (although it looks a bit more complex in C).

If it?s not an int object, it falls back to iterating until it finds the value (or not).

The whole logic could be translated to pseudo-Python like this:

def range_contains (rangeObj, obj):      if isinstance(obj, int):          return range_contains_long(rangeObj, obj)        # default logic by iterating      return any(obj == x for x in rangeObj)    def range_contains_long (r, num):      if r.step > 0:          # positive step: r.start <= num < r.stop          cmp2 = r.start <= num          cmp3 = num < r.stop      else:          # negative step: r.start >= num > r.stop          cmp2 = num <= r.start          cmp3 = r.stop < num        # outside of the range boundaries      if not cmp2 or not cmp3:          return False        # num must be on a valid step inside the boundaries      return (num - r.start) % r.step == 0  

Answer by wim for Why is 1000000000000000 in range(1000000000000001) so fast in Python 3?


Use the source, Luke!

In CPython, range(...).__contains__ (a method wrapper) will eventually delegate to this calculation in C code - which checks if the value can possibly be in the range:

static int  range_contains_long(rangeobject *r, PyObject *ob)  {      int cmp1, cmp2, cmp3;      PyObject *tmp1 = NULL;      PyObject *tmp2 = NULL;      PyObject *zero = NULL;      int result = -1;        zero = PyLong_FromLong(0);      if (zero == NULL) /* MemoryError in int(0) */          goto end;        /* Check if the value can possibly be in the range. */        cmp1 = PyObject_RichCompareBool(r->step, zero, Py_GT);      if (cmp1 == -1)          goto end;      if (cmp1 == 1) { /* positive steps: start <= ob < stop */          cmp2 = PyObject_RichCompareBool(r->start, ob, Py_LE);          cmp3 = PyObject_RichCompareBool(ob, r->stop, Py_LT);      }      else { /* negative steps: stop < ob <= start */          cmp2 = PyObject_RichCompareBool(ob, r->start, Py_LE);          cmp3 = PyObject_RichCompareBool(r->stop, ob, Py_LT);      }        if (cmp2 == -1 || cmp3 == -1) /* TypeError */          goto end;      if (cmp2 == 0 || cmp3 == 0) { /* ob outside of range */          result = 0;          goto end;      }        /* Check that the stride does not invalidate ob's membership. */      tmp1 = PyNumber_Subtract(ob, r->start);      if (tmp1 == NULL)          goto end;      tmp2 = PyNumber_Remainder(tmp1, r->step);      if (tmp2 == NULL)          goto end;      /* result = (int(ob) - start % step) == 0 */      result = PyObject_RichCompareBool(tmp2, zero, Py_EQ);    end:      Py_XDECREF(tmp1);      Py_XDECREF(tmp2);      Py_XDECREF(zero);      return result;  }    static int  range_contains(rangeobject *r, PyObject *ob)  {      if (PyLong_CheckExact(ob) || PyBool_Check(ob))          return range_contains_long(r, ob);        return (int)_PySequence_IterSearch((PyObject*)r, ob,                                         PY_ITERSEARCH_CONTAINS);  }  

For those not familiar with C code, first there is a check that the number is between start and stop. Then there is a check that the stride value doesn't "step over" our number.

The "meat" of the idea is mentioned in the line:

result = (int(ob) - start % step) == 0   

For example, 16 is in range(4, 100, 2) because:

  • 4 <= 16 < 100, and
  • (16 - 4) % 2 == 0.

Note: It's just aswell this line is a comment, because it would have been a bug, the start % step will be evaluated before the subtraction in C! However, you can see the implementation is done in the correct order in separate lines.

Answer by abarnert for Why is 1000000000000000 in range(1000000000000001) so fast in Python 3?


The fundamental misunderstanding here is in thinking that range is a generator. It's not. In fact, it's not any kind of iterator.

You can tell this pretty easily:

>>> a = range(5)  >>> print(list(a))  [0, 1, 2, 3, 4]  >>> print(list(a))  [0, 1, 2, 3, 4]  

If it were a generator, iterating it once would exhaust it:

>>> b = my_crappy_range(5)  >>> print(list(b))  [0, 1, 2, 3, 4]  >>> print(list(b))  []  

What range actually is, is a sequence, just like a list. You can even test this:

>>> import collections.abc  >>> isinstance(a, collections.abc.Sequence)  True  

This means it has to follow all the rules of being a sequence:

>>> a[3]  3  

The difference between a range and a list is that a range is a lazy or dynamic sequence; it doesn't remember all of its values, it just remembers its start, stop, and step, and creates the values on demand on __getitem__.

(As a side note, if you print(iter(a)), you'll notice that range uses the same listiterator type as list. How does that work? A listiterator doesn't use anything special about list except for the fact that it provides a C implementation of __getitem__, so it works fine for range too.)


Now, there's nothing that says that Sequence.__contains__ has to be constant time?in fact, for obvious examples of sequences like list, it isn't. But there's nothing that says it can't be. And it's easier to implement range.__contains__ to just check it mathematically ((val - start) % step, but with some extra complexity to deal with negative steps) than to actually generate and test all the values, so why shouldn't it do it the better way?

But there doesn't seem to be anything in the language that guarantees this will happen. As Ashwini Chaudhari points out, if you give it a non-integral value, instead of converting to integer and doing the mathematical test, it will fall back to iterating all the values and comparing them one by one. And just because CPython 3.2+ and PyPy 3.x versions happen to contain this optimization, and it's an obvious good idea and easy to do, there's no reason that IronPython or NewKickAssPython 3.x couldn't leave it out. (And in fact CPython 3.0-3.1 didn't include it.)


If range actually were a generator, like my_crappy_range, then it wouldn't make sense to test __contains__ this way, or at least the way it makes sense wouldn't be obvious. If you'd already iterated the first 3 values, is 1 still in the generator? Should testing for 1 cause it to iterate and consume all the values up to 1 (or up to the first value >= 1)?

Answer by Stefan Pochmann for Why is 1000000000000000 in range(1000000000000001) so fast in Python 3?


The other answers explained it well already, but I'd like to offer another experiment illustrating the nature of range objects:

>>> r = range(5)  >>> for i in r:          print(i, 2 in r, list(r))    0 True [0, 1, 2, 3, 4]  1 True [0, 1, 2, 3, 4]  2 True [0, 1, 2, 3, 4]  3 True [0, 1, 2, 3, 4]  4 True [0, 1, 2, 3, 4]  

As you can see, a range object is an object that remembers its range and can be used many times (even while iterating over it), not just a one-time generator.

Answer by abarnert for Why is 1000000000000000 in range(1000000000000001) so fast in Python 3?


If you're wondering why this optimization was added to range.__contains__, and why it wasn't added to xrange.__contains__ in 2.7:

First, as Ashwini Chaudhary discovered, issue 1766304 was opened explicitly to optimize [x]range.__contains__. A patch for this was accepted and checked in for 3.2, but not backported to 2.7 because "xrange has behaved like this for such a long time that I don't see what it buys us to commit the patch this late." (2.7 was nearly out at that point.)

Meanwhile:

Originally, xrange was a not-quite-sequence object. As the 3.1 docs say:

Range objects have very little behavior: they only support indexing, iteration, and the len function.

This wasn't quite true; an xrange object actually supported a few other things that come automatically with indexing and len,* including __contains__ (via linear search). But nobody thought it was worth making them full sequences at the time.

Then, as part of implementing the Abstract Base Classes PEP, it was important to figure out which builtin types should be marked as implementing which ABCs, and xrange/range claimed to implement collections.Sequence, even though it still only handled the same "very little behavior". Nobody noticed that problem until issue 9213. The patch for that issue not only added index and count to 3.2's range, it also re-worked the optimized __contains__ (which shares the same math with index, and is directly used by count).** This change went in for 3.2 as well, and was not backported to 2.x, because "it's a bugfix that adds new methods". (At this point, 2.7 was already past rc status.)

So, there were two chances to get this optimization backported to 2.7, but they were both rejected.


* In fact, you even get iteration for free with len and indexing, but in 2.3 xrange objects got a custom iterator. Which they then lost in 3.x, which uses the same listiterator type as list.

** The first version actually reimplemented it, and got the details wrong?e.g., it would give you MyIntSubclass(2) in range(5) == False. But Daniel Stutzbach's updated version of the patch restored most of the previous code, including the fallback to the generic, slow _PySequence_IterSearch that pre-3.2 range.__contains__ was implicitly using when the optimization doesn't apply.


Fatal error: Call to a member function getElementsByTagName() on a non-object in D:\XAMPP INSTALLASTION\xampp\htdocs\endunpratama9i\www-stackoverflow-info-proses.php on line 72

0 comments:

Post a Comment

Popular Posts

Powered by Blogger.