Iterator protocol within numpyBuild a Basic Python IteratorHow to iterate through two lists in parallel?In Python, how do I determine if an object is iterable?Iterating over dictionaries using 'for' loopsPythonic way to create a long multi-line stringHow do I find the length (or dimensions, size) of a numpy matrix in python?numpy get index where value is trueHow to iterate over rows in a DataFrame in Pandas?Filter rows of a numpy array?How to specify depth of iterator in numpy?

Longest bridge/tunnel that can be cycled over/through?

Were Alexander the Great and Hephaestion lovers?

Rebus with 20 song titles

Inward extrusion is not working

How did old MS-DOS games utilize various graphic cards?

Is it a problem if <h4>, <h5> and <h6> are smaller than regular text?

Should I give professor gift at the beginning of my PhD?

How come the nude protesters were not arrested?

Does the Long March-11 increase its thrust after clearing the launch tower?

CROSS APPLY produces outer join

Why didn't Voldemort recognize that Dumbledore was affected by his curse?

Overlapping String-Blocks

Implement Own Vector Class in C++

Is the term 'open source' a trademark?

Why can't I use =default for default ctors with a member initializer list

How to manually rewind film?

Importance of Building Credit Score?

Group Integers by Originality

Did Milano or Benatar approve or comment on their namesake MCU ships?

Is a lack of character descriptions a problem?

Why would future John risk sending back a T-800 to save his younger self?

How to hide an urban landmark?

Which languages would be most useful in Europe at the end of the 19th century?

What is the actual quality of machine translations?



Iterator protocol within numpy


Build a Basic Python IteratorHow to iterate through two lists in parallel?In Python, how do I determine if an object is iterable?Iterating over dictionaries using 'for' loopsPythonic way to create a long multi-line stringHow do I find the length (or dimensions, size) of a numpy matrix in python?numpy get index where value is trueHow to iterate over rows in a DataFrame in Pandas?Filter rows of a numpy array?How to specify depth of iterator in numpy?






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;








0















Is there a way to work with iterators instead of (for example) numpy.ndarray in numpy?



For example, imagine I have a 2D-array and I want to know if there is a row that only contain even numbers:



import numpy as np

x = np.array([[1, 2], [2, 4], [3, 6]])
np.any(np.all(x % 2 == 0, axis=1))


Is there a way to do this kind of things without instantiating the intermediate objects in memory? (or maybe it is already the case and I just don't know it) In this example, that would mean having an iterator over [False True False] instead of an array. In other words, can we do something that would be equivalent to:



has_an_even_row = False 
for row in x:
if np.all(row % 2 == 0):
has_an_even_row = True
break


My question doesn't only concern all and any but all function/methods in numpy. If it isn't possible I wonder if there is a practical reason for not having this in numpy. (Maybe everyone thinks it's useless, that would be a good reason)










share|improve this question



















  • 1





    Sure you can iterate over the rows. Use the usual python for loop. But be ware the action will usually, but not always, be slower.

    – hpaulj
    Mar 24 at 18:03











  • I just updated my question, I'm looking for a solution internal to numpy.

    – cglacet
    Mar 24 at 18:04











  • What exactly are you envisioning? If the creation of intermediate objects is problematic, look into numexpr. But, as hpaulj is saying, if you want an iterator, use a for-loop.

    – juanpa.arrivillaga
    Mar 24 at 18:05







  • 1





    You can also look at numba which is a JIT compiler that will just-in-time-compile functions that use simple loops over numpy data structures into native code. In my experience, it is quite effective.

    – juanpa.arrivillaga
    Mar 24 at 18:17






  • 1





    numpy is like a Lego set. It is fast and easy to use when you stick with the given building blocks. It does not include a custom block molding machine - you have to get that from some other source.

    – hpaulj
    Mar 24 at 20:08

















0















Is there a way to work with iterators instead of (for example) numpy.ndarray in numpy?



For example, imagine I have a 2D-array and I want to know if there is a row that only contain even numbers:



import numpy as np

x = np.array([[1, 2], [2, 4], [3, 6]])
np.any(np.all(x % 2 == 0, axis=1))


Is there a way to do this kind of things without instantiating the intermediate objects in memory? (or maybe it is already the case and I just don't know it) In this example, that would mean having an iterator over [False True False] instead of an array. In other words, can we do something that would be equivalent to:



has_an_even_row = False 
for row in x:
if np.all(row % 2 == 0):
has_an_even_row = True
break


My question doesn't only concern all and any but all function/methods in numpy. If it isn't possible I wonder if there is a practical reason for not having this in numpy. (Maybe everyone thinks it's useless, that would be a good reason)










share|improve this question



















  • 1





    Sure you can iterate over the rows. Use the usual python for loop. But be ware the action will usually, but not always, be slower.

    – hpaulj
    Mar 24 at 18:03











  • I just updated my question, I'm looking for a solution internal to numpy.

    – cglacet
    Mar 24 at 18:04











  • What exactly are you envisioning? If the creation of intermediate objects is problematic, look into numexpr. But, as hpaulj is saying, if you want an iterator, use a for-loop.

    – juanpa.arrivillaga
    Mar 24 at 18:05







  • 1





    You can also look at numba which is a JIT compiler that will just-in-time-compile functions that use simple loops over numpy data structures into native code. In my experience, it is quite effective.

    – juanpa.arrivillaga
    Mar 24 at 18:17






  • 1





    numpy is like a Lego set. It is fast and easy to use when you stick with the given building blocks. It does not include a custom block molding machine - you have to get that from some other source.

    – hpaulj
    Mar 24 at 20:08













0












0








0








Is there a way to work with iterators instead of (for example) numpy.ndarray in numpy?



For example, imagine I have a 2D-array and I want to know if there is a row that only contain even numbers:



import numpy as np

x = np.array([[1, 2], [2, 4], [3, 6]])
np.any(np.all(x % 2 == 0, axis=1))


Is there a way to do this kind of things without instantiating the intermediate objects in memory? (or maybe it is already the case and I just don't know it) In this example, that would mean having an iterator over [False True False] instead of an array. In other words, can we do something that would be equivalent to:



has_an_even_row = False 
for row in x:
if np.all(row % 2 == 0):
has_an_even_row = True
break


My question doesn't only concern all and any but all function/methods in numpy. If it isn't possible I wonder if there is a practical reason for not having this in numpy. (Maybe everyone thinks it's useless, that would be a good reason)










share|improve this question
















Is there a way to work with iterators instead of (for example) numpy.ndarray in numpy?



For example, imagine I have a 2D-array and I want to know if there is a row that only contain even numbers:



import numpy as np

x = np.array([[1, 2], [2, 4], [3, 6]])
np.any(np.all(x % 2 == 0, axis=1))


Is there a way to do this kind of things without instantiating the intermediate objects in memory? (or maybe it is already the case and I just don't know it) In this example, that would mean having an iterator over [False True False] instead of an array. In other words, can we do something that would be equivalent to:



has_an_even_row = False 
for row in x:
if np.all(row % 2 == 0):
has_an_even_row = True
break


My question doesn't only concern all and any but all function/methods in numpy. If it isn't possible I wonder if there is a practical reason for not having this in numpy. (Maybe everyone thinks it's useless, that would be a good reason)







python numpy iterator






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Mar 24 at 18:11







cglacet

















asked Mar 24 at 17:53









cglacetcglacet

1,755820




1,755820







  • 1





    Sure you can iterate over the rows. Use the usual python for loop. But be ware the action will usually, but not always, be slower.

    – hpaulj
    Mar 24 at 18:03











  • I just updated my question, I'm looking for a solution internal to numpy.

    – cglacet
    Mar 24 at 18:04











  • What exactly are you envisioning? If the creation of intermediate objects is problematic, look into numexpr. But, as hpaulj is saying, if you want an iterator, use a for-loop.

    – juanpa.arrivillaga
    Mar 24 at 18:05







  • 1





    You can also look at numba which is a JIT compiler that will just-in-time-compile functions that use simple loops over numpy data structures into native code. In my experience, it is quite effective.

    – juanpa.arrivillaga
    Mar 24 at 18:17






  • 1





    numpy is like a Lego set. It is fast and easy to use when you stick with the given building blocks. It does not include a custom block molding machine - you have to get that from some other source.

    – hpaulj
    Mar 24 at 20:08












  • 1





    Sure you can iterate over the rows. Use the usual python for loop. But be ware the action will usually, but not always, be slower.

    – hpaulj
    Mar 24 at 18:03











  • I just updated my question, I'm looking for a solution internal to numpy.

    – cglacet
    Mar 24 at 18:04











  • What exactly are you envisioning? If the creation of intermediate objects is problematic, look into numexpr. But, as hpaulj is saying, if you want an iterator, use a for-loop.

    – juanpa.arrivillaga
    Mar 24 at 18:05







  • 1





    You can also look at numba which is a JIT compiler that will just-in-time-compile functions that use simple loops over numpy data structures into native code. In my experience, it is quite effective.

    – juanpa.arrivillaga
    Mar 24 at 18:17






  • 1





    numpy is like a Lego set. It is fast and easy to use when you stick with the given building blocks. It does not include a custom block molding machine - you have to get that from some other source.

    – hpaulj
    Mar 24 at 20:08







1




1





Sure you can iterate over the rows. Use the usual python for loop. But be ware the action will usually, but not always, be slower.

– hpaulj
Mar 24 at 18:03





Sure you can iterate over the rows. Use the usual python for loop. But be ware the action will usually, but not always, be slower.

– hpaulj
Mar 24 at 18:03













I just updated my question, I'm looking for a solution internal to numpy.

– cglacet
Mar 24 at 18:04





I just updated my question, I'm looking for a solution internal to numpy.

– cglacet
Mar 24 at 18:04













What exactly are you envisioning? If the creation of intermediate objects is problematic, look into numexpr. But, as hpaulj is saying, if you want an iterator, use a for-loop.

– juanpa.arrivillaga
Mar 24 at 18:05






What exactly are you envisioning? If the creation of intermediate objects is problematic, look into numexpr. But, as hpaulj is saying, if you want an iterator, use a for-loop.

– juanpa.arrivillaga
Mar 24 at 18:05





1




1





You can also look at numba which is a JIT compiler that will just-in-time-compile functions that use simple loops over numpy data structures into native code. In my experience, it is quite effective.

– juanpa.arrivillaga
Mar 24 at 18:17





You can also look at numba which is a JIT compiler that will just-in-time-compile functions that use simple loops over numpy data structures into native code. In my experience, it is quite effective.

– juanpa.arrivillaga
Mar 24 at 18:17




1




1





numpy is like a Lego set. It is fast and easy to use when you stick with the given building blocks. It does not include a custom block molding machine - you have to get that from some other source.

– hpaulj
Mar 24 at 20:08





numpy is like a Lego set. It is fast and easy to use when you stick with the given building blocks. It does not include a custom block molding machine - you have to get that from some other source.

– hpaulj
Mar 24 at 20:08












2 Answers
2






active

oldest

votes


















1














The number of temporary arrays may be more than you realize:



In [224]: x = np.array([[1, 2], [2, 4], [3, 6]]) 
In [225]: x % 2
Out[225]:
array([[1, 0],
[0, 0],
[1, 0]])
In [226]: _ == 0
Out[226]:
array([[False, True],
[ True, True],
[False, True]])
In [227]: np.all(_, axis=1)
Out[227]: array([False, True, False])
In [228]: np.any(_)
Out[228]: True


In this case, working row by row would save on calculating the last row's values.



The last any step might short-circuit, stopping when it hits the True - that's an implementation detail.



A thoroughly iterative, no excess calculations method would be something like:



In [231]: val = False 
...: for row in x:
...: for col in row:
...: if col%2!=0:
...: break
...: val=(row,col)
...: break

In [232]: val
Out[232]: (array([2, 4]), 2)


This approach would make sense if I were writing in C or a lisp like language, where testing, memory management, and calculations all occur at the same code level. But it wouldn't be very modular or reusable.



The idea underlying numpy is to provide a comprehensive set of compiled building blocks. Those blocks won't be optimal for all tasks, but on the whole they are fast and easy to use.



It's generally recommended to use the given building blocks for quick development. Once it's working then worry about improving the speed of time critical steps.






share|improve this answer

























  • "But it wouldn't be very modular or reusable" I agree, that's why I was wondering if it existed inside numpy. "It's generally recommended to use the given building blocks for quick development. Once it's working then worry about improving the speed of time critical steps.", basically if I had a use case where memory is limiting you would advice to rewriting the code in Cython or any lower level language?

    – cglacet
    Mar 26 at 14:20


















1














The numpy library doesn't give you very many tools to use some of the conventional Python protocols because it is focused on performance within a narrow domain (numeric computation). The whole purpose of numpy is to do numeric operations that are slow in pure Python much more quickly (close to your hardware's maximum speed, like code written in a lower level language like C) without loosing all of the benefits of Python (like garbage collection and easy to read syntax).



The downside to focusing on a narrow domain is that you lose some benefits of more general code. So your for loop code can do less work than numpy does, because it can short-circuit, breaking out of the iteration as soon as the result is known. It doesn't need to do the modulus for every row if it found the result it needs already.



But I suspect if you test it, your numpy code may still going to be faster a lot of the time (test on real data, not trivial stuff like in your example)! Even though it computes a whole bunch of intermediate results up front, the low level operations are so much faster than the equivalent in pure Python that it doesn't matter that it has to iterate over the whole array.






share|improve this answer























  • I'll surely try to compare time performances and come back here :). But that wouldn't really be a sufficient reason for not having a way to have iterators, there probably is a memory-speed tradeoff here. Unless I'm missing something.

    – cglacet
    Mar 26 at 14:14











  • Well, I guess I just don't understand exactly what you're expecting. Numpy arrays are iterable, so you can write normal Python code to operate on them (though it may not be as convenient or even as fast as using normal Python data structures). Many numpy functions only work on arrays, rather than iterables, and the reason for that is that their performance benefits are only available for arrays, not for arbitrary objects.

    – Blckknght
    Mar 27 at 0:23











  • "and the reason for that is that their performance benefits are only available for arrays" that's the part I really don't understand, from what I understand numpy takes advantage of static typing (together with type homogeneous structures) to speedup things and save memory. What I fail to understand is why this can't be used to build some other (statically typed) set of functions that instead of having arrays as both input and output would have arrays as input and iterators as output (a custom kind of iterator since it wouldn't iterate over arbitrary object, but instead over a given type).

    – cglacet
    Mar 27 at 7:33











  • Testing this is a bit hard as it requires re-writing some parts of numpy, but I'll try to in the near future if nobody tells me it's just not possible because of some reason I fail to see for now (maybe because I have an over-simplified vision of how numpy works).

    – cglacet
    Mar 27 at 7:37











  • The iterator protocol isn't that specific. You can't really have a function that only accepts one kind of iterator and say that's using the iterator protocol. You either call next on the arbitrary iterator object you've been given (which is slow, since it does a Python function call and might run arbitrary Python code), or you don't.

    – Blckknght
    Mar 27 at 7:40











Your Answer






StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55326761%2fiterator-protocol-within-numpy%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























2 Answers
2






active

oldest

votes








2 Answers
2






active

oldest

votes









active

oldest

votes






active

oldest

votes









1














The number of temporary arrays may be more than you realize:



In [224]: x = np.array([[1, 2], [2, 4], [3, 6]]) 
In [225]: x % 2
Out[225]:
array([[1, 0],
[0, 0],
[1, 0]])
In [226]: _ == 0
Out[226]:
array([[False, True],
[ True, True],
[False, True]])
In [227]: np.all(_, axis=1)
Out[227]: array([False, True, False])
In [228]: np.any(_)
Out[228]: True


In this case, working row by row would save on calculating the last row's values.



The last any step might short-circuit, stopping when it hits the True - that's an implementation detail.



A thoroughly iterative, no excess calculations method would be something like:



In [231]: val = False 
...: for row in x:
...: for col in row:
...: if col%2!=0:
...: break
...: val=(row,col)
...: break

In [232]: val
Out[232]: (array([2, 4]), 2)


This approach would make sense if I were writing in C or a lisp like language, where testing, memory management, and calculations all occur at the same code level. But it wouldn't be very modular or reusable.



The idea underlying numpy is to provide a comprehensive set of compiled building blocks. Those blocks won't be optimal for all tasks, but on the whole they are fast and easy to use.



It's generally recommended to use the given building blocks for quick development. Once it's working then worry about improving the speed of time critical steps.






share|improve this answer

























  • "But it wouldn't be very modular or reusable" I agree, that's why I was wondering if it existed inside numpy. "It's generally recommended to use the given building blocks for quick development. Once it's working then worry about improving the speed of time critical steps.", basically if I had a use case where memory is limiting you would advice to rewriting the code in Cython or any lower level language?

    – cglacet
    Mar 26 at 14:20















1














The number of temporary arrays may be more than you realize:



In [224]: x = np.array([[1, 2], [2, 4], [3, 6]]) 
In [225]: x % 2
Out[225]:
array([[1, 0],
[0, 0],
[1, 0]])
In [226]: _ == 0
Out[226]:
array([[False, True],
[ True, True],
[False, True]])
In [227]: np.all(_, axis=1)
Out[227]: array([False, True, False])
In [228]: np.any(_)
Out[228]: True


In this case, working row by row would save on calculating the last row's values.



The last any step might short-circuit, stopping when it hits the True - that's an implementation detail.



A thoroughly iterative, no excess calculations method would be something like:



In [231]: val = False 
...: for row in x:
...: for col in row:
...: if col%2!=0:
...: break
...: val=(row,col)
...: break

In [232]: val
Out[232]: (array([2, 4]), 2)


This approach would make sense if I were writing in C or a lisp like language, where testing, memory management, and calculations all occur at the same code level. But it wouldn't be very modular or reusable.



The idea underlying numpy is to provide a comprehensive set of compiled building blocks. Those blocks won't be optimal for all tasks, but on the whole they are fast and easy to use.



It's generally recommended to use the given building blocks for quick development. Once it's working then worry about improving the speed of time critical steps.






share|improve this answer

























  • "But it wouldn't be very modular or reusable" I agree, that's why I was wondering if it existed inside numpy. "It's generally recommended to use the given building blocks for quick development. Once it's working then worry about improving the speed of time critical steps.", basically if I had a use case where memory is limiting you would advice to rewriting the code in Cython or any lower level language?

    – cglacet
    Mar 26 at 14:20













1












1








1







The number of temporary arrays may be more than you realize:



In [224]: x = np.array([[1, 2], [2, 4], [3, 6]]) 
In [225]: x % 2
Out[225]:
array([[1, 0],
[0, 0],
[1, 0]])
In [226]: _ == 0
Out[226]:
array([[False, True],
[ True, True],
[False, True]])
In [227]: np.all(_, axis=1)
Out[227]: array([False, True, False])
In [228]: np.any(_)
Out[228]: True


In this case, working row by row would save on calculating the last row's values.



The last any step might short-circuit, stopping when it hits the True - that's an implementation detail.



A thoroughly iterative, no excess calculations method would be something like:



In [231]: val = False 
...: for row in x:
...: for col in row:
...: if col%2!=0:
...: break
...: val=(row,col)
...: break

In [232]: val
Out[232]: (array([2, 4]), 2)


This approach would make sense if I were writing in C or a lisp like language, where testing, memory management, and calculations all occur at the same code level. But it wouldn't be very modular or reusable.



The idea underlying numpy is to provide a comprehensive set of compiled building blocks. Those blocks won't be optimal for all tasks, but on the whole they are fast and easy to use.



It's generally recommended to use the given building blocks for quick development. Once it's working then worry about improving the speed of time critical steps.






share|improve this answer















The number of temporary arrays may be more than you realize:



In [224]: x = np.array([[1, 2], [2, 4], [3, 6]]) 
In [225]: x % 2
Out[225]:
array([[1, 0],
[0, 0],
[1, 0]])
In [226]: _ == 0
Out[226]:
array([[False, True],
[ True, True],
[False, True]])
In [227]: np.all(_, axis=1)
Out[227]: array([False, True, False])
In [228]: np.any(_)
Out[228]: True


In this case, working row by row would save on calculating the last row's values.



The last any step might short-circuit, stopping when it hits the True - that's an implementation detail.



A thoroughly iterative, no excess calculations method would be something like:



In [231]: val = False 
...: for row in x:
...: for col in row:
...: if col%2!=0:
...: break
...: val=(row,col)
...: break

In [232]: val
Out[232]: (array([2, 4]), 2)


This approach would make sense if I were writing in C or a lisp like language, where testing, memory management, and calculations all occur at the same code level. But it wouldn't be very modular or reusable.



The idea underlying numpy is to provide a comprehensive set of compiled building blocks. Those blocks won't be optimal for all tasks, but on the whole they are fast and easy to use.



It's generally recommended to use the given building blocks for quick development. Once it's working then worry about improving the speed of time critical steps.







share|improve this answer














share|improve this answer



share|improve this answer








edited Mar 24 at 21:31

























answered Mar 24 at 21:12









hpauljhpaulj

122k791166




122k791166












  • "But it wouldn't be very modular or reusable" I agree, that's why I was wondering if it existed inside numpy. "It's generally recommended to use the given building blocks for quick development. Once it's working then worry about improving the speed of time critical steps.", basically if I had a use case where memory is limiting you would advice to rewriting the code in Cython or any lower level language?

    – cglacet
    Mar 26 at 14:20

















  • "But it wouldn't be very modular or reusable" I agree, that's why I was wondering if it existed inside numpy. "It's generally recommended to use the given building blocks for quick development. Once it's working then worry about improving the speed of time critical steps.", basically if I had a use case where memory is limiting you would advice to rewriting the code in Cython or any lower level language?

    – cglacet
    Mar 26 at 14:20
















"But it wouldn't be very modular or reusable" I agree, that's why I was wondering if it existed inside numpy. "It's generally recommended to use the given building blocks for quick development. Once it's working then worry about improving the speed of time critical steps.", basically if I had a use case where memory is limiting you would advice to rewriting the code in Cython or any lower level language?

– cglacet
Mar 26 at 14:20





"But it wouldn't be very modular or reusable" I agree, that's why I was wondering if it existed inside numpy. "It's generally recommended to use the given building blocks for quick development. Once it's working then worry about improving the speed of time critical steps.", basically if I had a use case where memory is limiting you would advice to rewriting the code in Cython or any lower level language?

– cglacet
Mar 26 at 14:20













1














The numpy library doesn't give you very many tools to use some of the conventional Python protocols because it is focused on performance within a narrow domain (numeric computation). The whole purpose of numpy is to do numeric operations that are slow in pure Python much more quickly (close to your hardware's maximum speed, like code written in a lower level language like C) without loosing all of the benefits of Python (like garbage collection and easy to read syntax).



The downside to focusing on a narrow domain is that you lose some benefits of more general code. So your for loop code can do less work than numpy does, because it can short-circuit, breaking out of the iteration as soon as the result is known. It doesn't need to do the modulus for every row if it found the result it needs already.



But I suspect if you test it, your numpy code may still going to be faster a lot of the time (test on real data, not trivial stuff like in your example)! Even though it computes a whole bunch of intermediate results up front, the low level operations are so much faster than the equivalent in pure Python that it doesn't matter that it has to iterate over the whole array.






share|improve this answer























  • I'll surely try to compare time performances and come back here :). But that wouldn't really be a sufficient reason for not having a way to have iterators, there probably is a memory-speed tradeoff here. Unless I'm missing something.

    – cglacet
    Mar 26 at 14:14











  • Well, I guess I just don't understand exactly what you're expecting. Numpy arrays are iterable, so you can write normal Python code to operate on them (though it may not be as convenient or even as fast as using normal Python data structures). Many numpy functions only work on arrays, rather than iterables, and the reason for that is that their performance benefits are only available for arrays, not for arbitrary objects.

    – Blckknght
    Mar 27 at 0:23











  • "and the reason for that is that their performance benefits are only available for arrays" that's the part I really don't understand, from what I understand numpy takes advantage of static typing (together with type homogeneous structures) to speedup things and save memory. What I fail to understand is why this can't be used to build some other (statically typed) set of functions that instead of having arrays as both input and output would have arrays as input and iterators as output (a custom kind of iterator since it wouldn't iterate over arbitrary object, but instead over a given type).

    – cglacet
    Mar 27 at 7:33











  • Testing this is a bit hard as it requires re-writing some parts of numpy, but I'll try to in the near future if nobody tells me it's just not possible because of some reason I fail to see for now (maybe because I have an over-simplified vision of how numpy works).

    – cglacet
    Mar 27 at 7:37











  • The iterator protocol isn't that specific. You can't really have a function that only accepts one kind of iterator and say that's using the iterator protocol. You either call next on the arbitrary iterator object you've been given (which is slow, since it does a Python function call and might run arbitrary Python code), or you don't.

    – Blckknght
    Mar 27 at 7:40















1














The numpy library doesn't give you very many tools to use some of the conventional Python protocols because it is focused on performance within a narrow domain (numeric computation). The whole purpose of numpy is to do numeric operations that are slow in pure Python much more quickly (close to your hardware's maximum speed, like code written in a lower level language like C) without loosing all of the benefits of Python (like garbage collection and easy to read syntax).



The downside to focusing on a narrow domain is that you lose some benefits of more general code. So your for loop code can do less work than numpy does, because it can short-circuit, breaking out of the iteration as soon as the result is known. It doesn't need to do the modulus for every row if it found the result it needs already.



But I suspect if you test it, your numpy code may still going to be faster a lot of the time (test on real data, not trivial stuff like in your example)! Even though it computes a whole bunch of intermediate results up front, the low level operations are so much faster than the equivalent in pure Python that it doesn't matter that it has to iterate over the whole array.






share|improve this answer























  • I'll surely try to compare time performances and come back here :). But that wouldn't really be a sufficient reason for not having a way to have iterators, there probably is a memory-speed tradeoff here. Unless I'm missing something.

    – cglacet
    Mar 26 at 14:14











  • Well, I guess I just don't understand exactly what you're expecting. Numpy arrays are iterable, so you can write normal Python code to operate on them (though it may not be as convenient or even as fast as using normal Python data structures). Many numpy functions only work on arrays, rather than iterables, and the reason for that is that their performance benefits are only available for arrays, not for arbitrary objects.

    – Blckknght
    Mar 27 at 0:23











  • "and the reason for that is that their performance benefits are only available for arrays" that's the part I really don't understand, from what I understand numpy takes advantage of static typing (together with type homogeneous structures) to speedup things and save memory. What I fail to understand is why this can't be used to build some other (statically typed) set of functions that instead of having arrays as both input and output would have arrays as input and iterators as output (a custom kind of iterator since it wouldn't iterate over arbitrary object, but instead over a given type).

    – cglacet
    Mar 27 at 7:33











  • Testing this is a bit hard as it requires re-writing some parts of numpy, but I'll try to in the near future if nobody tells me it's just not possible because of some reason I fail to see for now (maybe because I have an over-simplified vision of how numpy works).

    – cglacet
    Mar 27 at 7:37











  • The iterator protocol isn't that specific. You can't really have a function that only accepts one kind of iterator and say that's using the iterator protocol. You either call next on the arbitrary iterator object you've been given (which is slow, since it does a Python function call and might run arbitrary Python code), or you don't.

    – Blckknght
    Mar 27 at 7:40













1












1








1







The numpy library doesn't give you very many tools to use some of the conventional Python protocols because it is focused on performance within a narrow domain (numeric computation). The whole purpose of numpy is to do numeric operations that are slow in pure Python much more quickly (close to your hardware's maximum speed, like code written in a lower level language like C) without loosing all of the benefits of Python (like garbage collection and easy to read syntax).



The downside to focusing on a narrow domain is that you lose some benefits of more general code. So your for loop code can do less work than numpy does, because it can short-circuit, breaking out of the iteration as soon as the result is known. It doesn't need to do the modulus for every row if it found the result it needs already.



But I suspect if you test it, your numpy code may still going to be faster a lot of the time (test on real data, not trivial stuff like in your example)! Even though it computes a whole bunch of intermediate results up front, the low level operations are so much faster than the equivalent in pure Python that it doesn't matter that it has to iterate over the whole array.






share|improve this answer













The numpy library doesn't give you very many tools to use some of the conventional Python protocols because it is focused on performance within a narrow domain (numeric computation). The whole purpose of numpy is to do numeric operations that are slow in pure Python much more quickly (close to your hardware's maximum speed, like code written in a lower level language like C) without loosing all of the benefits of Python (like garbage collection and easy to read syntax).



The downside to focusing on a narrow domain is that you lose some benefits of more general code. So your for loop code can do less work than numpy does, because it can short-circuit, breaking out of the iteration as soon as the result is known. It doesn't need to do the modulus for every row if it found the result it needs already.



But I suspect if you test it, your numpy code may still going to be faster a lot of the time (test on real data, not trivial stuff like in your example)! Even though it computes a whole bunch of intermediate results up front, the low level operations are so much faster than the equivalent in pure Python that it doesn't matter that it has to iterate over the whole array.







share|improve this answer












share|improve this answer



share|improve this answer










answered Mar 24 at 22:07









BlckknghtBlckknght

66.3k664111




66.3k664111












  • I'll surely try to compare time performances and come back here :). But that wouldn't really be a sufficient reason for not having a way to have iterators, there probably is a memory-speed tradeoff here. Unless I'm missing something.

    – cglacet
    Mar 26 at 14:14











  • Well, I guess I just don't understand exactly what you're expecting. Numpy arrays are iterable, so you can write normal Python code to operate on them (though it may not be as convenient or even as fast as using normal Python data structures). Many numpy functions only work on arrays, rather than iterables, and the reason for that is that their performance benefits are only available for arrays, not for arbitrary objects.

    – Blckknght
    Mar 27 at 0:23











  • "and the reason for that is that their performance benefits are only available for arrays" that's the part I really don't understand, from what I understand numpy takes advantage of static typing (together with type homogeneous structures) to speedup things and save memory. What I fail to understand is why this can't be used to build some other (statically typed) set of functions that instead of having arrays as both input and output would have arrays as input and iterators as output (a custom kind of iterator since it wouldn't iterate over arbitrary object, but instead over a given type).

    – cglacet
    Mar 27 at 7:33











  • Testing this is a bit hard as it requires re-writing some parts of numpy, but I'll try to in the near future if nobody tells me it's just not possible because of some reason I fail to see for now (maybe because I have an over-simplified vision of how numpy works).

    – cglacet
    Mar 27 at 7:37











  • The iterator protocol isn't that specific. You can't really have a function that only accepts one kind of iterator and say that's using the iterator protocol. You either call next on the arbitrary iterator object you've been given (which is slow, since it does a Python function call and might run arbitrary Python code), or you don't.

    – Blckknght
    Mar 27 at 7:40

















  • I'll surely try to compare time performances and come back here :). But that wouldn't really be a sufficient reason for not having a way to have iterators, there probably is a memory-speed tradeoff here. Unless I'm missing something.

    – cglacet
    Mar 26 at 14:14











  • Well, I guess I just don't understand exactly what you're expecting. Numpy arrays are iterable, so you can write normal Python code to operate on them (though it may not be as convenient or even as fast as using normal Python data structures). Many numpy functions only work on arrays, rather than iterables, and the reason for that is that their performance benefits are only available for arrays, not for arbitrary objects.

    – Blckknght
    Mar 27 at 0:23











  • "and the reason for that is that their performance benefits are only available for arrays" that's the part I really don't understand, from what I understand numpy takes advantage of static typing (together with type homogeneous structures) to speedup things and save memory. What I fail to understand is why this can't be used to build some other (statically typed) set of functions that instead of having arrays as both input and output would have arrays as input and iterators as output (a custom kind of iterator since it wouldn't iterate over arbitrary object, but instead over a given type).

    – cglacet
    Mar 27 at 7:33











  • Testing this is a bit hard as it requires re-writing some parts of numpy, but I'll try to in the near future if nobody tells me it's just not possible because of some reason I fail to see for now (maybe because I have an over-simplified vision of how numpy works).

    – cglacet
    Mar 27 at 7:37











  • The iterator protocol isn't that specific. You can't really have a function that only accepts one kind of iterator and say that's using the iterator protocol. You either call next on the arbitrary iterator object you've been given (which is slow, since it does a Python function call and might run arbitrary Python code), or you don't.

    – Blckknght
    Mar 27 at 7:40
















I'll surely try to compare time performances and come back here :). But that wouldn't really be a sufficient reason for not having a way to have iterators, there probably is a memory-speed tradeoff here. Unless I'm missing something.

– cglacet
Mar 26 at 14:14





I'll surely try to compare time performances and come back here :). But that wouldn't really be a sufficient reason for not having a way to have iterators, there probably is a memory-speed tradeoff here. Unless I'm missing something.

– cglacet
Mar 26 at 14:14













Well, I guess I just don't understand exactly what you're expecting. Numpy arrays are iterable, so you can write normal Python code to operate on them (though it may not be as convenient or even as fast as using normal Python data structures). Many numpy functions only work on arrays, rather than iterables, and the reason for that is that their performance benefits are only available for arrays, not for arbitrary objects.

– Blckknght
Mar 27 at 0:23





Well, I guess I just don't understand exactly what you're expecting. Numpy arrays are iterable, so you can write normal Python code to operate on them (though it may not be as convenient or even as fast as using normal Python data structures). Many numpy functions only work on arrays, rather than iterables, and the reason for that is that their performance benefits are only available for arrays, not for arbitrary objects.

– Blckknght
Mar 27 at 0:23













"and the reason for that is that their performance benefits are only available for arrays" that's the part I really don't understand, from what I understand numpy takes advantage of static typing (together with type homogeneous structures) to speedup things and save memory. What I fail to understand is why this can't be used to build some other (statically typed) set of functions that instead of having arrays as both input and output would have arrays as input and iterators as output (a custom kind of iterator since it wouldn't iterate over arbitrary object, but instead over a given type).

– cglacet
Mar 27 at 7:33





"and the reason for that is that their performance benefits are only available for arrays" that's the part I really don't understand, from what I understand numpy takes advantage of static typing (together with type homogeneous structures) to speedup things and save memory. What I fail to understand is why this can't be used to build some other (statically typed) set of functions that instead of having arrays as both input and output would have arrays as input and iterators as output (a custom kind of iterator since it wouldn't iterate over arbitrary object, but instead over a given type).

– cglacet
Mar 27 at 7:33













Testing this is a bit hard as it requires re-writing some parts of numpy, but I'll try to in the near future if nobody tells me it's just not possible because of some reason I fail to see for now (maybe because I have an over-simplified vision of how numpy works).

– cglacet
Mar 27 at 7:37





Testing this is a bit hard as it requires re-writing some parts of numpy, but I'll try to in the near future if nobody tells me it's just not possible because of some reason I fail to see for now (maybe because I have an over-simplified vision of how numpy works).

– cglacet
Mar 27 at 7:37













The iterator protocol isn't that specific. You can't really have a function that only accepts one kind of iterator and say that's using the iterator protocol. You either call next on the arbitrary iterator object you've been given (which is slow, since it does a Python function call and might run arbitrary Python code), or you don't.

– Blckknght
Mar 27 at 7:40





The iterator protocol isn't that specific. You can't really have a function that only accepts one kind of iterator and say that's using the iterator protocol. You either call next on the arbitrary iterator object you've been given (which is slow, since it does a Python function call and might run arbitrary Python code), or you don't.

– Blckknght
Mar 27 at 7:40

















draft saved

draft discarded
















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55326761%2fiterator-protocol-within-numpy%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Kamusi Yaliyomo Aina za kamusi | Muundo wa kamusi | Faida za kamusi | Dhima ya picha katika kamusi | Marejeo | Tazama pia | Viungo vya nje | UrambazajiKuhusu kamusiGo-SwahiliWiki-KamusiKamusi ya Kiswahili na Kiingerezakuihariri na kuongeza habari

Swift 4 - func physicsWorld not invoked on collision? The Next CEO of Stack OverflowHow to call Objective-C code from Swift#ifdef replacement in the Swift language@selector() in Swift?#pragma mark in Swift?Swift for loop: for index, element in array?dispatch_after - GCD in Swift?Swift Beta performance: sorting arraysSplit a String into an array in Swift?The use of Swift 3 @objc inference in Swift 4 mode is deprecated?How to optimize UITableViewCell, because my UITableView lags

Access current req object everywhere in Node.js ExpressWhy are global variables considered bad practice? (node.js)Using req & res across functionsHow do I get the path to the current script with Node.js?What is Node.js' Connect, Express and “middleware”?Node.js w/ express error handling in callbackHow to access the GET parameters after “?” in Express?Modify Node.js req object parametersAccess “app” variable inside of ExpressJS/ConnectJS middleware?Node.js Express app - request objectAngular Http Module considered middleware?Session variables in ExpressJSAdd properties to the req object in expressjs with Typescript