lejar 5 years ago

Nice overview! One thing I think you should add, which I find immensely useful is the reordering of arrays using indexing.

Take for example:

    In [2]: numpy.array([1, 2, 3])[[0, 2, 1]]                                       
    Out[2]: array([1, 3, 2])
You index using a list and it gives you a view of the array with the new order (the underlying array is not changed and there is no copy being done).
  • quietbritishjim 5 years ago

    Using "fancy" indices like this does result in a copy because it can't be represented as a simple slice of the original matrix. A good explaination is here (it's from 2008 but still true):

    https://scipy-cookbook.readthedocs.io/items/ViewsVsCopies.ht...

    You can verify there's a copy by changing the new array after putting the result in a new variable (see above link for why this makes a difference) and verifying the old one is unchanged:

        >>> import numpy as np
        >>> x = np.array([1, 2, 3])
        >>> y = x[[0, 2, 1]]
        >>> y[0] = 3
        >>> y
        array([3, 3, 2])
        >>> x
        array([1, 2, 3])
    
    
    Edit:

    But a view can be based on a slice that includes a skip parameter, and in fact you even slice in multiple dimensions and it will still be a view. That is worth discussing in the article:

        >>> x = np.array([np.arange(7), np.arange(7)+1]*3)
        >>> y = x[4:1:-2, 1:5:2]
        >>> y
        array([[1, 3],
               [1, 3]])
        >>> y[0,0] = 99
        >>> x
        array([[ 0,  1,  2,  3,  4,  5,  6],
               [ 1,  2,  3,  4,  5,  6,  7],
               [ 0,  1,  2,  3,  4,  5,  6],
               [ 1,  2,  3,  4,  5,  6,  7],
               [ 0, 99,  2,  3,  4,  5,  6],
               [ 1,  2,  3,  4,  5,  6,  7]])
    • improbable22 5 years ago

      A related fun fact, when slicing several dimensions:

          >>> a = np.arange(9).reshape(3,3) # a matrix
          >>> a[0:3,0:3]          # ranges are treated independently
          array([[0, 1, 2],
                 [3, 4, 5],
                 [6, 7, 8]])
          >>> a[[0,1,2],[0,1,2]]  # but arrays are treated at once
          array([0, 4, 8])
    • ovi256 5 years ago

      A copy-on-write mechanism triggered by `y[0] = 3` would look the same and pass the test you devised, so you can't eliminate the possibility that it exists.

      A better way would be to track memory use. A copy being created by either `y = x[[0, 2, 1]]` or `y[0] = 3` would show as a memory increase.

  • jcims 5 years ago

    As an aside, one of my major challenges grokking numpy and pandas is the semantically dense syntax like the above. I know that the layers of bracing have an impact but it's difficult for me to tell where it is applied and/or described.

grenoire 5 years ago

Pretty, but not particularly in-depth.

Also, nitpick but I can't hold it: Why isn't the MSE np.mean(np.square(predictions - labels)? That's even breez-ier!

  • manojlds 5 years ago

    I think it's generally done this way because of the way the formula is represented mathematically.

milliams 5 years ago

I like this. One change I would make is on the aggregation and indexing section, change the representation of single values (as opposed to single-element arrays) to not be in a coloured box. It's important that the result of these operations is a different type.

pard68 5 years ago

Numpy was a huge boon in college. I had mostly gotten my homework process down to editing a LaTeX file with the csv files for my datasets and then when I compiled it would first crunch the numbers with Numpy, export it as Tex, and then build a pdf.

  • alanbernstein 5 years ago

    Care to share an example?

    • pard68 5 years ago

      I might still have something. I didn't version control it, but it might be on Dropbox still.

1-6 5 years ago

Wow, this is so timely! I love the visual references. I'm still a little confused about the section on Matrix Indexing. Overall, great work!

Vaslo 5 years ago

Good stuff! I'll definitely look for more from you!

tjpaudio 5 years ago

Nice page, but unless you have never used software for math before, I am not sure it's very useful.