Array Slicing using NumPy

When I started working on a new classification project, I had to slice data array to get the desired result. That’s when I realized I have not understood array slicing completely. That prompted me to get my hands dirty on array slicing and I’ll share my learning in this post.

We need to use numpy library to play around arrays. So I’ll first import numpy.

import numpy as np

Next, I’ll create a 2-d array with random numbers.

arr2d = np.random.randint(10, size=(4, 5))

This is how the array looks like

array([[3, 7, 3, 2, 0],
      [8, 1, 6, 1, 9],
      [3, 3, 3, 8, 2],
      [5, 9, 5, 1, 3]])

Essentially, there are two parts while describing this array.

arr2d[rowFrom:rowTo-1, columnFrom:columnTo-1]

The first section tells numpy how many (or which) rows we are interested in, and the second section tells numpy how many (or which) columns we are interested in.

If I don’t mention any values in those sections like this:

arr2d[:,:]

It shows me the complete array.

array([[3, 7, 3, 2, 0],
      [8, 1, 6, 1, 9],
      [3, 3, 3, 8, 2],
      [5, 9, 5, 1, 3]])

Let’s experiment on the rows first. In this example

arr2d[:2,:]

It returns first 2 rows. Do note that if rowFrom is blank, it means it will fetch from the start (0th element).

array([[3, 7, 3, 2, 0],
      [8, 1, 6, 1, 9]])

Now I’ll specify rowFrom as well as rowTo.

arr2d[2:4,:]

Do note that, since I specified rowTo as 4, essentially it returns from index 2 to index 4-1 which is 3.

array([[3, 3, 3, 8, 2],
      [5, 9, 5, 1, 3]])

Another example.

arr2d[3:4,:]

This will get me the rows from index 3 to index (4-1) which is 3 itself. So i’ll get only one row.

array([[5, 9, 5, 1, 3]])

Similarly, we can specify the columns for array.

So, arr2d[:,:] is same as arr2d[:,0:5] which will get me all the columns.

array([[3, 7, 3, 2, 0],
      [8, 1, 6, 1, 9],
      [3, 3, 3, 8, 2],
      [5, 9, 5, 1, 3]])

Another example, this command will get me columns from 0th column to (4-1) that is 3rd column.

arr2d[:,0:4]
array([[3, 7, 3, 2],
      [8, 1, 6, 1],
      [3, 3, 3, 8],
      [5, 9, 5, 1]])

This command will get me columns from 2nd (index) column to 3rd (4-1) column.

arr2d[:,2:4]
array([[3, 2],
      [6, 1],
      [3, 8],
      [5, 1]])

What if I don’t want a specific column? There are two ways to achieve it (as far as my knowledge goes at this point of time).

Method 1: Specify the columns index which are required.
arr2d[:,[0,2,3,4]]

The output does not contain index 1 column.

array([[3, 3, 2, 0],
      [8, 6, 1, 9],
      [3, 3, 8, 2],
      [5, 5, 1, 3]])

Even though the above method works as expected, sometimes it may not be practical to specify all the columns just to remove one column, especially when we are dealing with large array.

Luckily, NumPy provides another method like shown below.

numpy.delete(arr2d,[1],1)

Through Delete method, we can delete a specific column from the array. So the result will be like this. The second parameter is axis. Since it is a 2-d array, we need to specify 1.

array([[3, 3, 2, 0],
      [8, 6, 1, 9],
      [3, 3, 8, 2],
      [5, 5, 1, 3]])

This set of exercises helped me understand how array slicing can be done. With this understanding, I’ll go ahead and work on my next project.

Comments

Popular posts from this blog

Ordinal Encoder, OneHotEncoder and LabelBinarizer in Python

Natural Language Toolkit (NLTK)

Data Visualization using Pandas - Univariate Plots