Ordinal Encoder, OneHotEncoder and LabelBinarizer in Python
Usually in the data set we work on, we will have both numerical data as well as categorical data (non-numeric). There are algorithms which do not work on categorical data. These algorithms are designed that way to enhance efficiency. So in such cases how do we train the model with categorical data? This is where we need to apply a technique called Encoding. Encoding transforms categorical data into numeric. There are various Encoders available in sci-kit learn library. In this post, I'll share my learning on three Encoders. Ordinal Encoder This performs ordinal (integer) encoding of categorical data. Let's look at an example. I have an array of categorical data. This is a 2 dimensional array. multiArray = np.array([[ 'Karnataka' , 'KA' ], [ 'Maharastra' , 'MH' ], [ 'Gujarat' , 'GJ' ]]) Then I use OrdinalEncoder to transform this data. ordinalEncoder = OrdinalEncoder() ordinalEncoderArray = ordinalEncoder.fit_transform(mult