Data Preprocessing Module¶
In this tutorial, functions in data preprocessing module will be demonstrated. Data preprocessing offers data normalization functions that are sometimes needed before doing data mining.
Import Data preprocessing module¶
In [1]:
from mygeopackage import Geo
import mygeopackage.pproc
In order to use the function in data preprocessing module we will first need to import mygeopackage.data_preprocessing.
Standard Normalization¶
In [2]:
geojson = Geo(r'https://github.com/yungming0119/mygeopackage/blob/main/docs/notebooks/data/sample_points.geojson?raw=true')
In [3]:
geojson.data[:,15]
Out[3]:
array(['34.260194', '34.262226', '34.273318', ..., '18.354782', '18.336658', '18.31823'], dtype='<U60')
First, we will need to create a Geo class object.
In [4]:
mygeopackage.pproc.standardNormalization(geojson,15)
The standardNormalization function acceptes 2 argument. The first is the Geo class object that stores the data. The second argument is the index that you need to standardized.
In [5]:
geojson.data[:,15]
Out[5]:
array(['-0.6106098792046125', '-0.6102592429638816', '-0.6083452384450874', ..., '-3.3552033196364177', '-3.3583307464213625', '-3.361510630596653'], dtype='<U60')
We can see here, the original column has been replaced by the standardized column.
Last update: 2021-05-03