Unsupervised Machine Learning¶
In this tutorial, functions of unsupervised machine learning, also known as clustering, will be demonstrated.
Import module¶
from mygeopackage import Geo
import mygeopackage.unsupervised
First, we need to import the mygeopackage.unsupervised module.
K-Means Clustering¶
geojson = Geo(r'https://github.com/yungming0119/mygeopackage/blob/main/docs/notebooks/data/sample_points.geojson?raw=true')
cluster_results = mygeopackage.unsupervised.Cluster(geojson.data[0:100])
Unsupervised module has the core class, Cluster, which stores the results from the cluster analysis. To instantiate Cluster, give it the argument of a numpy array containing your spatial and attribute data.
mygeopackage.unsupervised.k_means(10,[0,1],cluster_results,2)
k_means() function require 4 arguments. The first is the n, which is the desired number of clusters for K-Means analysis. The second arguments is a list of the fields for clustering. For Geo.data, spatial data located at column 0 and 1, so passing [0,1] will perform a clustering on the spatial data. The third argument is the Cluster object where results will be stored. Finally, the last arguments is the index of the identifier column for yot data.
Cluster object attributes¶
cluster_results.cluster_centers
array([[-86.26039022, 34.31085656], [-86.05297041, 32.40150848], [-86.88929961, 33.35613407], [-88.00821942, 30.83185201], [-86.80296795, 34.71117208], [-85.80784885, 32.85064648], [-86.60439067, 33.55683823], [-87.01090287, 32.27727852], [-85.90201452, 31.42272588], [-86.0393075 , 33.51200282]])
Cluster has the attribute of cluster_centers. It records the center for every cluster. This is only supported for K-Means algortihm.
cluster_results.labels
array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 6, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 9, 9, 9, 9, 4, 6, 6, 6, 6, 0, 0, 0, 0, 0, 6, 6, 6, 6, 6, 7, 1, 1, 1, 7, 4, 2, 2, 2, 2, 4, 4, 0, 1, 2, 5, 5, 5, 5, 5, 6, 6, 6, 3, 9, 1, 5, 5, 3, 8, 3, 9], dtype=int32)
The label attribute is a list that records the class label for every data.
cluster_results.data
array([['-86.20617875299996', '34.260200473000054', '1', ..., '01026', '01009', '2018-2019'], ['-86.20488875199999', '34.26223247300004', '2', ..., '01026', '01009', '2018-2019'], ['-86.22014875799994', '34.27332447500004', '3', ..., '01026', '01009', '2018-2019'], ..., ['-85.90201452099996', '31.422725875000026', '98', ..., '01091', '01031', '2018-2019'], ['-87.61753298699995', '31.07607574000008', '99', ..., '01064', '01022', '2018-2019'], ['-86.08673267299997', '33.43270630300003', '100', ..., '01035', '01011', '2018-2019']], dtype='<U60')
Data attribute holds the original numpy array that you passed in.
cluster_results.identifier
2
Identifier defines the identifier column for the data, like FID or ObjectID in a shapefile.
DBSCSN¶
DBSCAN is a common density-basded clustering method that is also supported in mygeopacakge.
dbscan_results = mygeopackage.unsupervised.Cluster(geojson.data[0:100])
mygeopackage.unsupervised.dbscan(0.5,5,[0,1],dbscan_results,2)
dbscan() requires 5 arguments to run. The first argument is eps, which is the maximum distance between two samples for one to be considered as in the neighborhood of the other. The second argument is min_samples, which is the number of samples in a neighborhood for a point to be considered as a core point. The third argument is the fields used for clustering. The forth argument is the Cluster class for storing results. Finally, the last argument is the index of the field that can be used as the identifier in the dataset.
Cluster class methods¶
cluster_results.show()
Cluster object also support the method of show, which plot your cluster results on the map with Folium. Colors are given randomly for different clusters. By clicking on the dots, you can identify the cluster each point belongs to.
dbscan_results.show()
geoj = cluster_results.toGeoJson()
geoj
'{"type": "FeatureCollection", "name": "K-Means Results", "features": [{"type": "Feature", "properties": {"ID": "1", "Class": 0}, "geometry": {"type": "Point", "coordinates": ["-86.20617875299996", "34.260200473000054"]}}, {"type": "Feature", "properties": {"ID": "2", "Class": 0}, "geometry": {"type": "Point", "coordinates": ["-86.20488875199999", "34.26223247300004"]}}, {"type": "Feature", "properties": {"ID": "3", "Class": 0}, "geometry": {"type": "Point", "coordinates": ["-86.22014875799994", "34.27332447500004"]}}, {"type": "Feature", "properties": {"ID": "4", "Class": 0}, "geometry": {"type": "Point", "coordinates": ["-86.22181075699996", "34.25270647100007"]}}, {"type": "Feature", "properties": {"ID": "5", "Class": 0}, "geometry": {"type": "Point", "coordinates": ["-86.19329375099994", "34.28985548000003"]}}, {"type": "Feature", "properties": {"ID": "6", "Class": 0}, "geometry": {"type": "Point", "coordinates": ["-86.22177475699993", "34.25328347100003"]}}, {"type": "Feature", "properties": {"ID": "7", "Class": 0}, "geometry": {"type": "Point", "coordinates": ["-86.25409278199999", "34.53372752700005"]}}, {"type": "Feature", "properties": {"ID": "8", "Class": 0}, "geometry": {"type": "Point", "coordinates": ["-86.14202374099995", "34.36255549700007"]}}, {"type": "Feature", "properties": {"ID": "9", "Class": 0}, "geometry": {"type": "Point", "coordinates": ["-86.27036577999996", "34.40689250100007"]}}, {"type": "Feature", "properties": {"ID": "10", "Class": 0}, "geometry": {"type": "Point", "coordinates": ["-86.32126378099997", "34.17624045100007"]}}, {"type": "Feature", "properties": {"ID": "11", "Class": 0}, "geometry": {"type": "Point", "coordinates": ["-86.32132578099998", "34.176242450000075"]}}, {"type": "Feature", "properties": {"ID": "12", "Class": 0}, "geometry": {"type": "Point", "coordinates": ["-86.44208282499994", "34.34452248100007"]}}, {"type": "Feature", "properties": {"ID": "13", "Class": 0}, "geometry": {"type": "Point", "coordinates": ["-86.25409178199999", "34.53434752800007"]}}, {"type": "Feature", "properties": {"ID": "14", "Class": 0}, "geometry": {"type": "Point", "coordinates": ["-86.44681682999999", "34.399972492000074"]}}, {"type": "Feature", "properties": {"ID": "15", "Class": 0}, "geometry": {"type": "Point", "coordinates": ["-86.28585877799998", "34.30545447900005"]}}, {"type": "Feature", "properties": {"ID": "16", "Class": 0}, "geometry": {"type": "Point", "coordinates": ["-86.28585877799998", "34.30545447900005"]}}, {"type": "Feature", "properties": {"ID": "17", "Class": 0}, "geometry": {"type": "Point", "coordinates": ["-86.32251178099995", "34.17678045100007"]}}, {"type": "Feature", "properties": {"ID": "18", "Class": 0}, "geometry": {"type": "Point", "coordinates": ["-86.42188082099995", "34.37640648800004"]}}, {"type": "Feature", "properties": {"ID": "19", "Class": 0}, "geometry": {"type": "Point", "coordinates": ["-86.25415678299998", "34.534345528000074"]}}, {"type": "Feature", "properties": {"ID": "20", "Class": 0}, "geometry": {"type": "Point", "coordinates": ["-86.32089978", "34.17754245100008"]}}, {"type": "Feature", "properties": {"ID": "21", "Class": 0}, "geometry": {"type": "Point", "coordinates": ["-86.14142673999999", "34.36274049600007"]}}, {"type": "Feature", "properties": {"ID": "22", "Class": 2}, "geometry": {"type": "Point", "coordinates": ["-86.84473788299994", "33.34089225500003"]}}, {"type": "Feature", "properties": {"ID": "23", "Class": 6}, "geometry": {"type": "Point", "coordinates": ["-86.65855183399998", "33.413053277000074"]}}, {"type": "Feature", "properties": {"ID": "24", "Class": 2}, "geometry": {"type": "Point", "coordinates": ["-86.83768788099997", "33.34437625600003"]}}, {"type": "Feature", "properties": {"ID": "25", "Class": 2}, "geometry": {"type": "Point", "coordinates": ["-86.73218485399997", "33.39565427100007"]}}, {"type": "Feature", "properties": {"ID": "26", "Class": 2}, "geometry": {"type": "Point", "coordinates": ["-86.87839489199996", "33.33753325400005"]}}, {"type": "Feature", "properties": {"ID": "27", "Class": 2}, "geometry": {"type": "Point", "coordinates": ["-86.85480088499997", "33.33092925200003"]}}, {"type": "Feature", "properties": {"ID": "28", "Class": 2}, "geometry": {"type": "Point", "coordinates": ["-86.72878985299997", "33.39271627100004"]}}, {"type": "Feature", "properties": {"ID": "29", "Class": 2}, "geometry": {"type": "Point", "coordinates": ["-86.86736889099996", "33.36390225900004"]}}, {"type": "Feature", "properties": {"ID": "30", "Class": 2}, "geometry": {"type": "Point", "coordinates": ["-86.85375988999994", "33.40958526900005"]}}, {"type": "Feature", "properties": {"ID": "31", "Class": 2}, "geometry": {"type": "Point", "coordinates": ["-86.80882687599996", "33.39571826800005"]}}, {"type": "Feature", "properties": {"ID": "32", "Class": 2}, "geometry": {"type": "Point", "coordinates": ["-86.83123788099994", "33.388157266000064"]}}, {"type": "Feature", "properties": {"ID": "33", "Class": 2}, "geometry": {"type": "Point", "coordinates": ["-86.83367688199996", "33.38790126600003"]}}, {"type": "Feature", "properties": {"ID": "34", "Class": 2}, "geometry": {"type": "Point", "coordinates": ["-86.76681086399998", "33.40915027200003"]}}, {"type": "Feature", "properties": {"ID": "35", "Class": 2}, "geometry": {"type": "Point", "coordinates": ["-86.82441588199998", "33.426807274000055"]}}, {"type": "Feature", "properties": {"ID": "36", "Class": 2}, "geometry": {"type": "Point", "coordinates": ["-86.81286987499999", "33.359597260000044"]}}, {"type": "Feature", "properties": {"ID": "37", "Class": 2}, "geometry": {"type": "Point", "coordinates": ["-86.88919489599994", "33.344128254000054"]}}, {"type": "Feature", "properties": {"ID": "38", "Class": 2}, "geometry": {"type": "Point", "coordinates": ["-86.84093688199994", "33.33761025500007"]}}, {"type": "Feature", "properties": {"ID": "39", "Class": 4}, "geometry": {"type": "Point", "coordinates": ["-86.71371392399999", "34.705823545000044"]}}, {"type": "Feature", "properties": {"ID": "40", "Class": 4}, "geometry": {"type": "Point", "coordinates": ["-86.74270193299998", "34.740820550000024"]}}, {"type": "Feature", "properties": {"ID": "41", "Class": 4}, "geometry": {"type": "Point", "coordinates": ["-86.74254993199997", "34.71916454700005"]}}, {"type": "Feature", "properties": {"ID": "42", "Class": 4}, "geometry": {"type": "Point", "coordinates": ["-86.75046893299998", "34.697460542000044"]}}, {"type": "Feature", "properties": {"ID": "43", "Class": 4}, "geometry": {"type": "Point", "coordinates": ["-86.75115993499998", "34.721200546000034"]}}, {"type": "Feature", "properties": {"ID": "44", "Class": 4}, "geometry": {"type": "Point", "coordinates": ["-86.78323794399995", "34.72108154500006"]}}, {"type": "Feature", "properties": {"ID": "45", "Class": 4}, "geometry": {"type": "Point", "coordinates": ["-86.73306193199994", "34.75122955300003"]}}, {"type": "Feature", "properties": {"ID": "46", "Class": 4}, "geometry": {"type": "Point", "coordinates": ["-86.77920194399996", "34.73367654700007"]}}, {"type": "Feature", "properties": {"ID": "47", "Class": 4}, "geometry": {"type": "Point", "coordinates": ["-86.76605093999996", "34.734321548000025"]}}, {"type": "Feature", "properties": {"ID": "48", "Class": 4}, "geometry": {"type": "Point", "coordinates": ["-86.77201793999996", "34.70119254100007"]}}, {"type": "Feature", "properties": {"ID": "49", "Class": 4}, "geometry": {"type": "Point", "coordinates": ["-86.79180694499996", "34.705892541000026"]}}, {"type": "Feature", "properties": {"ID": "50", "Class": 9}, "geometry": {"type": "Point", "coordinates": ["-86.09853867699996", "33.431966302000035"]}}, {"type": "Feature", "properties": {"ID": "51", "Class": 9}, "geometry": {"type": "Point", "coordinates": ["-86.08969367399999", "33.43358030400003"]}}, {"type": "Feature", "properties": {"ID": "52", "Class": 9}, "geometry": {"type": "Point", "coordinates": ["-86.11748168099996", "33.42728530100004"]}}, {"type": "Feature", "properties": {"ID": "53", "Class": 9}, "geometry": {"type": "Point", "coordinates": ["-86.11771068199994", "33.42544630100008"]}}, {"type": "Feature", "properties": {"ID": "54", "Class": 4}, "geometry": {"type": "Point", "coordinates": ["-86.59045988899999", "34.722438553000075"]}}, {"type": "Feature", "properties": {"ID": "55", "Class": 6}, "geometry": {"type": "Point", "coordinates": ["-86.54809880999994", "33.54516231000002"]}}, {"type": "Feature", "properties": {"ID": "56", "Class": 6}, "geometry": {"type": "Point", "coordinates": ["-86.53689880699994", "33.54936631100003"]}}, {"type": "Feature", "properties": {"ID": "57", "Class": 6}, "geometry": {"type": "Point", "coordinates": ["-86.57418681699994", "33.54758030900007"]}}, {"type": "Feature", "properties": {"ID": "58", "Class": 6}, "geometry": {"type": "Point", "coordinates": ["-86.56431581499999", "33.542876308000075"]}}, {"type": "Feature", "properties": {"ID": "59", "Class": 0}, "geometry": {"type": "Point", "coordinates": ["-86.15793873699994", "34.212232465000056"]}}, {"type": "Feature", "properties": {"ID": "60", "Class": 0}, "geometry": {"type": "Point", "coordinates": ["-86.16392373799994", "34.21179246500003"]}}, {"type": "Feature", "properties": {"ID": "61", "Class": 0}, "geometry": {"type": "Point", "coordinates": ["-86.17547774099995", "34.20484546300003"]}}, {"type": "Feature", "properties": {"ID": "62", "Class": 0}, "geometry": {"type": "Point", "coordinates": ["-86.16134973799996", "34.21353146400003"]}}, {"type": "Feature", "properties": {"ID": "63", "Class": 0}, "geometry": {"type": "Point", "coordinates": ["-86.17809174099995", "34.20647646300006"]}}, {"type": "Feature", "properties": {"ID": "64", "Class": 6}, "geometry": {"type": "Point", "coordinates": ["-86.59365582899994", "33.65583533100005"]}}, {"type": "Feature", "properties": {"ID": "65", "Class": 6}, "geometry": {"type": "Point", "coordinates": ["-86.56761182199995", "33.64932933000006"]}}, {"type": "Feature", "properties": {"ID": "66", "Class": 6}, "geometry": {"type": "Point", "coordinates": ["-86.59076982899995", "33.66569033300004"]}}, {"type": "Feature", "properties": {"ID": "67", "Class": 6}, "geometry": {"type": "Point", "coordinates": ["-86.62363083399998", "33.59955031800007"]}}, {"type": "Feature", "properties": {"ID": "68", "Class": 6}, "geometry": {"type": "Point", "coordinates": ["-86.60395983099994", "33.629264325000065"]}}, {"type": "Feature", "properties": {"ID": "69", "Class": 7}, "geometry": {"type": "Point", "coordinates": ["-87.07761288499995", "32.11162598400006"]}}, {"type": "Feature", "properties": {"ID": "70", "Class": 1}, "geometry": {"type": "Point", "coordinates": ["-86.30444067999997", "32.37480006800007"]}}, {"type": "Feature", "properties": {"ID": "71", "Class": 1}, "geometry": {"type": "Point", "coordinates": ["-86.20185665099996", "32.36150206900004"]}}, {"type": "Feature", "properties": {"ID": "72", "Class": 1}, "geometry": {"type": "Point", "coordinates": ["-86.20185665099996", "32.36150206900004"]}}, {"type": "Feature", "properties": {"ID": "73", "Class": 7}, "geometry": {"type": "Point", "coordinates": ["-86.94419286399994", "32.442931061000024"]}}, {"type": "Feature", "properties": {"ID": "74", "Class": 4}, "geometry": {"type": "Point", "coordinates": ["-87.03808301", "34.61919551400007"]}}, {"type": "Feature", "properties": {"ID": "75", "Class": 2}, "geometry": {"type": "Point", "coordinates": ["-86.80389588", "33.504861291000054"]}}, {"type": "Feature", "properties": {"ID": "76", "Class": 2}, "geometry": {"type": "Point", "coordinates": ["-86.87386289599993", "33.43213127300004"]}}, {"type": "Feature", "properties": {"ID": "77", "Class": 2}, "geometry": {"type": "Point", "coordinates": ["-87.12222195499999", "33.22599622000007"]}}, {"type": "Feature", "properties": {"ID": "78", "Class": 2}, "geometry": {"type": "Point", "coordinates": ["-87.48194205399994", "33.187760199000024"]}}, {"type": "Feature", "properties": {"ID": "79", "Class": 4}, "geometry": {"type": "Point", "coordinates": ["-87.31212709099998", "34.670793513000035"]}}, {"type": "Feature", "properties": {"ID": "80", "Class": 4}, "geometry": {"type": "Point", "coordinates": ["-86.77787694299997", "34.72328954500006"]}}, {"type": "Feature", "properties": {"ID": "81", "Class": 0}, "geometry": {"type": "Point", "coordinates": ["-86.44100183799998", "34.57942052900006"]}}, {"type": "Feature", "properties": {"ID": "82", "Class": 1}, "geometry": {"type": "Point", "coordinates": ["-85.66181849799995", "32.37484009100007"]}}, {"type": "Feature", "properties": {"ID": "83", "Class": 2}, "geometry": {"type": "Point", "coordinates": ["-87.18767396999993", "33.163407204000066"]}}, {"type": "Feature", "properties": {"ID": "84", "Class": 5}, "geometry": {"type": "Point", "coordinates": ["-85.94879860799995", "32.945081205000065"]}}, {"type": "Feature", "properties": {"ID": "85", "Class": 5}, "geometry": {"type": "Point", "coordinates": ["-85.95077160799997", "32.937754202000065"]}}, {"type": "Feature", "properties": {"ID": "86", "Class": 5}, "geometry": {"type": "Point", "coordinates": ["-85.93069860199995", "32.93116820200004"]}}, {"type": "Feature", "properties": {"ID": "87", "Class": 5}, "geometry": {"type": "Point", "coordinates": ["-85.94903660899996", "32.955527207000046"]}}, {"type": "Feature", "properties": {"ID": "88", "Class": 5}, "geometry": {"type": "Point", "coordinates": ["-85.97262361399999", "32.92946320100003"]}}, {"type": "Feature", "properties": {"ID": "89", "Class": 6}, "geometry": {"type": "Point", "coordinates": ["-86.66513283999996", "33.48039629200008"]}}, {"type": "Feature", "properties": {"ID": "90", "Class": 6}, "geometry": {"type": "Point", "coordinates": ["-86.66513283999996", "33.48039629200008"]}}, {"type": "Feature", "properties": {"ID": "91", "Class": 6}, "geometry": {"type": "Point", "coordinates": ["-86.66513283999996", "33.48039629200008"]}}, {"type": "Feature", "properties": {"ID": "92", "Class": 3}, "geometry": {"type": "Point", "coordinates": ["-88.22931614399994", "30.794651658000078"]}}, {"type": "Feature", "properties": {"ID": "93", "Class": 9}, "geometry": {"type": "Point", "coordinates": ["-85.72568759699999", "33.921032420000074"]}}, {"type": "Feature", "properties": {"ID": "94", "Class": 1}, "geometry": {"type": "Point", "coordinates": ["-85.89487957199998", "32.53489811800006"]}}, {"type": "Feature", "properties": {"ID": "95", "Class": 5}, "geometry": {"type": "Point", "coordinates": ["-85.47482445599996", "32.61338415000006"]}}, {"type": "Feature", "properties": {"ID": "96", "Class": 5}, "geometry": {"type": "Point", "coordinates": ["-85.42818844499999", "32.64214715800006"]}}, {"type": "Feature", "properties": {"ID": "97", "Class": 3}, "geometry": {"type": "Point", "coordinates": ["-88.17780912199999", "30.62482862300004"]}}, {"type": "Feature", "properties": {"ID": "98", "Class": 8}, "geometry": {"type": "Point", "coordinates": ["-85.90201452099996", "31.422725875000026"]}}, {"type": "Feature", "properties": {"ID": "99", "Class": 3}, "geometry": {"type": "Point", "coordinates": ["-87.61753298699995", "31.07607574000008"]}}, {"type": "Feature", "properties": {"ID": "100", "Class": 9}, "geometry": {"type": "Point", "coordinates": ["-86.08673267299997", "33.43270630300003"]}}]}'
Finally, toGeoJson() converts your results to a JSON string in the GeoJSON format, which you can later save it.