'How to use manhattan distance for SpectralCluster in sklearn

I am trying to use manhattan distance for SpectralClustering() in Sklearn. I am trying to set the affinity parameter to be manhattan, but getting the following error.

ValueError: Unknown kernel 'manhattan'

What is the proper kernel name should I use for it? Anyone can help? Basically, I want to use SpectralClustering to realize kmeans using manhattan distance metric.

Here the line of code for setting SpectralClustering():

clustering = SpectralClustering(n_clusters=10, affinity='manhattan', assign_labels="kmeans")
clustering.fit(X)


Solution 1:[1]

Manhattan distance is not supported in sklearn.metrics.pairwise_kernels that is the reason for the ValueError.

From Documentation:

Valid values for metric are::
[‘rbf’, ‘sigmoid’, ‘polynomial’, ‘poly’, ‘linear’, ‘cosine’]

linear and manhattan distance metric are different, you could understand from the example mentioned here:

>>> import numpy as np
>>> from sklearn.metrics import pairwise_distances
>>> from sklearn.metrics.pairwise import pairwise_kernels
>>> X = np.array([[2, 3], [3, 5], [5, 8]])
>>> Y = np.array([[1, 0], [2, 1]])
>>> pairwise_distances(X, Y, metric='manhattan')
array([[ 4.,  2.],
       [ 7.,  5.],
       [12., 10.]])
>>> pairwise_kernels(X, Y, metric='linear')
array([[ 2.,  7.],
       [ 3., 11.],
       [ 5., 18.]])

Manhattan distance function is available under sklearn.metrics.pairwise_distance

Now, the simpler way to use manhattan distance measure with spectral cluster would be,

>>> from sklearn.cluster import SpectralClustering
>>> from sklearn.metrics import pairwise_distances
>>> import numpy as np
>>> X = np.array([[1, 1], [2, 1], [1, 0],
...               [4, 7], [3, 5], [3, 6]])

>>> X_precomputed = pairwise_distances(X, metric='manhattan')
>>> clustering = SpectralClustering(n_clusters=2, affinity='precomputed', assign_labels="discretize",random_state=0)
>>> clustering.fit(X_precomputed)
>>> clustering.labels_
>>> clustering 

Solution 2:[2]

The official documentation on Spectral Clustering tells you that you can use anything supported by sklearn.metrics.pairwise_kernels. Unfortunately there is no pairwise kernel for the Manhattan distance yet.

If something alike suffices, you could use the linear distance like this:

clustering = SpectralClustering(n_clusters=10, affinity='linear', assign_labels="kmeans")

Solution 3:[3]

The element of the precomputed matrix should be similarity rather than distance. You can use Gaussian Kernel to do this transformation

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Venkatachalam
Solution 2
Solution 3 neo