Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

DBSCAN clustering algorithm

106 views
Skip to first unread message

amira ali

unread,
May 20, 2012, 5:55:07 AM5/20/12
to
hello

i need the code of DBscan clustering algorithm but the following code doesn't return clusters centers like K-means i want clustering algorithm to remove outlier and noise instead of k-means





% -------------------------------------------------------------------------
% Function: [class,type]=dbscan(x,k,Eps)
% -------------------------------------------------------------------------
% Aim:
% Clustering the data with Density-Based Scan Algorithm with Noise (DBSCAN)
% -------------------------------------------------------------------------
% Input:
% x - data set (m,n); m-objects, n-variables
% k - number of objects in a neighborhood of an object
% (minimal number of objects considered as a cluster)
% Eps - neighborhood radius, if not known avoid this parameter or put []
% -------------------------------------------------------------------------
% Output:
% class - vector specifying assignment of the i-th object to certain
% cluster (m,1)
% type - vector specifying type of the i-th object
% (core: 1, border: 0, outlier: -1)
% -------------------------------------------------------------------------
% Example of use:
% x=[randn(30,2)*.4;randn(40,2)*.5+ones(40,1)*[4 4]];
% [class,type]=dbscan(x,5,[])
% clusteringfigs('Dbscan',x,[1 2],class,type)
% -------------------------------------------------------------------------
% References:
% [1] M. Ester, H. Kriegel, J. Sander, X. Xu, A density-based algorithm for
% discovering clusters in large spatial databases with noise, proc.
% 2nd Int. Conf. on Knowledge Discovery and Data Mining, Portland, OR, 1996,
% p. 226, available from:
% www.dbs.informatik.uni-muenchen.de/cgi-bin/papers?query=--CO
% [2] M. Daszykowski, B. Walczak, D. L. Massart, Looking for
% Natural Patterns in Data. Part 1: Density Based Approach,
% Chemom. Intell. Lab. Syst. 56 (2001) 83-92
% -------------------------------------------------------------------------
% Written by Michal Daszykowski
% Department of Chemometrics, Institute of Chemistry,
% The University of Silesia
% December 2004
% http://www.chemometria.us.edu.pl

function [class,type]=dbscan(x,k,Eps)

[m,n]=size(x);

if nargin<3 | isempty(Eps)
[Eps]=epsilon(x,k);
end

x=[[1:m]' x];
[m,n]=size(x);
type=zeros(1,m);
no=1;
touched=zeros(m,1);

for i=1:m
if touched(i)==0;
ob=x(i,:);
D=dist(ob(2:n),x(:,2:n));
ind=find(D<=Eps);

if length(ind)>1 & length(ind)<k+1
type(i)=0;
class(i)=0;
end
if length(ind)==1
type(i)=-1;
class(i)=-1;
touched(i)=1;
end

if length(ind)>=k+1;
type(i)=1;
class(ind)=ones(length(ind),1)*max(no);

while ~isempty(ind)
ob=x(ind(1),:);
touched(ind(1))=1;
ind(1)=[];
D=dist(ob(2:n),x(:,2:n));
i1=find(D<=Eps);

if length(i1)>1
class(i1)=no;
if length(i1)>=k+1;
type(ob(1))=1;
else
type(ob(1))=0;
end

for i=1:length(i1)
if touched(i1(i))==0
touched(i1(i))=1;
ind=[ind i1(i)];
class(i1(i))=no;
end
end
end
end
no=no+1;
end
end
end

i1=find(class==0);
class(i1)=-1;
type(i1)=-1;


%...........................................
function [Eps]=epsilon(x,k)

% Function: [Eps]=epsilon(x,k)
%
% Aim:
% Analytical way of estimating neighborhood radius for DBSCAN
%
% Input:
% x - data matrix (m,n); m-objects, n-variables
% k - number of objects in a neighborhood of an object
% (minimal number of objects considered as a cluster)



[m,n]=size(x);

Eps=((prod(max(x)-min(x))*k*gamma(.5*n+1))/(m*sqrt(pi.^n))).^(1/n);


%............................................
function [D]=dist(i,x)

% function: [D]=dist(i,x)
%
% Aim:
% Calculates the Euclidean distances between the i-th object and all objects in x
%
% Input:
% i - an object (1,n)
% x - data matrix (m,n); m-objects, n-variables
%
% Output:
% D - Euclidean distance (m,1)



[m,n]=size(x);
D=sqrt(sum((((ones(m,1)*i)-x).^2)'));

if n==1
D=abs((ones(m,1)*i-x))';
end

amira ali

unread,
May 24, 2012, 6:51:16 PM5/24/12
to
"amira ali" wrote in message <jpaf1r$n0i$1...@newscl01ah.mathworks.com>...
i run the code and removed noise and get clusters without noise but how can i use it for testing data

verma...@gmail.com

unread,
Feb 13, 2016, 12:57:57 PM2/13/16
to
how to apply dbscan on datasets consisting of packets information of different computer connected through router .My aim is to form a cluster of different computers based on mac addresses but main problem is that all information are in string and how to find distance measure among them,can you please help me

verma...@gmail.com

unread,
Feb 13, 2016, 1:05:49 PM2/13/16
to
On Sunday, May 20, 2012 at 3:25:07 PM UTC+5:30, amira ali wrote:
0 new messages