'Double lexicographic sorting of binary (0-1) matrices
I want to sort a binary matrix so that its columns and rows are both in lexicographical order by switching rows and columns. In other words, I need a double-lexical ordering of a matrix consisting of zeros and ones, while the entries of all rows and columns remain in their respective row/column. This effectively fills the bottom right side of the resulting matrix with as many 1
s as possible without breaking the columns/rows. (In my case I need the top left side filled, so I simply flipped the result.)
Can someone provide me with a link to code doing that? A very appreciated bonus would be if the algorithm works with tensors of any dimension.
I first tried to sort all rows lexicographically and then all columns, repeating this until the obtained matrix did not change anymore. I compared this with first sorting the columns and used the better result. However with this the result seemed to get stuck in some sort of local optimum and did not produce the best possible result. So I looked for existing algorithms but did not find anything implemented and quickly usable.
I found an old paper on sciencedirect.com [1] that does exactly that in O(mn) but have trouble implementing it. Maybe someone with access can help me with the code and especially with
- the
pointers
which are meant to keep track of the column blocks that were not yet split in a call todlrefine
. - And with the
bucket
filling where it can fail some times.
[1]: Spinrad, Jeremy P. "Doubly lexical ordering of dense 0–1 matrices." Information Processing Letters 45.5 (1993): 229-235.
Here is the erroneous Matlab code from my implementation attempt:
function [A,I]=dlpartition(M)
Rout={};
RP={1:size(M,1)};
Rlast=1;
CP={1:size(M,2)};
pointers=ones(1,size(M,1));
while Rlast>0
R=RP{Rlast};
C=[];
if any(pointers(R)>0)
Clast=pointers(R);
Clast=min(Clast(Clast>0));
C=CP{Clast}; %the last column block unexamined by R
end
if isempty(C)
Rlast=Rlast-1;
Rout=[R Rout];
else
[R,C,pointers]=dlrefine(M,R,C,pointers-Clast+1);
pointers=pointers+Clast-1;
RP=[RP{1:Rlast-1} R RP{Rlast+1:end}];
Rlast=Rlast-1+length(R);
CP=[CP{1:Clast-1} C CP{Clast+1:end}];
end
end
%prepare output to match my needs
I=reshape(1:numel(M),size(M));
I=I(flip([Rout{:}]),:);
I=I(:,flip([CP{:}]));
A=M(I);
end
function [R,C, pointers]=dlrefine(M,R,C,pointers)
%REFINE Spinrad 1992
%M Binary matrix to sort
%R Array of rows in current block
%C Array of columns in current block
%pointers Array of pointers from all rows to the
% column block that wasn't split yet by it
% with offset, such that 1 is the current
% block C
C={C};
for r=R
c=length(C);
C2=[];
if pointers(r) >= c %if Cc has not been split by r
while isempty(C2) && c > 0
C1=intersect(C{c},find(M(r,:)));
C2=setdiff(C{c},C1);
if ~isempty(C1) && ~isempty(C2)
C={C{1:c-1} C2 C1 C{c+1:end}};
pointers=pointers+(pointers>=c);
%pointers(r)=c-1; %TESTING (normally uncommented)
end
%if isempty(C2) %TESTING (normally uncommented)
c=c-1;
%end %TESTING (normally uncommented)
end
pointers(r)=c; %TESTING (normally commented out)
end
end
bucket=cell(1,length(C));
failed=[];
for r=R
success=false;
for sub=length(C):-1:1
if any(ismember(C{sub},find(1-M(r,:))))
bucket{sub}=[bucket{sub} r];
success=true;
break;
end
end
if ~success
failed=[failed r];
end
end
R=bucket(end:-1:1);
R=[R(~cellfun('isempty',R)) failed];
end
For convenience, you can run the dlpartition
function with the same matrix as used in the paper:
M=[0 1 0 1 1 0;
1 0 1 1 1 0;
0 1 1 1 1 0;
1 1 0 0 0 1;
0 1 0 1 0 1;
1 0 0 1 1 1];
[A I]=dlpartition(M)
Edit:
The code works for the small example given in the paper and produces:
A = 1 1 1 1 0 0
1 1 1 0 0 0
1 1 0 1 1 0
1 1 0 0 1 1
1 0 1 0 0 1
0 0 1 0 1 1
But it fails to generate the global optimum for arbitrary matrices. For example:
M=[1 0 1 0 0
1 1 0 0 1
1 0 0 0 1
0 0 1 1 1
0 0 1 0 1];
[A I]=dlpartition(M)
%this gives but should give
%A = 1 1 0 0 0 1 1 1 0 0
% 1 0 1 1 0 1 1 0 0 0
% 1 0 1 0 0 1 0 0 1 0
% 0 1 1 0 1 0 1 0 1 1
% 0 1 1 0 0 0 1 0 1 0
Solution 1:[1]
I note on that mse page mentioned by @dunkelkoon that the ranking reaches a minimum with the iterative sort, but does not distinguish between forms like this
0 0 0 0 0 0
0 0 1 and 0 1 1
1 1 0 1 0 0
Perhaps the algorithm by Anna Lubiw referenced in this SO question would be of interest.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | smichr |