How do I delete duplicate rows in an array?
Given a NumPy array:
[[1 0 1 0 0 1]
[1 1 1 0 0 1]
[1 0 1 0 0 1]
[1 0 1 0 1 1]
[1 0 1 0 0 1]]
a = [[1, 0, 1, 0, 0, 1],
[1, 1, 1, 0, 0, 1],
[1, 0, 1, 0, 0, 1],
[1, 0, 1, 0, 1, 1],
[1, 0, 1, 0, 0, 1]]
b = np.array(a)
How can I delete duplicate rows? That is, the output should be:
[[1 0 1 0 0 1]
[1 1 1 0 0 1]
[1 0 1 0 1 1]]
2
1 answers
In [23]: np.unique(b, axis=0)
Out[23]:
array([[1, 0, 1, 0, 0, 1],
[1, 0, 1, 0, 1, 1],
[1, 1, 1, 0, 0, 1]])
If you need to keep the order of the rows:
u, idx = np.unique(b, axis=0, return_index=True)
res = u[idx.argsort()]
Result:
In [33]: res
Out[33]:
array([[1, 0, 1, 0, 0, 1],
[1, 1, 1, 0, 0, 1],
[1, 0, 1, 0, 1, 1]])
4
Author: MaxU, 2020-03-27 15:51:10