'Finding union of 2 sorted arrays (with duplicates)
I'm trying to find the union of 2 sorted arrays (with duplicates) but I feel I'm not coming up with the most elegant code (what I have works btw, I just feel I can reduce some lines of code). Lets say I have 2 vectors a = {1,3,3,4,4,4,5,7} and b = {1,3,3,3,5,5,5,6,8,9} and I want to store their union in a vector called unionVector (which will be 1,3,4,5,6,7,8,9)
Here's my code:
#include <iostream>
#include <vector>
using namespace std;
// Prints the contents of a vector
void printVector(vector<int> a){
if(a.size() == 0)
return;
else{
for(int i = 0; i < a.size(); i++)
cout << a[i] << '\t';
}
cout << endl;
}
// Print the union of 2 sorted arrays with duplicates
void printUnion(int *a, int aSize, int *b, int bSize){
if(aSize == 0 && bSize == 0)
return;
else{
vector<int> unionVector;
int i = 0;
int j = 0;
int last = 0;
// insert the smaller of first element regardless
if(a[i] < b[j]){
unionVector.push_back(a[i]);
i++;
}
else if (b[j] < a[i]){
unionVector.push_back(b[j]);
j++;
}
else{// both are equal numbers
unionVector.push_back(a[i]);
i++;
j++;
}
// now traverse both the loops one increment at a time
while(i < aSize && j < bSize){
last = unionVector[unionVector.size() - 1];
if(a[i] < b[j]){
if(last != a[i])
unionVector.push_back(a[i]);
i++; // increment i in either case
}
else if(b[j] < a[i]){
if(last != b[j])
unionVector.push_back(b[j]);
j++;
}
else{
// both of the numbers are equal
if(last != a[i])
unionVector.push_back(a[i]);
i++;
j++;
}
}
// lets say if 1 array wasn't complete
while(i < aSize){
last = unionVector[unionVector.size() - 1];
if(last != a[i])
unionVector.push_back(a[i]);
i++;
}
while(j < bSize){
last = unionVector[unionVector.size() - 1];
if(last != b[i])
unionVector.push_back(b[j]);
j++;
}
printVector(unionVector);
}
}
int main(){
int a[] = {1,3,3,4,4,4,5,7};
int b[] = {1,3,3,3,5,5,5,6,7,7,8,9};
printUnion(a,8,b,12);
return 0;
}
Thing is since there can be duplicates I check the element that is to be inserted with the last element inserted in unionVector. I need to make sure that I don't try to get the 'last' element when the unionVector is empty, thats why I'm inserting 1 element in the unionVector anyways. I would really appreciate if anyone can suggest a way that I can do that check without needing to insert 1 element first (I was thinking of having a flag variable that checks if the unionVector is empty or not, but I feel that will be too messy)
Edit 1:
- This isn't a homework problem. This is something I was practising for my interviews
Edit 2:
- I also can't use any built in functions
Edit 3:
- Some people were getting confused if this was for a C++ position. You can use any language you want
Solution 1:[1]
If both arrays are sorted, it's just a matter of skipping ahead one iterator or the other or both if there's a match.
So something like:
void printUnion(int* a, int aSize, int* b, int bSize)
{
int *aEnd = a + aSize, *bEnd = b + bSize;
std::vector<int> unionVec;
for (; a != aEnd; ) {
if (b == bEnd) {
// copy all of a
while (a != aEnd) {
unionVec.push_back(*a);
a = std::upper_bound(a + 1, aEnd, *a);
}
break;
}
if (*b < *a) {
unionVec.push_back(*b);
b = std::upper_bound(b + 1, bEnd, *b);
}
else {
unionVec.push_back(*a);
if (*b == *a) {
b = std::upper_bound(b + 1, bEnd, *b);
}
a = std::upper_bound(a + 1, aEnd, *a);
}
}
// copy all of b
while (b != bEnd) {
unionVec.push_back(*b);
b = std::upper_bound(b + 1, bEnd, *b);
}
printVector(unionVec);
}
If you can't use upper_bound
directly, just implement that function yourself. Copying the implementation from this reference:
template<class ForwardIt, class T>
int* upper_bound(int* first, int* last, const int value)
{
int* it;
int count = last - first;
int step;
while (count > 0) {
it = first;
step = count / 2;
it += step;
if (value >= *it) {
first = ++it;
count -= step + 1;
}
else {
count = step;
}
}
return first;
}
Or a non-binary-search version:
int* upper_bound(int* first, int* last, const int value) {
for (; first < last && *first == value; ++first) {
;
}
return first;
}
Now that's obviously pretty verbose, and that's why the standard actually provides an algorithm directly for you set_union:
void printUnion(int* a, int aSize, int* b, int bSize)
{
std::vector<int> unionVec;
// get the union
std::set_union(a, a + aSize, b, b + bSize, std::back_inserter(unionVec));
// remove the dupes
unionVec.erase(std::unique(unionVec.begin(), unionVec.end()), unionVec.end());
printVector(unionVec);
}
Solution 2:[2]
Here is one way. Elegance may vary!
void printUnion(int* a, int aSize, int* b, int bSize)
{
std::multiset<int> x;
x.insert(a, a + aSize);
x.insert(b, b + bSize);
for (auto y : x )
cout << y << ",";
cout << endl;
}
NB. consider having printUnion
take iterator pairs. Use std::set
to ignore duplicates, or std::multiset
to retain duplicates.
Solution 3:[3]
def solve(a, b):
c = []
for i in range(len(a)):
if a[i] not in c:
c.append(a[i])
for i in range(len(b)):
if b[i] not in c:
c.append(b[i])
return c
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | |
Solution 2 | |
Solution 3 | Tyler2P |