'How to merge multiple cases into one in SPSS?
I want to fill in missing values for a case with values from cases in a different file. The corresponding cases have the same refrence number, variable REF. In the end, there should only be be one case per reference number, with no missing values in any variable. I already tried: Data-> Merge files-> Add variable-> many to one, but I still end up with multiple cases per reference number or no change at all in the table. I can't figure out how this works.
My two data sets:
REF p1 p2 p3
1 5 NA NA
2 3 NA NA
3 4 NA NA
REF p1 p2 p3
1 NA 3 NA
1 NA NA 1
2 NA 2 NA
2 NA NA 4
3 NA 1 NA
3 NA NA 1
Desired output:
REF p1 p2 p3
1 5 3 1
2 3 2 4
3 4 1 1
What I tried, but did not work:
Solution 1:[1]
I suggest you first stack the two files, so that all the data is in one table, then use aggregation to get all the data for each case into one line. I suggest aggregation using the max
function under the assumption that for every REF
only one value exists in each column, so the aggregation will select this value and leave out the other "competing" missing values.
EDITED to leave only one line per "REF":
add files /file = dataset1 /file = dataset2.
exe.
dataset name gen.
aggregate /outfile=* /break=REF /P1 P2 P3=max(P1 P2 P3).
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 |