Why do we need to merge two datasets?
Statistical Package for the Social Sciences (SPSS) is a powerful software for SPSS data. Today we are going to discuss an interesting use of SPSS, known as data merging as well as data matching. Data merging is so much important for the researchers to merge two datasets into one. Suppose you have two data sets of a specific number of samples, one containing information on a set of variables and the other containing information on another set of variables. As a matter of fact, you want to merge these two data sets into one. Please note that both datasets should have a common (unique) variable which is called the key variable/ID to be merged. If you don’t have the key variable for both data sets, SPSS will not be able to merge the data sets.
Start to merge two datasets
First of all, open both of your data sets. Set the same property for the key variable/ID in the two data sets. Remove any duplicate case from the data sets. Sort the key variable/ID of two data sets in ascending order. Now you are ready to merge your desired data sets into one file.
The process to merge two datasets
Now open the data set where you want to see the whole data after merging. Click on Data and then select Merge Files. Select Add Variables. You’ll see a dialogue box with the message “Add variable to…….” Select An external SPSS Statistics data file. Browse your second file and click on continue. Click on the key variable/ID from the Excluded Variables list. Click on Match cases on key variables in sorted files. Enter the key variable/ID into Key Variables box clicking on the adjacent right arrow. Click on the Active dataset is keyed table. Click on OK. SPSS will show you a warning “Keyed match will fail if data are not sorted in ascending order of Key Variables.”
Finally, Click on OK. If there is no error, SPSS will take you to the merged file. Save the file in your selected location. You’ll see the information on the first data set before the key variable/ID and the information of the second data set after the key variable/ID. You have merged your both data sets based on the key variable/ID.