Why data merging is so important?
Statistical Package for the Social Sciences (SPSS) is a powerful software for analyzing data. Today we are going to discuss an interesting use of SPSS, known as data merging as well as data matching. Data merging is so much important for the researchers to combine two sets of data of the same sample. Suppose you have two data sets of a specific number of samples, one containing information on a set of variables and the other containing information on another set of variables. As a matter of fact, you want to merge these two data sets. Please note that the both data sets should have a common (unique) variable which is called the key variable/ID to be merged. If you don’t have the key variable for the both data sets, SPSS will not be able to merge the data sets.
How to start?
First of all, open both of your data sets. Set the same property for the key variable/ID in the two data sets. Remove any duplicate case from the data sets. Sort the key variable/ID of two data sets in ascending order. Now you are ready to merge your desired data sets into one file.
How to merge?
Now open the data set where you want to see the whole data after merging. Click on Data and then select Merge Files. Select Add Variables. You’ll see a dialogue box with the message “Add variable to…….” Select An external SPSS Statistics data file. Browse your second file and click on continue. Click on the key variable/ID from the Excluded Variables list. Click on Match cases on key variables in sorted files. Enter the key variable/ID into Key Variables box clicking on the adjacent right arrow. Click on Active data set is keyed table. Click on OK. SPSS will show you a warning “Keyed match will fail if data are not sorted in ascending order of Key Variables.”
Finally Click on OK. If there is no error, SPSS will take you on the merged file. Save the file in your selected location. You’ll see the information of the first data set before the key variable/ID and the information of the second data set after the key variable/ID. You have merged your both data sets based on the key variable/ID.