1. Download DataMerge here.

2. Open the file in Microsoft Excel.

3. See the example below and the template in the excel file. In this example, I try to demonstrate how to merge the data of SET2 to that of SET1.

Strength of DataMerge

BEFORE YOUR USE, please delete the demonstration identifiers and data in SET1 and SET2.


To reduce the file size, I only constructed the functions (the cells are highlighted in grey colour) until row 15 (so it can only process 13 participants at the moment). I am sure your sample size is more than that, so you can simply select the grey colour cells in row 15, and slide down until you have enough space for all of your participants. The file size will dramatically increase to 60MB for 1000 participants.


The subject identifiers have to be in the same columns in each sheet. For instance, if you input participants' name in column ID1 in SET1, always keep using ID1 for the input of names in SET2, SET3, SET4, and SET5.


SET1 (the first sheet of the excel file), indicating missing data in the follow-up trial.


DataMerge uses ID1 as the priority identifier for the search, and if it doesn't find any matched identifier in ID1 of the other dataset, it will perform the search using ID2 and so on (i.e., ID3, ID4, and ID5) until it finds the participant. However, if it still doesn't find the participant, it will empty the cells of that particular dataset in 


Be aware of duplicated identifiers (e.g., names) in the ID. DataMerge only recognises the first match, so any presence of duplicated identifiers may affect the reliability of the match. However, this problem could be solved using two or more IDs in conjunction by modifying the code.




How to use DataMerge
Important to Note

Research using longitudinal or prospective design, or sometime experimental studies with test-retest procedures may collect several sets of data from the same group of participants. Before we can effectively process the the data of the whole study, we have to merge these datasets up. Indeed, we could not simply copy-and-paste the follow-up/ longitudinal responses into the new columns of the baseline dataset or using combined dataset in SPSS. It is because the the order of participants might not perfectly match between the datasets. To have a perfect merging up for longitudinal datasets, we have to line up the responses in various trials or follow-up surveys from the same participant into the same row. If you have a large sample size, this is a very time-consuming process. The process could take even longer when participants only fill in some of the identification information or they may dropout from some of the trials.


Research assistants or student helpers will certainly be helpful, but now DataMerge may do the job for you in seconds! DataMerge can merge 5 datasets using up-to 5 subject identifiers into a single excel spread sheet. All you have to do is to copy the data of the study variables and the subject identifiers into separate sheets in the excel file. DataMerge can then use your first dataset as the basis, and copy the responses of the same participant in the subsequent trials in the same row using either of the subject identifiers.


  • High tolerance with missing data as it uses 5 subject identifiers (or more) to search.

  • Allows 120 variables (or more) in each dataset.

  • Accepts multiple languages, such as Chinese.

  • Can process as many participants as you want.

  • Open source. You may slightly modify the code to increase the number of identifiers, variables, or datasets you want to process.

  • It is an excel template, so you may open it using Microsoft Excel.


Derwin Chan Research
​    Introduction​

The citation looks funny because it's not a software. However, I would be appreciated if you let me know you are using it.


Chan, D. K. C. (2014). DataMerge [Computer software]. Available from www.derwinchan.com.

How to cite?

This is the bottom of the excel window where you can select the sheets of data.

This is the first page of DataMerge. To demonstrate how to use DataMerge, I have already inserted some subject identifiers for ID1 to ID4. As I don't have identifier for ID5, I just leave it blank. Column F and onwards (until Q120) is where the data for the rest of the variables in SET1 is copied to.

This is the second page of DataMerge. Similar to SET1, I copied the data to column F and onwards, but the row order of the participants is not consistent with that of SET1. Also, many of the subject identifiers are missing.

Now turn back to SET1, and scroll to the right until you see column DV. You will now see the data from SET2 has been re-arranged according to the row order of SET1.

DataMerge