Here’s a function that generates a specified number of unique random IDs, of a certain length, in a reproducible way.
There are many reasons you might want a vector of unique random IDs. In this case, I embed my unique IDs in SurveyMonkey links that I send via mail merge. This way I can control the emailing process, rather than having messages come from SurveyMonkey, but I can still identify the respondents. If you are doing this for the same purpose, note that you first need to enable a custom variable in SurveyMonkey! I call mine a for simplicity.
The function
create_unique_ids <- function(n, seed_no = 1, char_len = 5){
set.seed(seed_no)
pool <- c(letters, LETTERS, 0:9)
res <- character(n) # pre-allocating vector is much faster than growing it
for(i in seq(n)){
this_res <- paste0(sample(pool, char_len, replace = TRUE), collapse = "")
while(this_res %in% res){ # if there was a duplicate, redo
this_res <- paste0(sample(pool, char_len, replace = TRUE), collapse = "")
}
res[i] <- this_res
}
res
}
Here’s what you get:
> create_unique_ids(10)
[1] "qxJ4m" "36ONd" "mkQxV" "ES9xW" "5nOhq" "xax1v" "DLElZ" "PXgSz" "YOWIG" "WbDTQ"
This function could get stuck in the while-loop if your N exceeds the number of unique permutations of alphanumeric strings of length char_len
. There are length(pool) ^ char_len
permutations available. Under the default value of char_len = 5
, that’s 62^5 combinations or 916,132,832. This should not be a problem for most users.
On reproducible randomization
The ability to set the randomization seed is to aid in reproducing the ID vector. If you’re careful, and using version control, you should be able to retrace what you did even without setting seed. There are downsides to setting the same seed each time too, for instance, if your input list gets shuffled and you’re now assigning already-used codes to different users.
No matter how you use this function, think carefully about how to record and reuse values such that IDs stay consistent over time.
Exporting results for mail merging
Here’s what this might look like in practice if you want to generate these IDs, then merge them into SurveyMonkey links and export for sending in a mail merge. In the example below, I generate both English- and Spanish-language links.
roster$id <- create_unique_ids(nrow(roster), seed = 23)
roster$link_en <- paste0("https://www.research.net/r/YourSurveyName?a=", roster$id, "&lang=en")
roster$link_es <- paste0("https://www.research.net/r/YourSurveyName?a=", roster$id, "&lang=es")
readr::write_csv(roster, "data/clean/roster_to_mail.csv", na = "")
Note that I have created the custom variable a
in SurveyMonkey, which is why I can say a=
in the URL.