The solution was a simple concept but harder than I thought to implement. I learned some lessons today:
- Read the instructions carefully. I built the balancing function
balance_banks
to return how many distributions it made of the maximum value, but removed this code when I realized that the full distribution of a bank of N values counted as one iteration, not N iterations. - Related: Up until now, Git has felt unnecessary for these challenges, and I’ve been too lazy to use it (though it would do wonders for my GitHub contribution graph). Today I got nervous deleting and reworking code for Part 1 … would I need it for the not-yet-known Part 2?
- I built my
balance_banks
function inside of mygo_til_repeat
function (because I thought it would help with passing counter information, which I did not need). When editing and then testing, I kept failing to load both into memory; I’d update one, load the other into memory, and call the first one. I don’t work much with nested functions of any complexity; TIL to avoid loading just one while programming and testing them.
Today was the first day that run-time performance became an issue. My outputs for each part were correct on the first try, but each took a while (3 minutes?) to run. Suspicious that my code was stuck in a loop, I never let it finish the first call I made: I broke the loop manually and added the counter to print every 1,000 rebalances.
But it was just slow. To check completion, I switched to the use of double duplicated
you see below instead of janitor::get_dupes()
, which is easier on the brain but a relatively expensive operation.
Lastly, when I thought I had to return multiple outputs from balance_banks
I had it return them in a list. I removed the increment counter code and the function now returns just one result (the balanced vector), but I left the list syntax in place since it worked. That’s why you see the $output
in last_result <- balance_banks(last_result$output)
for instance.
Adding Part 2 onto Part 1 was trivial. I’d seen Jenny Bryan use diff(range())
in a solution to an earlier puzzle and was stoked to reuse it in Part 2.
I resorted to rbind
instead of my beloved dplyr::bind_rows
because my data.frames don’t have column names. I remain tickled by situations where I work with data.frames in R while ignoring column names.
Parts 1 and 2
library(testthat) # Get the index of the largest bank bank_to_split_loc <- function(x) { which(x == max(x))[1] # subset to break ties } expect_equal(bank_to_split_loc(dat), 3) # Takes a vector and balances the banks by distributing the largest one over the rest balance_banks <- function(dat){ loop_amount <- dat[bank_to_split_loc(dat)] current_bank <- bank_to_split_loc(dat) + 1 dat[bank_to_split_loc(dat)] <- 0 for(i in seq_len(loop_amount)){ if(current_bank > length(dat)) { current_bank <- 1 } # wrap around the boundary dat[current_bank] <- dat[current_bank] + 1 current_bank <- current_bank + 1 } list(output = dat) # this is unnecessary, but when I misunderstood the problem I was also passing a counter out of this loop } expect_equal(balance_banks(c(0, 2, 7, 0))$output, c(2, 4, 1, 2)) # Takes vector of banks, keeps rebalancing them and checking the resulting vectors until a match is found # I later added the part 1 or 2 argument to return go_til_repeat <- function(input, part = 1){ # Initialize results data.frame results <- data.frame(matrix(ncol = length(input), nrow = 0), stringsAsFactors = FALSE) results <- rbind(results, input) last_result <- balance_banks(input) results <- rbind(results, last_result$output) # Print every 1000 rebalances, to confirm making progress # I thought the max bank value would decrease over time; it doesn't really, but still works as a progress indicator print_counter <- 0 while(sum(duplicated(results)) == 0){ if(print_counter %% 1000 == 0 & print_counter > 0){ print(max(last_result$output)) } print_counter <- print_counter + 1 last_result <- balance_banks(last_result$output) results <- rbind(results, last_result$output) } if(part == 1){ # how many total cycles nrow(results)-1 } else if (part == 2){ # how many cycles between duplicated rows # courtesy of StackOverflow: https://stackoverflow.com/questions/12495345/find-indices-of-duplicated-rows diff(range(which(duplicated(results) | duplicated(results, fromLast = TRUE)))) } } # These commands will take several minutes to run day_05_input <- c(5, 1, 10, 0, 1, 7, 13, 14, 3, 12, 8, 10, 7, 12, 0, 6) go_til_repeat(day_05_input, 1) go_til_repeat(day_05_input, 2)
One reply on “Advent of Code 2017 in #rstats: Day 6”
Looking at others’ solutions after completing mine, I tried out Jennifer Bryan’s: https://github.com/jennybc/2017_advent-of-code/blob/master/day06.R
I’d estimate it’s 1/10th the characters and 1/1000th the runtime. Humbling & inspiring.