Advent of Code 2017 in #rstats: Day 6

The solution was a simple concept but harder than I thought to implement. I learned some lessons today:

Read the instructions carefully. I built the balancing function balance_banks to return how many distributions it made of the maximum value, but removed this code when I realized that the full distribution of a bank of N values counted as one iteration, not N iterations.
Related: Up until now, Git has felt unnecessary for these challenges, and I’ve been too lazy to use it (though it would do wonders for my GitHub contribution graph). Today I got nervous deleting and reworking code for Part 1 … would I need it for the not-yet-known Part 2?
I built my balance_banks function inside of my go_til_repeat function (because I thought it would help with passing counter information, which I did not need). When editing and then testing, I kept failing to load both into memory; I’d update one, load the other into memory, and call the first one. I don’t work much with nested functions of any complexity; TIL to avoid loading just one while programming and testing them.

Today was the first day that run-time performance became an issue. My outputs for each part were correct on the first try, but each took a while (3 minutes?) to run. Suspicious that my code was stuck in a loop, I never let it finish the first call I made: I broke the loop manually and added the counter to print every 1,000 rebalances.

But it was just slow. To check completion, I switched to the use of double duplicated you see below instead of janitor::get_dupes(), which is easier on the brain but a relatively expensive operation.

Lastly, when I thought I had to return multiple outputs from balance_banks I had it return them in a list. I removed the increment counter code and the function now returns just one result (the balanced vector), but I left the list syntax in place since it worked. That’s why you see the $output in last_result <- balance_banks(last_result$output)for instance.

Adding Part 2 onto Part 1 was trivial. I’d seen Jenny Bryan use diff(range()) in a solution to an earlier puzzle and was stoked to reuse it in Part 2.

I resorted to rbind instead of my beloved dplyr::bind_rows because my data.frames don’t have column names. I remain tickled by situations where I work with data.frames in R while ignoring column names.

Parts 1 and 2

library(testthat)

# Get the index of the largest bank
bank_to_split_loc <- function(x) {
  which(x == max(x))[1] # subset to break ties
}
expect_equal(bank_to_split_loc(dat), 3)

# Takes a vector and balances the banks by distributing the largest one over the rest
balance_banks <- function(dat){
  loop_amount <- dat[bank_to_split_loc(dat)]
  current_bank <- bank_to_split_loc(dat) + 1
  dat[bank_to_split_loc(dat)] <- 0
  for(i in seq_len(loop_amount)){
    if(current_bank > length(dat)) { current_bank <- 1 } # wrap around the boundary
    dat[current_bank] <- dat[current_bank] + 1
    current_bank <- current_bank + 1
  }
  list(output = dat) # this is unnecessary, but when I misunderstood the problem I was also passing a counter out of this loop
}
expect_equal(balance_banks(c(0, 2, 7, 0))$output, c(2, 4, 1, 2))

# Takes vector of banks, keeps rebalancing them and checking the resulting vectors until a match is found
# I later added the part 1 or 2 argument to return 
go_til_repeat <- function(input, part = 1){
 
  # Initialize results data.frame
  results <- data.frame(matrix(ncol = length(input), nrow = 0), stringsAsFactors = FALSE)
  results <- rbind(results, input)
  
  last_result <- balance_banks(input)
  results <- rbind(results, last_result$output)
  
  # Print every 1000 rebalances, to confirm making progress
  # I thought the max bank value would decrease over time; it doesn't really, but still works as a progress indicator
  print_counter <- 0
  while(sum(duplicated(results)) == 0){
    if(print_counter %% 1000 == 0 & print_counter > 0){ print(max(last_result$output)) }
    print_counter <- print_counter + 1
    last_result <- balance_banks(last_result$output)
    results <- rbind(results, last_result$output)
  }

  if(part == 1){ # how many total cycles
    nrow(results)-1 
  } else if (part == 2){ # how many cycles between duplicated rows
    # courtesy of StackOverflow: https://stackoverflow.com/questions/12495345/find-indices-of-duplicated-rows
    diff(range(which(duplicated(results) | duplicated(results, fromLast = TRUE))))
  }

}

# These commands will take several minutes to run
day_05_input <- c(5, 1, 10, 0, 1, 7, 13, 14, 3, 12, 8, 10, 7, 12, 0, 6)
go_til_repeat(day_05_input, 1)
go_til_repeat(day_05_input, 2)

One reply on “Advent of Code 2017 in #rstats: Day 6”

Looking at others’ solutions after completing mine, I tried out Jennifer Bryan’s: https://github.com/jennybc/2017_advent-of-code/blob/master/day06.R

I’d estimate it’s 1/10th the characters and 1/1000th the runtime. Humbling & inspiring.

Parts 1 and 2

One reply on “Advent of Code 2017 in #rstats: Day 6”

Leave a Reply to Sam Cancel reply