North and South move one step up or down. The four diagonal directions move a half-step up or down and a full column laterally. The shortest solution path will be diagonal moves to reach the desired column, then vertical moves to the right row.

It took only a minute or two to modify my part 1 function for part 2, so I present both together.

library(dplyr); library(testthat) steps_needed <- function(dat){ lat <- 0 lon <- 0 max_dist <- 0 current_dist <- 0 for(i in seq_along(dat)){ if(dat[i] == "n"){lat <- lat + 1} if(dat[i] == "s"){lat <- lat - 1} if(dat[i] == "ne"){lat <- lat + 0.5; lon <- lon + 1} if(dat[i] == "se"){lat <- lat - 0.5; lon <- lon + 1} if(dat[i] == "nw"){lat <- lat + 0.5; lon <- lon - 1} if(dat[i] == "sw"){lat <- lat - 0.5; lon <- lon - 1} current_distance <- abs(lon) + # diagonal steps to move horizontally abs(lat) - 0.5 * abs(lon) # vertical steps, adjusted for prev diagonal moves max_dist <- max(c(max_dist), current_distance) current_dist <- current_distance } structure(c(current_dist, max_dist), names = c("Current Distance", "Maximum Distance")) } # Tests expect_equal(steps_needed(c("se","sw","se","sw","sw")), 3) expect_equal(steps_needed(c("ne", "ne", "ne")), 3) expect_equal(steps_needed(c("ne", "ne", "sw", "sw")), 0) expect_equal(steps_needed(c("ne", "ne", "s", "s")), 2) # Execute dat <- unlist(str_split(scan("11_1_dat.txt", "character"), ",")) steps_needed(dat) # Current Distance Maximum Distance # 722 1551

]]>

`(`

, subtract one for every `)`

.
That strategy formed the basis for my solution today. And I was excited to do Part 2 with arithmetic, not writing new code.

Strategy: maintain counter of open braces, decreasing for closed braces.

So `{{<a>},{<a>},{<a>},{<a>}}`

= 1 2 2 2 2. Sum = 9.

library(pacman) p_load(stringr, testthat) # cancel the character after ! cancel_chars <- function(x){ gsub("!.", "", x) } group_score <- function(dat){ clean <- gsub("<.*?>", "" , dat) # non-greedy regex to remove garbage lvl <- 1 # depth of braces nesting counts <- rep(0, nchar(clean)) # default value for non-{ characters for(i in seq.int(nchar(counts))){ if(str_sub(clean, i, i) == "{"){ # log the nested depth and increment counter counts[i] <- counts[i] + lvl lvl <- lvl + 1 } if(str_sub(clean, i, i) == "}"){ lvl <- lvl - 1 } } sum(counts) } # Test expect_equal(group_score("{{{}}}"), 6) expect_equal(group_score("{{},{}}"), 5) expect_equal(group_score("{{{},{},{{}}}}"), 16) dat <- scan("09_1_dat.txt", "character") group_score(cancel_chars(dat)) # 15922

The answer is the total number of characters, minus:

- The characters canceled by
`!`

- The bracketing
`<>`

for each garbage string - The valid characters remaining after removing the garbage in Part 1

To wit:

cleaned <- cancel_chars(dat) nchar(cleaned) - 2*str_count(string = cleaned, pattern = fixed(">")) - nchar(gsub("<.*?>", "" , cleaned)) # 7314

]]>

The first part was satisfying. Data cleaning was quick and I thought joining the data.frame to itself to populate parent and child for each entry was a nice touch.

test <- read.delim("07_1_test_dat.txt", header = FALSE, stringsAsFactors = FALSE) library(pacman) p_load(tidyr, splitstackshape, readr, dplyr, stringr) # Tidy the data tidy_puzzle_input <- function(x){ result <- x %>% separate(V1, into = c("name", "child"), sep = "->") %>% cSplit(., "child", ",", "long", type.convert = FALSE) %>% mutate(weight = parse_number(name), name = str_extract(name, "[A-z]+")) # Join to itself to attach parent name left_join( result, result %>% select(-weight) %>% rename(name = child, parent = name) ) }

My tidy data looked like:

name child weight parent 1 jlbcwrl <NA> 93 tzrppo 2 fzqsahw lybovx 256 peuppj 3 fzqsahw pdmhva 256 peuppj 4 rxivjo mewof 206 ikdsvc 5 rxivjo hrncqs 206 ikdsvc

Then the solution was easy:

# the bottom node will be the only one that doesn't have a parent test %>% tidy_puzzle_input %>% filter(is.na(parent)) read.delim("07_1_dat.txt", header = FALSE, stringsAsFactors = FALSE) %>% tidy_puzzle_input %>% filter(is.na(parent))

This was tough going. I got a function working for the sample input, which worked by stepping backward from the most distant children and adding their weight to their parents, then removing them and calling itself again.

But while the sample input had symmetric towers, in the bigger test data a tower could be balanced if it had two children of `4`

and `2 -> 1 + 1`

. In that scenario, you can’t peel off the 4 in the same step that you peel off the 1s. (For my own satisfaction, that false solution appears far below).

I’m proud of my eventual solution. And besides finally getting my brain to think recursively, I learned a few things: creating a named vector with `structure()`

and using `purrr::map_dbl`

.

get_faulty_set <- function(dat){ get_node_children <- function(node_name){ dat$child[dat$name == node_name] } # recursively get the cumulative weight of a node get_node_weight <- function(node_name){ if(is.na(get_node_children(node_name))[1]){ dat$weight[dat$name == node_name][1] } else{ dat$weight[dat$name == node_name][1] + sum(sapply(get_node_children(node_name), get_node_weight)) } } # Grab parents whose children have multiple weights - the problem cascades faulty_parents <- dat %>% rowwise %>% mutate(total_weight = get_node_weight(name)) %>% group_by(parent) %>% summarise(num_wts = n_distinct(total_weight)) %>% filter(num_wts > 1) %>% pull(parent) # find the lowest parent where the problem occurs lowest_bad_parent <- dat %>% filter(name %in% faulty_parents) %>% filter(! name %in% .$parent) %>% slice(1) %>% pull(name) # Get the weights & names of that parent's children bad_set <- structure( purrr::map_dbl(get_node_children(lowest_bad_parent), get_node_weight), names = get_node_children(lowest_bad_parent) ) bad_set # From here it's common sense to spot the odd node out, see how much it needs to change; # Then fetch its weight using get_node_weight and compute the sum }

I finished with a cop-out: once my function returned the weights of the subtower nodes where the problem existed and their names, it wasn’t worth programming the last bit to do the weight subtraction and get my answer. With this output:

dkvzre awufxne osbbt ycbgx wdjzjlk 2255 2255 2255 2260 2255

I just calculated it myself using `get_node_weight("ycbqx")`

and subtracting 5.

Because I feel salty that this didn’t work.

find_unbalanced_tower <- function(x){ lowest_children <- x %>% filter(is.na(child)) %>% distinct(.keep_all = TRUE) res <- lowest_children %>% group_by(parent) %>% summarise(uniques = n_distinct(weight), total_wt = sum(weight)) # check for completion and return result if done if(!all(res$uniques == 1)){ problem_parent <- res$parent[res$uniques == 2] print(paste("Problem parent is: ", problem_parent)) lowest_children %>% filter(parent == problem_parent) %>% group_by(name) %>% slice(1) } else { # Then add their weight to parent weight, remove them, update parents to have no children x <- x %>% mutate(child = ifelse(child %in% lowest_children$name, NA, child)) %>% filter(!name %in% lowest_children$name) %>% rowwise() %>% mutate(weight = weight + sum(res$total_wt[res$parent == name])) %>% ungroup() %>% distinct(.keep_all = TRUE) find_unbalanced_tower(x) } } my_input %>% tidy_puzzle_input %>% find_unbalanced_tower()

]]>

- Read the instructions carefully. I built the balancing function
`balance_banks`

to return how many distributions it made of the maximum value, but removed this code when I realized that the full distribution of a bank of N values counted as one iteration, not N iterations. - Related: Up until now, Git has felt unnecessary for these challenges, and I’ve been too lazy to use it (though it would do wonders for my GitHub contribution graph). Today I got nervous deleting and reworking code for Part 1 … would I need it for the not-yet-known Part 2?
- I built my
`balance_banks`

function inside of my`go_til_repeat`

function (because I thought it would help with passing counter information, which I did not need). When editing and then testing, I kept failing to load*both*into memory; I’d update one, load the other into memory, and call the first one. I don’t work much with nested functions of any complexity; TIL to avoid loading just one while programming and testing them.

Today was the first day that run-time performance became an issue. My outputs for each part were correct on the first try, but each took a while (3 minutes?) to run. Suspicious that my code was stuck in a loop, I never let it finish the first call I made: I broke the loop manually and added the counter to print every 1,000 rebalances.

But it was just slow. To check completion, I switched to the use of double `duplicated`

you see below instead of `janitor::get_dupes()`

, which is easier on the brain but a relatively expensive operation.

Lastly, when I thought I had to return multiple outputs from `balance_banks`

I had it return them in a list. I removed the increment counter code and the function now returns just one result (the balanced vector), but I left the list syntax in place since it worked. That’s why you see the `$output`

in `last_result <- balance_banks(last_result$output)`

for instance.

Adding Part 2 onto Part 1 was trivial. I’d seen Jenny Bryan use `diff(range())`

in a solution to an earlier puzzle and was stoked to reuse it in Part 2.

I resorted to `rbind`

instead of my beloved `dplyr::bind_rows`

because my data.frames don’t have column names. I remain tickled by situations where I work with data.frames in R while ignoring column names.

library(testthat) # Get the index of the largest bank bank_to_split_loc <- function(x) { which(x == max(x))[1] # subset to break ties } expect_equal(bank_to_split_loc(dat), 3) # Takes a vector and balances the banks by distributing the largest one over the rest balance_banks <- function(dat){ loop_amount <- dat[bank_to_split_loc(dat)] current_bank <- bank_to_split_loc(dat) + 1 dat[bank_to_split_loc(dat)] <- 0 for(i in seq_len(loop_amount)){ if(current_bank > length(dat)) { current_bank <- 1 } # wrap around the boundary dat[current_bank] <- dat[current_bank] + 1 current_bank <- current_bank + 1 } list(output = dat) # this is unnecessary, but when I misunderstood the problem I was also passing a counter out of this loop } expect_equal(balance_banks(c(0, 2, 7, 0))$output, c(2, 4, 1, 2)) # Takes vector of banks, keeps rebalancing them and checking the resulting vectors until a match is found # I later added the part 1 or 2 argument to return go_til_repeat <- function(input, part = 1){ # Initialize results data.frame results <- data.frame(matrix(ncol = length(input), nrow = 0), stringsAsFactors = FALSE) results <- rbind(results, input) last_result <- balance_banks(input) results <- rbind(results, last_result$output) # Print every 1000 rebalances, to confirm making progress # I thought the max bank value would decrease over time; it doesn't really, but still works as a progress indicator print_counter <- 0 while(sum(duplicated(results)) == 0){ if(print_counter %% 1000 == 0 & print_counter > 0){ print(max(last_result$output)) } print_counter <- print_counter + 1 last_result <- balance_banks(last_result$output) results <- rbind(results, last_result$output) } if(part == 1){ # how many total cycles nrow(results)-1 } else if (part == 2){ # how many cycles between duplicated rows # courtesy of StackOverflow: https://stackoverflow.com/questions/12495345/find-indices-of-duplicated-rows diff(range(which(duplicated(results) | duplicated(results, fromLast = TRUE)))) } } # These commands will take several minutes to run day_05_input <- c(5, 1, 10, 0, 1, 7, 13, 14, 3, 12, 8, 10, 7, 12, 0, 6) go_til_repeat(day_05_input, 1) go_til_repeat(day_05_input, 2)

]]>

`testthat`

for checking against the test input).
This was a textbook case for a `while`

loop, which I rarely use in R.

steps_to_leave <- function(vec, allow_decreasing_offsets = FALSE){ index <- 1 steps <- 0 while(!is.na(vec[index])){ steps <- steps + 1 starting_position <- index index <- starting_position + vec[index] # increase or decrease original offset, updated for part 2 if(!allow_decreasing_offsets | vec[starting_position] < 3){ vec[starting_position] <- vec[starting_position] + 1 } else { vec[starting_position] <- vec[starting_position] - 1 } } steps } # Part 1 test_vec <- c(0, 3, 0, 1, -3) testthat::expect_equal(steps_to_leave(test_vec, FALSE), 5) dat <- scan("05_1_dat.txt", numeric(), quote = "") steps_to_leave(dat) # 364539 # Part 2 testthat::expect_equal(steps_to_leave(test_vec, TRUE), 10) steps_to_leave(dat, TRUE) # 27477714

]]>

I wrote a function that compares the length of the input tokens to length of `unique()`

input tokens. Then for part 2, I added an argument as to whether anagrams should be permitted; if not, the tokens are each sorted alphabetically first into alphagrams (a word borrowed from competitive Scrabble).

I borrowed the string sorting algorithm from StackOverflow – I cited it below and upvoted as well

library(pacman) p_load(dplyr, stringr, testthat) check_valid <- function(string, anagrams_okay){ subs <- unlist(str_split(string, " ")) if(!anagrams_okay){ subs <- unlist(lapply(subs, make_alphagram)) # for part 2 } length(unique(subs)) == length(subs) } # thanks StackOverflow! https://stackoverflow.com/questions/5904797/how-to-sort-letters-in-a-string make_alphagram <- function(x){ paste(sort(unlist(strsplit(x, ""))), collapse = "") }

Now it’s just a matter of testing & running:

# Tests expect_equal(check_valid("aa bb cc dd ee aa", anagrams_okay = TRUE), FALSE) expect_equal(check_valid("ba bb cc dd ee ab", anagrams_okay = TRUE), TRUE) expect_equal(check_valid("ba bb cc dd ee ab", anagrams_okay = FALSE), FALSE) # Part 1 raw <- read.delim("04_1_dat.txt", header = FALSE, stringsAsFactors = FALSE)[[1]] # maybe an inelegant way to read in as a vector... lapply(raw, check_valid) %>% unlist %>% sum # Part 2 lapply(raw, check_valid, anagrams_okay = FALSE) %>% unlist %>% sum]]>

- Excited: R is good at rectangular numeric data, and I work with a lot of spreadsheets
- Concerned: Will I be starting with a single long string? How do I get that into rectangular form?

After considering trying to chop up one long string, I copy-pasted my input to a text file and `read.delim()`

worked on the first try.

Part 1 was simple, Part 2 was tricky. The idea of row-wise iteration got in my head and I spent a lot of time with `apply`

and `lapply`

… I still hold a grudge against them from battles as a new R user, but having (mostly) bent them to my will I went to them here. Maybe I should spend more time with the `purrr`

package.

Straightforward:

library(dplyr) library(testthat) ex <- data.frame( a = c(5, 7, 2), b = c(1,5,4), c = c(9,3,6), d = c(5, NA, 8) ) checksum <- function(x){ row_max <- apply(x, 1, max, na.rm = TRUE) row_min <- apply(x, 1, min, na.rm = TRUE) sum(row_max - row_min) } expect_equal(checksum(ex), 18) # works! my_input <- read.delim("02_1_dat.txt", header = FALSE) checksum(my_input)

Got the idea quickly, but executing took a bit longer. I decided right away that I would:

- Work on a function that tackled a row (as a vector) at a time
- The function would divide every value in the vector by every other value in the vector
- It would then pick out the integer that wasn’t 1, as well as its reciprocal, and divide them
- Then run this function row-wise on the input

Actually building up that function took ~20 minutes. I felt confident the approach would work so didn’t stop to consider anything else… I’m excited to see other solutions after I publish this, as maybe there’s something elegant I missed.

Along the way I learned: `is.integer`

doesn’t do what I thought it would. Come on, base R.

I don’t work with matrices much (not a mathematician, don’t have big data) and was a somewhat vexed by the matrix -> data.frame conversion.

# Will apply something row-wise through the data.frame # Need to get the dividend and divisor # example data for part 2: ex2 <- data.frame( a = c(5, 9, 3), b = c(9,4,8), c = c(2,7,6), d = c(8, 3, 5) ) # start with the top row to play with x <- ex2[1, ] # I'll create an x by x matrix with every number in a vector divided by itself # I want the numbers corresponding to matrix entries containing the sole non-1 integer and its reciprocal. # Annoying, is.integer() doesn't do what I thought it would, so need to make my own function: https://stackoverflow.com/q/3476782/4470365 is_integer <- function(x) { (x %% 1) == 0 } is_recip_integer <- function(x){ ((1/x) %% 1) == 0 } get_quotient <- function(x){ # create x by x matrix dividing each entry by itself divided <- sapply(x, function(a) a/x) %>% unlist %>% matrix(nrow = sqrt(length(.))) %>% as.data.frame() divided[divided == 1] <- NA # don't want the diagonal values # Get the position where the vector contains an integer bigger_index <- which( lapply(divided, is_integer) %>% lapply(sum, na.rm = TRUE) %>% unlist(.) == 1) # I drafted this function as nested parentheses, # converted to pipes for blogging... it looks weird. # Get the position where the vector contains the reciprocal of an integer smaller_index <- which( lapply(divided, is_recip_integer) %>% lapply(sum, na.rm = TRUE) %>% unlist(.) == 1) # Compute quotient of values at those indices x[bigger_index] / x[smaller_index] } # Test expect_equal( # apply the function rowwise and sum apply(ex2, 1, get_quotient) %>% sum, 9 ) # Get part 2 answer: apply(my_input, 1, get_quotient) %>% sum

]]>

**Part 1:**

Split the input into a vector, add the first value to the end, and use dplyr’s lead() function to compare each value to the following value:

library(stringr) library(dplyr) target <- "112345567891" # should yield 1 + 5 + 1 = 7 target <- str_split(target, "") %>% unlist %>% as.numeric() # split input string into a vector: target <- c(target, target[1]) # add the first value to the end, to account for wrap-around sum(target[target == lead(target)], na.rm = TRUE) # sum every number that matches the number following it

**Part 2:**

This challenge is simpler, in my opinion. You can halve the vector, then compare the first half to the second half, summing the matches (don’t forget to double that result).

target <- "123425" # should match both 2s for a total of 4 target <- str_split(target, "") %>% unlist %>% as.numeric() first_half <- target[1:(length(target)/2)] second_half <- target[-c(1:(length(target)/2))] sum(first_half[first_half == second_half], na.rm = TRUE) * 2]]>

Since then we’ve rotated beers through it. In order of fill date:

- Imperial Stout
- Scotch Ale (
*this beer naturally soured)* - Oud Bruin (
*we pitched 8 packs of Blackman Flemish Sour Mix)* - Tart of Darkness Stout (
*here we introduced Brett C., actually a strain of Brett Anomalous)* - Dark Saison
- Belgian Golden Strong
- IPA

We empty + refill every six months or so. The brewers are a rotating cast, with people dropping in and out. We typically aim to bring 11 shares of 5 gallons each, filling the barrel to the top and leaving some extra to top up the angel’s share.

This barrel has produced consistently good beers, and at this point the sour character is locked in. Beers from this barrel have won silver and gold medals in the American Wild Ale category at the 2016 & 2017 Michigan Beer Cup. Brewers often perform a tertiary fermentation on fruit – tart cherries are a favorite, this being Michigan – and sometimes blend with young or clean beers to cut sourness to taste.

]]>If you add sugar to hard cider and don’t want that addition to restart fermentation (which would increase alcohol and leave the cider even drier), you’ll need to stabilize it. The most common method in home cidermaking is to add both potassium sorbate and potassium sulfite. This guidance from BYO magazine on backsweetening provides background on the approach.

For each gallon of cider, use 1/2 tsp of potassium sorbate and 1/2 tsp of 10% sulfite solution (an extra step, but worth making – the solution is easier to work with than dry potassium sulfite).

**Winemakers say…**

Winemakers talk more about sorbic acid, the relevant chemical; potassium sorbate is 74% sorbic acid. There are legal limits of 0.2 g/L (Europe) and 0.3 g/L (America); the sensory level for perceiving this chemical’s flavor is reported at 0.135 g/L.

Adjusting these from sorbic acid -> potassium sorbate (what a homebrewer weighs) gives legal limits of 0.26 g/L and 0.4 g/L, respectively, with a taste threshold of 0.18 g/L.

**Homebrewers say…**

The container from the homebrew store says “1/2 tsp per gallon.” My high-resolution scale says a typical 1/2 teaspoon of sorbate weighs 1.1g (*this container has been opened many times, and perhaps has absorbed moisture*). Thus the container recommendation translates to 0.29 g/L.

BYO recommends a sorbate dosage rate of “0.5 to 1.0 g/L”. This is much higher than all other recommendations, with no justification provided.

**Other factors**

This winemaking guide, quoting Peynaud (1984), notes that sorbate is more effective – and thus less is needed – at lower pH and higher alcohol %. Unfortunately for us, cider is generally much less alcoholic than wine (*pH is similar … I think?*).

**My approach: **just go with the recommendation on the label: 1/2 tsp per gallon, aiming for 0.3 g/L – the limit allowed in American winemaking.

More complicated process, but more coherent estimates. Sulfites are discussed in terms of ppm of sulfur dioxide (SO_{2}), as this is what matters and there are several ways to add sulfur dioxide to wine or cider. Professionals have procedures for estimating free SO_{2}, as they add sulfites at various stages to maintain a desired SO_{2} level. If you haven’t yet added sulfites, let’s assume you have no free SO_{2} in your cider.

**Homebrewers say…**

BYO recommends a sulfite addition of 30 mg/L.

**Winemakers say…**

Try Winemaker Mag’s sulfite calculator, treating your cider as a white wine – you can enter the actual pH and ABV, so this doesn’t seem like a misapplication. When I punch in BYO’s recommended goal of 30 mg/L of free SO_{2}, the calculator suggests I target 36-42 mg/L instead, based on pH.

To achieve this, the calculator suggests adding about 2.6 mL of 10% sulfite solution per gallon of cider – which is just slightly more than 1/2 tsp per gallon. A 10% solution is the easiest way to add sulfites, as the math is simpler and you don’t have to worry about dissolving your sulfites each time. Here’s how to make a 10% solution. I keep a small bottle of it.

**Other factors**

The US legal limit for sulfites in cider is 300 mg/L, counting all additions. You shouldn’t be hitting this as a home cidermaker, but if you use sulfites up front (e.g., Campden tablets) to control wild yeasts and microbes, do the math to be sure. Commercial winemakers add sulfites at various stages (including when racking) to reduce oxygen pickup; I don’t know any home cider makers who do this.

**My approach: **I go with about 1/2 tsp of 10% solution per gallon of cider.