Categories
#rstats

Advent of Code 2017 in #rstats: Day 7

Today was hard, but a good challenge.  I haven’t written a recursive function since college CS – is that typical for data science / analysis work?  I don’t see much about recursion on #rstats Twitter.  Recursion feels like a separate way of thinking, and I had forgotten how.

Part 1

The first part was satisfying.  Data cleaning was quick and I thought joining the data.frame to itself to populate parent and child for each entry was a nice touch.

test <- read.delim("07_1_test_dat.txt", header = FALSE, stringsAsFactors = FALSE)

library(pacman)
p_load(tidyr, splitstackshape, readr, dplyr, stringr)

# Tidy the data
tidy_puzzle_input <- function(x){
  result <- x %>%
    separate(V1, into = c("name", "child"), sep = "->") %>%
    cSplit(., "child", ",", "long", type.convert = FALSE) %>%
    mutate(weight = parse_number(name),
           name = str_extract(name, "[A-z]+"))
  
  # Join to itself to attach parent name
  left_join(
    result,
    result %>%
      select(-weight) %>%
      rename(name = child, parent = name)
  )
}

My tidy data looked like:

        name   child weight  parent
1    jlbcwrl    <NA>     93  tzrppo
2    fzqsahw  lybovx    256  peuppj
3    fzqsahw  pdmhva    256  peuppj
4     rxivjo   mewof    206  ikdsvc
5     rxivjo  hrncqs    206  ikdsvc

Then the solution was easy:

# the bottom node will be the only one that doesn't have a parent 

test %>%
  tidy_puzzle_input %>%
  filter(is.na(parent))

read.delim("07_1_dat.txt", header = FALSE, stringsAsFactors = FALSE) %>%
  tidy_puzzle_input %>%
  filter(is.na(parent))

 

Part 2

This was tough going.  I got a function working for the sample input, which worked by stepping backward from the most distant children and adding their weight to their parents, then removing them and calling itself again.

But while the sample input had symmetric towers, in the bigger test data a tower could be balanced if it had two children of `4` and `2 -> 1 + 1`.  In that scenario, you can’t peel off the 4 in the same step that you peel off the 1s.  (For my own satisfaction, that false solution appears far below).

I’m proud of my eventual solution.  And besides finally getting my brain to think recursively, I learned a few things: creating a named vector with `structure()` and using `purrr::map_dbl`.

get_faulty_set <- function(dat){
  
  get_node_children <- function(node_name){
    dat$child[dat$name == node_name]
  }
  
  # recursively get the cumulative weight of a node
  get_node_weight <- function(node_name){
    if(is.na(get_node_children(node_name))[1]){
      dat$weight[dat$name == node_name][1]
    } else{
      dat$weight[dat$name == node_name][1] +
        sum(sapply(get_node_children(node_name), get_node_weight))
    }
  }
  
  # Grab parents whose children have multiple weights - the problem cascades
  faulty_parents <- dat %>%
    rowwise %>%
    mutate(total_weight = get_node_weight(name)) %>%
    group_by(parent) %>%
    summarise(num_wts = n_distinct(total_weight)) %>%
    filter(num_wts > 1) %>%
    pull(parent)
  
  # find the lowest parent where the problem occurs
  lowest_bad_parent <- dat %>%
    filter(name %in% faulty_parents) %>%
    filter(! name %in% .$parent) %>%
    slice(1) %>%
    pull(name)
  
  # Get the weights & names of that parent's children
  bad_set <- structure(
    purrr::map_dbl(get_node_children(lowest_bad_parent), get_node_weight),
    names = get_node_children(lowest_bad_parent)
  )
  bad_set
  # From here it's common sense to spot the odd node out, see how much it needs to change;
  # Then fetch its weight using get_node_weight and compute the sum
}

I finished with a cop-out: once my function returned the weights of the subtower nodes where the problem existed and their names, it wasn’t worth programming the last bit to do the weight subtraction and get my answer.  With this output:

   dkvzre awufxne   osbbt   ycbgx wdjzjlk 
   2255    2255    2255    2260    2255

I just calculated it myself using `get_node_weight(“ycbqx”)` and subtracting 5.

Appendix: Solution with Symmetric Towers

Because I feel salty that this didn’t work.

find_unbalanced_tower <- function(x){
  lowest_children <- x %>%
    filter(is.na(child)) %>%
    distinct(.keep_all = TRUE)
  
  res <- lowest_children %>%
    group_by(parent) %>%
    summarise(uniques = n_distinct(weight),
              total_wt = sum(weight))
  
  # check for completion and return result if done
  if(!all(res$uniques == 1)){
    problem_parent <- res$parent[res$uniques == 2]
    print(paste("Problem parent is: ", problem_parent))
    lowest_children %>%
      filter(parent == problem_parent) %>%
      group_by(name) %>%
      slice(1)
    
  } else {
    # Then add their weight to parent weight, remove them, update parents to have no children
    x <- x %>%
      mutate(child = ifelse(child %in% lowest_children$name, NA, child)) %>%
      filter(!name %in% lowest_children$name) %>%
      rowwise() %>%
      mutate(weight = weight + sum(res$total_wt[res$parent == name])) %>%
      ungroup() %>%
      distinct(.keep_all = TRUE)
    
    find_unbalanced_tower(x)
  }
}

my_input %>%
  tidy_puzzle_input %>%
  find_unbalanced_tower()

 

Leave a Reply

Your email address will not be published. Required fields are marked *