Categories
#rstats

Advent of Code 2017 in #rstats: Day 12

(Day 12 puzzle). This was my favorite day so far.  I’ve never faced my own graph problem and this was a great example for trying out the igraph package.

Big shout out to Gábor Csárdi and anyone else on the igraph team who wrote the docs.  And I mean wrote the docs!  When I google an R question, 99% of the time I land on StackOverflow.  The searches I made for Day 12 all* took me to the igraph documentation website, which answered my questions.  I don’t know of another R package or topic like that.

Their example of creating a graph was clear and was easy to adapt to the toy example on Day 12.  From there, some searching found the two functions I’d need for Day 12: neighborhood() and clusters().  Look how short my part 2 is!

Part 0: Playing with igraph

Here’s the documentation example for creating an igraph.  I played with it to confirm it would work for my needs:

library(pacman)
p_load(igraph, tidyr, dplyr)

# Toy example
relations <- data.frame(from=c("Bob", "Cecil", "Cecil", "David",
                               "David", "Esmeralda"),
                        to=c("Alice", "Bob", "Alice", "Alice", "Bob", "Alice"),
                        same.dept=c(FALSE,FALSE,TRUE,FALSE,FALSE,TRUE),
                        friendship=c(4,5,5,2,1,1), advice=c(4,5,5,4,2,3))
g <- graph_from_data_frame(relations, directed=FALSE)
neighborhood(g, 1, "Esmeralda") # 2
neighborhood(g, 2, "Esmeralda") # 5

Part 1

This was mostly wrangling the data into the igraph.  It didn’t seem to like integer names for vertices so I prepended “a”.

create_graph_from_input <- function(filename){
  filename %>%
    read.delim(header = FALSE) %>%
    separate(V1, into = c("v1", "v2"), sep = "<->") %>%
    separate_rows(v2, sep = ",") %>%
    mutate(v1 = paste0("a", str_trim(v1)),
           v2 = paste0("a", str_trim(v2))) %>%
    graph_from_data_frame(directed = FALSE)
}

get_group_size <- function(filename, grp_size, node_name){
   create_graph_from_input(filename) %>%
    neighborhood(grp_size, paste0("a", node_name)) %>%
    unlist %>%
    length()
  }
testthat::expect_equal(get_group_size("12_1_test_dat.txt", 30, "0"), 6)
get_group_size("12_1_dat.txt", 30, "0") # 239

I increased the `grp_size` parameter until my result stopped increasing.  That was at about 30 degrees of separation (it was still changing at 15).  A more permanent solution might include a loop to do this.

Part 2

All you need is igraph::clusters():

"12_1_dat.txt" %>%
  create_graph_from_input %>%
  clusters() %>%
  .$no #215

One.  Function.

Conclusion: graphs are neat, igraph is the way to analyze them.

* okay, one search took me to StackOverflow and gave me what I needed: the `clusters()` function.  Everything else came from igraph.org.

Leave a Reply

Your email address will not be published. Required fields are marked *