Co-author Network Analysis with networkD3


Michael Luu


January 23, 2024


This is post was largely inspired by the advent of code 2023 day 25 puzzle which largely involved network analysis. I thought it would be a fun visualization to visualize the network of inter-connected co-authors among the publications I have been a part of.

We start off with using the RefManageR package to query pubmed for all publications with my name and Cedars-Sinai as an affiliation. We then use tidyverse to wrangle the data into a format that can be used by networkD3.

q <- '(Michael Luu) AND (Cedars-Sinai[Affiliation])'

data <- RefManageR::ReadPubMed(q, retmax = 999)

data <- data |> as_tibble()

data <- data |>
  select(author) |>
  mutate(i = row_number(), .before = author) |> 
    delim = ' and ',
    names = paste0('author', 1:30),
    too_few = 'align_start'
  ) |> 
  pivot_longer(2:31) |> 

# A tibble: 761 × 3
       i name     value               
   <int> <chr>    <chr>               
 1     1 author1  Vinicius F Calsavara
 2     1 author2  Norah L Henry       
 3     1 author3  Ron D Hays          
 4     1 author4  Sungjin Kim         
 5     1 author5  Michael Luu         
 6     1 author6  Márcio A Diniz      
 7     1 author7  Gillian Gresham     
 8     1 author8  Reena S Cecchini    
 9     1 author9  Greg Yothers        
10     1 author10 Patricia A Ganz     
# ℹ 751 more rows

Now that we have the data in a tidy format, let’s construct a tibble, where all co-authors are connected to all other co-authors for a given publication. Since the list of interconnected co-authors are so large, for simplicity sake we are going to randomly sample 5 publications and visualize the network of co-authors for those publications only.

nd <- data |> 
  group_nest(i) |> 
  deframe() |> 
  map(\(x) {
    authors <- x |> pull(value)
    data <- tidyr::crossing(
      from = authors,
      to = authors
    data <- data |> 
      filter(from != to)
  }) |> 
  bind_rows(.id = 'id')

selected_publications <- nd |> pull(id) |> unique() |> sample(5)

nd <- nd |> filter(id %in% selected_publications) |> select(-id)

The following is the interactive network of co-authors for the selected publications!

ig <- graph_from_data_frame(nd)

n <- networkD3::igraph_to_networkD3(ig)

  Links = n$links,
  Nodes = n$nodes,
  Source = 'source',
  Target = 'target',
  NodeID = 'name',
  Group = 'name',
  zoom = TRUE,
  opacity = .9,
  charge = -100,
  height = 800,
  fontSize = 20



