In-class_Ex05 (vast challenge Data)

Author

NeoYX

Published

May 13, 2023

Modified

May 13, 2023

Note

Edge data should be organised as such: (can use dplyr methods)

First column: Source id (FK to Node second column) - compulsory

Second column: Target id (FK to Node second column) - compulsory

Node data

First column: ID - compulsory

Second column: Label (contains all the distinct values of source and target in Edge data) (only need if Id are all integers) (what is present in edge data must exists in Labels of node data, must not be missing in node data)

Warning

Try not to use R built-in NA/NULL function. Manually type “unknown’ / ‘missing’ as a value instead.

In today’s in class exercise,

Import libraries

The new libraries used today are :

Show the code
pacman::p_load(jsonlite, igraph, tidygraph, ggraph,
               visNetwork, lubridate, clock,
               tidyverse, graphlayouts,knitr)
Show the code
MC1 <- jsonlite::fromJSON("C:/yixin-neo/ISSS608-VAA/Project/data/MC1.json")
Note

Problem with dataset of links:

Source and Data columns are at the back instead of the first 2 columns

Pull out the nodes and edge data and save them as tibble data frames.

Show the code
MC1_nodes <- as_tibble(MC1$nodes) %>% 
  select(id,type,country)
Show the code
MC1_edges <- as_tibble(MC1$links) %>% 
  select(source,target,type,weight,key)  
# can exclude dataste column as they all contain the same values.

Back to GAStech dataset

Show the code
GAStech_nodes <- read_csv("C:/yixin-neo/ISSS608-VAA/Hands-on_Ex/Hands-on_Ex08/data/GAStech_email_node.csv")
GAStech_edges <- read_csv('C:/yixin-neo/ISSS608-VAA/Hands-on_Ex/Hands-on_Ex08/data/GAStech_email_edge-v2.csv')