In-class_Ex05 (vast challenge Data)
Edge data should be organised as such: (can use dplyr methods)
First column: Source id (FK to Node second column) - compulsory
Second column: Target id (FK to Node second column) - compulsory
Node data
First column: ID - compulsory
Second column: Label (contains all the distinct values of source and target in Edge data) (only need if Id are all integers) (what is present in edge data must exists in Labels of node data, must not be missing in node data)
Try not to use R built-in NA/NULL function. Manually type “unknown’ / ‘missing’ as a value instead.
In today’s in class exercise,
Import libraries
The new libraries used today are :
jsonliteto import json file
Problem with dataset of links:
Source and Data columns are at the back instead of the first 2 columns
Pull out the nodes and edge data and save them as tibble data frames.
Back to GAStech dataset