This thesis summarizes the work I have done during my master's study at UCLA. We ranked 38th among all the participants of the KDD 21 challenge on large-scale graph machine learning. We built a two-stage model, taking the most out of UniMP and Correct and Smooth architectures in Pytorch. We studied a social network graph with 121 million nodes and 153 categories, achieving node classification accuracy of 65$\%$.
The second part of thesis summarizes a mini-batch attention-based graph machine learning model that we developed. We first learned a dense self-attention based on graph node features and overlayed it with the original adjacencymatrix. It achieves about the same test accuracy of $69.00 \pm 0.28\%$ on the Arxiv dataset compared to clusterGCN, but it has the potential to outperform. This is especially true when graph node features are rich and informative. Interesting results may yield for a deeper GCN.