Extracting Protein-Protein Interactions (PPIs) from Biomedical Literature using Attention-based Relational Context Information

Because protein-protein interactions (PPIs) are crucial to understand living systems, harvesting these data is essential to probe disease development and discern gene/protein functions and biological processes. Some curated datasets contain PPI data derived from the literature and other sources (e.g., IntAct, BioGrid, DIP, and HPRD). However, they are far from exhaustive, and their maintenance is a labor-intensive process. On the other hand, machine learning methods to automate PPI knowledge extraction from the scientific literature have been limited by a shortage of appropriate annotated data. This work presents a unified, multi-source PPI corpora with vetted interaction definitions augmented by binary interaction type labels and a Transformer-based deep learning method that exploits entities’ relational context information for relation representation to improve relation classification performance. The model’s performance is evaluated on four widely studied biomedical relation extraction datasets, as well as this work’s target PPI datasets, to observe the effectiveness of the representation to relation extraction tasks in various data. Results show the model outperforms prior state-of-the-art models. The code and data are available at: https://github.com/BNLNLP/PPI-Relation-Extraction


I. INTRODUCTION
Much effort in modern molecular biology either involves or is entirely focused on learning and understanding the functions and interactions of the millions of proteins that compose the basic building blocks of life.In particular, the prediction of protein structure and functions has been recognized as a paramount phase in some major issues of life science, such as the therapeutic approach for several diseases, which can ameliorate healthcare by accelerating drug discovery and development.The functions of most proteins currently are unknown with only a small fraction definitively established after extensive and labor-intensive lab work has been performed.These gold-standard protein function assignments have been extended computationally via DNA and amino acid sequence homology throughout the ever-expanding collection of protein sequences determined from genome sequencing.However, inference from homology often is inaccurate.Helpfully, clues about function can come from other sources, including interactions with proteins for which the function is known.While experiments that definitively determine interactions can be laborintensive, several relatively high-throughput methods are in use, such as two-hybrid screening [1] and affinity purification followed by mass spectrometry [2].Numerous databases, such as IntAct1 , STRING2 , DIP 3 , BioGrid 4 , HPRD 5 , and MINT 6  are now dedicated to collecting and curating protein-protein interaction (PPI) results obtained using various techniques and from the scientific literature.Unfortunately, mining the literature requires manual effort and is slow.To remedy this, we aim to develop a machine learning (ML) model that effectively identifies statements of PPIs in scientific text.
Efforts to fully automate text knowledge extraction are widespread and ongoing with supervised learning approaches currently being the most favored.A key challenge in applying these methods to PPI extraction is a shortage of training data specifically annotated for this purpose.Several publicly available PPI training datasets suffer from biases of restricted biological focus (i.e., human-, medical-, or microbial-only) and also differences in the concept of what defines an interaction.For this work, we combine all of the aforementioned training sets, vet them for uniformity in interaction definition, and add interaction type labels.We also propose Transformer architecture-based models [3], which leverage entities' relational context information to build a relation representation that improves relation classification performances.
As detailed in this paper, our contribution is twofold: 1) We augment public PPI corpora with labels for protein types (enzyme and structural), which further delineate the functional role of proteins and consequently afford a helpful protein classification for the biology community.
We also provide the interaction-typed PPI corpora for the community.2) We present a Transformer-based relation prediction method that exploits entities' relational context information to build an improved relation representation.Our study shows the effectiveness of the proposed approach not only on the PPI datasets, but also four biomedical relation extraction datasets.

II. RELATED WORK
There have been ongoing efforts to consolidate biological knowledge pertinent to PPIs from literature by creating machine-processable data and designing protein relation extraction methods.

A. PPI corpora
BioCreative VI [4] proposed a PPI relation extraction challenge task related to genetic mutations to foster the development of mining PPI information from biomedical literature.Bunescu et al. [5] annotated 1000 titles and abstracts from the MEDLINE repository that discuss human genes/proteins, the so-called AIMed corpus, which includes roughly 5000 protein names and 1000 protein interactions.Pyysalo et al. [6] created BioInfer (Bio Information Extraction Resource), containing 1100 sentences with named entities and their relationships tagged from abstracts of biomedical research articles.Fundel, COMMANDKüffner, and Zimmer [7] tagged the sentences of 50 abstracts referenced by the Human Protein Reference Database (HPRD) with direct physical interactions, regulatory relations, and modifications between genes/proteins.The IEPA (Information Extraction Processing Assessment) corpus [8] was created to conduct a comparative study on the merits of different text processing units for interactions between biochemical entities.The Learning Language in Logic Workshop (LLL05) [9] designed the genic interaction extraction challenge task that aims to promote protein/gene interactions information extraction from biology abstracts in the MEDLINE bibliography database.The LLL challenge focused on gene interactions in Bacillus subtilis, a model bacterium, and many papers have been published about direct gene interactions involved in sporulation.
Although the number of corpora and methods for PPI information extraction from biomedical text has increased as the interest in automatic mining systems has grown, the lack of consensus with respect to PPI annotation has hindered consolidation of heterogeneous datasets, thereby making it difficult for researchers to properly evaluate their methods on a standardized dataset for PPI extraction.Pyysalo et al. [10] have conducted a comparative analysis of the five PPI datasets-AIMed, BioInfer, HPRD50, IEPA, and LLL-and unified the PPI annotations to share with the community for clear and comparative method evaluation.To merge these diverse datasets, Pyysalo et al. [10] have found common categories across the five corpora and generated a unified PPI corpora composed of sentences tagged with undirected and untyped binary interactions (i.e., positive and negative).These unified versions of PPI datasets, hereafter called the five benchmark PPI corpora, have been widely used to evaluate various approaches on PPI extraction tasks [11]- [13].In the biological literature, single sentences often discuss more than two proteins, and such statements are not all declarations of interactions between the proteins mentioned.These datasets include all identified protein/gene entity names found within each training sentence, as well as a pairwise evaluation of positive/negative interactions between each possible pairing.
However, some issues remain regarding the content and annotations in these benchmark PPI datasets (detailed in Section III-A).In this paper, we present an augmented, refined version of the five benchmark PPI corpora along with the BioCreative VI corpus that further specify positive interactions into two types of interactions: enzyme and structural.These interaction types are desirable to construct protein interaction networks.

B. PPI extraction methods
In the early stages of adopting ML approaches for the PPI extraction task, feature-and kernel-based approaches have been commonly used [12], [14].In an attempt to capture syntactic and semantic information of sentences, Murugesan, Abdulkadhar, and Natarajan [15] developed a Distributed Smoothed Tree kernel (DSTK) composed of distributed lexical parse trees and semantic feature vectors and demonstrated that the shallow linguistic information helped enhance the PPI extraction capability with the model evaluation on the five benchmark PPI corpora.
With the recent success of deep learning in a number of applications, deep neural network models have emerged to tackle the PPI extraction task.Peng and Lu [16] have demonstrated their multichannel dependency-based convolutional neural network model (McDepCNN) effectively captures syntactic features of sentences by adding a separate channel for the dependency information of the sentence syntactic structure on the PPI task using AIMed and Bioinfer corpora.Attention mechanisms in natural language processing (NLP) have shed some light on solving long dependency issues between tokens in sequential data.The self-attention-based Transformer architecture [3] has proven to well preserve long-term dependencies and establish effective contextual representations.NLP models built upon Transformer architecture, such as BERT [17], have achieved state-of-the-art (SOTA) results in various NLP tasks, including in biology domains [18].Warikoo, Chang, and Hsu [13] have proposed a Lexically aware BERT model (LBERT) that generates syntactic contexts emphasized representations for sentence-level bio-entity relation extraction tasks taking ngram parts-of-speech frames as an additional input embedding to deliver latent lexical properties, and the model outperformed the prior models on a PPI task with the five benchmark PPI corpora.Recently, Tang et al. [19] have built a PPI extraction model based on a domain-specifically pre-trained BERT and adversarial training, which showed significant improvement on the classification of the five benchmark PPI corpora.

III. ADDITIONAL PPI CURATION
This section details the further curation and enhancement of the aforementioned datasets.

A. Problems discovered during curation
In vetting the five benchmark PPI training corpora, we identified the following problems: 1) Bias due to restricted biological focus for each set: In particular, the AIMed and IEPA corpora are focused on human medical biochemistry and phenomena, including viral pathogens, whereas the set LLL is limited to a single bacterial species, Bacillus subtilis.These differences manifest in skew and distribution of protein/gene name frequency counts between the five sets, as well as other domain-specific terminology.In fact, the most frequently occurring protein in IEPA, insulin, accounts for 14% of the protein mentions in all of the IEPA positives, yet it does not occur in the AIMed positives set, where the most common protein, p53, accounts for only 1.75% of the protein names.These sets all sampled especially different populations in the literature.Combining all sets together helps to counter this bias, but, in the future, we plan to collect more training data to better address this issue.
2) Differences in notion of the definition of an interaction: The five sets largely restrict PPI-positive cases to clear statements of direct interaction between the two subjects.LLL further restricts positive PPI declarations to cases where a protein binds to DNA and causes or inhibits the transcription of the gene of another protein, or a statement of gene regulation-a markedly particular type of interaction.
We intentionally broaden our acceptance of a positive PPI indication.Our goal is to provide biologists with a tool to identify possible interactive connections between proteins directly from the scientific literature text.Because of the likelihood that claims of direct PPI will end up in future databases (if not there already), a less restrictive interpretation will allow a text mining system to report results of value that will not necessarily be found in a PPI database.
Along these lines, we did not distinguish between gene or protein for this work.In addition to direct binding between two proteins or a protein and itself (i.e., dimers and multimers), we also consider interacting cases where two proteins bound to a larger complex of other proteins without necessarily contacting each other directly.
The following details an example (from the BioCreative corpus) where a direct connection between proteins PVA12 and ORP3a is made but is not declared an actual interaction.
The targeting of the oxysterol-binding protein ORP3a to the endoplasmic reticulum relies on the plant VAP33 homolog PVA12.
On the other hand, we are mindful of the possibility of being too broad, which would result in too many PPI calls to be meaningful.
3) Confusion over PPI-negative annotations: This expanded threshold for PPI-positive impacts the public negative annotations.The following are two example cases (from AIMed corpus) where we disagree with the given negative labels.
In addition to this unique pathway, FGFR3 also links to GRB2.
A negative interaction between proteins FGFR3 and GRB2 was declared in the public set.
After a brief historical incursion regarding renal artery stenosis (RAS) of renal origin, we present the main extrarenal angiotensin-forming enzymes, starting with isorenin, tonin, and D and G cathepsin and ending with the conversion enzyme and chymase.
In this case, negative interactions are annotated between angiotensin and each of isorenin, tonin, G cathepsin, and chymase, respectively, even though they are declared as forming angiotensin.
The following shows an example of a negative PPI sentence where we agree with the given label and have included in our curated set (from AIMed corpus).
The molar ratio of serum retinol-binding protein (RBP) to transthyretin (TTR) is not useful to assess vitamin A status during infection in hospitalized children.
To reduce confusion in our initial models regarding updated positive and negative relabels, we consider only those negatively labeled sentences where no positive pairs were declared in a sentence.Then, we manually examine each case to make sure we agree, disregarding (for now) those where we differ.For the same reason, in this work, we also disregard negative pair cases in sentences with both positive and negative annotations.

B. Interaction Type Annotation
PPIs aid with biological engineering.Notably, structure and protein subunit complex knowledge is critical to protein engineering, and transient interactions (e.g., chaperone to client protein) knowledge is needed for engineering at a broader scale.To make the public PPI corpora more useful for this purpose, we have added interaction type labels for the positively defined pairs in the unified datasets and the BioCreative set.In determining the interaction type labels, we first considered top-level protein function categories from IntAct's molecular interaction ontology but discovered we lacked enough training examples to provide sufficient statistics in each of the 28 categories to properly train a model (not all interaction types occur with equal frequency).We then tried to reduce the number of categories by making them coarser, first lowering to roughly 10 then three types.However, we found that making assignments in this manner proves too complicated with only questionable scientific value.
We finally decided on a simple binary classification with interactions being declared either enzyme or structural for our first pass because enzyme or structural accurately delineates the functional role of almost all proteins and consequently provides a concise but meaningful protein classification.The structural label is applied to protein assemblages of large, permanent cellular components, such as cell walls, histones, golgi apparatus, microtubules, membranes, and inter-cellular structures.All other interactions are classified as enzyme.Type is determined by examining the given function for each protein/gene, where it can be obtained from any of several online protein databases, such as Uniprot, NCBI, and GeneCards, and from the sentence context itself.For the five sentence-based datasets, interaction type labels are applied for positively identified protein pairs.An example of a structural interaction label for the proteins alpha-syntrophin and utrophin (from BioInfer corpus) follows: Absence of alpha-syntrophin leads to structurally aberrant neuromuscular synapses deficient in utrophin.
The remaining non-structural interactions are considered enzymatic, a label applied to nominal enzyme activity (proteins that catalyze chemical reactions of metabolites in reaction pathways) and proteins that activate other proteins (kinases).In this work, we also applied said label to all proteins that activated, inhibited, signaled, and formed temporary complexes with other proteins, as well as those that bind to DNA to regulate gene expression, chaperones which help proteins fold, and those that destroy proteins (proteases).The following is an example of an enzyme-labeled PPI between JAK2 and Ref-1 (from AIMed corpus): The cytokine-activated tyrosine kinase JAK2 activates Raf-1 in a p21ras-dependent manner.
This process of adding type labels proved to be the most difficult and labor-intensive aspect of the training data curation with thousands of gene names and symbols that required external lookups in addition to an equally large host of specialized biological jargon and acronyms (chemical names, cell lines, experimental conditions, etc.) that required research to differentiate from proteins and establish the context necessary for understanding each sentence.Importantly, because this annotation effort is informed by resources and knowledge external to the text in question, it encodes specialized domain knowledge that makes the PPI type classification task more challenging, increasing pressure on ML models to capture sufficiently informative context adequately to make a class determination.
Appendix A shows the annotation process.Two domain experts have performed the PPI annotation and reached a high inter-annotator agreement as seen in Appendix B. The definition of an interaction and the annotation rules were carefully determined ahead of time, according to domain expertise.Some of the rules are shown in Appendix C, and the complete rules can be found in our GitHub repository.

IV. METHODOLOGY
We have adopted a Transformer-based approach for the PPI classification task.In particular, we improve a relation representation exploiting the relational context information of an entity pair.

A. Relation Representation augmented with Attention-based Context Information
In a relation classification task, the [CLS] token is frequently used to represent a relation representation, which is a special classification token in BERT employed to capture the overall information of an input sequence.Another popular method is the entity mention pooling approach that concatenates a pair of two max-pooled entity embeddings in the last hidden state of BERT.To explicitly indicate target tokens for a relation, entity markers can be used in input, which are additional special input tokens indicating which tokens need focus for relation learning.Soares, Fitzgerald, Ling, and Kwiatkowski [20] have conducted the comparative study between marker-free and marker-embed representations showing the marker embedded approach outperforms markerfree representations on several supervised relation extraction tasks.Specifically, the concatenation of the entity start markers achieves the best performance.
We additionally improve the relation representation built upon a pair of entities or entity start markers by adding relational context information of entities.The rationale is that additional tokens for relational context can serve a crucial role in determining the relation of the entities.For instance, the word activates in "A activates B" and Interaction in "Interaction between A and B" are important clues for the effector-effectee relation.To find the most relevant tokens for relation information, we leverage entity tokens' attention probabilities generated in the last hidden layer in BERT.We sum two entities' attention probabilities and retrieve additional tokens by the probability scores.The retrieved tokens are maxpooled then added to the final relation representation.) ⊕ e r 2 , where P attn denotes attention probabilities of a token.H is the number of heads in the model.rc stands for relation context, and N is the number of tokens to be attentive.N also is a hyper-parameter and is set prior to model training.In this study, N is set as 20% of an input length, which was empirically determined using validation sets of biomedical relation extraction benchmarks (see Appendix D). x r is the final relation representation for a classifier, which is the linking of entity embeddings (e r 1 , e r 2 ) (mention pooling or entity start marker) and a max-pooled relation context embedding.When selecting tokens for relation context, we only account for alphanumerical tokens and exclude entity tokens and special tokens (besides entity markers).If a token is a part of a word (tokens with "##"), the entire word is included.Figure 1 illustrates the construction of a relation representation for a sentence with entity start markers, and the mention pooling approach is depicted in Appendix E.

B. Model Architecture
Our Transformer-based relation extraction model performs a sequence classification task using a logistic regression with softmax to determine the probability of relation class (e.g., c ∈ {enzyme, structural, negative}) as follows: where X and x r denote examples and relation representations, respectively.The model parameters are optimized using a categorical cross entropy.
where δ(X, c) indicates whether the class of X is correctly predicted (δ(X, c) = 1) or not (= 0).Algorithm 1 illustrates the model training procedure.

V. EXPERIMENTAL SETUP
We first demonstrate the effectiveness of the proposed approach on four well-known relation extraction benchmark datasets in the biomedical domain.Then, the method is evaluated on the five PPI benchmark corpora and our PPI corpus with interaction types by comparing the performance with SOTA models.Compute gradient and update parameters.17: end for 19: end while

A. Datasets
In this study, we use four biomedical relation extraction (RE) datasets: ChemProt [21], DDI [22], GAD [23], and EU-ADR [24].There are various versions of the ChemProt, DDI, and GAD datasets.Here, we adopt the recent and widely used benchmark data, the Biomedical Language Understanding and Reasoning Benchmark (BLURB) provided by [25].We also use the EU-ADR data in BioBERT [26].The ChemProt, DDI, and GAD datasets consist of a train/validation/test set, while the EU-ADR contains 10-fold sets for cross validation.In all of the data, target entities are anonymized with predefined tags, including @GENE$, @CHEMICAL$, @DRUG$, and @DISEASE$.In ChemProt and DDI, additional tags, @CHEM-GENE$ and @DRUG-DRUG$, are used for overlapping entities.When entity markers are used, @CHEM-GENE$ and @DRUG-DRUG$ are surrounded by the [E1-E2] tag.Descriptions of each data follow, and Table I displays the number of data samples.3) GAD (The Genetic Association Database corpus) contains a set of gene-disease binary associations, which was semi-automatically collected from PubMed abstracts.4) EU-ADR features a list of binary associations between drugs, diseases, genes, and proteins annotated on Medline abstracts.The five PPI benchmark corpora include AIMed [5], BioInfer [6], HPRD50 [7], IEPA [8], and LLL [9].We adopt the unified version of PPI benchmark datasets provided by [10] that has been used in the SOTA models.In the datasets, the PPI relations are tagged with either positive or negative.The corpus statistics is described in Table II.Our PPI annotations with interaction types (enzyme, structural, or negative) are the expanded version of the five benchmark corpora and the BioCreative VI protein interaction dataset [4].Table III displays the corpora statistics.The annotation work in all corpora has been carried out in a sentence boundary as engaged in the five PPI benchmark corpora.

A. Evaluation on biomedical RE datasets
We use the BioBERT large-cased model for the ChemProt, the PubMedBERT-uncased-fulltext model for DDI and GAD, and the BioBERT base-cased model for EU-ADR.We compare our model's performance with the SOTA results, including KeBioLM [27] for ChemProt and GAD, PubMedBERT [25] for DDI, and BioBERT [26] (Version 8 as our model was built on PyTorch) for EU-ADR.KeBioLM and PubMedBERT use the combinations of entity mentions, and BioBERT uses the [CLS] token for relation classification.We measure the performance by the same metrics used in the SOTA systems.The results demonstrate that our proposed representation of the entity mention augmented with the relation context achieved SOTA results for ChemProt, GAD, EU-ADR, while the combination of entity start markers with the relation context produced comparable performance for DDI (shown in Table IV).The relation context improves the predictions in all cases.Notably, its significance is clearly shown in EU-ADR, where we have replicated the result obtained in the SOTA model ([CLS]: 85.1 F1 score) using the same model, input (without markers), representation, and adding the relation context to the mention pooling, which produced a superior result over the [CLS] token.

B. Evaluation on PPI datasets
We adopt BioBERT for the evaluation on the PPI data that achieved greater improvements on the performances in the recent PPI extraction works [13], [19].To compare the 8 https://github.com/dmis-lab/biobert-pytorchperformance of the proposed approach with SOTA works, we evaluate our model using a 10-fold cross-validation (CV) manner and a micro F1 performance metric as adopted in the SOTA models.Table V displays the evaluation results on the five benchmark PPI corpora, showing our models produce the best performances and outperform the SOTA models on the overall classification as described in the average F1 scores.Unlike the entity anonymized inputs, the inputs with entity markers perform better than the original inputs across all data, while using the [CLS] token in the original input performs the worst.This finding also has been observed in earlier works [20], [25], implying the significance of explicit indication for target entities, such as markers or entity anonymization, with its type.The relation context constantly improves the performances, although a slight degradation occurred for the combination with entity mention in the LLL data, and the representation of entity start markers augmented with relation context achieves the best predictions.
In addition, we examine the model's ability on our PPI corpora with interaction types.In this experiment, we combine the six corpora where some datasets contain only single class or highly skewed samples so the model can be trained on more balanced data.The model evaluation also is carried out in a 10-fold CV manner, and

VII. CONCLUSION
In this work, we have augmented existing PPI corpora annotated with interaction types, which is expected to be beneficial for extracting more PPI information from scientific publications.We also have presented a Transformer architecturebased model for relation extraction.Specifically, we have improved a relation representation by adding relational context information based on entities' attention probabilities.Our models outperform SOTA models and offer proof about the effectiveness of additional relational context embedding on the biomedical relation extraction benchmarks and PPI corpora.
We will continue to improve our PPI annotations by resolving identified problems, including debiasing the training data.More examples are needed from across biological subject areas (plants, environmental, microbiomes, etc).Our goal is to provide a tool that works across all subfields of biology.Granularity in type classifications also needs to be increased, which will require more training data and manual annotation.Finally, statements of interaction that span two (or more) sentences also will require added attention in the future.

APPENDIX A ANNOTATION PROCESS DIAGRAM APPENDIX B INTER-ANNOTATOR AGREEMENT
We measured the inter-annotator agreement scores to observe the discrepancy between the annotators in the PPI relation types.The annotated data statistics can be found in Table III.As seen in Table VII, the two annotators achieved a high inter-annotator agreement.2) Histones and nucleosomes are not considered structural because their "structure" is mutable and controls regulation.3) Proteins/Genes ending in -ase are preidentified as enzymes.4) Proteins/Genes containing inhibitor, activator, transcription factor, repressor, enhancer, or regulator are preidentified as enzymes.

APPENDIX D EVALUATION ON DIFFERENT RELATION CONTEXT SIZES
To find an appropriate size of attentive context of target entities, we evaluated different sizes of relation context using the biomedical relation extraction benchmark datasets: ChemProt, DDI, GAD, and EU-ADR.We leveraged 10%, 20%, and 30% of a sequence length for a number of attentive tokens of target entities and compared them on the respective validation set of the datasets.When selecting tokens for relation context, we only account for the alphanumerical tokens and exclude entity tokens (e.g., [CLS]; [SEP]) and special tokens (besides entity markers).Because the EU-ADR is a 10-fold cross validation set, we split a training set in each fold in a 9:1 ratio, i.e., 90% of the data are used for training the model, while 10% are used for validating the model.Without using a test set, the average scores of cross validations on train/validation sets were measured.Table VIII demonstrates the F1 scores of different sizes of relation context, and 20% of an input length-except for tokens to be ignored-showed the best performances on both entity mention use and entity start marker use in representation.

APPENDIX E RELATION REPRESENTATION USING MENTION POOLING
Figure 2 illustrates the construction of a relation representation for a sentence using mention pooling.As in the entity start marker method, input sentences are tagged with entity markers.The rectangles and ovals represent the tokens' embeddings and attention probabilities, respectively.

Algorithm 1 7 : 9 :L
Training a PPI model Initialize: Load a pre-trained BERT model and set the max epoch and mini-batch size.Output: Refined BERT model for PPI classification task using an attention-based relation representation.1: Given relation extraction samples, define entity spans and add entity tags when using markers.2: for s in S relation do 3: D ← def ine entity span and add marker(s) 4: end for 5: while epoch to epoch max do 6: // b is a mini-batch.for b in D do 8: for each (e1: entity 1, e2: entity 2) ∈ b do Generate attention-based relation representations.10: R ← e1 emb ⊕ relation context ⊕ e2 emb = CrossEntropyLoss(logits, labels) 16:

Fig. 1 .
Fig. 1.The relation representation consists of entity start markers and the max-pooled of relational context, which is a series of tokens chosen by attention probability of the entities.The relation representation based on mention pooling is depicted in Appendix E. ⊕ denotes element-wise addition.The example sentence is Absence of alpha-syntrophin leads to structurally aberrant neuromuscular synapses deficient in utrophin.(Source: BioInfer corpus).

Fig. 2 .
Fig. 2. The relation representation consists of the max-pooled of two entity contextualized embeddings and the max-pooled of relational context, which is a series of tokens chosen by attention probability of the entities.⊕ denotes element-wise addition.The example sentence is Absence of alpha-syntrophin leads to structurally aberrant neuromuscular synapses deficient in utrophin.(Source: BioInfer corpus).

TABLE I STATISTICS
OF BIOMEDICAL RELATION EXTRACTION DATASETS.EU-ADR CONSISTS OF 10-FOLD SETS FOR CROSS VALIDATION.

TABLE II FIVE
PPI BENCHMARK FOR positive AND negative CLASSES.

TABLE III INTERACTION
TYPED PPI CORPORA FOR enzyme, structural, AND negative CLASSES.† ANNOTATIONS USING THE PPI DATA FROM BIOCREATIVE VI TRACK 4: MINING PROTEIN INTERACTIONS AND MUTATIONS FOR PRECISION MEDICINE (PM).THE SIGNIFICANT REDUCTION FROM THE ORIGINAL DATA IN NEGATIVE SAMPLES IS EXPLAINED IN III-A3.

TABLE IV F1
SCORES ON THE TEST SETS FOR CHEMPROT, DDI, GAD, AND 10-FOLD CV FOR EU-ADR.IN THE DATASETS, TARGET ENTITIES ARE ANONYMIZED WITH PRE-DEFINED TAGS (E.G., @GENE$, @CHEMICAL$, @DRUG$).Mention IS A CONCATENATION OF THE CONTEXTUAL EMBEDDINGS OF THE ENTITY MENTIONS.Entity Start (markers) ARE [E1] AND [E2].(BOLD: BEST SCORE IN OUR METHOD; UNDERLINE: BEST SCORE IN SOTA) Table VI reflects the micro F1 scores of each representation.The results demonstrate that the models yield consistent predictions with the best 87.8 F1 score compared to the previous experiments, and the representations augmented with relation context continually generate satisfactory outcomes.Through the observation of enhanced results on various relation extraction tasks, we can conclude that contextual representations that target entities are attentive and able to effectively provide additional information to determine the relations of entity pairs.

TABLE V F1
SCORES VIA 10-FOLD CV ON THE PPI CLASSIFICATION WITH THE FIVE BENCHMARK PPI CORPORA.Mention IS A CONCATENATION OF THE CONTEXTUAL EMBEDDINGS OF THE ENTITY MENTIONS.Entity Start (markers) ARE [E1] AND [E2].OUR METHODS USE THE BIOBERT BASE-CASED MODEL.(BOLD: BEST SCORE IN OUR METHOD; UNDERLINE: BEST SCORE IN SOTA)

TABLE VI F1
SCORES VIA 10-FOLD CV ON THE TYPED PPI CORPORA.THE BIOBERT BASE-CASED MODEL IS USED.

TABLE VII INTER
-ANNOTATOR AGREEMENT STATISTICS BETWEEN THE TWO ANNOTATORS FOR THE THREE PPI TYPES.Genes ending in -in or -ins are pre-identified as structural (actin, catenin, . . .).Exceptions include: a) Toxin b) Beta-catenin (can be gene regulator OR structural as it is a dual-function gene) c) Calreticulin -multifunction; mostly enzyme.