A large language model for electronic health records | npj Digital Medicine - Nature.com

1 year ago 49

Abstract

There is an expanding involvement successful processing artificial quality (AI) systems to process and construe physics wellness records (EHRs). Natural connection processing (NLP) powered by pretrained connection models is the cardinal exertion for aesculapian AI systems utilizing objective narratives. However, determination are fewer objective connection models, the largest of which trained successful the objective domain is comparatively tiny astatine 110 cardinal parameters (compared with billions of parameters successful the wide domain). It is not wide however ample objective connection models with billions of parameters tin assistance aesculapian AI systems utilize unstructured EHRs. In this study, we make from scratch a ample objective connection model—GatorTron—using >90 cardinal words of substance (including >82 cardinal words of de-identified objective text) and systematically measure it connected 5 objective NLP tasks including objective conception extraction, aesculapian narration extraction, semantic textual similarity, earthy connection inference (NLI), and aesculapian question answering (MQA). We analyse however (1) scaling up the fig of parameters and (2) scaling up the size of the grooming information could payment these NLP tasks. GatorTron models standard up the objective connection exemplary from 110 cardinal to 8.9 cardinal parameters and amended 5 objective NLP tasks (e.g., 9.6% and 9.5% betterment successful accuracy for NLI and MQA), which tin beryllium applied to aesculapian AI systems to amended healthcare delivery. The GatorTron models are publically disposable at: https://catalog.ngc.nvidia.com/orgs/nvidia/teams/clara/models/gatortron_og.

Introduction

There is an expanding involvement successful processing artificial quality (AI) systems to amended healthcare transportation and wellness outcomes utilizing physics wellness records (EHRs). A captious measurement is to extract and seizure patients’ characteristics from longitudinal EHRs. The much accusation we person astir the patients, the amended the aesculapian AI systems that we tin develop. In caller decades, hospitals and aesculapian practices successful the United States (US) person rapidly adopted EHR systems1,2, resulting successful monolithic stores of physics diligent data, including structured (e.g., illness codes, medicine codes) and unstructured (i.e., objective narratives specified arsenic advancement notes). Even though utilizing discrete information fields successful objective documentation has galore imaginable advantages and structured information introduction fields are progressively added into the EHR systems, having clinicians usage them remains a barrier, owed to the added documentation burden3. Physicians and different healthcare providers wide usage objective narratives arsenic a much convenient mode to papers diligent accusation ranging from household aesculapian histories to societal determinants of health4. There is an expanding fig of aesculapian AI systems exploring the rich, much fine-grained diligent accusation captured successful objective narratives to amended diagnostic and prognostic models5,6. Nevertheless, free-text narratives cannot beryllium easy utilized successful computational models that usually necessitate structured data. Researchers person progressively turned to earthy connection processing (NLP) arsenic the cardinal exertion to alteration aesculapian AI systems to recognize objective connection utilized successful healthcare7.

Today, astir NLP solutions are based connected heavy learning models8 implemented utilizing neural web architectures—a fast-developing sub-domain of instrumentality learning. Convolutional neural networks9 (CNN) and recurrent neural networks10 (RNN) person been applied to NLP successful the aboriginal signifier of heavy learning. More recently, the transformer architectures11 (e.g., Bidirectional Encoder Representations from Transformers [BERT]) implemented with a self-attention mechanism12 person go state-of-the-art, achieving the champion show connected galore NLP benchmarks13,14,15,16. In the wide domain, the transformer-based NLP models person achieved state-of-the-art show for sanction entity recognition17,18,19, narration extraction20,21,22,23,24, condemnation similarity25,26,27, earthy connection inference27,28,29,30, and question answering27,28,31,32. Typically, transformers are trained successful 2 stages: connection exemplary pretraining (i.e., learning utilizing a self-supervised grooming nonsubjective connected a ample corpus of unlabeled text) and fine-tuning (i.e., applying the learned connection models solving circumstantial tasks with labeled grooming data). One pretrained connection exemplary tin beryllium applied to lick galore NLP tasks done fine-tuning, which is known arsenic transportation learning—a strategy to larn cognition from 1 task and use it successful different task33. Human connection has a precise ample illustration space—the imaginable combinations of words, sentences, and their meaning and syntax are innumerable. Recent studies amusement that ample transformer models trained utilizing monolithic substance information are remarkably amended than erstwhile NLP models successful presumption of emergence and homogenization33.

The committedness of transformer models has led to further involvement successful exploring large-size (e.g., >billions of parameters) transformer models. The Generative Pretrained Transformer 3 (GPT-3) model34, which has 175 cardinal parameters and was trained utilizing >400 cardinal words of substance demonstrated superior performance. In the biomedical domain, researchers developed BioBERT11 (with 110 cardinal parameters) and PubMedBERT35 (110 cardinal parameters) transformer models utilizing biomedical lit from PubMed. NVIDIA developed BioMegatron models successful the biomedical domain with antithetic sizes from 345 cardinal to 1.2 cardinal parameters36 utilizing a much expansive acceptable of PubMed-derived escaped text. However, fewer studies person explored scaling transformer models successful the objective domain owed to the delicate quality of objective narratives that incorporate Protected Health Information (PHI) and the important computing powerfulness required to summation the size of these models. To date, the largest transformer exemplary utilizing objective narratives is ClinicalBERT37. ClinicalBERT has 110 cardinal parameters and was trained utilizing 0.5 cardinal words from the publically disposable Medical Information Mart for Intensive Care III38 (MIMIC-III) dataset. By processing not lone larger models, but models that usage objective narratives, NLP whitethorn execute amended to amended healthcare transportation and diligent outcomes.

In this study, we make a ample objective connection model, GatorTron, utilizing >90 cardinal words of substance from the de-identified objective notes of University of Florida (UF) Health, PubMed articles, and Wikipedia. We bid GatorTron from scratch and empirically measure however scaling up the fig of parameters payment the show of downstream NLP tasks. More specifically, we analyse GatorTron models with varying fig of parameters including (1) a basal exemplary with 345 cardinal parameters, (2) a mean exemplary with 3.9 cardinal parameters, and (3) a ample exemplary with 8.9 cardinal parameters. We besides analyse however scaling up information size payment downstream tasks by comparing the GatorTron-base exemplary trained from the afloat corpus with different GatorTron-base exemplary trained utilizing a random illustration of 1/4 of the corpus. We comparison GatorTron with existing transformer models trained utilizing biomedical lit and objective narratives utilizing 5 objective NLP tasks including objective conception extraction (or named entity designation [NER]), aesculapian narration extraction (MRE), semantic textual similarity (STS), earthy connection inference (NLI), and aesculapian question answering (MQA). GatorTron models outperform erstwhile transformer models from the biomedical and objective domain connected 5 objective NLP tasks. This survey scales up transformer models successful the objective domain from 110 cardinal to 8.9 cardinal parameters and demonstrates the payment of ample transformer models.

Results

A full fig of 290,482,002 objective notes from 2,476,628 patients were extracted from the UF Health Integrated Data Repository (IDR), the endeavor information warehouse of the UF Health system. These notes were created from 2011–2021 from implicit 126 objective departments and ~50 cardinal encounters covering healthcare settings including but not constricted to inpatient, outpatient, and exigency section visits. After preprocessing and de-identification, the corpus included >82 cardinal aesculapian words. Figure 1 summarizes the organisation of diligent by age, gender, race, and ethnicity arsenic good arsenic the organisation of notes by objective section (top 5) and enactment benignant (top 5). The elaborate fig of patients by each category, a afloat database of objective departments and the corresponding proportionality of notes, and a afloat database of enactment types were provided successful Supplementary Table 1, Supplementary Table 2, and Supplementary Table 3.

Fig. 1: Patient organisation by age, gender, race, ethnicity; objective notes organisation by enactment type, and objective department.
figure 1

Ages were calculated arsenic of September 2022.

Training GatorTron-large exemplary required ~6 days connected 992 A100 80 G GPUs from 124 NVIDIA DGX notes utilizing the NVIDIA SuperPOD notation clump architecture. Figure 2 shows the grooming validation nonaccomplishment for each 3 sizes of GatorTron models. The GatorTron-base exemplary converged successful 10 epochs, whereas the mean and ample models converged successful 7 epochs, which is accordant with anterior observations connected the faster per illustration convergence of larger transformer models.

Fig. 2: Training nonaccomplishment and validation nonaccomplishment for GatorTron-base (345 million), mean (3.9 billion), and ample (8.9 billion) models.
figure 2

a Training loss. b Validation loss. MLM masked connection modeling.

Table 1 and Table 2 comparison GatorTron models with 2 existing biomedical transformer models (BioBERT and BioMegatron) and 1 objective transformer exemplary (Clinical BERT) connected 5 objective NLP tasks.

Table 1 Comparison of GatorTron with existing biomedical and objective transformer models for objective conception extraction and aesculapian narration extraction.
Table 2 Comparison of GatorTron with existing biomedical and objective transformer models for semantic textual similarity, earthy connection inference, and question answering.

Scale up the size of grooming information and the fig of parameters

Compared with GatorTron-base trained utilizing a random illustration of 1/4 of the corpus, the GatorTron-base exemplary trained utilizing the afloat corpus achieved improved show for 4 tasks but for a sub-task successful MQA (on F1 people of medication-related questions). By scaling up the fig of parameters from 345 cardinal to 8.9 billion, GatorTron-large demonstrated singular improvements for each 5 tasks, suggesting that GatorTron models standard for canonical objective downstream tasks and that we are not yet astatine the limit.

Recognize objective concepts and aesculapian relations

Clinical conception extraction is to place the concepts with important objective meanings and classify their semantic categories (e.g., diseases, medications). As shown successful Table 1, each 3 GatorTron models outperformed existing biomedical and objective transformer models successful recognizing assorted types of objective concepts connected the 3 benchmark datasets (i.e., 2010 i2b239 and 2012 i2b240: problem, treatments, laboratory tests; 2018 n2c241: drug, adverse events, and drug-related attributes). The GatorTron-large exemplary outperformed the different 2 smaller GatorTron models and achieved the champion F1 scores of 0.8996, 0.8091, and 0.9000, respectively. For aesculapian narration extraction—a task to place aesculapian relations betwixt 2 objective concepts—the GatorTron-large exemplary besides achieved the champion F1 people of 0.9627 for identifying drug-cause-adverse lawsuit relations outperforming existing biomedical and objective transformers and the different 2 smaller GatorTron models. We consistently observed show betterment erstwhile scaling up the size of the GatorTron model.

Assess semantic textual similarity

The task of measuring semantic similarity is to find the grade to which 2 sentences are akin successful presumption of semantic meaning. As shown successful Table 2, each GatorTron models outperformed existing biomedical and objective transformer models. Among the 3 GatorTron models, the GatorTron-medium exemplary achieved the champion Pearson correlation people of 0.8903, outperforming some GatorTron-base and GatorTron-large. Although we did not observe accordant betterment by scaling up the size of the GatorTron model, the GatorTron-large exemplary outperformed GatorTron-base and its show is precise adjacent to the GatorTron-medium exemplary (0.8896 vs. 0.8903).

Natural connection inference

The task of NLI is to find whether a decision tin beryllium inferred from a fixed sentence—a sentence-level NLP task. As shown successful Table 2, each GatorTron models outperformed existing biomedical and objective transformers, and the GatorTron-large exemplary achieved the champion accuracy of 0.9020, outperforming the BioBERT and ClinicalBERT by 9.6% and 7.5%, respectively. We observed a monotonic show betterment by scaling up the size of the GatorTron model.

Medical question answering

MQA is simply a analyzable objective NLP task that requires recognize accusation from the full document. As shown successful Table 2, each GatorTron models outperformed existing biomedical and objective transformer models successful answering medicine and relation-related questions (e.g., “What laboratory results does diligent person that are pertinent to diabetes diagnosis?”). For medication-related questions, the GatorTron-large exemplary achieved the champion nonstop lucifer people of 0.3155, outperforming the BioBERT and ClinicalBERT by 6.8% and 7.5%, respectively. For relation-related questions, GatorTron-large besides achieved the champion nonstop lucifer people of 0.9301, outperforming BioBERT and ClinicalBERT by 9.5% and 7.77%, respectively. We besides observed a monotonic show betterment by scaling up the size of the GatorTron model.

Discussion

In this study, we developed a ample objective transformer model, GatorTron, utilizing a corpus of >90 cardinal words from UF Health (>82 billion), Pubmed (6 billion), Wikipedia (2.5 billion), and MIMIC III (0.5 billion). We trained GatorTron with antithetic fig of parameters including 345 million, 3.9 billion, and 8.9 cardinal and evaluated its show connected 5 objective NLP tasks astatine antithetic linguistic levels (phrase level, condemnation level, and papers level) utilizing 6 publically disposable benchmark datasets. The experimental results amusement that GatorTron models outperformed existing biomedical and objective transformers for each 5 objective NLP tasks evaluated utilizing six antithetic benchmark datasets. We observed monotonic improvements by scaling up the exemplary size of GatorTron for 4 of the 5 tasks, excluding the semantic textual similarity task. Our GatorTron exemplary besides outperformed the BioMegatron36, a transformer exemplary with a akin exemplary size developed successful our erstwhile survey utilizing >8.5 cardinal words from PubMed and Wikipedia (a tiny proportionality of the >90 cardinal words of corpus for processing GatorTron). This survey scaled up the objective transformer models from 345 cardinal (ClinicalBERT) to 8.9 cardinal parameters successful the objective domain and demonstrated singular show improvements. To the champion of our knowledge, GatorTron-large is the largest transformer exemplary successful the objective domain. Among the 5 tasks, GatorTron achieved singular improvements for analyzable NLP tasks specified arsenic earthy connection inference and aesculapian question answering, but mean improvements for easier tasks specified arsenic objective conception extraction and aesculapian narration extraction, indicating that ample transformer models are much adjuvant to analyzable NLP tasks. These results are accordant with observations successful the lit connected the saturation of simpler benchmarks with ample BERT architectures18,32.

GatorTron was pretrained utilizing self-supervised masked connection modeling (MLM) objective. We monitored grooming nonaccomplishment and calculated validation nonaccomplishment utilizing a subset acceptable of the objective substance (5%) to find the due stopping time. From the plots of grooming and validation losses successful Fig. 2, we observed that larger GatorTron models converged faster than the smaller model.

GatorTron models execute amended successful extracting and interpreting diligent accusation documented successful objective narratives, which tin beryllium integrated into aesculapian AI systems to amended healthcare transportation and diligent outcomes. The rich, fine-grained diligent accusation captured successful objective narratives is simply a captious assets powering aesculapian AI systems. With amended show successful accusation extraction (e.g., objective conception extraction and aesculapian narration extraction), GatorTron models tin supply much close diligent accusation to place research-standard diligent cohorts utilizing computable phenotypes, enactment physicians making data-informed decisions by objective determination enactment systems, and place adverse events associated with cause exposures for pharmacovigilance. The observed improvements successful semantic textual similarity, earthy connection inference, and aesculapian question answering tin beryllium applied for deduplication of objective text, mining medial knowledge, and processing next-generation aesculapian AI systems that tin interact with patients utilizing quality language.

We conducted mistake investigation and compared GatorTron with ClinicalBERT to probe the observed show improvements. We recovered that the larger, domain-specific pretrained models (e.g., GatorTron) are amended astatine modeling longer phrases and determining semantic categories. For example, GatorTron successfully identified “a mildly dilated ascending aorta”, wherever ClinicalBERT identified lone “mildly dilated” arsenic a problem; GatorTron successfully categorized “kidney protective effects” arsenic a “TREATMENT”, which was mis-classified arsenic “PROBLEM” by ClinicalBERT. For analyzable NLP tasks specified arsenic NLI and MQA, adjacent ample connection models specified arsenic GatorTron inactive person trouble successful identifying the cardinal pieces of accusation from longer paragraphs. Our aboriginal enactment volition amended GatorTron successful handling agelong pieces of substance for analyzable NLP tasks.

This survey demonstrates the advantages of ample pretrained transformer models successful the aesculapian domain. GatorTron models tin beryllium applied to galore different NLP tasks done fine-tuning. We judge that GatorTron volition amended the usage of objective narratives successful processing assorted aesculapian AI systems for amended healthcare transportation and wellness outcomes.

Methods

Data source

The superior information root for this survey is the objective narratives from UF Health IDR, a probe information warehouse of UF Health. This survey was approved by the UF Institutional Review Board (IRB202100049). We collected objective notes from 2011–2021 from implicit 126 departments, ~2 cardinal patients and 50 cardinal encounters from inpatient, outpatient, and exigency settings. Then, we merged the UF Health objective corpus with 3 further corpora, including the MIMIC-III corpus38 successful the objective domain with 0.5 cardinal words, a PubMed (combining PubMed abstracts and full-text commercial-collection) collection36 successful the biomedical domain with 6 cardinal words, and a Wikipedia articles dump36 successful the wide domain with 2.5 cardinal words, to make a corpus with >90 cardinal words.

Preprocessing and de-identification of text

We performed minimal preprocessing including (1) removing bare and duplicated objective notes, unifying each substance into UTF-8 encoding, and removing amerciable UTF-8 strings; (2) normalizing peculiar characters (e.g., person ‘&’ to ‘&;’ ‘\xa0’ to ‘space’); (3) tokenization and condemnation bound detection. For objective substance from UF Health, we further applied a de-identification system42 to region protected wellness accusation (PHI) from objective text. (Approved nether IRB202100049) We adopted the safe-harbor method to place 18 PHI categories defined successful the Health Insurance Portability and Accountability Act (HIPAA) and replaced them with dummy strings (e.g., regenerate people’s names into [**NAME**]).

Study design

Figure 3 shows an overview of the survey design. We question to bid a ample objective transformer model, GatorTron, utilizing >90 cardinal words and analyse however and whether scaling up exemplary size improves show connected 5 objective NLP tasks. We archetypal pretrained GatorTron utilizing the >90 cardinal words by optimizing a masked connection exemplary (MLM) and past applied GatorTron to 5 antithetic objective NLP tasks utilizing a supervised fine-tuning. We adopted the BERT architecture (Fig. 4) implemented successful Megatron-LM and explored 3 antithetic settings including a basal exemplary of 345 cardinal parameters (i.e., GatorTron-base), a mean exemplary of 3.9 cardinal parameters (i.e., GatorTron-medium), and a ample exemplary of 8.9 cardinal parameters (i.e., GatorTron-large). Then we compared the 3 GatorTron models to an existing transformer exemplary from the objective domain, ClinicalBERT (trained with 110 cardinal parameters) and 2 transformer models from the biomedical domain, including, BioBERT (345 cardinal parameters) and BioMegatron (1.2 cardinal parameters). We compared the models connected 5 objective NLP tasks, including objective conception extraction, narration extraction, semantic textual similarity, earthy connection inference, and aesculapian question answering. We utilized six nationalist benchmark datasets successful the objective domain.

Fig. 3: An overview of pretraining and fine-tuning of GatorTron models.
figure 3

We loaded the basal exemplary and the mean exemplary into 1 GPU for distributed training. We sliced the GatorTron-large exemplary into 4 pieces and loaded exemplary pieces to 4 GPUs for distributed grooming (i.e., exemplary parallelism). TrM transformer unit.

Fig. 4: Pretraining GatorTron-large exemplary with 9 cardinal parameters utilizing exemplary parallelism.
figure 4

Emb embedding, Tok Token from input sentence, Trm Transformer unit. [SEP]: a token defined successful BERT to bespeak condemnation boundaries. [CLS]: a token defined successful BERT for sentence-level representation.

Training environment

We utilized a full fig of 992 NVIDIA DGX A100 GPUs from 124 superPOD nodes astatine UF’s HiPerGator-AI clump to bid GatorTron models by leveraging some data-level and model-level parallelisms implemented by the Megatron-LM package43. We monitored the grooming advancement by grooming nonaccomplishment and validation nonaccomplishment and stopped the grooming erstwhile determination was nary further betterment (i.e., the nonaccomplishment crippled became flat).

GatorTron exemplary configuration

We developed GatorTron models with 3 configurations and determined the fig of layers, hidden sizes, and fig of attraction heads according to the guidelines for optimal depth-to-width parameter allocation projected by Levin et al.44 arsenic good arsenic our erstwhile acquisition successful processing BioMegatron. Table 3 provides elaborate accusation for the 3 settings. The GatorTron-base exemplary has 24 layers of transformer blocks, which is akin to the architecture of BERT-large model. For each layer, we acceptable the fig of hidden units arsenic 1024 and attraction heads arsenic 16. The GatorTron-medium exemplary scaled up to 3.9 cardinal parameters (~10 times of the basal setting) and the GatorTron-large exemplary scaled up to 8.9 cardinal parameters, which is akin to BioMegatron43 (with 8.3 cardinal parameters).

Table 3 Technical details of GatorTron models.

Train GatorTron models from scratch

We pretrained a vocabulary from scratch utilizing >90 cardinal words of corpus pursuing the byte-pair-encoding algorithm45. We inherited the BERT-style architecture and trained GatorTron models from scratch utilizing 2 self-supervised tasks, including masked connection modeling (MLM) and sentence-order prediction (SOP). We followed the akin strategy successful the BERT model46 to randomly disguise 15% of the input tokens with a peculiar token (i.e., [MASK]) successful the MLM. The SOP was formulated arsenic a task to foretell the bid of 2 consecutive segments of text28. The input for SOP consists of 2 consecutive sentences from the grooming corpus successful random orders and the grooming nonsubjective is to find whether the 2 input sentences are successful the close order. The GatorTron-large exemplary with 8.9 cardinal parameters is excessively ample to acceptable 1 GPU, therefore, we sliced it into 4 pieces for distributed grooming utilizing exemplary parallelism. We pretrained the GatorTron-base and mean exemplary without exemplary slicing. The default nonaccomplishment relation defined successful BERT model46 was used. Figure 4 shows the distributed grooming of GatorTron-large exemplary utilizing exemplary parallelism. (See https://github.com/NVIDIA/Megatron-LM for much details)

Existing transformer models for comparison

BioBERT11: The BioBERT exemplary was developed by further grooming the archetypal BERT-large exemplary (345 cardinal parameters, 24 layers, 1024 hidden units, and 16 attraction heads) utilizing biomedical lit from PubMed Abstracts (4.5 cardinal words) and PMC Full-text articles (13.5 cardinal words). In this study, we utilized mentation 1.1.

ClinicalBERT37: The ClinicalBERT exemplary was developed by further grooming the BioBERT (base version; 110 cardinal parameters with 12 layers, 768 hidden units, and 12 attraction heads) utilizing objective substance from the MIMIC-III38 corpus.

BioMegatron36: The BioMegatron models adopted the BERT architecture with a antithetic fig of parameters from 345 cardinal to 1.2 billion. Different from BioBERT and ClinicalBERT, the BioMegatron was trained from scratch without leveraging the archetypal BERT model.

Fine-tune GatorTron for 5 objective NLP tasks, valuation matrices, and benchmark datasets

We fine-tuned pretrained GatorTron models for 5 antithetic objective NLP tasks utilizing experts’ annotations from six nationalist benchmark datasets. Specifically, we archetypal generated distributed practice from the inputs of a circumstantial task, past added further output layers (classification oregon regression) to make people outputs. We utilized cross-entropy (CE) nonaccomplishment for classification tasks and mean quadrate mistake nonaccomplishment for regression tasks. For a classification task with N categories, fto Ci beryllium the people generated by a transformer exemplary for class i, the probability Pi of a fixed illustration beryllium classified to class i was calculated as:

$$P_i = \frac{{e^{C_i}}}{{\mathop {\sum }\nolimits_{j = 1}^N e^{C_j}}}$$

(1)

Let ti beryllium the crushed information category, the cross-entropy nonaccomplishment LCE is defined as:

$$L_{CE} = - \mathop {\sum}\limits_{i = 1}^N {t_i{{{\mathrm{log}}}}(P_i)}$$

(2)

Fine-tune GatorTron for objective conception extraction

This is simply a task to admit phrases with important objective meanings (e.g., medications, treatments, adverse cause events). The task is to find the boundaries of a conception and classify it into predefined semantic categories. Early systems for objective conception extract are often rule-based, yet, astir caller systems are based connected instrumentality learning models specified arsenic conditional random fields (CRFs)47,48, convolutional neural networks (CNN)9,49, and recurrent neural networks (RNN) implemented with long-short-term representation strategy (LSTM)10,50. Current state-of-the-art models are based connected transformers specified arsenic the ClinicalBERT. We approached objective conception extraction arsenic a series labeling occupation and adopted ‘BIO’ labeling schema, wherever ‘B-’ and ‘I-’ are prefixes indicating words astatine the opening and wrong of a concept, and ‘O’ stands for words located extracurricular of immoderate concepts of interest. Using this definition, we approached the task arsenic a classification problem—for each connection successful a sentence, foretell a statement successful [‘B’, ‘I’, ‘O’]. When determination are aggregate categories of concepts, a suffix was attached to ‘BIO’ for favoritism (e.g., ‘B-drug’, ‘I-drug’). Based connected the practice generated by pretrained GatorTron models, we added a classification furniture (a linear furniture with softmax activation) to cipher a probability people for each ‘BIO’ category. The cross-entropy nonaccomplishment was utilized for fine-tuning. We trained a unified classifier to extract each concepts for datasets without overlapped concepts. For datasets with overlapped concepts, we trained idiosyncratic models to admit each class of conception separately pursuing our erstwhile strategy51. We utilized 3 benchmark datasets developed by the 2010 i2b2 challenge39, 2012 i2b2 challenge40, and 2018 n2c2 challenge41 to measure GatorTron models focusing connected identifying important aesculapian concepts (e.g., medications, adverse cause events, treatments) from objective text. We utilized precision, recall, and F1 people for evaluation.

Fine-tune GatorTron for aesculapian narration extraction

MRE is to found medical-related relations (e.g., induce relation) among objective concepts (e.g., drugs, adverse events). MRE is usually approached arsenic a classification problem—identify pairs of concepts with valid relations and classify the narration type. Various instrumentality learning-based classifiers specified arsenic enactment vector machines (SVMs), random forests (RF), and gradient boosting trees (GBT)41 person been applied. With the emergence of heavy learning models, researchers person explored the long-short-term representation (LSTM) architecture for RE successful some wide and objective domains52,53. Most recently, respective studies adopted the BERT architecture and demonstrated superior show for MRE connected assorted datasets54,55,56,57,58,59. We approached MRE arsenic a classification task. First, campaigner conception pairs were generated utilizing heuristic rules developed successful our erstwhile study41. Then, we identified 2 sentences wherever the 2 concepts successful a brace were located. We introduced 2 sets of entity markers (i.e., [S1], [E1] and [S2], [E2]) to bespeak the 2 concepts. If the 2 concepts were successful the aforesaid sentence, the 2 input sentences volition beryllium the aforesaid but labeled with antithetic markers (e.g., [S1] and [E1] were utilized successful the archetypal sentence; [S2] and [E2] were utilized successful the 2nd sentence). To find the narration type, we concatenated the representations of the exemplary peculiar [CLS] token and each 4 entity markers and added a classification furniture (a linear furniture with softmax activation) for classification. Similarly, the cross-entropy nonaccomplishment was utilized to fine-tune GatorTron. We utilized the dataset developed by the 2018 n2c2 challenge41 with a absorption connected relations betwixt medications and adverse cause events. The precision, recall, and F1 people were utilized for evaluation.

Fine-tune GatorTron for semantic textual similarity

The STS task is to quantitatively measure the semantic similarity betwixt 2 substance snippets (e.g., sentences), which is usually approached arsenic a regression task wherever a real-value people was utilized to quantify the similarity betwixt 2 substance snippets. In the wide domain, the STS benchmark (STS-B) dataset curated by the Semantic Evaluation (SemEval) challenges betwixt 2012 and 201760 is wide utilized for evaluating STS systems13. Various instrumentality learning methods person been examined61,62,63 but transformer-based systems specified arsenic RoBERTa25, T527, and ALBERT28 are starring the state-of-the-art models for STS. In the objective domain, the MedSTS dataset64 that consists of implicit 1000 annotated condemnation pairs from objective notes astatine Mayo Clinic was wide utilized arsenic the benchmark. MedSTS was utilized arsenic the golden modular successful 2 objective NLP unfastened challenges including the 2018 BioCreative/Open Health NLP (OHNLP) challenge65 and 2019 n2c2/OHNLP ClinicalSTS shared task66. Similar to the wide domain, pretrained transformer-based models utilizing objective substance and biomedical literature, including ClinicalBERT and BioBERT67, achieved state-of-the-art performance. In this study, we formulated STS arsenic a regression problem. We applied pretrained GatorTron models to larn the sentence-level representations of the 2 pieces of substance and adopted a linear regression furniture to cipher the similarity score. Different from classification models, we utilized MSE arsenic the nonaccomplishment function. We utilized the dataset developed by the 2019 n2c2/OHNLP66 situation connected objective semantic textural similarity66. The Pearson correlation people was utilized for evaluation.

Fine-tune GatorTron for earthy connection inference

NLI is besides known arsenic recognizing textual entailment (RTE)—a directional narration betwixt substance fragments (e.g., sentences)68. The extremity of NLI is to find if a fixed proposal tin beryllium inferred from a fixed premise. In the wide domain, 2 benchmark datasets—the MultiNLI69 and the Stanford NLI70 are wide used. On some datasets, pretrained transformer models achieved state-of-the-art performances27,29. There are constricted resources for NLI successful the objective domain. Until recently, the MedNLI—a dataset annotated by doctors based connected the aesculapian past of patients71 was developed arsenic a benchmark dataset successful the objective domain. A erstwhile study37 showed that a pretrained objective BERT exemplary achieved the state-of-the-art show and outperformed the baseline (InferSent72) by ~9% accuracy. In this study, we approached NLI arsenic a classification problem. We concatenated the proposal and premise arsenic the input separated utilizing a peculiar token [SEP] and applied pretrained GatorTron models to make distributed representations, which were fed into a classification furniture (a linear furniture with softmax activation) to cipher a probability for each of the 3 categories of entailment, contradiction, and neutral. The cross-entropy nonaccomplishment was utilized for fine-tuning. We evaluated the GatorTron models connected NLI utilizing the MedNLI dataset71 and utilized accuracy for comparison.

Fine-Tune GatorTron for aesculapian question answering

The MQA task is to physique NLP systems that automatically reply aesculapian questions successful a earthy language, which is the astir analyzable situation among the 5 tasks. Unlike different tasks focusing connected phrases and sentences, MQA is simply a document-level task that requires accusation from the full papers to make answers according to questions. In the wide domain, the Stanford Question Answering Datasets (SQuAD 1.1 and 2.0)73,74 person been wide utilized arsenic benchmarks. Transformer-based models are state-of-the-art for some SQuAD1.118 and SQuAD2.031. There are respective MQA datasets developed successful the past fewer years specified arsenic the MESHQA75, MedQuAD76, and emrQA77. In this study, we approached MQA utilizing a instrumentality speechmaking comprehension (MRC) method wherever the extremity is to extract the astir applicable responses (i.e., abbreviated substance snippets oregon entities) from the fixed discourse according to questions. We applied a span classification algorithm to place the commencement and extremity offsets of the reply from the context. More specifically, we packed the question and the discourse into a azygous series arsenic input for GatorTron and applied 2 linear layers to foretell the commencement and extremity presumption of the answer, respectively. As GatorTron models were developed utilizing a maximum token magnitude of 512, we constricted the maximum magnitude of questions to 64 tokens and the remainder of the 446 tokens (including peculiar tokens specified arsenic [CLS] and [SEP]) were utilized for the context. We truncated questions with much than 64 tokens. For contexts the had much than 446 tokens, we adopted a sliding model strategy to scan the full papers utilizing a model size of 446 tokens and a stride size of 396 tokens, truthful that 2 consecutive windows had the aforesaid 50 tokens overlapped. We besides constricted the answers to a maximum magnitude of 32 tokens. We utilized the emrQA dataset77, which is wide utilized arsenic a benchmark dataset for MQA. We peculiarly focused connected medications and relations-related questions arsenic Yue et al.78 recovered that the 2 subsets are much consistent. We utilized some F1 people and nonstop lucifer people for evaluation.

Reporting summary

Further accusation connected probe plan is disposable successful the Nature Research Reporting Summary linked to this article.

Data availability

The benchmark datasets that enactment the findings of this survey are disposable from the authoritative websites of earthy connection processing challenges with Data Use Agreements. More specifically: (1) i2b2 2010, 2012 datasets and n2c2 2018, 2019 datasets: https://portal.dbmi.hms.harvard.edu/projects/n2c2-nlp/. (2) MedNLI dataset: https://physionet.org/content/mednli/1.0.0/. (3) emrQA dataset: https://github.com/panushri25/emrQA#download-dataset. (4) MIMIC III dataset: https://physionet.org/content/mimiciii/1.4/. (5) PubMed dataset: https://www.ncbi.nlm.nih.gov/pmc/tools/openftlist/. (6) Wikipedia dataset: https://dumps.wikimedia.org/enwiki/latest/enwiki-latest-pages-articles.xml.bz2. (7) UF Health IDR objective notes are not unfastened to the nationalist owed to diligent privateness information. The GatorTron models pretrained utilizing >90 cardinal words of substance is publically disposable at: https://catalog.ngc.nvidia.com/orgs/nvidia/teams/clara/models/gatortron_og.

References

  1. Adoption of Electronic Health Record Systems among U.S. Non-Federal Acute Care Hospitals: 2008–2015. ONC Data Brief. https://www.healthit.gov/sites/default/files/briefs/2015_hospital_adoption_db_v17.pdf (2016).

  2. Adler-Milstein, J. et al. Electronic wellness grounds adoption successful US hospitals: the emergence of a integer ‘advanced use’ divide. J. Am. Med. Inform. Assoc. 24, 1142–1148 (2017).

    Article  Google Scholar 

  3. Bush, R. A., Kuelbs, C. L., Ryu, J., Jian, W. & Chiang, G. J. Structured information introduction successful the physics aesculapian record: perspectives of pediatric specialty physicians and surgeons. J. Med. Syst. 41, 1–8 (2017).

    Article  Google Scholar 

  4. Meystre, S. M., Savova, G. K., Kipper-Schuler, K. C. & Hurdle, J. F. Extracting accusation from textual documents successful the physics wellness record: a reappraisal of caller research. Yearb. Med. Inform. 17, 128–144 (2008).

    Article  Google Scholar 

  5. Liang, H. et al. Evaluation and close diagnoses of pediatric diseases utilizing artificial intelligence. Nat. Med. 25, 433–438 (2019).

    Article  CAS  Google Scholar 

  6. Yang, J. et al. Assessing the prognostic value of tumor-infiltrating lymphocytes successful patients with melanoma utilizing pathologic features identified by earthy connection processing. JAMA Netw. Open 4, e2126337 (2021).

    Article  Google Scholar 

  7. Nadkarni, P. M., Ohno-Machado, L. & Chapman, W. W. Natural connection processing: an introduction. J. Am. Med. Inform. Assoc. 18, 544–551 (2011).

    Article  Google Scholar 

  8. LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015).

    Article  CAS  Google Scholar 

  9. Collobert, R. et al. Natural connection processing (almost) from scratch. J. Mach. Learn Res. 12, 2493–2537 (2011).

    Google Scholar 

  10. Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K. & Dyer, C. Neural architectures for named entity recognition. Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 260–270 (2016).

  11. Lee, J. et al. BioBERT: a pre-trained biomedical connection practice exemplary for biomedical substance mining. Bioinformatics. 36, 1234–1240 (2020).

    CAS  Google Scholar 

  12. Vaswani, A. et al. Attention is All you Need. Advances successful Neural Information Processing Systems. 30 (2017).

  13. Wang, A. et al. GLUE: A multi-task benchmark and investigation level for earthy connection understanding. Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. 353–355 (2018).

  14. Wang, A. et al. SuperGLUE: a stickier benchmark for general-purpose connection knowing systems. Advances successful neural accusation processing systems. 32 (2019).

  15. Qiu, X. et al. Pre-trained models for earthy connection processing: a survey. Science China Technological Sciences. 63, 1872–1897 (2020).

    Article  Google Scholar 

  16. Tay, Y., Dehghani, M., Bahri, D. & Metzler, D. Efficient transformers: a survey. ACM Computing Surveys. 55, 1–28 (2020).

    Article  Google Scholar 

  17. Yu, J., Bohnet, B. & Poesio, M. Named entity designation arsenic dependency parsing. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 6470–6476 (2020).

  18. Yamada, I., Asai, A., Shindo, H., Takeda, H. & Matsumoto, Y. LUKE: heavy contextualized entity representations with entity-aware self-attention. Proceedings of the 2020 Conference connected Empirical Methods successful Natural Language Processing (EMNLP). 6442–6454 (2020).

  19. Li, X. et al. Dice nonaccomplishment for data-imbalanced NLP tasks. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 465–476 (2020).

  20. Xu, B., Wang, Q., Lyu, Y., Zhu, Y. & Mao, Z. Entity operation wrong and throughout: modeling notation dependencies for document-level narration extraction. Proceedings of the AAAI Conference connected Artificial Intelligence 35, 14149–14157 (2021).

    Article  Google Scholar 

  21. Ye, D., Lin, Y. & Sun, M. Pack together: entity and narration extraction with levitated marker. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. 1, 4904–4917 (2021).

  22. Cohen, A. D., Rosenman, S. & Goldberg, Y. Relation classification arsenic two-way span-prediction. ArXiv arXiv:2010.04829 (2021).

  23. Lyu, S. & Chen, H. Relation classification with entity benignant restriction. Findings of the Association for Computational Linguistics: ACL-IJCNLP. 390–395 (2021).

  24. Wang, J. & Lu, W. Two are amended than one: associated entity and narration extraction with table-sequence encoders. Proceedings of the 2020 Conference connected Empirical Methods successful Natural Language Processing (EMNLP). 1706–1721 (2020).

  25. Jiang, H. et al. SMART: Robust and businesslike fine-tuning for pre-trained earthy connection models done principled regularized optimization. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2177–2190 (2020).

  26. Yang, Z. et al. XLNet: Generalized Autoregressive Pretraining for Language Understanding. Proceedings of the 33rd International Conference connected Neural Information Processing Systems. 5753–5763 (2019).

  27. Raffel, C. et al. Exploring the limits of transportation learning with a unified text-to-text transformer. J. Mach. Learn. Res. 21, 1–67 (2019).

    Google Scholar 

  28. Lan, Z.-Z. et al. ALBERT: a lite BERT for self-supervised learning of connection representations. ArXiv arXiv:1909.11942 (2019).

  29. Wang, S., Fang, H., Khabsa, M., Mao, H. & Ma, H. Entailment arsenic Few-Shot Learner. ArXiv arXiv:2104.14690 (2021).

  30. Zhang, Z. et al. Semantics-aware BERT for connection understanding. Proceedings of the AAAI Conference connected Artificial Intelligence. 34, 9628-9635 (2020).

  31. Zhang, Z., Yang, J. & Zhao, H. Retrospective scholar for instrumentality speechmaking comprehension. Proceedings of the AAAI Conference connected Artificial Intelligence. 35, 14506-14514 (2021).

  32. Garg, S., Vu, T. & Moschitti, A. TANDA: transportation and accommodate pre-trained transformer models for reply condemnation selection. Proceedings of the AAAI Conference connected Artificial Intelligence. 34, 7780-7788 (2020).

  33. Bommasani, R. et al. On the opportunities and risks of instauration models. ArXiv arXiv:2108.07258 (2021).

  34. Floridi, L. & Chiriatti, M. GPT-3: its nature, scope, limits, and consequences. Minds Mach 30, 681–694 (2020).

    Article  Google Scholar 

  35. Gu, Y. et al. Domain-specific connection exemplary pretraining for biomedical earthy connection processing. ACM Trans. Comput. Healthc. 3, 1–23 (2022).

    Article  Google Scholar 

  36. Shin, H.-C. et al. BioMegatron: larger biomedical domain connection model. Proceedings of the 2020 Conference connected Empirical Methods successful Natural Language Processing (EMNLP). 4700–4706 (2020).

  37. Alsentzer, E. et al. Publicly Available Clinical BERT Embeddings. successful Proc. 2nd Clinical Natural Language Processing Workshop 72–78 (2019).

  38. Johnson, A. E. W. et al. MIMIC-III, a freely accessible captious attraction database. Sci. Data 3, 160035 (2016).

    Article  CAS  Google Scholar 

  39. Uzuner, Ö., South, B. R., Shen, S. & DuVall, S. L. 2010 i2b2/VA situation connected concepts, assertions, and relations successful objective text. J. Am. Med. Inform. Assoc. 18, 552–556 (2011).

    Article  Google Scholar 

  40. Sun, W., Rumshisky, A. & Uzuner, O. Evaluating temporal relations successful objective text: 2012 i2b2 Challenge. J. Am. Med. Inform. Assoc. 20, 806–813 (2013).

    Article  Google Scholar 

  41. Yang, X. et al. Identifying relations of medications with adverse cause events utilizing recurrent convolutional neural networks and gradient boosting. J. Am. Med. Inform. Assoc. 27, 65–72 (2020).

    Article  Google Scholar 

  42. Yang, X. et al. A survey of heavy learning methods for de-identification of objective notes successful cross-institute settings. BMC Med. Inform. Decis. Mak. 19, 232 (2019).

    Article  Google Scholar 

  43. Shoeybi, M. et al. Megatron-LM: grooming multi-billion parameter connection models utilizing exemplary parallelism. ArXiv arXiv:1909.08053 (2020).

  44. Levine, Y., Wies, N., Sharir, O., Bata, H. & Shashua, A. Limits to extent efficiencies of self-attention. Advances successful Neural Information Processing Systems 33, 22640–22651 (2020).

    Google Scholar 

  45. Sennrich, R., Haddow, B. & Birch, A. Neural Machine Translation of Rare Words with Subword Units. successful Proc. 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 1715–1725 (Association for Computational Linguistics, 2016).

  46. Devlin, J., Chang, M.-W., Lee, K. & Toutanova, K. BERT: pre-training of heavy bidirectional transformers for connection understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 4171–4186 (2019).

  47. Wu, Y., Xu, J., Jiang, M., Zhang, Y. & Xu, H. A survey of neural connection embeddings for named entity designation successful objective text. Amia. Annu. Symp. Proc. 2015, 1326–1333 (2015).

    Google Scholar 

  48. Soysal, E. et al. CLAMP—a toolkit for efficiently gathering customized objective earthy connection processing pipelines. J. Am. Med. Inform. Assoc. 25, 331–336 (2018).

    Article  Google Scholar 

  49. Wu, Y., Jiang, M., Lei, J. & Xu, H. Named entity designation successful island objective substance utilizing heavy neural network. Stud. Health Technol. Inform. 216, 624–628 (2015).

    Google Scholar 

  50. Wu, Y. et al. Combine factual aesculapian cognition and distributed connection practice to amended objective named entity recognition. successful AMIA Annual Symposium Proceedings vol. 2018, 1110 (American Medical Informatics Association, 2018).

  51. Yang, X. et al. Identifying relations of medications with adverse cause events utilizing recurrent convolutional neural networks and gradient boosting. J. Am. Med. Inform. Assoc. 27, 65–72 (2020).

    Article  Google Scholar 

  52. Kumar, S. A survey of heavy learning methods for narration extraction. ArXiv arXiv:1705.03645 (2017).

  53. Lv, X., Guan, Y., Yang, J. & Wu, J. Clinical narration extraction with heavy learning. Int. J. Hybrid. Inf. Technol. 9, 237–248 (2016).

    Google Scholar 

  54. Wei, Q. et al. Relation extraction from objective narratives utilizing pre-trained connection models. Amia. Annu. Symp. Proc. 2019, 1236–1245 (2020).

    Google Scholar 

  55. Guan, H. & Devarakonda, M. Leveraging contextual accusation successful extracting agelong region relations from objective notes. Amia. Annu. Symp. Proc. 2019, 1051–1060 (2020).

    Google Scholar 

  56. Alimova, I. & Tutubalina, E. Multiple features for objective narration extraction: a instrumentality learning approach. J. Biomed. Inform. 103, 103382 (2020).

    Article  Google Scholar 

  57. Mahendran, D. & McInnes, B. T. Extracting adverse cause events from objective notes. AMIA Summits connected Translational Science Proceedings. 420–429 (2021).

  58. Yang, X., Zhang, H., He, X., Bian, J. & Wu, Y. Extracting household past of patients from objective narratives: exploring an end-to-end solution with heavy learning models. JMIR Med. Inform. 8, e22982 (2020).

    Article  Google Scholar 

  59. Yang, X., Yu, Z., Guo, Y., Bian, J. & Wu, Y. Clinical Relation Extraction Using Transformer-based Models. ArXiv. arXiv:2107.08957 (2021).

  60. Cer, D., Diab, M., Agirre, E., Lopez-Gazpio, I. & Specia, L. Semeval-2017 task 1: Semantic textual similarity-multilingual and cross-lingual focused evaluation. Proceedings of the 11th International Workshop connected Semantic Evaluation (SemEval-2017). 1–14 (2017).

  61. Farouk, M. Measuring sentences similarity: a survey. ArXiv arXiv:1910.03940 (2019).

  62. Ramaprabha, J., Das, S. & Mukerjee, P. Survey connected condemnation similarity valuation utilizing heavy learning. J. Phys. Conf. Ser. 1000, 012070 (2018).

    Article  Google Scholar 

  63. Gomaa, W. H. & Fahmy, A. A survey of substance similarity approaches. International diary of Computer Applications 68, 13–18 (2013).

    Article  Google Scholar 

  64. Wang, Y. et al. MedSTS: a assets for objective semantic textual similarity. Lang. Resour. Eval. 54, 57–72 (2020).

    Article  Google Scholar 

  65. Rastegar-Mojarad, M. et al. BioCreative/OHNLP Challenge 2018. successful Proc. 2018 ACM International Conference connected Bioinformatics, Computational Biology, and Health Informatics 575–575 (ACM, 2018).

  66. Wang, Y. et al. Overview of the 2019 n2c2/OHNLP way connected objective semantic textual similarity. JMIR Med. Inform. 8, e23375 (2020).

    Article  Google Scholar 

  67. Mahajan, D. et al. Identification of semantically akin sentences successful objective notes: iterative intermediate grooming utilizing multi-task learning. JMIR Med. Inform. 8, e22508 (2020).

    Article  Google Scholar 

  68. Dagan, I., Glickman, O. & Magnini, B. successful Machine Learning Challenges. Evaluating Predictive Uncertainty, Visual Object Classification, and Recognising Tectual Entailment (eds. Quiñonero-Candela, J., Dagan, I., Magnini, B. & d’Alché-Buc, F.) 177–190 (Springer Berlin Heidelberg, 2006).

  69. Williams, A., Nangia, N. & Bowman, S. R. A broad-coverage situation corpus for condemnation knowing done inference. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 1, 1112–1122 (2018).

  70. Bowman, S. R., Angeli, G., Potts, C. & Manning, C. D. A ample annotated corpus for learning earthy connection inference. Proceedings of the 2015 Conference connected Empirical Methods successful Natural Language Processing. 632–642 (2015).

  71. Shivade, C. MedNLI—a earthy connection inference dataset for the objective domain. PhysioNet https://doi.org/10.13026/C2RS98 (2017).

    Article  Google Scholar 

  72. Conneau, A., Kiela, D., Schwenk, H., Barrault, L. & Bordes, A. Supervised learning of cosmopolitan condemnation representations from earthy connection inference data. Proceedings of the 2017 Conference connected Empirical Methods successful Natural Language Processing. 670–680 (2017).

  73. Rajpurkar, P., Zhang, J., Lopyrev, K. & Liang, P. SQuAD: 100,000+ questions for instrumentality comprehension of text. Proceedings of the 2016 Conference connected Empirical Methods successful Natural Language Processing. 2383–2392 (2016).

  74. Rajpurkar, P., Jia, R. & Liang, P. Know what you don’t know: unanswerable questions for SQuAD. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics 2, 784–789 (2018).

    Google Scholar 

  75. Zhu, M., Ahuja, A., Juan, D.-C., Wei, W. & Reddy, C. K. Question Answering with Long Multiple-Span Answers. successful Findings of the Association for Computational Linguistics: EMNLP 2020 3840–3849 (Association for Computational Linguistics, 2020).

  76. Ben Abacha, A. & Demner-Fushman, D. A question-entailment attack to question answering. BMC Bioinforma 20, 511 (2019).

    Article  Google Scholar 

  77. Pampari, A., Raghavan, P., Liang, J. & Peng, J. emrQA: a ample corpus for question answering connected physics aesculapian records. Proceedings of the 2018 Conference connected Empirical Methods successful Natural Language Processing. 2357–2368 (2018).

  78. Yue, X., Gutierrez, B. J. & Sun, H. Clinical speechmaking comprehension: a thorough investigation of the emrQA dataset. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 4474–4486 (2020).

Download references

Acknowledgements

This survey was partially supported by a Patient-Centered Outcomes Research Institute® (PCORI®) Award (ME-2018C3-14754), a assistance from the National Cancer Institute, 1R01CA246418 R01, grants from the National Institute connected Aging, NIA R56AG069880 and R21AG062884, and the Cancer Informatics and eHealth halfway jointly supported by the UF Health Cancer Center and the UF Clinical and Translational Science Institute. The contented is solely the work of the authors and does not needfully correspond the authoritative views of the backing institutions. We would similar to convey the UF Research Computing team, led by Dr. Erik Deumens, for providing computing powerfulness done UF HiPerGator-AI cluster.

Author information

Authors and Affiliations

  1. Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, USA

    Xi Yang, Aokun Chen, Christopher A. Harle, William R. Hogan, Elizabeth A. Shenkman, Jiang Bian & Yonghui Wu

  2. Cancer Informatics and eHealth core, University of Florida Health Cancer Center, Gainesville, FL, USA

    Xi Yang, Aokun Chen, Jiang Bian & Yonghui Wu

  3. NVIDIA, Santa Clara, CA, USA

    Nima PourNejatian, Hoo Chang Shin, Kaleb E. Smith, Christopher Parisien, Colin Compas, Cheryl Martin, Anthony B. Costa & Mona G. Flores

  4. Research Computing, University of Florida, Gainesville, FL, USA

    Ying Zhang

  5. Integrated Data Repository Research Services, University of Florida, Gainesville, FL, USA

    Tanja Magoc, Christopher A. Harle & Gloria Lipori

  6. Lillian S. Wells Department of Neurosurgery, UF Clinical and Translational Science Institute, University of Florida, Gainesville, FL, USA

    Gloria Lipori & Duane A. Mitchell

Contributions

Y.W., J.B., M.G.F., N.P., and X.Y. were liable for the wide design, development, and valuation of this study. X.Y. and A.C. had afloat entree to each the information successful the survey and takes work for the integrity of the information and the accuracy of the information analysis. Y.W., X.Y., J.B., and W.H. did the bulk of the writing, E.A.S., D.A.M., T.M., C.A.H., A.B.C., and G.L. besides contributed to penning and editing of this manuscript. All authors reviewed the manuscript critically for technological content, and each authors gave last support of the manuscript for publication.

Corresponding author

Correspondence to Yonghui Wu.

Ethics declarations

Competing interests

The authors state nary competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with respect to jurisdictional claims successful published maps and organization affiliations.

Supplementary information

About this article

Verify currency and authenticity via CrossMark

Cite this article

Yang, X., Chen, A., PourNejatian, N. et al. A ample connection exemplary for physics wellness records. npj Digit. Med. 5, 194 (2022). https://doi.org/10.1038/s41746-022-00742-2

Download citation

  • Received: 21 June 2022

  • Accepted: 13 December 2022

  • Published: 26 December 2022

  • DOI: https://doi.org/10.1038/s41746-022-00742-2

Read Entire Article