Search

Article
Peer Reviewed

Applying MetaMap to Medline for identifying novel associations in a large clinical dataset: a feasibility analysis.

UC Irvine Previously Published Works (2014)

OBJECTIVE: We describe experiments designed to determine the feasibility of distinguishing known from novel associations based on a clinical dataset comprised of International Classification of Disease, V.9 (ICD-9) codes from 1.6 million patients by comparing them to associations of ICD-9 codes derived from 20.5 million Medline citations processed using MetaMap. Associations appearing only in the clinical dataset, but not in Medline citations, are potentially novel. METHODS: Pairwise associations of ICD-9 codes were independently identified in both the clinical and Medline datasets, which were then compared to quantify their degree of overlap. We also performed a manual review of a subset of the associations to validate how well MetaMap performed in identifying diagnoses mentioned in Medline citations that formed the basis of the Medline associations. RESULTS: The overlap of associations based on ICD-9 codes in the clinical and Medline datasets was low: only 6.6% of the 3.1 million associations found in the clinical dataset were also present in the Medline dataset. Further, a manual review of a subset of the associations that appeared in both datasets revealed that co-occurring diagnoses from Medline citations do not always represent clinically meaningful associations. DISCUSSION: Identifying novel associations derived from large clinical datasets remains challenging. Medline as a sole data source for existing knowledge may not be adequate to filter out widely known associations. CONCLUSIONS: In this study, novel associations were not readily identified. Further improvements in accuracy and relevance for tools such as MetaMap are needed to realize their expected utility.

Cover page: Applying MetaMap to Medline for identifying novel associations in a large clinical dataset: a feasibility analysis.

Article
Peer Reviewed

Results from the second year of a collaborative effort to forecast influenza seasons in the United States

UC Santa Cruz Previously Published Works (2018)

Accurate forecasts could enable more informed public health decisions. Since 2013, CDC has worked with external researchers to improve influenza forecasts by coordinating seasonal challenges for the United States and the 10 Health and Human Service Regions. Forecasted targets for the 2014-15 challenge were the onset week, peak week, and peak intensity of the season and the weekly percent of outpatient visits due to influenza-like illness (ILI) 1-4 weeks in advance. We used a logarithmic scoring rule to score the weekly forecasts, averaged the scores over an evaluation period, and then exponentiated the resulting logarithmic score. Poor forecasts had a score near 0, and perfect forecasts a score of 1. Five teams submitted forecasts from seven different models. At the national level, the team scores for onset week ranged from <0.01 to 0.41, peak week ranged from 0.08 to 0.49, and peak intensity ranged from <0.01 to 0.17. The scores for predictions of ILI 1-4 weeks in advance ranged from 0.02-0.38 and was highest 1 week ahead. Forecast skill varied by HHS region. Forecasts can predict epidemic characteristics that inform public health actions. CDC, state and local health officials, and researchers are working together to improve forecasts.

Cover page: Results from the second year of a collaborative effort to forecast influenza seasons in the United States

Article
Peer Reviewed

Collaborative efforts to forecast seasonal influenza in the United States, 2015–2016

UC Santa Cruz Previously Published Works (2019)

Since 2013, the Centers for Disease Control and Prevention (CDC) has hosted an annual influenza season forecasting challenge. The 2015-2016 challenge consisted of weekly probabilistic forecasts of multiple targets, including fourteen models submitted by eleven teams. Forecast skill was evaluated using a modified logarithmic score. We averaged submitted forecasts into a mean ensemble model and compared them against predictions based on historical trends. Forecast skill was highest for seasonal peak intensity and short-term forecasts, while forecast skill for timing of season onset and peak week was generally low. Higher forecast skill was associated with team participation in previous influenza forecasting challenges and utilization of ensemble forecasting techniques. The mean ensemble consistently performed well and outperformed historical trend predictions. CDC and contributing teams will continue to advance influenza forecasting and work to improve the accuracy and reliability of forecasts to facilitate increased incorporation into public health response efforts.

Cover page: Collaborative efforts to forecast seasonal influenza in the United States, 2015–2016