TY - JOUR
T1 - Automated identification of molecular effects of drugs (AIMED)
AU - Fathiamini, Safa
AU - Johnson, Amber M.
AU - Zeng, Jia
AU - Araya, Alejandro
AU - Holla, Vijaykumar
AU - Bailey, Ann M.
AU - Litzenburger, Beate C.
AU - Sanchez, Nora S.
AU - Khotskaya, Yekaterina
AU - Xu, Hua
AU - Meric-Bernstam, Funda
AU - Bernstam, Elmer V.
AU - Cohen, Trevor
PY - 2016/7
Y1 - 2016/7
N2 - Introduction Genomic profiling information is frequently available to oncologists, enabling targeted cancer therapy. Because clinically relevant information is rapidly emerging in the literature and elsewhere, there is a need for informatics technologies to support targeted therapies. To this end, we have developed a system for Automated Identification of Molecular Effects of Drugs, to help biomedical scientists curate this literature to facilitate decision support. Objectives To create an automated system to identify assertions in the literature concerning drugs targeting genes with therapeutic implications and characterize the challenges inherent in automating this process in rapidly evolving domains. Methods We used subject-predicate-object triples (semantic predications) and co-occurrence relations generated by applying the SemRep Natural Language Processing system to MEDLINE abstracts and ClinicalTrials.gov descriptions. We applied customized semantic queries to find drugs targeting genes of interest. The results were manually reviewed by a team of experts. Results Compared to a manually curated set of relationships, recall, precision, and F2 were 0.39, 0.21, and 0.33, respectively, which represents a 3- to 4-fold improvement over a publically available set of predications (SemMedDB) alone. Upon review of ostensibly false positive results, 26% were considered relevant additions to the reference set, and an additional 61% were considered to be relevant for review. Adding co-occurrence data improved results for drugs in early development, but not their better-established counterparts. Conclusions Precision medicine poses unique challenges for biomedical informatics systems that help domain experts find answers to their research questions. Further research is required to improve the performance of such systems, particularly for drugs in development.
AB - Introduction Genomic profiling information is frequently available to oncologists, enabling targeted cancer therapy. Because clinically relevant information is rapidly emerging in the literature and elsewhere, there is a need for informatics technologies to support targeted therapies. To this end, we have developed a system for Automated Identification of Molecular Effects of Drugs, to help biomedical scientists curate this literature to facilitate decision support. Objectives To create an automated system to identify assertions in the literature concerning drugs targeting genes with therapeutic implications and characterize the challenges inherent in automating this process in rapidly evolving domains. Methods We used subject-predicate-object triples (semantic predications) and co-occurrence relations generated by applying the SemRep Natural Language Processing system to MEDLINE abstracts and ClinicalTrials.gov descriptions. We applied customized semantic queries to find drugs targeting genes of interest. The results were manually reviewed by a team of experts. Results Compared to a manually curated set of relationships, recall, precision, and F2 were 0.39, 0.21, and 0.33, respectively, which represents a 3- to 4-fold improvement over a publically available set of predications (SemMedDB) alone. Upon review of ostensibly false positive results, 26% were considered relevant additions to the reference set, and an additional 61% were considered to be relevant for review. Adding co-occurrence data improved results for drugs in early development, but not their better-established counterparts. Conclusions Precision medicine poses unique challenges for biomedical informatics systems that help domain experts find answers to their research questions. Further research is required to improve the performance of such systems, particularly for drugs in development.
KW - Biomedical question answering
KW - Molecular
KW - Pharmacogenomics
KW - Precision oncology
KW - SemRep
KW - Targeted therapy
UR - http://www.scopus.com/inward/record.url?scp=84981225429&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84981225429&partnerID=8YFLogxK
U2 - 10.1093/jamia/ocw030
DO - 10.1093/jamia/ocw030
M3 - Article
C2 - 27107438
AN - SCOPUS:84981225429
SN - 1067-5027
VL - 23
SP - 758
EP - 765
JO - Journal of the American Medical Informatics Association
JF - Journal of the American Medical Informatics Association
IS - 4
ER -