MEDLINE: Difference between revisions
imported>Robert Badgett |
imported>Robert Badgett |
||
Line 73: | Line 73: | ||
Machine learning methods in which the search engine seeks articles that more resemble the included articles, may be more accurate than Boolean methods (see EBMSearch below).<ref name="pmid15561789">{{cite journal |author=Aphinyanaphongs Y, Tsamardinos I, Statnikov A, Hardin D, Aliferis CF |title=Text categorization models for high-quality article retrieval in internal medicine |journal=J Am Med Inform Assoc |volume=12 |issue=2 |pages=207–16 |year=2005 |pmid=15561789 |doi=10.1197/jamia.M1641}}</ref> However, the study by Aphinyanaphongs compared machine learning to the 1994 Boolean filters.<ref name="pmid15561789"/> | Machine learning methods in which the search engine seeks articles that more resemble the included articles, may be more accurate than Boolean methods (see EBMSearch below).<ref name="pmid15561789">{{cite journal |author=Aphinyanaphongs Y, Tsamardinos I, Statnikov A, Hardin D, Aliferis CF |title=Text categorization models for high-quality article retrieval in internal medicine |journal=J Am Med Inform Assoc |volume=12 |issue=2 |pages=207–16 |year=2005 |pmid=15561789 |doi=10.1197/jamia.M1641}}</ref> However, the study by Aphinyanaphongs compared machine learning to the 1994 Boolean filters.<ref name="pmid15561789"/> | ||
Machine learning may be improved by using stacked generalization (or stacking) to emphasize the role of UMLS concepts and title words.<ref name="pmid18952929">{{cite journal |author=Kilicoglu H, Demner-Fushman D, Rindflesch TC, Wilczynski NL, Haynes RB |title=Towards automatic recognition of scientifically rigorous clinical research evidence |journal=J Am Med Inform Assoc |volume=16 |issue=1 |pages=25–31 |year=2009 |pmid=18952929 |pmc=2605595 |doi=10.1197/jamia.M2996 |url=http://www.jamia.org/cgi/pmidlookup?view=long&pmid=18952929 |issn=}}</ref> | Machine learning may be improved by ensemble learning method using stacked generalization (or stacking) to emphasize the role of UMLS concepts and title words.<ref name="pmid18952929">{{cite journal |author=Kilicoglu H, Demner-Fushman D, Rindflesch TC, Wilczynski NL, Haynes RB |title=Towards automatic recognition of scientifically rigorous clinical research evidence |journal=J Am Med Inform Assoc |volume=16 |issue=1 |pages=25–31 |year=2009 |pmid=18952929 |pmc=2605595 |doi=10.1197/jamia.M2996 |url=http://www.jamia.org/cgi/pmidlookup?view=long&pmid=18952929 |issn=}}</ref> | ||
==Research methods for comparative studies== | ==Research methods for comparative studies== |
Revision as of 09:42, 1 March 2011
According to the U.S. National Library of Medicine, "MEDLINE® (Medical Literature Analysis and Retrieval System Online) is the U.S. National Library of Medicine's® (NLM) premier bibliographic database that contains over 16 million references to journal articles in life sciences with a concentration on biomedicine. A distinctive feature of MEDLINE is that the records are indexed with NLM's Medical Subject Headings (MeSH®)."[1]
PubMed is the National Library of Medicine's free online search system for MEDLINE.
Structure
MEDLINE® (Medical Literature Analysis and Retrieval System Online) is a database of predominantly biomedical bibliographic citations maintained by the U.S. National Library of Medicine (NLM).[2] The process sofr selecting journals is described.[3] Each citation includes bibliographic data, abstract if available, links to full text of the article and keywords. The keywords are indexed with the NLM's Medical Subject Headings (MeSH®)[4] and subheadings[5].
The important MeSH terms “Randomized Controlled Trial” and “Clinical Controlled Trial” were introduced in 1991 and 1995, respectively.[6] The Cochrane Collaboration helps MEDLINE correctly retag articles with these terms.[6]
The National Library of Medicine's Indexing Initiative is trying to automate assignment of MeSH terms.
The National Library of Medicine is investigated whether indexing MeSH terms can be either fully or semi-automated.[7]
Methods to improve searching MEDLINE
There is much ongoing research into improving MEDLINE search results.
Citation tracking
Citation tracking may help identify relevant studies in MEDLINE.[8][9]
Clustering
Clustering search results may help.[10]
Filters (hedges)
MEDLINE filters, also called hedges, are an optimal Boolean combination of search terms, both textword and MeSH terms, to search articles. Many filters have been made by the Hedges Team and are available as Clinical Queries at PubMed. Filters have been criticized for being imperfect.[11]
Filters for article types
Purpose category | Strategy with high sensitivity |
Strategy with high specificity |
---|---|---|
1994[12] | ||
Treatment | randomized controlled trial[Publication Type] OR drug therapy[MeSH Subheading] OR therapeutic use[MeSH Subheading] OR random*[Title/Abstract] | placebo*[Title/Abstract] OR (double[Title/Abstract] AND blind*[Title/Abstract] |
Diagnosis | ||
2005[13] | ||
Treatment | (clinical[Title/Abstract] AND trial[Title/Abstract]) OR clinical trials[MeSH Terms] OR clinical trial[Publication Type] OR random*[Title/Abstract] OR random allocation[MeSH Terms] OR therapeutic use[MeSH Subheading] | randomized controlled trial[Publication Type] OR (randomized[Title/Abstract] AND controlled[Title/Abstract] AND trial[Title/Abstract]) |
Diagnosis | sensitiv*[Title/Abstract] OR sensitivity and specificity[MeSH Terms] OR diagnos*{Title/Abstract] OR diagnosis[MeSH:noexp] OR diagnostic * [MeSH:noexp] OR diagnosis,differential[MeSH:noexp] OR diagnosis[Subheading:noexp] | specificity[Title/Abstract] |
One filter is for identifying randomized controlled trials. Many MEDLINE filters have been developed by the Hedges team[13] supported by a grant from the National Library of Medicine.[14] The filters were initially published in 1994[12] and then revised and published in 2005[15].
Examples include filters for randomized controlled trials[16] and systematic reviews[17].
Filters for subject types
A filter have been developed for articles about kidney disease[18], dentistry[19], and about specific age ranges[20].
Relevancy ranking
Although MEDLINE is usually searched for exact matches using Boolean terms, relevancy ranking has been studied. In an early comparison, relevancy ranking performed well; however, the Boolean version of MEDLINE did not fully use MeSH terms.[21][22]
eTBLAST uses text mining to search for similar publications.[23][24]
Citation analysis or PageRank
There are conflicting results over the role of ranking results based on citation counts or PageRank. A study using Google's own PageRank found PubMed's clinical queries to be better.[25] However, a comparative study found better results for a metric analogous to PageRank for biomedical journals based on:[26][27]
Machine learning
Machine learning methods in which the search engine seeks articles that more resemble the included articles, may be more accurate than Boolean methods (see EBMSearch below).[28] However, the study by Aphinyanaphongs compared machine learning to the 1994 Boolean filters.[28]
Machine learning may be improved by ensemble learning method using stacked generalization (or stacking) to emphasize the role of UMLS concepts and title words.[29]
Research methods for comparative studies
In comparing the information retrieval of search strategies, there are two experimental methods.
- If a complete test collection of articles is available that is already divided into articles of meeting inclusion criteria and articles that not meeting criteria, then each strategy is compared for its ability to successfully identify the articles meeting criteria (sensitivity) and to successfully exclude (specificity) the articles not meeting criteria. Sensitivity is also called "recall".[30]
- If a partial test collection is available that only consists of articles meeting inclusion criteria (for example, article meeting inclusion criteria for ACP Journal Club[28] or articles included in a systematic review of a clinical topic or articles in an annotated bibliography[27]), then the sensitivity is again the proportion of relevant articles identified by the strategy. However, the specificity is not computable. Instead, one of several related measures are calculated. These measures are all based on the positive predictive value (PPV) of the strategy. Analogous to PPV used in diagnostic testing, the PPV directly correlates with the prevalence of relevant articles in the collection and thus is not stable across prevalences.[31]
- Precision is "the proportion of retrieved articles that meet criteria" and thus is the same as the PPV.[32][33]
- Hit curve "is the number of important articles among the first n results."[34][26]
- Number Needed to Read (NNR) is "how many papers in a journal have to be read to find one of adequate clinical quality and relevance."[35][36][31][25] Of note, the NNR has been proposed as a metric to help libraries to decide which journals to subscribe to.[35]
- 11-point precision recall graph is similar to a receiver operating characteristic curve[28]
Methods to access MEDLINE
There are many third party interfaces to search MEDLINE such as OVID[37]. The National Library of Medicine's own search interface is PubMed (http://pubmed.gov).
PubMed
PubMed (http://pubmed.gov) is the National Library of Medicine's own free Internet access to MEDLINE. PubMed has been freely available since 1997.
EBM Search
EBM Search (http://www.ahsl.arizona.edu/ebmsearch/) is a federated medical search engine.[38]
EBMSearch
EBMSearch (http://ebmsearch.org/) maintains its own copy of MEDLINE and uses machine learning to rank articles.[28]
eTBLAST
eTBLAST uses text mining to search for similar publications.[23][24]
GoPubMed
GoPubMed (http://www.GoPubMed.org/) applies social networking to MEDLINE.[39]
HubMed
HubMed (http://www.hubmed.org/) does not maintain its own copy of MEDLILNE, but rather uses PubMed's EUtils web service to retrieve MEDLINE records stored at PubMed.[40]
Ovid
SUMSearch
SUMSearch (http://sumsearch.uthscsa.edu/) is a federated medical search engine. It does not maintain its own copy of MEDLINE, but rather queries PubMed and revises searches too few or too many citations are retrieved. At the same time, SUMSearch queries the National Guidelines Clearinghouse, DARE, WikiPedia, and other resources.
References
- ↑ MEDLINE Fact Sheet. National Library of Medicine. Retrieved on 2008-01-22.
- ↑ National Library of Medicine. MEDLINE Fact Sheet. Retrieved on 2007-11-09.
- ↑ Anonymous (2007). MEDLINE® Journal Selection Fact Sheet. National Library of Medicine. Retrieved on 2010-04-04.
- ↑ National Library of Medicine. Medical Subject Headings (MESH®) Fact Sheet. Retrieved on 2007-11-09.
- ↑ Anonymous (2008). Qualifiers - 2008. National Library of Medicine. Retrieved on 2008-03-19.
- ↑ 6.0 6.1 Glanville JM, Lefebvre C, Miles JN, Camosso-Stefinovic J (2006). "How to identify randomized controlled trials in MEDLINE: ten years on.". J Med Libr Assoc 94 (2): 130-6. PMID 16636704. PMC PMC1435857.
- ↑ National Library of Medicine. Indexing Initiative. Retrieved on 2007-11-25.
- ↑ Bakkalbasi N, Bauer K, Glover J, Wang L (2006). "Three options for citation tracking: Google Scholar, Scopus and Web of Science". Biomed Digit Libr 3: 7. DOI:10.1186/1742-5581-3-7. PMID 16805916. Research Blogging.
- ↑ Kuper H, Nicholson A, Hemingway H (2006). "Searching for observational studies: what does citation tracking add to PubMed? A case study in depression and coronary heart disease". BMC Med Res Methodol 6: 4. DOI:10.1186/1471-2288-6-4. PMID 16483366. Research Blogging.
- ↑ Lin Y, Li W, Chen K, Liu Y (2007). "A document clustering and ranking system for exploring MEDLINE citations". J Am Med Inform Assoc 14 (5): 651–61. DOI:10.1197/jamia.M2215. PMID 17600104. Research Blogging.
- ↑ Leeflang MM, Scholten RJ, Rutjes AW, Reitsma JB, Bossuyt PM (2006). "Use of methodological search filters to identify diagnostic accuracy studies can lead to the omission of relevant studies.". J Clin Epidemiol 59 (3): 234-40. DOI:10.1016/j.jclinepi.2005.07.014. PMID 16488353. Research Blogging.
- ↑ 12.0 12.1 Haynes RB, Wilczynski N, McKibbon KA, Walker CJ, Sinclair JC (1994). "Developing optimal search strategies for detecting clinically sound studies in MEDLINE.". J Am Med Inform Assoc 1 (6): 447-58. PMID 7850570. PMC PMC116228. [e]
- ↑ 13.0 13.1 Hedges Team. Search Strategies. Retrieved on 2011-03-015.
- ↑ Project Information - NIH RePORTER – NIH Research Portfolio Online Reporting Tool Expenditures and Results. Retrieved on 2007-11-25.
- ↑ Haynes RB, McKibbon KA, Wilczynski NL, Walter SD, Werre SR, Hedges Team (2005). "Optimal search strategies for retrieving scientifically strong studies of treatment from Medline: analytical survey.". BMJ 330 (7501): 1179. DOI:10.1136/bmj.38446.498542.8F. PMID 15894554. PMC PMC558012. Research Blogging.
- ↑ McKibbon KA, Wilczynski NL, Haynes RB (2009). "Retrieving randomized controlled trials from MEDLINE: a comparison of 38 published search filters.". Health Info Libr J 26 (3): 187-202. DOI:10.1111/j.1471-1842.2008.00827.x. PMID 19712211. Research Blogging.
- ↑ Wilczynski NL, Haynes RB (2009). "Consistency and accuracy of indexing systematic review articles and meta-analyses in MEDLINE.". Health Info Libr J 26 (3): 203-10. DOI:10.1111/j.1471-1842.2008.00823.x. PMID 19712212. Research Blogging.
- ↑ Garg AX, Iansavichus AV, Wilczynski NL, Kastner M, Baier LA, Shariff SZ et al. (2009). "Filtering Medline for a clinical discipline: diagnostic test assessment framework.". BMJ 339: b3435. DOI:10.1136/bmj.b3435. PMID 19767336. Research Blogging.
- ↑ Niederman R, Chen L, Murzyn L, Conway S. Benchmarking the dental randomised controlled literature on MEDLINE. Evidence-Based Dentistry. 2002;3:5-9 DOI:10.1038/sj/ebd/4600095
- ↑ Kastner M, Wilczynski NL, Walker-Dilks C, McKibbon KA, Haynes B (2006). "Age-specific search strategies for Medline.". J Med Internet Res 8 (4): e25. DOI:10.2196/jmir.8.4.e25. PMID 17213044. PMC PMC1794003. Research Blogging.
- ↑ Hersh WR, Hickam DH (1992). "A comparison of retrieval effectiveness for three methods of indexing medical literature". Am. J. Med. Sci. 303 (5): 292–300. PMID 1580316. [e]
- ↑ Hersh WR, Hickam DH, Haynes RB, McKibbon KA (1994). "A performance and failure analysis of SAPHIRE with a MEDLINE test collection". J Am Med Inform Assoc 1 (1): 51–60. PMID 7719787. [e]
- ↑ 23.0 23.1 Errami M, Wren JD, Hicks JM, Garner HR (2007). "eTBLAST: a web server to identify expert reviewers, appropriate journals and similar publications.". Nucleic Acids Res 35 (Web Server issue): W12-5. DOI:10.1093/nar/gkm221. PMID 17452348. PMC PMC1933238. Research Blogging.
- ↑ 24.0 24.1 Lewis J, Ossowski S, Hicks J, Errami M, Garner HR (2006). "Text similarity: an alternative way to search MEDLINE.". Bioinformatics 22 (18): 2298-304. DOI:10.1093/bioinformatics/btl388. PMID 16926219. Research Blogging.
- ↑ 25.0 25.1 Haase A, Follmann M, Skipka G, Kirchner H (2007). "Developing search strategies for clinical practice guidelines in SUMSearch and Google Scholar and assessing their retrieval performance". BMC Med Res Methodol 7: 28. DOI:10.1186/1471-2288-7-28. PMID 17603909. Research Blogging.
- ↑ 26.0 26.1 Bernstam EV, Herskovic JR, Aphinyanaphongs Y, Aliferis CF, Sriram MG, Hersh WR (2006). "Using citation data to improve retrieval from MEDLINE". J Am Med Inform Assoc 13 (1): 96–105. DOI:10.1197/jamia.M1909. PMID 16221938. Research Blogging.
- ↑ 27.0 27.1 Herskovic JR, Bernstam EV (2005). "Using incomplete citation data for MEDLINE results ranking". AMIA Annu Symp Proc: 316–20. PMID 16779053. [e]
PubMed Central Cite error: Invalid
<ref>
tag; name "pmid16779053" defined multiple times with different content - ↑ 28.0 28.1 28.2 28.3 28.4 Aphinyanaphongs Y, Tsamardinos I, Statnikov A, Hardin D, Aliferis CF (2005). "Text categorization models for high-quality article retrieval in internal medicine". J Am Med Inform Assoc 12 (2): 207–16. DOI:10.1197/jamia.M1641. PMID 15561789. Research Blogging.
- ↑ Kilicoglu H, Demner-Fushman D, Rindflesch TC, Wilczynski NL, Haynes RB (2009). "Towards automatic recognition of scientifically rigorous clinical research evidence". J Am Med Inform Assoc 16 (1): 25–31. DOI:10.1197/jamia.M2996. PMID 18952929. PMC 2605595. Research Blogging.
- ↑ Hersh, William R. (2008). Information Retrieval: A Health and Biomedical Perspective (Health Informatics). Berlin: Springer. ISBN 0-387-78702-X. Google books
- ↑ 31.0 31.1 Bachmann LM, Coray R, Estermann P, Ter Riet G (2002). "Identifying diagnostic studies in MEDLINE: reducing the number needed to read". J Am Med Inform Assoc 9 (6): 653–8. PMID 12386115. [e]
- ↑ Haynes RB, Wilczynski NL (2004). "Optimal search strategies for retrieving scientifically strong studies of diagnosis from Medline: analytical survey". BMJ 328 (7447): 1040. DOI:10.1136/bmj.38068.557998.EE. PMID 15073027. Research Blogging.
- ↑ Zhang L, Ajiferuke I, Sampson M (2006). "Optimizing search strategies to identify randomized controlled trials in MEDLINE". BMC Med Res Methodol 6: 23. DOI:10.1186/1471-2288-6-23. PMID 16684359. PMC 1488863. Research Blogging.
- ↑ Herskovic JR, Iyengar MS, Bernstam EV (2007). "Using hit curves to compare search algorithm performance". J Biomed Inform 40 (2): 93–9. DOI:10.1016/j.jbi.2005.12.007. PMID 16469545. Research Blogging.
- ↑ 35.0 35.1 Toth B, Gray JA, Brice A (2005). "The number needed to read-a new measure of journal value". Health Info Libr J 22 (2): 81–2. DOI:10.1111/j.1471-1842.2005.00568.x. PMID 15910578. Research Blogging.
- ↑ McKibbon KA, Wilczynski NL, Haynes RB (2004). "What do evidence-based secondary journals tell us about the publication of clinically important articles in primary healthcare journals?". BMC Med 2: 33. DOI:10.1186/1741-7015-2-33. PMID 15350200. Research Blogging.
- ↑ Anonymous. MEDLINE® - Ovid's MEDLINE. Retrieved on 2007-11-09.
- ↑ Bracke PJ, Howse DK, Keim SM (April 2008). "Evidence-based Medicine Search: a customizable federated search engine". J Med Libr Assoc 96 (2): 108–13. DOI:10.3163/1536-5050.96.2.108. PMID 18379665. PMC 2268222. Research Blogging.
- ↑ Doms A, Schroeder M (July 2005). "GoPubMed: exploring PubMed with the Gene Ontology". Nucleic acids research 33 (Web Server issue): W783–6. DOI:10.1093/nar/gki470. PMID 15980585. PMC 1160231. Research Blogging.
- ↑ Eaton AD (July 2006). "HubMed: a web-based biomedical literature search interface". Nucleic acids research 34 (Web Server issue): W745–7. DOI:10.1093/nar/gkl037. PMID 16845111. PMC 1538859. Research Blogging.
External links
- PubMed
- PubMed for Handhelds
- PubMed usage statistics
- Entrez Programming Utilities
- Déjà vu: a Database of Duplicate Citations in the Scientific Literature (See Déjà vu--a study of duplicate citations in Medline PMID 18056062)