The following is a non-exhaustive list of corpora that are either freely accessible, or to which the Library subscribes that may be useful in Linguistics.
The Library also has a LibGuide on Text and data mining specifically.
Data source | About | Data access | Further information |
---|---|---|---|
Adam Matthew digital | Primary source collections from the social sciences and humanities | API, via Library request (select your Faculty liaison librarian) | Data mining/text mining statement |
American Medical Association | JAMA: The Journal of the American Medical Association is a peer-reviewed medical journal published 48 times a year by the AMA. It publishes original research, reviews, and editorials covering all aspects of the biomedical sciences. | Register for an account with JAMA; Data is packaged in subscription-level sets (eg, JAMA Internal Medicine 1998-current) and downloadable in JSON format | JAMA Network text and data mining services |
Brill online | Primary sources, books, and reference resources in the Humanities and Social Sciences, International Law and selected areas in the Sciences | Content may be downloaded for TDM, via Library request (select your Faculty liaison librarian) | |
Emerald insight | Scholarly academic journals, case studies, and books in the fields of management, business, education, library studies, health care, and engineering | Via CrossRef’s TDM service | CrossRef community forum |
Gale Primary Sources | Historical primary source archives incorporating monographs, manuscripts, newspapers, maps, and photographs | Access through Gale Digital Scholar Lab. | Data Mining, Textual Analytics, the Digital Humanities, and Gale |
HathiTrust | A collection of millions of titles digitized from libraries around the world. |
A suite of tools and services through the HathiTrust Research Center |
|
JSTOR | Selected content from JSTOR, a digital library of academic journals, books, and primary sources | JSTOR Data for Research | JSTOR Data for Research Dataset Services |
Oxford University Press | Access points for reference resources from Oxford University Press: Oxford Art Online Oxford Music Online Oxford Scholarship Online |
Via consultation: Data.Mining@oup.com Please copy lib-eresources-l@monash.edu into request for local assistance |
Oxford Art FAQ Oxford Music FAQ |
ProQuest TDM Studio | Visualizations interface - newspapers (depending on Monash subscriptions) and ProQuest Dissertations and Theses Workbench interface - includes most journals, newspapers, dissertations, theses, and primary sources available through Monash subscriptions |
Through record in Search - TDM Studio Visualizations (no coding skills required) or TDM Studio Workbench (R or Python needed) |
|
ScienceDirect | Scholarly eJournals and selected eBooks published by Elsevier | API | Elsevier ScienceDirect APIs |
Scopus | Abstracts and citation data from a large multidisciplinary corpus covering published material in the STEM and Humanities | API | Elsevier Scopus APIs |
SpringerLink | Multidisciplinary collection of online resources covering life, health and physical sciences, social sciences, and the humanities | API or CrossRef's TDM service | |
Taylor & Francis online | Scholarly journals, ebooks, and reference works in the Humanities, Social Sciences, Behavioural Sciences, Science, Technology and Medicine sectors | Via Library request (select your Faculty liaison librarian) | |
Web of Science Journals API | Supports rich searching across the Web of Science to retrieve full item-level metadata, including times cited counts, contributor addresses/affiliations and finding data. The API is performance limited based on the API plan chosen by the institution. See further information for details on current API plans. |
Sign up and register an application for access to the API subscription. |
Web of Science Journals API page.
|
Wiley Online Library | Multidisciplinary collection of online resources covering life, health and physical sciences, social sciences, and the humanities | API or CrossRef's TDM service | Text and data mining agreement |