Draft:Scholarly profiling

Definition
We will begin by defining a "scholarly profiling" tool. Coined by Lane Rasberry, a Wikimedian at the University of Virginia, a "scholarly profiling tool" aims to provide better and more helpful approaches for people around the globe to access academic articles. Generally used by researchers (from students to professionals in industry) to conduct the literature reviews that contextualize the papers of academic publications, the goal of scholarly profiling tool is to identify relevant records given a query - ideally the tool must be structured such that the query is transparent and reproducible where the results returned has good coverage within its domain(s). Such literature reviews and consequently the scholarly profiling tools used to complete them are essential because the findings reported by these publications (which can often determine the direction of future work) are evaluated based on how they fill gaps in the current understanding of the field.

It's important to distinguish, however, between a scholarly profiling tool and the data it operates with. More data is constantly being added to the databases of search systems, but the quality of results returned to the researcher will still depend on the proficiency of the tool/search system. In literature, this tool is also known as an academic search system, academic search engine and bibliographic database, evidence synthesis technology, and web-based literature search systems.

Benefits of using a scholarly profiling tool is that most return results instantaneously, search by variety of different factors (topic, date, author, etc), display how many times the result has been cited by others, and often have a method of saving articles to a "bookshelf" or "library" to be found again easily.

Dataset versus tool
Scholarly profiling works when a user queries a tool to present part of a dataset. The quality of the profile depends on the effectiveness of the tool and the completeness of the dataset.

Features
In general, Gusenbaur and Haddaway describe how quality search system is one with good coverage that answers the user's query with high precision (percentage of returned results that are relevant) and recall (percentage of relevant results returned), as well as one that allows reviewers to successfully reproduce the results of a transparently documented search process.

As such, these authors utilized 27 diverse criterion to evaluate whether a particular academic search system is suitable for systematic review or meta-analysis. Though conducting a systematic or meta-analytic review is only a small part of the researching world, their criterion nevertheless provide useful insights into features that an academic search engine may have. The features are listed here:


 * 1) Subject
 * 2) Size
 * 3) Record Type (Selectable separately)
 * 4) Retrospective Coverage (Oldest Entries)
 * 5) Open Access Content?
 * 6) Controlled Vocabulary?
 * 7) Field codes/Limiters?
 * 8) Full Text Search Option?
 * 9) Search String Length
 * 10) Server Resp/ Time/Records: Max. Word Comb.
 * 11) Language
 * 12) Boolean Functional? OR
 * 13) Boolean Functional? AND
 * 14) Boolean Functional? NOT
 * 15) Comparative Test
 * 16) Query Interpretation/Query Expansion
 * 17) Truncation/ Wildcards Available?
 * 18) Exact Phrase Functional?
 * 19) Parenthesis Functional?
 * 20) Post-query Results Refinement
 * 21) Citation Search
 * 22) Advanced Search String Field?
 * 23) Search Help?
 * 24) No. of Accessible Hits
 * 25) Bulk Download?
 * 26) Repeatable? Time
 * 27) Location-independent?

All the aforementioned properties of an academic search engines that Gusenbauer and Haddaway discussed provided great insights into what a primary and reliable academic search engine should have. On top of all these features, there are still more that might be helpful for researchers. As described by Rasberry, a "scholarly profiling" tool should be able to discern:


 * 1) Academic articles from a particular region/country
 * 2) Database created by a particular author/institution
 * 3) All articles from an academic institutions (i.e. universities or government agencies)
 * 4) Whether this paper is funded by an organization
 * 5) Specific publishers

Examples of academic search engines
There are various types of academic search engines that are available out on the Internet that everyone has access to. Examples of academic search engines include: Google Scholar, Scopus, ResearchGate, Web of Science. These academic search engines are often the starting place for researchers to gain inspirations for research and conduct initial stage of literature review.

Gusenbauer and Haddaway have conducted a study that outlined 28 academic search systems, providing great insights for people who are interested in finding alternative search methods or attempting to find articles under a more specific subject field, and classify these systems into principal and supplementary search systems. The 28 academic search systems, chosen by where the top 63 papers (top 0.1% of their field) published within two years of September/October 2018, are listed here:


 * 1) ACM Digital Library
 * 2) AMiner
 * 3) arXiv
 * 4) Bielefeld Academic Search Engine(BASE)
 * 5) CiteSeerX
 * 6) ClinicalTrials.gov
 * 7) Cochrane Library
 * 8) Digital Bibliography& Library Project
 * 9) Directory of Open Access Journals
 * 10) EbscoHost
 * 11) Education Resources Information Center
 * 12) Google Scholar
 * 13) IEEE Xplore Digital Library
 * 14) JSTOR
 * 15) Microsoft Academic
 * 16) OVID
 * 17) ProQuest
 * 18) PubMed
 * 19) ScienceDirect
 * 20) Scopus
 * 21) Semantic Scholar
 * 22) SpringerLink
 * 23) Transport Research International Documentation
 * 24) Virtual Health Library
 * 25) Web of Science
 * 26) Wiley Online Library
 * 27) WorldCat
 * 28) WorldWideScience

Other examples of academic search engines
Here we introduce few other examples of "non-conventional" academic search engines that provide certain extremely useful features. As a scholarly profiling tool, we envision that our tool would carry similar functions as the ones demonstrated here:

(1) MathScinet: MathSci net is a search engine dedicated to searching academic articles centering on mathematics and science. Although users must create an account in order to proceed with the search, it nevertheless offers many insightful functions that other academic search engine could adopt.

Some unique features of MathScinet include:

Another useful piece of information for more researchers to know is that there are various academic search engines out there that dedicate themselves to a specific field of research, and MathSci net is a fabulous example. While most people tend to use Google Scholar for primary research, it's extremely practical for users to at least know that there are specific academic search engines for specific fields. These "subgroup search engines" may offer articles and research that orient more toward the specific research interest that you'd like to investigate.
 * 1) Search by author
 * 2) Search by Review Text
 * 3) Search by specific journal
 * 4) Search by series
 * 5) Search by MSR Primary / secondary
 * 6) Search by MR number

(2) Lens.org: With more than 200 million scholarly works, Lens is one of the largest indexes currently available, and this combination of scholarly and patent information provides a powerful means for investigating linkages between research and innovation. Some inspiring features of Lens includes:

All these features
 * 1) Has Flags:
 * 2) Open Access
 * 3) Has Abstract
 * 4) Has Full Text
 * 5) Has Chemical
 * 6) Has Funding
 * 7) Has Clinical Trial
 * 8) Has Affiliation
 * 9) Has ROR Id
 * 10) Has ORCID iD
 * 11) Cited By Patent
 * 12) Cited by Scholarly Work
 * 13) Search by Institution
 * 14) Search by country/region
 * 15) Research receive funding from certain agencies (with filter to select different agencies)
 * 16) Search by conference name

Academic search engine optimization
Academic search engine optimization (ASEO) is the process of optimizing scholarly literature for academic search engines in general. An example is optimizing articles to be retrieved by Google Scholar. This research demonstrates that Google Scholar takes these factors into consideration when user searches an academic article: relevance, citation counts, author and publication name, year published, sources indexed by Google Scholar

Misconceptions
In the context of academic search systems, many believe that more coverage is always better; however, while this increases the recall of a query, it simultaneously decreases the precision all else being held constant. Thus, while some supplementary search systems may have great coverage, it may prove more fruitful for a researcher to utilize a principal search system specific to the query's domain.