Monday, October 14, 2013

Webometrics and Research Impact Analysis

The field of webometrics grew out of citation analysis, bibliometrics, and scientometrics (also referred to as cybermetrics and informetrics).   Webometrics is “the study of web based content with primarily quantitative methods for social science research goals using techniques that are not specific to one field of study” (Thelwall 2009, 6).  Drawing from the early citation analysis work of Garfield (1972) for journal evaluation, Almind and Ingwersen (1997) are credited with the term “webometrics” and illustrated how scholarly  web artifacts can be assessed in terms of visibility and relationships to each other – much like that of traditional citation analysis (Thelwall 2009; Björneborn and Ingwersen 2001; Björneborn and Ingwersen 2004).  This literature includes extensive discussion of web impact analysis, search engine optimization, link analysis, and tools like SocSciBot, web crawlers, LexiURL Searcher (now Thelwall’s Webometric Analyst), web traffic rankings, page ranking, and citation networks.  The web has become a global publishing platform with very sophisticated indexing and citation analysis capabilities (Jalal, Biswas, and Mukhopadhyay 2009; Kousha and Thelwall 2009).

The interconnectedness of web information is especially suited to scholarly communications where web mentions (i.e., citations), types of links between web pages (relationships), and the resulting network dynamics produce quantifiable metrics for scholarly impact, usage, and lineage (Kousha 2005; Thelwall 2009; Bollen, Rodriquez, and Van de Sompel 2007).   Webometrics can be used to analyze material posted to the web and the network structure of references to academic work by not only determining the frequency of citation, but also rank or score of these mentions by the weight or popularity of the referring hyperlinks.  The resulting metrics (linkages, citations, mentions, usage, etc.) are analogous to reputation systems derived from traditional citation analysis procedures.   This has since been operationalized for citation analysis with web-based tools such as Harzing’s “Publish or Perish” and the University of Indiana’s “Scholarometer.”  These tools have leveraged the power and accessibility of GS to exceed that of proprietary indices like ISI-Web of Science and Scopus (see Hoang, Kaur, Menczer 2010; Harzing and van der wal 2009; Moed 2009; Falagas, Pitsouni, Malietzis, and Pappas 2008; Neuhaus and Daniel 2008; and MacRoberts and MacRoberts 2010).  

Nearly all of the literature on webometrics related to scholarly evaluation replicates traditional citation analysis.  Much of this research explores whether open-access indices, especially GS, can produce similar citation metrics as that of ISI and Scopus citation indices (see Harzing and van der Wall 2009; Kousha and Thelwall 2009; Meho and Sugimoto 2009).  When GS was launched in 2004, it did not have the coverage of ISI or Scopus.  That has since changed, and with GS being the most “democratic” of the three, it has shown to produce comparable results to the other two for many disciples (Harzing, 2010).

Some of the initial applications of webometrics were focused on assessing hyperlinks to estimate “web impact factors” for web sites of scientific research as well as universities as a whole (Mukhopadhyay 2004).  By analyzing both outlinks and inlinks (i.e., backlinks and co-linking), the volume, reach, and hierarchy of web sites through the structure of the Domain Name System (DNS), for instance, the top-level domains, sub-level domains, and host (or site) level domains can determine the country , organization type, and page context of these links (see Thelwall 2004 for further discussion).  This information can be extracted to derive the network relationships among the many web sites much like a social network.  This network approach to web site relationships can also be applied to scholarly artifacts that appear or are referenced on the web, and indices or search engines navigate databases of link structures, much like that proposed by Garfield (1955) for citation indexing (Neuhaus and Daniel 2006). 

Thelwall, Klitkov, Verbeek, Stuart, and Vincent (2010) point out the challenge faced by including gray literature stating, “A big disadvantage of link analysis webometrics, in contrast to citation analysis, is that web publishing is heterogeneous, varying from spam to post-prints.  As a result, the quality of the indicators produced is typically not high unless irrelevant content is manually filtered out, and the results also tend to reflect a range of phenomena rather than just research impact” (p.2).  Because there are no standardized citation-like databases for gray literature, in particular, there is not the same level of control over how artifacts are cited on the web.  This issue will be solved as webometrics becomes more fully utilized.

Unlike citation analysis and bibliometrics which focus on references to books, chapters, and journal articles by a small audience of academics, this approach encompasses a greater portion of the scholarly footprint by including some of the gray literature and non-refereed output of faculty members.  The metrics that will be discussed delineate four dimensions that are implicit to the spirit of citation analyses.  These are productivity, visibility, reputation, and impact.  Each of these has been discussed either directly or indirectly in the scholarly communications and citation analysis literature, but not explicitly in terms of faculty evaluation criteria.  This is primarily due to the fact that the application of webometrics departs from the control and domains of academic publishing companies and academics themselves as sources of reputational measures.  My next blog post will briefly discuss each of these.


Almind, T. C, and P. Ingwersen. 1997. “Informetric analyses on the World Wide Web: methodological approaches to ‘Webometrics’.” Journal of documentation 53 (4): 404-426.

Bollen, J., M. A Rodriguez, H. Van de Sompel, L. L Balakireva, and A. Hagberg. 2007. The largest scholarly semantic network... ever. In Proceedings of the 16th international conference on World Wide Web, 1247-1248. ACM.

Björneborn, L., and P. Ingwersen. 2001. “Perspective of webometrics.” Scientometrics 50 (1): 65-82.

Björneborn, L., and P. Ingwersen. 2004. “Toward a basic framework for webometrics.” Journal of the American Society for Information Science and Technology 55 (14): 1216-1227.

Falagas, M. E, E. I Pitsouni, G. A Malietzis, and G. Pappas. 2008. “Comparison of PubMed, Scopus, web of science, and Google scholar: strengths and weaknesses.” The FASEB Journal 22 (2): 338-342.

Garfield, E. 1955. “Citation indexes to science: a new dimension in documentation through association of ideas.” Science 122: 108-111.

Garfield, E. 1972. Citation analysis as a tool in journal evaluation. In American Association for the Advancement of Science.

Harzing, A. W. 2010. The publish or perish book. Tarma Software Research.

Harzing, A. W, and R. van der Wal. 2009. “A Google Scholar h‐index for journals: An alternative metric to measure journal impact in economics and business.” Journal of the American Society for Information Science and Technology 60 (1): 41-46.

Hoang, D. T, J. Kaur, and F. Menczer. 2010. “Crowdsourcing scholarly data.”

Jalal, S. K, S. C Biswas, and P. Mukhopadhyay. 2009. “Bibliometrics to webometrics.” Information Studies 15 (1): 3-20.

Kousha, K. 2005. “Webometrics and Scholarly Communication: An Overview.” Quarterly Journal of the National Library of Iran [online] 14 (4).

Kousha, K., and M. Thelwall. 2009. “Google Book Search: Citation analysis for social science and the humanities.” Journal of the American Society for Information Science and Technology 60 (8): 1537-1549. 

MacRoberts, M. H., and B. R. MacRoberts. 2010. “Problems of citation analysis: A study of uncited and seldom‐cited influences.” Journal of the American Society for Information Science and Technology 61 (1): 1-12.

Meho, L. I, and C. R Sugimoto. 2009. “Assessing the scholarly impact of information studies: A tale of two citation databases—Scopus and Web of Science.” Journal of the American Society for Information Science and Technology 60 (12): 2499-2508.

Moed, H. F. 2009. “New developments in the use of citation analysis in research evaluation.” Archivum immunologiae et therapiae experimentalis 57 (1): 13-18. 

Mukhopadhyay, Parthasarathi. 2004. “Measuring Web Impact Factors : A Webometric Study based on the Analysis of Hyperlinks.” Library and Information Science: 1-12.

Neuhaus, C., and H. D Daniel. 2008. “Data sources for performing citation analysis: an overview.” Journal of Documentation 64 (2): 193-210.

Thelwall, M. 2004. Link analysis: An information science approach. Academic Press.

Thelwall, M. 2009. “Introduction to webometrics: Quantitative web research for the social sciences.” Synthesis lectures on information concepts, retrieval, and services 1 (1): 1-116.

Thelwall, M., A. Klitkou, A. Verbeek, D. Stuart, and C. Vincent. 2010. “Policy‐relevant Webometrics for individual scientific fields.” Journal of the American Society for Information Science and Technology 61 (7): 1464-1475.


  1. We do web research to gather information using internet, and since everyone has their internet then it is an easy job but still requires attention and focused.

  2. This is very good information i am looking for, it will help me a lot.

    journal of undergraduate research