As we have seen, YouTube is more than cat videos and tutorials, it’s becoming a channel for science. That it’s why I’ve been reading some papers to prepare my next study. In this case, I selected a publication in Scientometrics named “YouTube and Science: models for research impact“. The researchers examine how citations of research articles in YouTube video descriptions can signal, or even predict, their academic social impact. Next, I’ll do my best to try to explain what I understood from the study =)
Shaikh, A.R., Alhoori, H. & Sun, M. YouTube and science: models for research impact. Scientometrics 128, 933–955 (2023).
https://doi.org/10.1007/s11192-022-04574-5
Can YouTube help understand the impact of scientific research?
YouTube has evolved far beyond a platform of entertainment. It has become a powerful space for sharing and amplifying scientific knowledge. Academic talks, conferences, presentations, and peer-reviewed research have claimed a spot on the platform as videos reaching millions of viewers worldwide. Traditionally, the impact of scientific research has been measured through citations counts in scholarly literature. But in the digital age, knowledge circulates in more fluid ways, some of those ways being social media platforms, blogs (like this one 😉), and specially videos.
This article aims to:
- Examine the relationship between research articles and YouTube videos that cite them.
- Identify which scientific subjects and video categories are most active in this ecosystem.
- Build predictive models for both scholarly and societal impact based on altmetric signals.
The Method:
To explore the questions, the authors built and combined several datasets:
- Almetric.com: Almetric ID, Title, Publication Date, Mendeley Readers, Scopus Subjects, News, Twitter, Facebook, Policy, GooglePlus, Reddit, Blogs, Patent, Wikipedia, Video Citations, YouTube Links, YouTube Citation, Scholarly Citation.
- YouTube: Link, Cited ID (Almetric ID), Title, Views, Likes, Dislikes, Subscriber Number, Publication Date, Description, Video Category, Comments, Video Views.
The datasets created were:
- A1: 500,000 random research articles from Altmetric.com
- A2: 150,624 articles from Altmetric.com with at least one video mention.
- B: 93,931 YouTube videos citing research articles in their descriptions.
- C1: [A2 + B] -> Adding video data from B to A2
- C2: [A2 + B]-> Adding article data from A2 to B
They applied statistical analysis and four classification algorithms (Bernoulli Naive Bayes, Decision Tree, KNN, and Random Forest) to predict three key target variables:
- Whether a research article will be cited in a YouTube video.
- Whether an article will reach a high level of scholarly citations.
- Whether a video citing a research article will become popular (based on view count).
Key Findings:
- Most research is not cited in videos: only about 16% of the sampled articles were cited in at least one YouTube video.
- Most cited disciplines: Medicine and Biochemistry dominate citations. Biological Sciences rank second. Engineering and Physical Sciences are 5th and 6th, while Computer Sciences is 9th.
- Most active Categories: Education, Science & Technology, People & Blogs.
- Interestingly, although Entertainment videos were fewer in number, they generated a high volume of views, suggesting strong potential for public reach.
The machine learning models revealed that:
- Twitter mentions and news coverage are the strongest predictors of whether an article will be cited in a YouTube video.
- Mendeley readership and video views are key predictors of scholarly impact (citations).
- The comment count did not have a high correlation with views of videos.
- Subscriber count is the most important factor in predicting video popularity.
- Across all models, Random Forest delivered the best performance, with F1 scores ranging from 0.80 to 0.94.
Key Takeaways:
This study shows that:
- YouTube plays and increasingly important role in the dissemination of scientific research.
- Mentions of scholarly articles in videos are not just a marker of social visibility, they are also associated with greater academic impact over time.
- Biomedical fields dominate the video-citation landscape.
- Social Media signs can serve as early indicators of impact.
- Visibility factors (subscribers, likes, mentions) can be as relevant as traditional bibliometrics.
Limitations & What’s left to explore:
The authors also noted some limitations to the analysis:
- Temporal dynamics were not incorporated.
- Is based on metadata only, not the content of the videos.
- It relies on a single altmetric source.
Conclusions:
The findings demonstrate that YouTube is not just a space for entertainment, it’s an emerging channel for scientific communication. One of the most important points is that mentions in videos correlate with increased visibility and may boost academic impact. Social media signals don’t just follow science, they help drive it.
For researchers and science communicators, this highlights the importance of understanding how research circulates through audiovisual media and how early altmetric signals can complement traditional impact measures.
See you in the next paper =)