[SWJ] Review received, #3791-5005

0 views

Skip to first unread message

www.semantic-web-journal.net

<contact@semantic-web-journal.net>

unread,

Sep 22, 2025, 9:22:08 AM9/22/25

to text2kg-swj@googlegroups.com, krzysztof.janowicz@univie.ac.at, eva.blomqvist@liu.se, cog-academics@coganshimizu.com, saideepthi.dondolu@newgen.co, mccain.32@wright.edu

A review has been provided for a paper which was assigned to you.

Authors: Amna Binte Kamran, Nigar Azhar Butt, Amna Basharat
Title: Semantic Enrichment of Hadith Corpus - Knowledge Graph Generation from
Islamic Text
Submission Type: 'Full Paper'
URL:
https://www.semantic-web-journal.net/content/semantic-enrichment-hadith-corpus-knowledge-graph-generation-islamic-text-0
Tracking number: 3791-5005
Assigned editor: Guest Editors KG Gen from Text 2023
(text2...@googlegroups.com)
Review submitted by: Raza
Suggested decision: Minor Revision

Review Comments:
Knowing that this work is an extension of the first study, I began by
reviewing the original version to understand its methodology, ontology
design, and competency questions.

Competency questions (CQs) presented in the second paper are essentially the
same as those in the first paper, with only minor surface-level variations in
wording. For example, the first paper includes questions such as “What is
the lineage of a particular narrator?” or “Find hadith narrated by
Narrator A,” whereas the second paper rephrases these into “Which
narrators have not narrated any sacred hadith?” or “Find hadith
‘discussesTopic’ Topic X.” Despite slight differences in terminology
(e.g., shifting focus from narrators to topics or events), the underlying
structure, intent, and scope of the CQs remain unchanged.

This repetition indicates that the second paper does not demonstrate new
competency questions that reflect the distinct methodological contribution it
claims (i.e., text-driven knowledge graph generation with NLP and similarity
computation). Instead, the CQs are largely recycled from the first work,
which suggests that the extension is not fully articulated at the level of
use cases and evaluation.

Page 15, line 24: As mentioned in Section 5, the authors state that the scope
of the ontology is defined through a set of competency questions (CQs).
However, they do not acknowledge that these CQs are essentially the same as
those used in their earlier work. This omission is problematic because it
suggests a lack of transparency in how the evaluation requirements were
established. If the paper is positioned as an extension, the authors should
have either (i) clarified that the CQs are intentionally reused and explained
why they remain appropriate, or (ii) introduced new or refined CQs aligned
with the novel contributions of the work. By failing to do so, the paper
leaves uncertainty about the standards used to validate the extended ontology
and weakens the claim of substantive advancement over the previous study.

As previously mentioned by a reviewer: “1. Generalization Issues: The
section mentions that existing studies in hadith primarily focus on specific
domains, like prophetic medicine and the chain of narrators, too broadly and
do not acknowledge other studies in the domain.”

Although the authors have added detailed explanations about leveraging NLP
techniques, preprocessing steps (pages 2, lines 18–49), and the integration
of external knowledge graphs to enhance the SemanticHadith ontology, they do
not address the previous concern regarding generalization across the hadith
domain. The discussion focuses on technical methodology, linguistic
processing, and practical applications, but it does not acknowledge other
studies beyond the specific domains of prophetic medicine or chains of
narrators, nor does it provide evidence that the framework generalizes to the
full breadth of hadith literature. Without such references or broader
validation, the previous reviewer comment regarding generalization remains
unaddressed, even though there is an enhanced discussion of NLP techniques in
sections 2.3 and 2.4.

Finally, the SemanticHadith knowledge graph is claimed to be freely
accessible at http://www.semantichadith.com [1]
. However, despite numerous attempts, I was unable to access the knowledge
graph. Therefore, I am unable to judge its applicability, SPARQL
functionality, or other practical capabilities, preventing a proper
evaluation of the extended ontology in practice.

To strengthen the paper and address these shortcomings, the authors may
consider referencing additional relevant work, including:

=Quran Knowledge Graphs=

Elsayed, E., & Fathy, D. R. (2019). Evaluation of Quran recitation via OWL
ontology-based system. Proceedings of the International Conference on
Computer and Communication Engineering (ICCCE 2019).

Iqbal, R., Azmi Murad, M. A., & Ashraf, A. (2020). Quantitative assessment of
concept maps for conceptualizing domain ontologies: A case of Quran.
Proceedings of the International Conference on Computer and Communication
Engineering (ICCCE 2020).

Jiang, S., & Mosa, M. A. (2022). Reliable semantic communication system
enabled by knowledge graph. Entropy, 24(12), 1704.
https://doi.org/10.3390/e24121704 [2]

The authors of SemanticHadith 2.0 do not sufficiently discuss prior closely
related works, particularly Mosa (2025) and Shafie (2021-2023). Both of these
studies already apply hybrid AI and knowledge graph approaches to address
critical tasks in Hadith analysis—narrator disambiguation in Mosa and
retrieval plus semantic-similarity classification in KASHAF. By failing to
acknowledge these contributions, the manuscript overlooks important context
for positioning its methodology and novelty. A proper comparative discussion
would clarify how SemanticHadith 2.0 extends, complements, or differentiates
itself from these approaches, especially in terms of multi-collection
coverage, expert-labeled similarity pairs, and explainable reasoning. Without
this, readers may misinterpret the claimed contributions as more original or
distinct than they are.

Mosa, M. A. (2025). Synergizing structure and semantics: a knowledge
graph-transformer framework for narrator disambiguation in hadith networks.
Digital Scholarship in the Humanities, fqaf088.

Shafie, Omar Abdulfattah. "KASHAF: A Knowledge-Graphs Approach Search-Engine
for Hadith Analysis & Flow-Visualization." Master's thesis, Hamad Bin Khalifa
University (Qatar), 2021.

Shafie, O., Darwish, K., & Jansen, B. J. (2023, July). Robust Hadith IR using
Knowledge-Graphs and Semantic-Similarity Classification. In CS & IT
Conference Proceedings (Vol. 13, No. 12). CS & IT Conference Proceedings.

===Method and results===

1.Section 6.3 Expert Validation and Insights:

Author states that 100 hadith pairs were “randomly selected from the top
similarity bins” for expert validation, but it does not clarify the method
used for random selection. Without specifying the procedure—whether it was
simple random sampling, stratified sampling across collections, or another
approach—it is unclear how representative these pairs are of the broader
corpus. This is particularly important given the hierarchical and uneven
distribution of hadith across collections, and the potential for systematic
biases in similarity scores. In religious domains, unlike industrial or
scientific datasets, interpretations can vary, and without a transparent
selection process, it is difficult to assess whether the expert validation
accurately reflects the reliability of similarity computations across the
full corpus.

2. Section 6.4 Integration into knowledge graph

The section appears intended to justify why only expert-validated hadith
pairs were included in the knowledge graph, while also highlighting a
refinement (the “strongly similar” property) for the top similarity bin.
However, it is confusing because it suggests a broader application to other
collections (Sahih Muslim, Ibn Maja, Sunan Abi Dawood, and Nisai) without
explaining whether this has actually been implemented or is purely
prospective. The paragraph blurs the line between what has been done in the
current study and what is planned for future work, making it unclear to the
reader whether the knowledge graph currently contains only Sahih Bukhari
pairs or has been extended to other collections. It is not immediately clear
what assumptions or decisions led the authors to limit the graph to
expert-labeled pairs, and the mention of future expansion could be
misinterpreted as an existing contribution.

Mirarab, A. (2024). Explainable large language model for Islamic and
humanities studies. STIM Journal of Islamic Studies and Technology.
https://stim.qom.ac.ir/article_3085.html [3]

3. Section 6.5 Challenges and Insights

The authors discuss the potential use of LLMs in the future work section as a
way to improve similarity computations and capture semantic and contextual
nuances. However, the manuscript does not emphasize in the literature review
that such approaches already exist and have been applied in related domains,
such as Quranic verse similarity, religious question answering, and semantic
knowledge graphs. By not situating LLM-based methods within the existing body
of work, the paper misses an opportunity to acknowledge prior approaches,
contrast them with their current methodology, and justify the novelty or
limitations of their proposed framework. This omission weakens the
contextualization of the proposed future directions.

================================================================================================

1.Originality: Work in the Islamic hadith collection context is original, but
not for general knowledge graph development. The extension largely recycles
competency questions from the previous work.

2.Significance of the results: Without access to the knowledge graph and
ontology, the practical impact and applicability of the results remain
uncertain.

3.Quality of writing: The manuscript is well-written. However, the
organization of tables and figures could be improved for clarity—figures
should appear close to the discussion about them, as currently, some figures
(e.g., Figure 4) are referenced on one page but appear much later (page 18),
disrupting the flow for the reader.

4. Data availability

a. Data organization and README: The repository on GitHub is reasonably
organized and contains a README file, which provides basic guidance for
understanding the provided files. This makes it easier to navigate the
resources.

b. Completeness for replication: While the repository contains ontology and
knowledge graph files, it lacks a functional SPARQL endpoint or fully
accessible knowledge graph. As a result, replication of experiments or direct
verification of similarity computation and interlinking results is not
possible.

c. Repository suitability: The repository is hosted on GitHub, which is a
recognized platform for long-term accessibility and basic discoverability.
However, the main resources (knowledge graph and SPARQL endpoint) are not
operational, limiting the practical utility of the repository.

d. Completeness of data artifacts: The provided data artifacts are incomplete
for full replication or evaluation. Without a working knowledge graph or
query endpoint, critical aspects of the study, such as interlinking and
similarity computations, cannot be independently verified.

Comments for editor:
While the paper presents an extended version of the SemanticHadith ontology
and knowledge graph, the work exhibits several critical shortcomings. The
competency questions are largely recycled from the original study, offering
little evidence of new or refined evaluation criteria aligned with the
claimed methodological contributions. Generalization across the broader
hadith domain remains unaddressed, despite detailed descriptions of NLP
techniques and integration with external knowledge graphs. Moreover, the
inaccessibility of the knowledge graph prevents verification of its practical
functionality, SPARQL support, and real-world applicability. Without clearer
articulation of novel contributions, validation across diverse hadith texts,
and accessible resources for reproducibility, the paper’s claim of
substantive advancement over previous work remains unconvincing. Inclusion of
additional references from related Quranic and Islamic knowledge graph
research could help situate the work in the broader landscape and strengthen
its contextual foundation.

You can access the paper by logging in as editorial board member on
http://www.semantic-web-journal.net/

# SWJ in a Nutshell #

The Semantic Web journal is an open and transparent journal. The full
manuscripts, metadata, names of the reviewers (if they do not op-out), their
reviews, names of the assigned editors, and manuscript decisions are public
and will be made accessible within a Linked Data-based knowledge graph as
well as secondary data products such as document embeddings via machine
learning techniques. Rejected manuscripts can be depublished on request from
the journal's webpage. Nonetheless, they may have already been indexed and
copied by third parties such as search engines outside of our control.
Volunteered community reviews are welcome. Different paper categories have
explicitly stated review criteria that have to be addressed by authors and
reviewers. According to our 2-strike rule, a paper has to receive at least a
minor revision decision after the second round of reviews, otherwise it will
be rejected.

Author information: http://www.semantic-web-journal.net/authors
Reviewer information: http://www.semantic-web-journal.net/reviewers

[1] http://www.semantichadith.com
[2] https://doi.org/10.3390/e24121704
[3] https://stim.qom.ac.ir/article_3085.html

www.semantic-web-journal.net

<contact@semantic-web-journal.net>

unread,

Sep 28, 2025, 6:32:53 PM9/28/25

to text2kg-swj@googlegroups.com, krzysztof.janowicz@univie.ac.at, eva.blomqvist@liu.se, cog-academics@coganshimizu.com, saideepthi.dondolu@newgen.co, mccain.32@wright.edu

Review submitted by: Abid Ali Fareedi
Suggested decision: Accept

Review Comments:
Review Report

The revised version of the manuscript demonstrates significant improvements
and shows that the authors have taken great care in addressing the concerns
raised in the earlier submission. Overall, the manuscript is well-structured,
coherent, and clearly contributes to the field.

Summary

The recent manuscript showcases an updated methodology for constructing a KG
from the Hadith Corpus, building upon the SemanticHadith Ontological data
model (domain-centric ontology). The authors have refined their data
processing, NLP-based entity extraction, and semantic modelling approaches to
improve interoperability and accessibility of Islamic knowledge resources.
The revised version of the manuscript has comprehensively strengthenedngs in
the first version and strengthened the manuscript considerably and now offers
methodological rigour and practical relevance.

Strengths of the Revised Version

• Responsiveness to Feedback: The authors have diligently addressed the
comments from the previous review round. They have been improved by
mitigating issues (suggested by reviewers) related to generalization, detail
on NLP techniques, clarity, methodology transparency, assumptions and
limitations, comparative analysis, and reference consistency.
• Clarity and Structure: The manuscript now flows more smoothly, with more
precise articulation of the problem statement, background, methodology, and
outcomes.
• Methodological Detail: The revisions add greater depth to the NLP
techniques, data curation processes, and expert validation steps. This makes
the work more transparent and reproducible.
• Ontology Design and Results: The extended SemanticHadith ontology is now
more comprehensively described, with improved explanations of modelling
decisions and interoperability considerations.
• Overall Contribution: The work demonstrates both scholarly and practical
significance. It meaningfully contributes to the growing research on semantic
technologies in underrepresented domains such as Islamic studies.

Issues Previously Raised – Now Addressed

• Lack of detail on NLP methodologies: the authors addressed this with more
precise explanations and examples (see Section 2.4, Section 4.1.1, 4.1.2 with
concrete examples).
• Terminology inconsistencies and formatting: the authors corrected these
throughout the manuscript.
• Limited discussion of assumptions and limitations: the authors improved
with a more balanced perspective with concrete examples throughout the
manuscript.
• Missing clarity on expert validation processes: now elaborated with
examples (see Section 3.6).
• Comparative analysis: The authors showcase the training setup for a
customized NER model using the spaCy NLP (see Section 4.2.1) and justify
performance metrics, including precision, recall and F1-score (see Section
4.2.2). They also improved the observations and results section with a micro
average strategy (see Table 2), dealt with the entity extraction process with
a concrete example, and dealt with entity variations.
• Discussed integration challenges between SemanticHadith and external
ontologies for reconciling structural and semantic differences.
• The authors also highlight and discuss futuristic approaches which handle
isolated matan by preprocessing the text to exclude sanad before embedding
using Euclidean distance or Manhattan distance. They also mentioned the
inclusion of the LLMS fine-tuned method for Islamic texts for better semantic
and contextual nuances.
• Reference inconsistencies: updated and properly cited.

Comments for editor:
Recommendation

The revised manuscript is substantially improved and demonstrates that the
authors have carefully integrated prior feedback. It is now clear,
methodologically sound, and makes a valuable contribution to the field.

Recommendation: Accept in its current form.

Reply all

Reply to author

Forward

0 new messages