Published on: April 2026
VISUAL RAG SYSTEM FOR WEB PAGE ANALYSIS
Md. Khais E. Chamana Sree M. Eshwar M. Sai Teja
P. Niharika
Hyderabad Telangana India
Article Status
Available Documents
Abstract
The system allows users to input a webpage URL, from which content is extracted, processed, and divided into smaller chunks. These chunks are converted into vector embeddings using a local embedding model, enabling efficient similarity-based retrieval. When a user submits a query, the system retrieves the most relevant content and feeds it into a powerful language model via the Groq API to generate precise and contextually accurate responses.
Additionally, the project includes features such as quick summary generation, chat history, and multimodal support for images and text, making it more interactive and user-friendly. The system is implemented using Streamlit for the interface, Sentence Transformers for embeddings, and LLMs for response generation.
This approach improves reliability, reduces hallucination, and ensures that responses are grounded in real-time data, making it highly useful for applications like research assistance, study tools, and intelligent document analysis.
How to Cite this Paper
Khais, M., Sree, E. C., Eshwar, M. & Teja, M. S. (2026). Visual Rag System for Web Page Analysis. International Journal of Creative and Open Research in Engineering and Management, <i>02</i>(04). https://doi.org/10.55041/ijcope.v2i4.249
Khais, Md., et al.. "Visual Rag System for Web Page Analysis." International Journal of Creative and Open Research in Engineering and Management, vol. 02, no. 04, 2026, pp. . doi:https://doi.org/10.55041/ijcope.v2i4.249.
Khais, Md.,E. Sree,M. Eshwar, and M. Teja. "Visual Rag System for Web Page Analysis." International Journal of Creative and Open Research in Engineering and Management 02, no. 04 (2026). https://doi.org/https://doi.org/10.55041/ijcope.v2i4.249.
References
- P. Lewis, et al., Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks, 2020.
DOI: https://doi.org/10.48550/arXiv.2005.11401 - O. Khattab and M. Zaharia, ColBERT: Efficient Passage Search via Contextualized Late Interaction, 2020.
DOI: https://doi.org/10.1145/3397271.3401075 - Y. Xu, et al., LayoutLM: Pre-training of Text and Layout for Document Image Understanding, 2020.
DOI: https://doi.org/10.1145/3394486.3403172 - N. Reimers and I. Gurevych, Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks, 2019.
DOI: https://doi.org/10.48550/arXiv.1908.10084 - A. Radford, et al., Learning Transferable Visual Models From Natural Language Supervision, 2021.
DOI: https://doi.org/10.48550/arXiv.2103.00020
Ethical Compliance & Review Process
- •All submissions are screened under plagiarism detection.
- •Review follows editorial policy.
- •Authors retain copyright.
- •Peer Review Type: Double-Blind Peer Review
- •Published on: Apr 11 2026
This article is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. You are free to share and adapt this work for non-commercial purposes with proper attribution.

