Free Local RAG Scraper for Custom GPTs and Assistants

Dec 10, 2025 - 00:39
 0  2
Free Local RAG Scraper for Custom GPTs and Assistants

TL;DR Scrape Squarespace/Shopify Websites for use with AI

This web scraper runs entirely in your browser and is perfect for creating training data for AI models. It works by reading the website's sitemap.xml file, making it particularly well-suited for modern platforms like Squarespace and Shopify that automatically generate sitemaps.

The scraper preserves the structure of your content, including headings, paragraphs, lists, and tables, while removing unnecessary elements like navigation menus and footers. It also captures metadata, images, and PDF documents.

  • This scraper uses a CORS proxy to access websites. No additional setup is required.

    The scraper will:

    • Read the website's sitemap.xml to find all pages. It runs in your browser and none of your information is stored on a server.

    • Process each page while preserving content structure.

    • Generate a markdown file with all content. You can add this file as knowledge to a custom GPT or attach it to an AI assistant.

    • Allow you to preview each page's content before saving.

Ready to start  

Cursor AI did most of the heavy lifting in developing this app.

If you have improvement suggestions, please share them in the comments.

What's Your Reaction?

Like Like 0
Dislike Dislike 0
Love Love 0
Funny Funny 0
Angry Angry 0
Sad Sad 0
Wow Wow 0
Suraj Manikpuri Hi, I’m Suraj Manikpuri, an Engineer with over 15 years of industrial experience and a deep passion for technology and artificial intelligence. My professional journey has allowed me to work across diverse industries, where I’ve gained hands-on expertise in problem-solving, system optimization, and applying innovative tech solutions to real-world challenges. For the past 15 years, I’ve dedicated myself to learning and experimenting with technology — not just from books or tutorials, but through real practical exposure. My curiosity about how emerging tools work led me to explore and personally test numerous AI tools and platforms. By experimenting first-hand, I’ve been able to understand how artificial intelligence is transforming industries, creativity, and the way we live and work. Through FutureTrendHub.com, I share insights drawn from my personal experience, technical knowledge, and continuous learning in the fields of AI, automation, and modern technology trends. My goal is to make complex topics simple, engaging, and useful for readers who want to stay informed and future-ready. I believe in learning by doing, and my approach to content creation reflects that philosophy. Each article I write is backed by real-world experience, research, and an engineer’s perspective — to ensure it’s accurate, practical, and valuable for both tech enthusiasts and professionals. Technology is evolving faster than ever, and I’m here to help others understand and harness its power. Let’s explore the future together.