How to Find and Extract All URLs from a Website Using Olostep Maps API and Streamlit

Introduction

When building web crawlers, competitive analysis, SEO audits, or AI agents, one of the first critical tasks is finding all the URLs on a website.

While traditional methods like Google search tricks, sitemap exploration, and SEO tools work, there’s a faster, modern way: using Olostep Maps API.

In this guide, we’ll:

Introduce the challenge of URL discovery
Show how to build a live Streamlit app to scrape all URLs
Compare it with traditional techniques (like sitemap.xml and robots.txt)
Provide complete runnable Python code

Target Audience: Developers, Growth Engineers, Data Scientists, SEO specialists, and Founders who need structured, scalable scraping.

Why Extract All URLs?

Finding every page on a website can help you:

Analyze site structure (for SEO)
Scrape website content efficiently
Find hidden gems like orphan pages
Monitor website changes
Prepare data for AI agents and automation

Traditional Methods (Before Olostep)

1. Sitemaps (XML Files)

Webmasters often create XML sitemaps to help Google index their sites. Here’s an example:

<urlset>
  <url>
    <loc>https://example.com</loc>
  </url>
  <url>
    <loc>https://example.com/about</loc>
  </url>
</urlset>

To find sitemaps:

Other possible sitemap locations:

/sitemap.xml.gz
/sitemap_index.xml
/sitemap.php

You can also Google:

site:example.com filetype:xml

Problems:

Some websites don’t maintain updated sitemaps.
Not all pages may be listed.
Dynamic websites (heavy JavaScript) often leave out many pages.

2. Robots.txt

Example:

User-agent: *
Sitemap: https://example.com/sitemap.xml
Disallow: /admin

Good for finding disallowed URLs and sitemap links, but again not comprehensive.

The Modern Solution: Olostep Maps API

✅ Find up to 100,000 URLs in seconds.

✅ No need to manually find sitemap or robots.txt.

✅ Simple API call.

✅ No server maintenance or IP bans.

👉 Full code Gist

Let’s build a full Streamlit app to demo this!

🛠️ Full Project: Website URL Extractor with Olostep Maps API + Streamlit

1. Install Requirements

pip install streamlit requests

2. Python Code

import streamlit as st
import requests
import json

def fetch_urls(target_url, api_key):
    headers = {
        "Authorization": f"Bearer {api_key}",
        "Content-Type": "application/json"
    }
    payload = {"url": target_url}
    response = requests.post("https://api.olostep.com/v1/maps", headers=headers, json=payload)
    if response.status_code == 200:
        return response.json()
    else:
        st.error(f"Failed to fetch URLs. Status code: {response.status_code}")
        return None

st.title("🔎 Website URL Scraper")

st.markdown("Use Olostep Maps API to instantly extract all discovered URLs from any website. Great for SEO, scraping, site analysis, and more!")

api_key = st.text_input("Enter your Olostep API Key", type="password")
url_to_scrape = st.text_input("Enter Website URL (e.g., https://example.com)")

if st.button("Find URLs"):
    if api_key and url_to_scrape:
        with st.spinner("Fetching URLs..."):
            data = fetch_urls(url_to_scrape, api_key)
        if data:
            urls = data.get("urls", [])
            st.success(f"✅ Found {len(urls)} URLs!")
            for idx, u in enumerate(urls, start=1):
                st.markdown(f"{idx}. [{u}]({u})")

            st.download_button(
                "📄 Download URLs as Text File",
                data="\n".join(urls),
                file_name="discovered_urls.txt",
                mime="text/plain"
            )

📸 Example Output

✅ Found 35 URLs from https://docs.olostep.com

📥 Saved as discovered_urls.txt

⚡ Why Olostep Maps API Beats Traditional Methods

Feature	Sitemap/Robots.txt	SEO Spider	Olostep Maps
Instant Response	❌	❌	✅
Handles JS-heavy Sites	❌	⚠️ (Partial)	✅
Handles Big Sites	❌	❌ (Limit)	✅
No Setup Needed	❌	❌	✅
Easy Pagination	❌	❌	✅

📈 Conclusion

Using Olostep Maps API + a few lines of Streamlit code, you can build powerful website discovery tools in minutes.

No more worrying about sitemaps, robots.txt, or getting blocked by firewalls.

✅ Super fast

✅ Reliable

✅ Perfect for Growth Engineering, SEO, Scraping, and Automation.

🚀 Ready to try?

Written by:

Mohammad Ehsan Ansari

Growth Engineer @ Olostep

Happy scraping! 🚀

How UX and Marketing Are Saying the Same Things, Differently

How To Create A Partial Submersion Effect Photoshop Tutorial

Why we still need AM radio

How to Find and Extract All URLs from a Website Using Olostep Maps API and Streamlit

Introduction

Why Extract All URLs?

Traditional Methods (Before Olostep)

1. Sitemaps (XML Files)

2. Robots.txt

The Modern Solution: Olostep Maps API

🛠️ Full Project: Website URL Extractor with Olostep Maps API + Streamlit

1. Install Requirements

2. Python Code

📸 Example Output

⚡ Why Olostep Maps API Beats Traditional Methods

📈 Conclusion

🚀 Ready to try?

Check out our other content

How UX and Marketing Are Saying the Same Things, Differently

How To Create A Partial Submersion Effect Photoshop Tutorial

Why we still need AM radio

How UX and Marketing Are Saying the Same Things, Differently

How To Create A Partial Submersion Effect Photoshop Tutorial

Why we still need AM radio

The Role Of Technology In Raising Your Creative Game

Inside the controversial tree farms powering Apple’s carbon neutral goal

How to Harden Your Node.js APIs – Security Best Practices

Most Popular Articles

How UX and Marketing Are Saying the Same Things, Differently

How To Create A Partial Submersion Effect Photoshop Tutorial

Why we still need AM radio

The Role Of Technology In Raising Your Creative Game

Inside the controversial tree farms powering Apple’s carbon neutral goal

How to Harden Your Node.js APIs – Security Best Practices

Send In The Clowns Photoshop Tutorial

The vibes are shifting for US climate tech