GitHub Repo Scanning: Why URL Scans Aren't Enough

Quick answer: URL scans only see your deployed frontend. Repository scanning analyzes the source: backend code, config files, dependencies, and every commit in git history, where deleted secrets still live. SimplyScan finds that 30% of high-severity issues in AI-built apps are missed by URL scans alone. Scan your repo to secure the full stack.

By Gabriel CA · Kraftwire Software

2026-02-11 · 10 min read

Why URL Scans Are Only the Tip of the Iceberg

A URL scan is a "black box" test that analyzes your deployed frontend · the HTML, JavaScript, CSS, and network requests visible to any visitor. It is essential for catching exposed API keys in client-side bundles, missing security headers, and XSS prevention. However, it cannot see your backend source code, git history, or server-side configurations.

GitHub repo scanning is a "white box" analysis that inspects the source: backend logic, .env templates, database migrations, and every historical commit where deleted secrets still live. To secure a "vibe-coded" application built with tools like Lovable or Bolt.new, you must scan the repository to find what the browser never sees. SimplyScan's research into AI-generated applications shows that while URL scans catch immediate frontend leaks, they miss approximately 30% of high-severity vulnerabilities that exist only in the source code or configuration layers.

What a URL Scan Misses (and Why It Matters)

While a URL scan is a fast way to check if a website is secure, it has significant blind spots. If you are building with tools like Lovable, Bolt.new, or Cursor, your app likely relies on server-side logic that a URL scanner cannot reach.

1. Backend Source Code and Edge Functions

Modern apps use Edge functions (Supabase Edge Functions, Vercel Functions) to handle sensitive logic. A URL scan can see the *result* of an API call, but it cannot see the code *inside* the function. Repo scanning detects insecure patterns like eval() usage or unsanitized inputs that lead to code injection. For example, an AI might generate a function that concatenates strings into a database query, creating a SQL injection risk that is invisible from the frontend.

2. Configuration and Infrastructure Files

Files like docker-compose.yml, config.toml, and CI/CD pipeline definitions (.github/workflows) often contain misconfigurations. For example, a Docker file might run a container as root, or a workflow file might leak a GitHub Token. These files are never served to the browser, making them invisible to URL-based vulnerability scanners.

3. Database Schemas and Migrations

AI tools often generate database migration files that include seed data. If that seed data contains real user emails or default passwords, it becomes a permanent part of your repository. A URL scan only sees the data the API chooses to expose; a repo scan sees the underlying structure and any hardcoded credentials used to initialize the database.

4. Hidden Secrets in Git History

This is the most critical gap. A URL scan only sees the *current* version of your app. A GitHub repo scan looks at every commit ever made. If you committed an OpenAI key on Monday and "deleted" it on Tuesday, the key is still in your .git folder. Attackers use automated bots to scrape these historical commits. According to the GitGuardian State of Secrets Spillage 2024 report, over 12.8 million secrets were exposed on GitHub in 2023 alone.

The Danger of "Vibe-Coded" Repositories

"Vibe coding" · using AI to generate entire applications from prompts · introduces unique risks. AI models are trained on vast amounts of public code, including code that contains bad security practices or outdated patterns.

Hardcoded Credentials in Generated Code

When an AI generates a docker-compose.yml or a server.js file, it often uses placeholders that look like real credentials. Sometimes, if the prompt is specific enough, the AI might even suggest using a "test" key that it found in its training data. Without a secret scanner, these hardcoded strings can easily slip into production.

Insecure Dependency Trees

Your package.json or requirements.txt lists your dependencies, but those dependencies have their own dependencies (transitive dependencies). A repo scan analyzes your lock files to find known CVEs. A URL scan cannot do this because the deployed app does not expose its full dependency manifest. AI tools frequently suggest older versions of libraries that may have unpatched vulnerabilities.

AI-Generated Logic Flaws

AI often misses the nuance of broken access control. It might generate a "Delete User" endpoint but forget to check if the requester is an admin. A repo scan can flag these missing authorization checks by analyzing the server-side route handlers, whereas a URL scan would only find this if it actively attempted to exploit every endpoint.

How GitHub Repo Scanning Works

Repository scanning (also known as Static Analysis Security Testing or SAST) involves several layers of inspection:

Secret Detection: Searching for patterns that match API keys (Stripe, AWS, OpenAI), private keys, and database strings.
Static Analysis: Looking for "sources" (user input) that flow into "sinks" (dangerous functions like innerHTML or db.query) without sanitization.
Dependency Auditing: Comparing your npm or pip packages against databases of known vulnerabilities like the GitHub Advisory Database.
History Scanning: Walking back through the Git tree to find secrets in deleted lines of code.

This highlights why scanning the current state of the code is insufficient; you must scan the history to ensure no legacy credentials remain active.

Comparing URL Scanning and Repo Scanning

The two methods are complementary rather than redundant. A URL scan, such as the free scan provided by SimplyScan, is the only way to verify your CSP guide implementation is actually working in the browser and that your security headers are correctly served by the production environment. It also checks for AEO visibility and accessibility (WCAG) compliance.

Conversely, a repo scan is the only way to ensure your environment variables aren't leaked in a config file and that your Supabase RLS policies are correctly defined in your migration files. While a URL scan checks the "outside-in" perspective, the repo scan provides the "inside-out" view.

URL Scanning Strengths: Security headers, SSL/TLS health, frontend API key leaks, SEO, speed, and AI crawler accessibility.
Repo Scanning Strengths: Git history, backend logic, dependency CVEs, infrastructure-as-code (Docker/K8s), and server-side secrets.

When to Use Each Approach

Use URL Scanning for Immediate Feedback

You should run a free security scan every time you deploy. It takes ~30 seconds and checks for "low hanging fruit" that attackers exploit first:

Missing security headers.
Exposed API keys in the frontend.
Performance bottlenecks and AEO visibility.

Use Repo Scanning for Deep Audits

Repo scanning is mandatory during these phases:

Before making a private repo public: Ensure no secrets are hidden in the history.
When adding new team members: Audit the code they contribute for insecure patterns.
Before a major launch: Verify that Supabase RLS or Firebase rules are correctly defined.
Monthly Maintenance: To catch new CVEs discovered in your existing dependencies.

How SimplyScan Bridges the Gap

SimplyScan is designed specifically for the modern "vibe-coding" workflow. We understand that developers using Windsurf or Replit move fast and need security that keeps up.

One Pass, 8 Dimensions: Our scanner doesn't just look at security; it grades speed, SEO, AEO, and accessibility (WCAG) in one go.
AI-Specific Risks: We detect risks unique to AI-built apps, such as prompt injection vulnerabilities and exposed AI provider keys.
Pro Monitoring: Our $24/month plan includes uptime monitoring and scheduled rescans with Slack/GitHub/Linear integrations.

Step-by-Step: Securing Your GitHub Repo

Scan the URL first: Run a SimplyScan. If it finds keys in your JavaScript bundles, you have an immediate leak that needs rotation.
Connect GitHub: Use a repository scanner to link your repository. This typically uses GitHub's secure OAuth · you can revoke access at any time.
Audit History: Look for the "Secrets in History" report. If a key is found, remove it from git history using tools like BFG Repo Cleaner or git-filter-repo.
Check Dependencies: Update any packages flagged with high-severity CVEs in your package-lock.json.
Verify RLS: If you use a backend-as-a-service, ensure your RLS policies are committed in your migrations and aren't set to "Permissive" by default.

By combining URL and repo scanning, you protect your application from the browser to the database. Start with a free scan today to see your site's health across security, speed, and AI visibility.

FAQ

Are secrets in a private GitHub repo still a risk?

Yes. Private repositories are often shared with third-party integrations, CI/CD tools, and multiple collaborators. If a single collaborator's account is compromised, your entire history of secrets is exposed. Furthermore, many developers eventually flip private repos to public, forgetting that the entire git history · including every secret ever committed · becomes public instantly. Always treat committed secrets as compromised.

How do I remove a secret from git history?

Simply deleting the secret in a new commit is not enough; it remains in the git history. You must use a tool like git-filter-repo or the BFG Repo Cleaner to rewrite the repository's history and purge the string. After cleaning, you must force-push the changes. Most importantly, you must rotate the leaked key immediately, as it may have already been cached by GitHub's internal logs or external scrapers.

How many secrets get leaked on GitHub?

According to the GitGuardian State of Secrets Spillage 2024 report, over 12.8 million secrets were exposed on GitHub in 2023. This represents a significant increase year-over-year. Automated bots scan every public commit within seconds of it being pushed, meaning that if you push a live AWS or OpenAI key, it is likely compromised before you can even hit "delete."

Can a repo scan find vulnerable npm packages?

Yes. A repository scan analyzes your package-lock.json or yarn.lock files to build a full dependency tree. It then cross-references every package version against the GitHub Advisory Database and other CVE sources. This is more effective than a URL scan, which can only guess at frontend libraries and cannot see server-side dependencies at all.

Do I need repo scanning if my URL scan came back clean?

Absolutely. A clean URL scan only means your deployed frontend is currently safe. It does not account for SQL injection vulnerabilities in your backend, hardcoded passwords in your database migrations, or vulnerable packages in your server-side environment. SimplyScan typically finds that 30% of high-severity issues in AI-built apps are only detectable via source code analysis.

Is it safe to give a scanner access to my GitHub repository?

When using a reputable tool, you grant access via GitHub OAuth. This allows you to select specific repositories rather than granting blanket access to your entire account. You can revoke this access at any time through your GitHub settings. The scanner only reads the code to identify patterns and does not store your proprietary logic permanently.

Related Guides

Frequently asked questions

Are secrets in a private GitHub repo still a risk?

Yes. Private repositories are often shared with third-party integrations, CI/CD tools, and multiple collaborators. If a single collaborator's account is compromised, your entire history of secrets is exposed. Furthermore, many developers eventually flip private repos to public, forgetting that the entire git history—including every secret ever committed—becomes public instantly. Always treat committed secrets as compromised.

How do I remove a secret from git history?

Simply deleting the secret in a new commit is not enough; it remains in the git history. You must use a tool like git-filter-repo or the BFG Repo Cleaner to rewrite the repository's history and purge the string. After cleaning, you must force-push the changes. Most importantly, you must rotate the leaked key immediately, as it may have already been cached by GitHub's internal logs or external scrapers.

How many secrets get leaked on GitHub?

Can a repo scan find vulnerable npm packages?

Yes. A repository scan analyzes your package-lock.json or yarn.lock files to build a full dependency tree. It then cross-references every package version against the GitHub Advisory Database and other CVE sources. This is more effective than a URL scan, which can only guess at frontend libraries and cannot see server-side dependencies at all.

Do I need repo scanning if my URL scan came back clean?

Absolutely. A clean URL scan only means your deployed frontend is currently safe. It does not account for SQL injection vulnerabilities in your backend, hardcoded passwords in your database migrations, or vulnerable packages in your server-side environment. Repo scanning provides the "internal" view necessary to catch the 30% of high/critical issues SimplyScan typically finds in AI-built apps.

Is it safe to give a scanner access to my GitHub repository?

When using a reputable tool like SimplyScan, you grant access via GitHub OAuth. This allows you to select specific repositories rather than granting blanket access to your entire account. You can revoke this access at any time through your GitHub settings. The scanner only reads the code to identify patterns and does not store your proprietary logic permanently.

GitHub Repo Scanning: Why URL Scans Aren't Enough

Why URL Scans Are Only the Tip of the Iceberg

What a URL Scan Misses (and Why It Matters)

1. Backend Source Code and Edge Functions

2. Configuration and Infrastructure Files

3. Database Schemas and Migrations

4. Hidden Secrets in Git History

The Danger of "Vibe-Coded" Repositories

Hardcoded Credentials in Generated Code

Insecure Dependency Trees

AI-Generated Logic Flaws

How GitHub Repo Scanning Works

Comparing URL Scanning and Repo Scanning

When to Use Each Approach

Use URL Scanning for Immediate Feedback

Use Repo Scanning for Deep Audits

How SimplyScan Bridges the Gap

Step-by-Step: Securing Your GitHub Repo

FAQ

Are secrets in a private GitHub repo still a risk?

How do I remove a secret from git history?

How many secrets get leaked on GitHub?

Can a repo scan find vulnerable npm packages?

Do I need repo scanning if my URL scan came back clean?

Is it safe to give a scanner access to my GitHub repository?

Related Guides

Frequently asked questions

Are secrets in a private GitHub repo still a risk?

How do I remove a secret from git history?

How many secrets get leaked on GitHub?

Can a repo scan find vulnerable npm packages?

Do I need repo scanning if my URL scan came back clean?

Is it safe to give a scanner access to my GitHub repository?

Related guides