GitHub Repo Scanning: Why URL Scans Aren't Enough
URL scans check your frontend. Repo scans check everything else - backend files, server configs, edge functions, and secrets that never reach the browser.
By Gabriel CA · Kraftwire Software
· 9 min readWhat URL Scanning Cannot See
A URL scan analyzes your deployed frontend - the HTML, JavaScript, CSS, and network requests that any visitor can see. It is effective for catching exposed API keys, client-side vulnerabilities, and frontend misconfigurations.
But it cannot see:
**Backend source code** like Edge functions, API routes, and server-side logic
**Configuration files** like `.env` templates, `config.toml`, and deployment configs
**Database schemas** including migration files, seed data, and schema definitions
**Server-side secrets** that are used server-side but might be committed to the repository
**Infrastructure configuration** like Docker files, CI/CD pipelines, and deployment scripts
**Test files** that might contain hardcoded credentials, test API keys, or database connection strings
This is why GitHub repo scanning exists. It analyzes the actual source code in your repository, not just the deployed output.
Why Repository Scanning Matters
Your deployed app is the tip of the iceberg. The repository contains everything that goes into building and deploying that app, and many security issues live in files that never make it to the browser.
Secrets Committed to Git History
The most common reason to scan repositories is finding secrets that were committed at some point in the project history. Even if you later removed a key from the code, it still exists in every previous commit. Anyone who clones your repository can scroll through the git log and find every secret that was ever committed.
**How this happens in practice:** You start building your app and put your OpenAI key directly in the code to test it. It works great. A week later, you realize the key should be in environment variables, so you move it. But the commit from a week ago still contains the key in plain text. If your repo is public (or becomes public), that key is exposed.
**What attackers actually do:** Automated bots continuously scan public GitHub repositories for API keys, passwords, and tokens. GitGuardian reports that over 10 million secrets are leaked on GitHub every year. These bots find and exploit exposed keys within minutes of them being pushed.
Hardcoded Credentials in Config Files
AI-generated projects frequently include configuration files with placeholder credentials that look real:
# docker-compose.yml generated by AI
services:
db:
environment:
POSTGRES_PASSWORD: supersecretpassword123
These files sit in your repository and often go unnoticed during code review because they look like they are supposed to be there.
Insecure Dependencies
Your `package.json`, `requirements.txt`, or `Cargo.toml` lists every dependency your app uses. A repo scan can cross-reference these against vulnerability databases to find packages with known security issues. URL scanning cannot do this because the deployed app does not expose its dependency list.
Server-Side Code Patterns
Backend code contains the logic that handles authentication, authorization, database queries, and business rules. A repo scan can detect:
SQL queries built with string concatenation instead of parameterized queries
Missing authentication middleware on sensitive routes
Overly permissive file upload handlers
Hardcoded admin credentials or bypass logic
Insecure random number generation for tokens or session IDs
What GitHub Repo Scanning Catches
1. Leaked Secrets in Git History
Every commit is permanent. Even deleted files live in git history. A thorough repo scan checks every commit, not just the latest code.
**Common secrets found in git history:**
API keys for OpenAI, Stripe, Twilio, SendGrid, and AWS
Database connection strings with embedded passwords
OAuth client secrets and refresh tokens
SSH private keys and SSL certificates
Service account credentials and JWT signing keys
**How to fix it:** Use a tool like `git-filter-repo` or BFG Repo Cleaner to remove secrets from history. Then rotate every key that was ever committed. Simply deleting the file in a new commit does not remove it from history.
2. Environment File Patterns
Many AI projects generate `.env.example` or `.env.local` files with real values instead of placeholders:
# .env.example - should have placeholders, but AI put real values
STRIPE_SECRET_KEY=sk_live_51ABC...
DATABASE_URL=postgresql://admin:realpassword@db.example.com/prod
A repo scan flags these files and checks whether the values look like real credentials rather than obvious placeholders.
3. Insecure Code Patterns
Beyond secrets, repo scanning detects code-level vulnerabilities:
**eval() usage** with user input, which allows arbitrary code execution
**SQL injection** through string concatenation in database queries
**Command injection** through unsanitized input passed to shell commands
**Path traversal** vulnerabilities in file serving logic
**Insecure deserialization** of user-provided data
**Missing input validation** on API endpoints
4. Dependency Vulnerabilities
Your project dependencies are a significant attack surface. A repo scan analyzes your lock files and manifest files to find:
Packages with known CVEs (Common Vulnerabilities and Exposures)
Outdated packages that are missing security patches
Dependencies that have been deprecated or abandoned
Packages with suspicious characteristics (typosquatting, new maintainers on popular packages)
5. Infrastructure Configuration Issues
If your repository contains deployment configuration, a scan can detect:
Docker containers running as root
CI/CD pipelines with overly permissive access
Cloud infrastructure templates with public access defaults
Missing network security group rules
Unencrypted storage configurations
URL Scanning vs Repo Scanning
| What It Checks | URL Scan | Repo Scan |
|---------------|----------|-----------|
| Frontend exposed secrets | Yes | Yes |
| Backend source code | No | Yes |
| Git history | No | Yes |
| Config files | No | Yes |
| Dependencies | No | Yes |
| Server-side logic | No | Yes |
| Deployed headers | Yes | No |
| Runtime behavior | Yes | No |
| Network requests | Yes | No |
| Client-side performance | Yes | No |
The key insight is that these are complementary, not competing approaches. URL scanning tells you what is visible to attackers right now. Repo scanning tells you what is vulnerable in the code that powers your application.
When to Use Each Approach
Use URL Scanning When:
You want a quick security check of your live application
You need to verify that no secrets are exposed in the deployed frontend
You want to check security headers and CORS configuration
You are auditing a third-party application you do not have source code access to
You want to monitor your production deployment for regressions
Use Repo Scanning When:
You want to check for secrets in your entire git history
You need to audit server-side code for vulnerabilities
You want to identify dependency vulnerabilities before deployment
You are doing a thorough security review before launch
You need to comply with security standards that require code-level auditing
Use Both When:
You are preparing for a production launch
You are handling sensitive user data (financial, health, personal information)
You are going through a security audit or compliance process
You want comprehensive coverage of your attack surface
How SimplyScan Handles Repo Scanning
SimplyScan offers both URL scanning and GitHub repository scanning:
**Free URL scan** checks your deployed app for exposed secrets, frontend vulnerabilities, and security header issues across 3 core categories
**Pro scan** includes all 14 categories with 51+ checks, covering both URL and repository analysis
**Repo scanning** analyzes your connected GitHub repository for secrets in history, insecure code patterns, dependency vulnerabilities, and configuration issues
The combination gives you complete visibility into your application security, from the code in your repository to the deployed application your users interact with.
Setting Up Repository Scanning
Step 1: Connect Your Repository
Link your GitHub repository to SimplyScan. The connection uses GitHub's OAuth system, so you control exactly which repositories SimplyScan can access.
Step 2: Run the Scan
SimplyScan scans your repository for secrets, code patterns, dependency issues, and configuration problems. The scan checks your current code and your git history.
Step 3: Review and Fix
The scan results show you exactly where each issue is located, with file paths, line numbers, and specific remediation steps. Fix the critical issues first (exposed secrets, known vulnerable dependencies) and work through the rest.
Step 4: Rescan After Fixes
After applying fixes, run another scan to verify that the issues are resolved. This catches cases where a fix was incomplete or introduced a new issue.
Key Takeaways
URL scanning and repo scanning serve different purposes. URL scanning catches what is visible to attackers in your deployed app. Repo scanning catches vulnerabilities in your source code, configuration, dependencies, and git history.
For AI-generated and vibe-coded apps, repo scanning is especially important because AI tools frequently commit secrets, generate insecure code patterns, and use outdated dependencies. A URL scan alone misses the majority of these issues.
Use both approaches for comprehensive security coverage. Start with a [free SimplyScan URL scan](/) and add repo scanning for complete visibility into your application security.
Related Guides
[How to Fix Exposed API Keys in 5 Minutes](/blog/fix-exposed-api-keys)
[Environment Variables Security](/blog/environment-variables-security)
[Security Audit Checklist](/blog/security-audit-checklist)
[Is Vibe Coding Safe?](/blog/is-vibe-coding-safe)