Name: AgentsOrg
Author: AgentsOrg

Your AI governance program probably has a blind spot.

You’ve documented your AI vendors. You’ve reviewed your ML models in production. You’ve audited your data pipelines. But there’s an entire category of AI that your governance framework almost certainly doesn’t cover: the Custom GPTs and Projects your employees built themselves, last quarter, in an afternoon, without telling anyone.

This is shadow AI: and in most enterprises running ChatGPT Enterprise, it’s not a fringe phenomenon. It’s the majority of your AI footprint.

What Is Shadow AI?

Shadow AI is any AI asset created or used by employees outside of formal IT approval or visibility. In the context of ChatGPT Enterprise, it manifests primarily as:

Custom GPTs built by individual contributors with no review process
System prompts that define agent behavior without governance oversight
External integrations connected to GPTs without security review
Data flows where employees feed sensitive documents into AI systems with no logging

Unlike traditional shadow IT: where employees install unauthorized software: shadow AI creates behavior. A Custom GPT built by a sales rep that’s connected to your CRM API and generates customer proposals represents an autonomous process operating on company data, at scale, without review.

Why Shadow AI Happens

Shadow AI isn’t malicious. It emerges from the gap between what employees need and what IT can deliver.

When ChatGPT Enterprise launches in an organization, the message is: “You can build your own AI assistants.” The tools are intuitive. The barriers are low. An employee with a repetitive workflow can spin up a GPT in 20 minutes that saves them hours per week.

IT can’t keep up. The GPT creation velocity in a 5,000-person enterprise routinely runs at dozens per week. No governance team can review each one in real time. So most GPTs are never reviewed at all.

The result: within 12-18 months of a ChatGPT Enterprise deployment, the average enterprise has:

60-70% of Custom GPTs and Projects that were built without IT awareness
15-25% of GPTs and Projects with access to external APIs or file systems that no one audited
10-20% of GPTs and Projects that are duplicates of each other or of officially approved tools
Unknown percentage that handle regulated data (PII, financial records, health information)

The Four Shadow AI Risk Categories

Not all shadow AI is equally risky. When you start a discovery program, you need a way to triage what you find.

Category 1: Data Exposure Risk

GPTs with access to sensitive data through file uploads, Knowledge bases, or API integrations. These are your highest-priority items because they represent active data flows that may violate your data governance policies.

Signs to look for: GPTs with file upload enabled, custom actions pointing to internal APIs, Knowledge bases containing proprietary documents.

Category 2: Compliance Risk

GPTs operating in regulated domains: legal contract review, HR decisions, financial analysis, patient-facing healthcare workflows: without the safeguards required by your compliance framework.

Signs to look for: GPT names or descriptions referencing HR, legal, compliance, medical, or financial processes; system prompts that instruct the GPT to make recommendations or decisions.

Category 3: Reputational Risk

GPTs that interact with customers or external partners, or that generate content for public use, without brand and legal review.

Signs to look for: GPTs described as “client-facing,” “customer communication,” “proposal generation,” or “marketing content.”

Category 4: Operational Risk

GPTs that have become load-bearing in a business process without formal documentation or backup. If a key employee leaves and takes their GPT knowledge with them, a workflow may break.

Signs to look for: GPTs with high usage by multiple team members, indicating organizational dependency.

How to Run a Shadow AI Discovery Audit

A shadow AI audit has four phases:

Phase 1: Inventory

The first challenge is seeing everything. OpenAI’s admin console lists GPTs but provides minimal metadata. For a real inventory, you need to extract:

GPT name, description, creation date, creator
System prompt (the behavioral specification)
Configured tools (web browsing, code interpreter, file uploads, custom actions)
Knowledge base contents (if accessible)
Usage metrics (if available)

This is the foundation. Without it, you’re governing blind.

Phase 2: Semantic Classification

A list of GPT names tells you very little. “Sales Helper,” “Legal Bot,” and “My Assistant” could mean anything. You need to understand what each GPT actually does by analyzing its system prompt and configuration semantically.

This means using LLMs to classify each GPT on dimensions like:

What business process does this automate?
What data types does it handle?
What risk factors are present in the instructions?
Who is the intended audience?

Manual review doesn’t scale. Automating this classification is the only way to audit hundreds of GPTs efficiently.

Phase 3: Risk Scoring

Once you have semantic classifications, apply risk scoring. A reasonable starting framework:

Risk Factor	Scoring Weight
External API connections	High
PII or regulated data in Knowledge base	High
Customer-facing or external-use description	High
HR, legal, financial, or medical domain	Medium-High
Instructions to make decisions (not just assist)	Medium
No system prompt (unconfigured)	Low-Medium
Duplicate of existing approved GPT	Low

GPTs that score high on multiple factors are your immediate governance priorities.

Phase 4: Remediation

For each high-risk GPT, you have four options:

Approve with controls: Review, document, and formally sanction the GPT with appropriate access controls
Remediate and approve: Work with the builder to modify the GPT to meet governance standards, then approve
Consolidate: If it duplicates an approved GPT, redirect users and retire the duplicate
Retire: Disable GPTs that represent unacceptable risk or have no active users

The goal is not to eliminate all shadow AI: it’s to convert ungoverned AI into governed AI.

Building Ongoing Shadow AI Governance

A one-time audit decays quickly. New GPTs appear every week. You need a continuous governance motion:

Weekly automated scans that detect new GPTs, run semantic classification, and flag new high-risk items for review.

Builder engagement, not just enforcement. Employees build shadow AI because the official process is too slow. Make governance fast enough that builders want to participate.

A graduated approval process: Low-risk GPTs get auto-approved with standard notices. Medium-risk get a lightweight review. High-risk go through formal security review. Don’t apply the same friction to a personal productivity GPT as to a customer-facing AI.

Visibility into the governed portfolio so employees can see what already exists before building something new. Duplicate detection and a searchable GPT directory reduce redundancy without requiring mandates.

Shadow AI as a Signal, Not Just a Risk

Here’s the counterintuitive insight from organizations that have done this well: shadow AI is also a signal of innovation.

The employees building ungoverned GPTs are often your most capable AI adopters. They identified real problems, built real solutions, and were willing to invest personal time in tools that help them work better. A governance program that only shuts things down loses that energy.

The best programs do the opposite: they find these builders, recognize their work, bring their GPTs through a governance process, and elevate them as internal AI champions. Shadow AI becomes the discovery mechanism for your AI talent.

Getting Started

If you’re running ChatGPT Enterprise or Team and you don’t have a current inventory of your Custom GPTs and Projects with semantic classification and risk scores, you have a shadow AI problem.

AgentsOrg automates the entire discovery and classification pipeline: connecting to OpenAI’s Compliance API, extracting full GPT and Project metadata, running semantic enrichment across 9 dimensions, and surfacing risk flags in a governance dashboard.

git clone https://github.com/ori129/agentsorg.git
cd agentsorg && make up

Enable Demo mode to see what a fully classified GPT portfolio looks like before connecting your real workspace.

Shadow AI governance is an evolving practice. Share what you’re seeing in your organization on the GitHub Discussions.

Shadow AI: How to Discover What's Actually Running in Your ChatGPT Workspace