Methodology — PentestPlanner scoping logic

1. Core concepts

Every estimate is computed deterministically from the wizard inputs. There is no AI inference in the math — AI is only used to enrich free-text notes into structured signals before they feed the engine. The output is always a triple: min, recommended, and max man-days (MD), plus a per-scope breakdown.

Pipeline:

Wizard input → normalized ScopeInput (one per scope).
Per-scope rule → baseline + modifiers + discovery + approach + reporting.
Engine merge → cross-scope overlap adjustments, shared PM/workshop overhead, confidence score.
Final triple → min/recommended/max with a transparent line-item breakdown.

2. Confidence and ranges

Each line item carries a kind (baseline, modifier, discovery, compliance, reporting) and an MD value. min and max are derived from baseline tables per scope type and shrink as the confidence score increases (more answered questions, fewer "unknown" signals).

Missing critical info (e.g. unknown endpoint count, unknown AD presence) is recorded under missingInfo[] and visibly widens the range.

3. Web application (deterministic)

Web is the most opinionated rule because the wizard collects enough structural data to avoid AI guesses.

3.1 Black-box gating

Black-box engagements pass through a sequential gate before any baseline is applied:

No public entry vector (no login, no registration reachable) → 0.5 MD perimeter sanity check only. No further modifiers.
Public login present, no self-registration, no auth-break allowed → 2.5 MD "auth surface only" (login, registration, password reset, forgot-password, brute-force throttling, account enumeration). Tested as a single anonymous role.
Auth break allowed → standard sizing path + a fixed +5 MD post-breach internal add-on.

3.2 Sizing units (grey/white-box & auth-break path)

The user picks a sizing unit that matches how they think about the app:

Unit	Typical use	Baseline curve
`pages`	Marketing / CMS / portal	≤5 pages + no auth = static (2 MD); otherwise scaled
`modules`	SaaS / business app	~7 modules ≈ 10 MD baseline
`endpoints`	Heavy SPA / API-driven UI	per-endpoint coverage, capped

Each unit has its own (min, recommended, max) lookup curve so the baseline never falls below realistic floor effort for a given surface area.

3.3 Role coverage multiplier

Role count is applied as a tiered percentage on top of the baseline (with a minimum absolute add so it still matters for small baselines):

Roles	Multiplier	Min add
1	—	—
2	+10%	+1.0 MD
3–4	+15%	+1.5 MD
5+	+20%	+2.0 MD

3.4 Additional applications

Each app beyond the first adds ~60% of the baseline (assumes moderate code/auth overlap between apps under the same engagement).

3.5 Feature modifiers

Complex RBAC — beyond a standard role matrix
Multi-tenant isolation — cross-tenant data leakage tests
Payment / financial flows — race conditions, transaction integrity
Approval workflows, sensitive admin interface
File upload / processing
Federated SSO / SAML / OAuth
Complex MFA flows
Sensitive business workflows

3.6 Discovery and missing-info penalties

No test accounts → "Account self-provisioning" discovery overhead
No documentation → surface-from-scratch overhead
Unknown unit count, unknown app count, unknown public-login → recorded as missing info and widen the range

4. API

Baseline driven by endpoint count and OpenAPI/spec availability:

Small / Medium / Large baseline based on endpoint count and module density
+ GraphQL → schema/depth/batching tests
+ SOAP/XML → WSDL, XXE, SOAPAction abuse
+ Complex object-level authorization (BOLA / scopes)
+ Financial / transaction workflows
Discovery overhead if spec is partial or missing

5. Mobile

First-platform baseline by size. Second platform adds:

Native second: ~70% of first
Shared codebase (React Native, Flutter): ~35% of first

Modifiers: SSL pinning bypass, offline/sync surface, sensitive local storage, biometrics, root/jailbreak detection bypass, financial flows, MASVS L1/L2 mapping. Backend API is only covered as "basic support" — a real API assessment requires a dedicated API scope.

6. External infrastructure

Baseline by IP count. Modifiers: number of distinct exposed services, many domains/subdomains (recon overhead), cloud perimeter complexity, restricted testing windows, VPN/access setup. If exploitation is not permitted, confidence is lowered (the engine cannot validate true exploitability).

7. Internal infrastructure

Baseline by asset count. Modifiers: Active Directory in scope (+kerberoasting, ACL abuse, GPO review), multiple AD forests, segmentation testing, assumed-breach scenario, lateral movement, many subnets/VLANs, production restrictions, EDR/AV evasion overhead.

8. Cloud

Baseline by size. Modifiers: Kubernetes (cluster, RBAC, workloads, network policies), IAM complexity, multi-account/subscription paths, CI/CD security, IaC review. Without read-only access, discovery is external-enumeration only and confidence drops.

9. Retest

baseline = max(0.5, originalMd × pct) where pct scales with finding count and severity (more high/critical findings → higher pct). Plus modifiers: fix discovery overhead (when fix docs missing), report update overhead.

10. Social engineering (phishing / vishing / OSINT)

Phishing baseline scales with user count + scenarios. Modifiers: custom landing page, credential capture, malicious attachment simulation, awareness material.

Vishing: script complexity, scenarios, languages. OSINT: breached creds, social media, exposed assets, executive analysis, large brand footprint.

11. Cross-scope overlap

When multiple scopes share context, the engine deducts overlap to avoid double-charging:

Web + API on the same app → shared discovery & consolidated report
External + Internal infra → consolidated infrastructure report

12. Shared overhead (PM, workshop, reporting)

On top of the per-scope MDs the engine adds:

Per-scope reporting — ~15% of technical effort, minimum 1 MD per scope (executive summary + remediation guidance)
Workshop / kickoff — fixed overhead for scoping alignment
Project management — fraction of total technical effort

For single-scope estimates these rows are rendered inline in the scope card. For multi-scope estimates they appear in a collapsible "Shared overhead & adjustments" panel, and the reconciliation always equals: Σ per-scope MD + sharedOverhead + overlap adjustments = totalMdRecommended.

13. Training feedback loop

When real-world MDs are entered in the Training module, they are split into Technical / Discovery / Compliance / Reporting categories so the system can pinpoint which part of the estimation deviated, not just the total.

14. What is NOT included

Travel, on-site presence, hardware shipping
Customer-side fix implementation
Re-scoping after the contract is signed
Legal/contractual review

How PentestPlanner calculates effort