Senior Application Security Tester, AI Red Team Subject Matter Expert 3

US
May 16, 2026

Job Description

Company: remoterocketship

Location: US

The Senior Application Security Tester & AI Red Team Subject Matter Expert is a senior-level offensive security role for a tester who has mastered modern web and API security and is now defining how Evolve Security tests AI-enabled applications, large language models, and agenticsystems.
This role wears two hats:
hands-on senior application penetration tester for our most complex client engagements, and the firm-wide subject matter expert who builds, scales, and represents Evolve Security’s AI red team practice.
The senior tester executes assessments with full autonomy,owns the technical relationship with client security and engineering leadership, mentors mid-level engineers and OSOC analysts, and is the recognized internal authority on offensive AI/ML testing methodology, tooling, and threat modeling.

Requirements:

• Typical Experience:

• 58+ years of offensive security experience with a deep concentration in web application and API penetration testing, plus demonstrable hands-on work testing AI/ML systems LLM-backed applications, RAG pipelines, fine-tuned models, multi-agent systems, or production ML inference.
A track record of dozens of completed assessments, published research, conference talks, CVEs, or open-source contributions is expected.
• Domain Expertise:

• Mastery of web application and API security beyond the OWASP Top 10 business logic abuse, complex authentication and authorization flows (OAuth 2.
0 / OIDC, SAML, JWT, mTLS), SSRF chains, deserialization, request smuggling, prototype pollution, and modern SPA / GraphQL attack surface.
Equally fluent in the OWASP Top 10 for LLM Applications and OWASP ML Top 10 prompt injection (direct, indirect, multi-modal), jailbreaks and safety bypasses, insecure output handling, training data poisoning and extraction, model denial of service, supply chain vulnerabilities in model and plugin ecosystems, excessive agency in agentic systems, sensitive data leakage from system prompts and embeddings, and vector store / RAG poisoning.
• Technical Skills:

• Expert with the modern offensive toolchain Burp Suite Pro (including custom extensions), OWASP ZAP, Nuclei, Postman, Nmap, Metasploit, BloodHound and able to build bespoke tooling when the off-the-shelf option falls short.
Comfortable with AI red-teaming tooling such as Garak, PyRIT, Promptfoo, Giskard, and adversarial ML libraries, and confident designing custom evaluation harnesses against client-specific LLM and agent stacks.
Strong scripting and small-tool development in Python, with working knowledge of JavaScript / TypeScript, Bash, and PowerShell.
Familiar with the components of modern AI applications:
vector databases (Pinecone, Weaviate, pgvector), embedding models, retrieval pipelines, agent frameworks (LangChain, LlamaIndex, CrewAI), and tool-use protocols including MCP.
• Soft Skills:

• Excellent written and verbal communication produces publication-quality reports with no editorial rework, leads CISO and engineering-leader briefings, and de-escalates contested findings with technical rigor.
Mentors mid-level engineers and OSOC analysts through code review, paired testing, and methodology coaching.
Comfortable representing Evolve Security externally webinars, podcasts, conference CFPs, and client thought-leadership content.
• Certifications (Preferred, not required):

• OSWE, OSCP, OSEP, GWAPT, GXPN, Burp Suite Certified Practitioner; AI/ML-adjacent credentials and contributions such as AI Red Team certifications, published prompt injection research, MITRE ATLAS contributions, or SANS SEC545/SEC595.

Expertise that aligns to our approach
• Lead end-to-end web application and API penetration tests as the senior technical owner, scoping the engagement, executing the assessment, and presenting findings to client security and engineering leadership.
• Apply structured testing techniques aligned to OWASP WSTG and OWASP API Security Top 10 to assess authentication, session management, access control (vertical and horizontal privilege escalation), input validation, error handling, and business logic flaws.
• Design and execute AI red team engagements against LLM-backed applications, RAG systems, and agentic workflows covering prompt injection (direct, indirect, multi-modal), jailbreak resilience, system prompt and tool-use exfiltration, training data and embedding leakage, insecure output handling, and excessive agency in tool-using agents.
• Map AI findings to the OWASP Top 10 for LLM Applications, OWASP ML Top 10, MITRE ATLAS, and the NIST AI Risk Management Framework so client stakeholders can defend severity and remediation calls internally.
• Test the full AI application surface:
model endpoints, prompt and response pipelines, retrieval augmentation, vector stores, fine-tuning pipelines, plugin / tool integrations (including MCP servers), guardrail and safety layers, and supporting cloud infrastructure.
• Demonstrate proficiency in manual exploit development for both classical web vulnerabilities (XSS, SQLi, SSRF, IDOR, CSRF, deserialization) and LLM-specific attacks (jailbreak chains, indirect prompt injection via RAG content, agent hijacking via crafted tool outputs).
• Validate authentication mechanisms OAuth, OIDC, SAML, MFA implementations, and JWT and how they extend into AI-specific surfaces such as agent identity, per-user tool scoping, and prompt-level authorization.
• Assess session management, secrets handling, and data-flow controls in AI applications, including how user data ends up in prompts, logs, vector stores, and model fine-tunes.
• Execute client-side testing using browser dev tools and proxy-based inspection, evaluating DOM-based vulnerabilities, insecure local storage, and AI-driven client behaviors (e.
g.
, embedded copilots and in-page agents).
• Test REST and GraphQL APIs using a combination of dynamic, manual, and automated methods; extend the same rigor.

Source: Jobilize