Shostack + Friends Blog

 

Appsec Roundup - June 2025

Lots of fascinating threat model-related advances, new risk management tools, games, and more! a photograph of a robot, sitting in a library, working on a jigsaw puzzle

My June roundup, delivered on Canada day, kicks off with regulation, because the White House has amended Executive Orders 13694 and 14144 on cybersecurity, and ... it’s complicated enough that I’m rounding up posts just about that change.

US Executive Orders roundup

  • First: The fact sheet and full order. This is most complicated for those in government, but will have impacts on the private sector.
  • Eric Geller has a broad overview, titled Trump scraps Biden software security, AI, post-quantum encryption efforts in new executive order, and Dan Goodin has another, Cybersecurity takes a big hit in new Trump executive order.
  • Ryan Hurst has a long post, From Mandate to Maybe: The Quiet Unwinding of Federal Cybersecurity Policy, with interesting comments about the possibilities of “Rules as Code: Promise, Paradox, and Perfect Timing.”
  • Chris Wysopal’s Linkedin post lists these software assurance provisions that have been struck:
    • Mandatory, machine-readable attestations from every federal software supplier that they follow NIST’s Secure Software Development Framework (SSDF)
    • A CISA-run Repository for Software Attestations & Artifacts (RSAA) plus a program that randomly validates those filings and publicly names vendors that fail.
    • New FAR clauses forcing every agency to buy only from suppliers that file acceptable attestations.
    • Escalation path to DOJ for vendors that lie in an attestation.
    • The centralized requirement to hand over an SBOM (or any validating artifact) for every piece of software the government buys has been removed. However, SBOMs still exist in federal policy, and any individual agency can continue to demand them under EO 14028 and existing OMB or DoD guidance

Other regulation

  • FDA has issued new pre-market guidance, view here.
  • Four former managers at Volkswagen have been convicted of fraud over “Dieselgate,” and two have been sentenced to prison, according to this story. Your employer can’t go to jail for you, and with the CRA rolling along, the possibilities for fraud will grow.

Threat Modeling

  • Spanning the Volkswagen item and threat modeling, NPR reports that Meta plans to replace humans with AI to assess privacy and societal risks. This is fascinating, because the alternative to humans assessing those threats is computers doing so in a way that might be ok, and might be glue-on-pizza. And maybe, just maybe, the average LLM review of a feature is better than the average human review. That might be because it’s higher technical quality, and it might be because it can happen immediately, and fast has its own value. Some interesting quotes:
    • ”A slide describing the new process says product teams will now in most cases receive an "instant decision" after completing a questionnaire about the project.” (This is not how I’d design such a tool.)
    • ”This frees up capacity for our reviewers allowing them to prioritize their expertise on content that's more likely to violate...”
    • ”...that scrutiny regularly finds issues the company should have taken more seriously..."
  • Iain Mulholland writes about Google Cloud Security from the perspective of their CISO's team.
  • An academic paper, ACSE-Eval: Can LLMs threat model real-world cloud infrastructure? asks the question and says:
    Our evaluations on ACSE-Eval demonstrate that GPT-4.1 and Gemini 2.5 Pro excel at threat identification, with Gemini 2.5 Pro performing optimally in 0-shot scenarios and GPT-4.1 showing superior results in few-shot settings. While GPT-4.1 maintains a slight overall performance advantage, Claude 3.7 Sonnet generates the most semantically sophisticated threat models but struggles with threat categorization and generalization. To promote reproducibility and advance research in automated cybersecurity threat analysis, we open-source our dataset1, evaluation metrics, and methodologies
    This is the sort of work that’s needed to understand how LLMs can and can’t help us threat model, and it likely represents a person-year or more of effort.
  • Erik Hollnagel’s Safety-I and Safety-II: The Past and Future of Safety Management came across my feeds. It’s an older deck, but it checks out, and has stood the test of time. One standout comment: “Efficiency-Thoroughness Trade-Offs are made by all professions and can be found on all levels of an organisation – from top management to daily operations.” Understanding and respecting these is a crucial part of designing effective threat modeling processes.

Appsec

  • Apple has new(?) documentation about using extensions to sandbox risky functionality in Creating extensions with enhanced security.
  • Baxbench is a new benchmark, focused on “evaluating LLMs on secure and correct code generation, showing that even flagship LLMs are not ready for coding automation, frequently generating insecure or incorrect code.”

Books received

Attacking
and Exploiting Modern Web Applications and A Threat Centric Approach
to Vulnerabilities Leveraging LLM and Predictions

Shostack + Associates updates

Image by Midjourney: “a photograph of a robot, sitting at a desk in a canadian library, working on a jigsaw puzzle. The robot is spotlighted by light streaming in through a small window, through which you can see people on the lawn enjoying summer. on the wall behind the robot is a canadian flag” What can I say, midjourney has real trouble with Canadian flags. Sorry!