Shostack + Friends Blog

 

AI will be the high interest credit card of 2023

I haven’t done a lot of work in Python, and I’ve never used it to produce graphs. But after an hour of pair programming and then using Chatgpt and Github Copilot got me quite far in writing a set of Jupyter Notebooks, and dramatically shrunk the effort to use and debug a new tool. I wanted to record some thoughts on the experience, and what it means for programming and for application security.

The world’s most popular programming language has long been ... Excel. So called “no code/low code” platforms like IFTTT, Zapier and more are hugely popular, and “software robots” that allow anyone to add layers of automation on top of extant software have brought a billion (with a B) dollars of revenue to UIPath. People want to take control of the systems that are in front of them, and fighting the syntax of a programming language is a barrier.

The strong temptation to distinguish these things from “real programming” mis-leads us. The reality is that even professional programmers routinely encounter scenarios beyond their expertise. Also, a new study by GitHub finds 92% of programmers are using AI tools, and 70% of them think it’s helping them code.

My own experience is, if not for the LLM help, I’d likely have given up along the way. The faster cycles (compared to combing through Stack Exchange/Reddit threads) are meaningful, and frankly, that's going to be complex waters for appsec folks. I’m not, by any stretch, proud of the code, and much of what’s ok about it (parameterization) the tools didn’t really help with. But it runs. ChatGPT was fine as a ‘rubber duck debugging’ tool, even if it didn’t tell me that the problem was always indentation. (Who designs a programming language like that?!?)

These new ways of writing code require new structures to help us program and engineer systems. Much like compilers took developers away from writing machine code and let us focus on algorithms, AI assistance will take us further from the machine and what it’s doing. That will result in code that’s more bloated and slower, and in many cases, that’ll be ok. The added abstraction will help us do more.

In many ways, it’ll be like how NPM pulls in modules we’ve never heard of, with functionality we don’t understand and vulns we can’t stay on top of. But developers will get to first functionality faster, they’ll ship faster, and that will lead to technical debt accumulating ever faster. Like credit card debt, you can get shiny new things, and then find yourself unable to pay the cost of owning or maintaining them.

One engineering challenge is how to exploit the new tools without paying that cost. The survey mentions people checking in code they don’t understand(!?) and getting caught in code review. The issue of vulns in the code is well documented; when I search on Panda (one of the libraries I’m using) I get things like Lesson 3: Data analysis with Pandas rather than something about how to use Panda safely.

All in all, these models are not coming fast, but already here, and understanding what that means involves both considering things systematically, and also with the benefit of experimentation and experience. If you haven’t tried the tools yet, you’re missing out.

The title is an intentional reference to Machine Learning: The High Interest Credit Card of Technical Debt; the image is by Midjourney, with a prompt “:a human programmer, sitting in front of many monitors filled with software code, with a swarm of fairies pointing things out and making changes. --ar 8:3” (v5 was a big shift in this one).