Working with Systems Smarter Than You

Some opinions on what the future might look like for jobs, AI, and software engineering.

Mar 01, 2025

Article voiceover

0:00

-13:04

1As of early 2025, my typical workday involves talking more with Large Language Models (LLMs, e.g. Sonnet 3.7 and o1) than people and typing more as prompts than writing code. Spending so much time with these models and building products with them, I’m under no illusion about just how “stupid” they can be and yet I feel pretty confident that they will eventually supersede nearly everything I do today behind a keyboard.

Despite the crypto-scam level hype and marketing behind many AI products, I genuinely believe these models will continue to improve rapidly — so it's crucial to consider what it might mean to collaborate closely with systems potentially smarter than you or me.

We often see comparisons between this 2020s boom of AI technologies and past revolutionary shifts such as electricity, the industrial revolution, calculators, computers, and the internet. However, this current wave feels distinctively unique and considerably more unpredictable. The framework of “person did role X, X is automated so now they do role Y” doesn’t really work for AI. For quite a few of the intuitive (X, Y) pairs — AI might be better at both. A lot of folks may also miss that X isn’t about how the work is done either, but about outcomes. AI won’t attend your sprint meetings, collaboratively whiteboard, or click through an IDE — it will simply build the product.

In this post, I wanted to talk through some thoughts on what it might mean to work in a world of “superintelligent” systems. It’s a bit of a speculative part two to my more practical guide on AI-powered Software Engineering. It’s not necessarily what I want to happen but what I think will happen. I focus on Software Engineering (SWE) but copy-paste this post into ChatGPT to re-write an analogous version for a different field.

Hear me out — AI coding is here to stay.

There’s an increasingly large polarization between engineers who think AI makes everyone a SWE3 overnight and those that think that it’s mostly hype that pollutes codebases and cripples junior engineers.

No matter how high the SWE-bench score climbs, just looking at the top voted “AI”-related posts on r/programming and r/webdev you’d get a strong impression that it’s the latter and “most” engineers still don’t see any value here or would even say it’s a heavily negative development. I won’t post a full manifesto but since nearly half of my recent followers came from my all-LLM-code-is-dangerous post, I’ll briefly rehash some of my thoughts and misconceptions.

If your prompts result in bad or useless code, it’s often a skill issue.

A common misconception is that because these models are trained on a lot of bad code, they will write bad code. A better mental model is that it can produce code for all levels of engineers and conventions (good and bad) and for now it’s up to your instructions to set the right defaults.2
You might not be using Sonnet 3.x, which is still confidently in a league of its own.3
You might be a victim of the IKEA effect when it comes to hand written code.
Anecdote: I write a lot of code, with most of my recent projects4 being mostly AI generated. There’s no going back.

Insecure code will be a problem, but with its own solutions.

A large % of code I see posted on social media has security issues and this is exaggerated by a false confidence in using AI as a complete abstraction for software development — this is a known issue that we’ll have to work around.
In the near future models will be qualitatively measured by their ability to write secure code driving them to invest in secure-by-default outputs. I expect to see security benchmarks that OpenAI/Anthropic/etc compete on.
For sensitive applications, we might put the security burden on the LLM itself by having other models act as reviewers, forcing them to follow the rules of safety-critical code, and/or perform automated theorem proving.
Backdoor issues like BadSeek will be handled through a mix of cautious trust and ensembling.

We haven’t yet figured out the ideal UX for AI-assisted coding.

It’s fairly common for folks to conflate limitations of the model (the underling LLM) and the tool (e.g. GitHub Copilot). In many ways these wrappers are still catching up and I expect that even if we froze the model for a few years, these AI IDEs would continue to improve.
LLM intelligence doesn’t fully align with human intelligence. They will probably still make token counting mistakes while they simultaneously become “superintelligent”. This is also why it’s hard to ever say they will “replace a job” because fundamentally they will be doing a slightly different job optimized for where they are reliable and what they can output.
The current chat-on-an-IDE interface (Copilot, Cursor, v0, etc) is clunky and an artifact of our transition from dumb-LLMs to smart-LLMs. It gives the impression that AI will write your code for you while dumping a large amount of code for you to now rely on your own expertise to review. I expect that as these models evolve and codebases adapt, these AI IDEs will no longer look like IDEs.5

The models will continue to get better.

“Scaling laws” are sort of a thing. We have several dimensions (pre-training, alignment training, test-time, etc) to continue to throw more data, rewards, and compute to upgrade the models. At the end of the day, even if one part hits a wall, brute-forcing a problem by re-running a model N=1000 times in parallel or in some contrived scaffolding will likely yield intelligent-looking results, even if it’s just an ensemble of dumb LLMs under the hood.
I think there’s also too much investment in this at this point for this to fail — especially in the case of AI for engineering. Several billion invested to build a reliable AI SWE is worth it and the model trainers and top ML researchers who work for them know it.
If you think we’ve hit a wall and it’s just hype, you should bet against me on it.6

A “Software Engineer” in 2035

The world moves much more slowly than your hyper-growth tech startup and I expect that even after we achieve LLMs that are capable of superintelligent software engineering, it will take time for:

AI tools to fully integrate the new model (from model to wrapper layer)
Companies to realize AI-driven development is a competitive advantage
Companies and their engineering teams to restructure and adopt it
Companies to learn how to effectively use and scale with it

For startups that have embraced AI, they might already be at stage 3.5 but for large non-SaaS companies this might take years (bottlenecked by legacy human-centric processes). The below predictions assume a world where most organizations have gotten to stage 4 which may take 5 - 20 years.7

Post-AI Jobs

It’s difficult to predict what jobs would look like in fields like software engineering. You might initially think “oh since AI is writing the code, humans will shift to reviewing” but AI will probably be better at reviewing too. “oh then architecting the system” → AI can do that better as well. “understanding the market and what products to build” → AI might just be better at that too.

My guess at the post-AI hierarchy of jobs. Owners, high-stakes adapters, low-stakes adapters, then mandated positions. Expect this to roughly correlate with pay and prestige, but I’m making a lot of guesses here.

Playing this out, I'm thinking there are four types of jobs left:

Owners
- AI can’t “own” things, in the foreseeable future I see the root of companies still being human-managed. Owners seed the personality and vision of an organization while being held liable for any legal and ethical accountability.
- This is your former CEO, CTO, etc. much of who’s decision making and delegation will be done via AI-based systems. Instead they take a more board-like role, aligning the automated decision making based on their personal bets and strategy.
Mandated Positions
- Many positions will exist simply because they are required to.
- This includes Union-mandated personnel, legal representatives, compliance officers, ethical auditors, and human-in-the-loop evaluators. I expect this list to grow as AI wrappers begin to go for non-software and customer support roles.
High Stakes Adapters (Critical Roles)
- Most roles fall into an AI augmented limbo, where while the model and wrapper applications may be capable of most things there’s a notable gap between the outcomes of the AI vs AI + a human adapter. Adapters role’s consistently shrink overtime at varying rates based on the requirements of a given role and the market size.
- High-stakes adapters are required to maintain full-competence in the role that AI is replacing and be able to perform the full task “offline”. This is the airline pilot of post-AI roles — automated systems can mostly fly the plane but if something goes wrong a highly trained human operator will need to be there to take over. Existing highly skilled SWEs might transition into these roles where the ability to code is still highly valued.
- The semantic difference between a high-stakes adapter and a mandated position, is that these positions are rationally required for the measurable safety or efficacy of the system.
Low-Stakes Adapters (Flexible Roles)
- For less safety critical roles and ones where there’s substantial incentive to trade away full human oversight in favor of AI-powered scalability and faster iteration, you’ll have low stake adapters. These roles adapt rapidly to fill in the gaps between what AI can do own it’s own and the outcomes desired for a specific role. Often 2+ different low-stakes adapter roles merge into the responsibilities of a single individual.
- Most software engineers (excluding e.g. high-stakes critical systems/libraries) fall into this category. Over time there will be less and less incentive and need to maintain full-competence in the underlying task (i.e. they’ll get worse writing and reviewing code and better at other traditionally non-engineering things).

Software Engineers vs “SaaS Adapters”

An oversimplified Venn diagram of the skills required of a Software Engineer vs a “SaaS Adapter”.

Traditional Software Engineer (SWE)

Primarily writes code in an IDE
Deep expertise in specific programming languages and frameworks (React, Python, Java, etc.)
Task-focused workflow (e.g., creates and resolves engineering tickets, attends sprints, primarily collaborates with human teammates)

SaaS Adapter (~AI-Augmented Engineer)

Primarily acts as an AI communicator and orchestrates AI-driven development
Operates at a higher abstraction level; coding is continuously pushed further from daily responsibilities
Increasingly merges or expands responsibilities into adjacent roles (Product Management, Customer Success, UX, etc.)

Core Skills in Common

Primarily measured by outcomes rather than implementation specifics
Highly values critical thinking, creativity, and problem-solving
Regularly handles and debugs unexpected issues, leveraging increasing AI assistance

While the ability to write code is likely to become a less valuable skill, this is very different from saying anyone can be a software engineer or successful in the adapter version of the role. There’s a ton of skill and differentiation that will still exist between the average person and the successful engineer turned SaaS adapter — it just wont be measured by leetcode score and there will be cases where a person worse at coding would be a better fit for these roles.

I expect that while the number of jobs titled “Software Engineer” will steadily decrease over time from here on out, the total number of individuals employed in software will increase (as an application of Jevons paradox).8

Rewarded (But Uncomfortable) Skills

The next big question is now how do you prepare for this and while I don’t know (and no one else really), I have some ideas for what skills will do well. I explicitly titled these as “uncomfortable” because the mainstream advice of “learn to use ChatGPT more in your daily workflows” only has so much alpha and the notes below may better distinguish the types of individuals organizations will value during this transition period.

Using AI as not just an assistant but as a mentor and a manager.

Shifting from a mindset of “help me with this guided menial sub-task” to “here’s the outcome I want, what do you suggest?”.
This was probably one of my most unnerving realizations when I began to use o1 and the various deep researches for ad hoc life decision making. There are often times when it’s able to convince me that how I planned to solve a problem was suboptimal compared to it’s other suggestions.
There’s an incredibly large set of ethical and safety considerations when using AI to make important decisions — developing the skill to interrogate these AI decisions, judge over-confidences, and verify “hallucinations” is and will continue to be critical.

Thriving in a world of breadth and rapid change.

Most roles fall into what I deemed as an “adapter” role meaning the day-to-day will increasingly change to fill the gaps in AI capabilities. An engineer will be expected to do work that was traditionally not expected of an engineer and the skill bar and breadth of any given role will continuously increase.
Success will mean letting go of a specific role identity (“I am an X, this is what I’m good at and the only thing I want/can do”) and working without a “career script”.

Becoming comfortable automating (parts of) your own role.

The low-stakes adapter roles are fundamentally in a consistently vulnerable position, often doing work that will eventually be filled by AI-driven solutions. The dilemma is that, in a profit-seeking organization, the short term incentives (compensation, prestige, etc) will be given to those that aid this transition the most effectively.

Conclusions & Advice

The next few decades of software engineering will likely be shaped by rapidly advancing AI — though exactly how remains uncertain. The many speculative predictions I made are predicated on three beliefs, and it’s totally possible any of these (or an implicit belief) could be completely wrong.

AI models will continue rapidly improving (but no singularity).
We'll maintain broadly the same economic model as today.
Society, overall, will prefer the perceived benefits of widespread AI integration despite its drawbacks.

If I’m at least directionally correct, here’s my advice for new grads who are likely experiencing the most near-term uncertainty:

There’s a fine but important line between leaning on AI-augmentation and obliterating your critical thinking skills. While the challenging part of your day-to-day may not be coding when you have Cursor, there should be at least something you do regularly that challenges you to think critically.
Learning to code strictly without AI tools (i.e. not learning to use them together) will reduce your chances of finding a job. It’s not a crutch nor is it cheating, but it will become an expectation.
Think of your career more laterally, placing greater value on skill diversity and domain breadth than what the career playbooks have suggested traditionally.
You still have plenty of time and your CS degree is still valuable. The degree (hopefully) taught you not just coding but a way of solving problems. It’ll also in the near term retain it’s value as a symbolic badge for companies filtering for qualified applicants. I expect that recruiting teams will also take time to figure out how to adapt what they hire for and in the near term still look for traditional SWE skills.
See the “rewarded skills” section — get good at squeezing value from these assistants while knowing their limits. This comes from spending a lot of time with these models.

Thought it would be fun to include an AI-generated podcast for this post using NotebookLM. Note that that there are some small inaccuracies that don’t completely align with the content of the post (I never mention a T-shaped professional explicitly, learning to code first without any AI aid, etc.) but overall not bad!

Chollet’s “How I think about LLM prompt engineering” is a pretty good way to think of prompt engineering from a theory side. All possible programs-as-prompts exist, your instructions are a query into this search space.

I will say, o3-mini is growing on me, but the fact that Sonnet has significantly lower latency (as a non-reasoning model) is a huge plus.

The irony of that LLM Backdoor post is that Sonnet itself wrote much of the pytorch code to implement this. Lots of back and forth but still felt like I was mainly the idea guy.

I would go so far as to say that even viewing the code directly might eventually be considered a design anti-pattern for prompt-to-software tools.

The linked market is symbolic and not actually a great example of a bet on this topic. The underlying score threshold and benchmark might not well align with whether AI will be good enough to radical change the industry.

I already know folks are going to make fun of this both for being too pessimistic, too optimistic, or too wide an interval. Feel free to comment what you believe is the right timeline for this (:

These roles may be increasingly global hires, creating an incentive for US companies to employ non-US low-stakes adapters who self-taught themselves into the new role and will work for a lower salary. This is already a thing in the tech job market, but I think it will be exaggerated by these changes.

Kyle Gao

Mar 2

Great article Shrivu! What do you think will be the effort needed to articulate the desirable outcome of a SaaS product well enough for the AI to build? I mean you can’t just tell it that I want to hit $1B ARR with a cybersecurity product, go! So it needs to be broken into sub-outcomes (desirable feature?) and maybe sub sub outcomes (desirable component?).

Expand full comment

2 replies by Shrivu Shankar and others

fang

Mar 9

Thanks for sharing your thoughts!!

1. I'm trying to concretize my own opinions about what timelines are going to look like, but "AI coding is here to stay" resonated - I think even in its current form, with no improvements, AI coding tools are already making engineers so much more effective, so it decidedly seems like the genie's out of the bottle

2. The distinction between High Stakes Adapters and Low-Stakes Adapters is not clear to me. Specifically, I get the idea at a high level, but I'm not personally able to come up with a set of characteristics that would make a role fall into either 1 category or the other (basically i don't know how to break down '"High-stakes adapters are required to maintain full-competence in the role that AI is replacing and be able to perform the full task “offline”' further - curious if you have more fleshed out thoughts here?

3 replies by Shrivu Shankar and others

6 more comments...

Shrivu’s Substack

Discussion about this post