Ask HN: How to explain to execs why gen AI hasn't 10x'd feature dev

14 points by macphisto178 5 months ago

Our senior eng team has found that gen AI tools like GPT and Claude haven't significantly reduced the time required to develop features in our system. We're having open/good faith talks with product and exec teams about the reasons behind this and would love insights on how to effectively explain the limitations and nuances of using gen AI in engineering.

austin-cheney 5 months ago

Just be clear about the current technology in language a 5 year old can follow. Use short sentences and connect the dots.

1. AI is business jargon for Large Language Models.

2. LLMs are predictive models using plain text.

3. LLMs are not more accurate or creative than what they are fed.

4. LLMs can provide excellent documentation for prior encountered problems but cannot provide original solutions to new problems.

5. A more effective means at cost reduction is to reduce/eliminate regression. This provides the same benefit as LLMs but without sacrificing creativity.

6. This is an opportunity for executives to reduce expenses and dramatically and simultaneously radically increase product quality. Lead the developers to increase test automation coverage and increase execution speed.

7. The alternative for the developers is their replacement by LLMs. LLMs cannot replace people but cost so much less they make up the difference if product quality remains marginal or maintenance cost remains high.

That is how you do it. Small sentences, use numbers to explain the finances, and play devils advocate.

tacostakohashi 5 months ago

Why is the onus on you, why don't the execs explain why it _would_ 10x the efficiency of their feature factory, if they're so smart?

The obvious answer is that generating code is not the hard part of building products.

fullstackwife 5 months ago

This is a classic example of org-politic power play, and a blame game. Programmers are not skilled in this kind of internal corporate warfare.

2024user 5 months ago

I would look at the claim that it would 10x feature dev. Where did that claim come from? Afaik, OpenAI and Anthropic don't make that claim.

brudgers 5 months ago

Asking where it came from will at least provide an assessment of the actual openness and good faith of the discussion.
Because willingness to question one’s assumptions is good faith’s hallmark.
Which also means the OP should not be dogmatic that AI cannot speed features up 10x.
In any case changes to process don’t come for free. Faster features of the same quality may mean faster testing, faster communication with customers, faster analysis, faster marketing updates, etc.

drooby 5 months ago

Increasing the speed at which you throw shit at a wall does not change the fact that you are throwing shit at a wall.

MattGaiser 5 months ago

Biggest cause is ineffective reasoning in a large context in my experience (more than 350 lines).

Im my experience, ChatGPT breaks down when it needs to consider more than 350 lines and its performance is sloppy before that.

To get solid performance out of it, I essentially need to specify the important areas and changes as well as the desired approach.

That being said, I’ve found it has cut my development time for most features by at least half, even in large codebase so I would be curious to know how you are approaching it currently.

macphisto178 5 months ago

a few examples where AI has been extremely helpful for work related things:
- writing complex sql queries for throw-away code, i.e. need to create a few charts from big query - writing internal front end tooling - polishing any internal messages - helping design docs with clarity - helping unblock highly technical areas where i haven't had any experience - helping learn the sdk from new vendors - helping read api documentation, i.e. does the instagram api allow for ABC
areas where it's been mildly helpful:
- helping to write verbose methods that i'm too lazy to write. e.g. maybe a custom date parser that takes a datetime and outputs some custom text according to a spec
areas where it hasn't been that helpful
- building features e2e. as you mentioned, the context window issue is really the crux here. - building simple code like a crud api. our system, like all others, has its own idioms and spinning up a new crud api is easy enough. getting an llm to write it is do-able but by the time i've edited things to make it consistent with the rest of our system, i've lost the benefit
- threecheese 5 months ago
  
  It might be time to retire those idioms? A realistic outcome of AI-driven development is code is no longer a long term asset. Patterns like DRY are beneficial because humans have cognitive limits, and many humans over long timeframes need to maintain complex software systems. Maybe patterns like locality of reference are better than DRY for short context window AIs, and maybe we need to start looking at how we would manage many smaller bits of code, with lots of duplication, just like we attend to developer experience today with code analysis and declarative build and deploy pipelines.
  If you HAD to deal with 1,000 individual functions - let’s say due to hardware and organizational limitations - how would you manage the obvious risks? Would that be net-net cheaper than a fleet of microservices and development teams?
kadushka 5 months ago

When you say ChatGPT, do you mean 4o, o1, o1 pro, or…?

muzani 5 months ago

It can 10x writing code and debugging, but neither of these are a software engineer's day to day.

You still need to do code reviews, especially on the AI. It's quite bug free, more than human code, but you still need to explain the product. v0 doesn't quite replace a good product designer.

It's not yet at the level it can do architectural work and it doesn't understand the scope or goal of products. It doesn't understand roadmaps. It can't plan the code around where it needs to be in 1 year. You need a proper architect to do this better.

The average code it's trained on is from 2019 or so. New models has people write new data for it, but most of this data is not production data. So you're likely using an old design too and it tends to recommend these until encouraged otherwise.

Also if you're not using the newer tools like Cursor, Aider, Windsurf, a lot of the contribution of AI is better test coverage. The value that comes with the "agent" tools is they will write and edit code across multiple files, and they save you the trouble of explaining context when you can just share the source code.

TheMongoose 5 months ago

I would look at why you're outsourcing this thinking to HN instead of putting together data and an understanding of your own environment.

Or just ask the LLM's to write it for you.

PaulHoule 5 months ago
I tried Copilot and found the answer it gave was unsatisfying, the best para I can lift is "While gen AI can assist with generating code snippets, documentation, and automating certain tasks, it may not be as effective in solving complex, domain-specific problems that require deep understanding and critical thinking." all accurate but not really compelling to somebody who is under the spell of magic beans.
This famous essay
https://worrydream.com/refs/Brooks_1986_-_No_Silver_Bullet.p...
points out why 10x is a pipe dream with any technology. The core of the argument is that software development involves a number of steps, let's say
```
   * requirements gathering
   * systems support (build/dependencies/version control)
   * software architecture
   * user interface design
   * database design
   * coding
   * testing
   * documentation
   * user training
   * deployment
```
It's not too crazy to suggest that these all require a similar amount of work; so approximately each one of these is 10% of the work [1] If you had some breakthrough that reduced the coding time to zero, you've 1.1x timed your productivity, not 10x'ed. To 10x it you really have to 10x every one of those things!
[1] the argument wouldn't be too different if one of these was 30% of the work, and one certainly isn't
- RainyDayTmrw 5 months ago
  
  ECEs have something called Amdahl's Law[1], which states something very similar.
  [1]: https://en.wikipedia.org/wiki/Amdahl%27s_law
quantified 5 months ago

Free answer from HN: Don't believe the hype.

threecheese 5 months ago

I had a similar question that I posed to an ally in senior leadership. The answer, obviously I guess, was metrics. Assuming your teams didn’t just FAFO through it, you could reconstruct what the expected outcomes were vs actuals. And follow that with sound recommendations about where it could be used along with some thoughts about the future state (when the tools are faster or cheaper or whatever).

Another commenter had a great note, that you should take this opportunity to advance your career, given you are in an excellent position to do so. It made me giggle, but then think that you really need to make sure your PoC was executed thoughtfully and seriously, that you understand what the SoTA is wrt using assistants and their various modalities (chat, agent, copilot etc) and despite your expertise it was a no go. Because if you don’t, somebody else who is taking it dead seriously is going to take that commenters advice and demonstrate the value that you didn’t, and this may reflect poorly on you. Execs are getting hype from their back channels or vendors like I have never seen in my career, and you are going to go against that. ($bigco perspective there)

jarsin 5 months ago

I just watched the latest ycombinator AI promo piece on youtube. Amongst a bunch of other claims, they claim founders in latest batch are saying they won't hire any engineers that don't use AI due to the "force multiplier".

Then I come to check out Ask HN and see this as top post.

PaulHoule 5 months ago

Because it can't. It can beat Stack Overflow, but that's because Stack Overflow sucks and hasn't had a serious competitor.

If somebody else seems to be getting a 10x-speed up they got lucky for a simple problem, are lying (want to make it big as an AI influencer) or are delusional.

Could some product come out next year that's better? Maybe. Right now it's not a productive conversation to have to look for some "nuance" which will get you to 10x.

viraptor 5 months ago

Why ask HN? Have you actually tried using the tech? Then you should have the answers. There are lots of different types of software development and in some areas gen AI will be extremely useful, but a net negative in others. Only you can answer the questions about your environment.

macphisto178 5 months ago

> Why ask HN?
bc hn is a community of smart ppl. I'm sure these discussions are happening at most software companies so I was just curious how others are dealing with it
> Have you actually tried using the tech? Then you should have the answers. There are lots of different types of software development and in some areas gen AI will be extremely useful, but a net negative in others. Only you can answer the questions about your environment.
yes, daily. :) here's my rough answer to my own question:
- as another commenter mentioned, context windows aren't large enough to capture every detail in our system - coding is actually the easy part of the job. Generally speaking, once we know the details of the task (i.e. building table X with N fields, build a service that does XYZ, etc), the coding portion doesn't take a lot of time. So the cost saving of outsourcing that to Claude/GPT/etc can be helpful, but it's not a game changer - team alignment is hard and figuring out what to build is hard - we lose eng time on the unknowns that gen ai doesn't really help with: debugging an issue with a vendor's sdk, identifying the cause of a race condition, figuring out how to mitigate a bot issue, etc.

franktankbank 5 months ago

Do you really need to explain it? Why are other teams picking your tools?

AnimalMuppet 5 months ago

"It hasn't for the same reason it hasn't made you a 10x exec".

verdverm 5 months ago

AI is like hiring jr developers, have they any evidence this has worked prior? Why would they expect an equivalent to work?

How did they come to this belief in the 10x AI developer? Get them to question their base assumptions, ask them to justify their expectations

epcoa 5 months ago

Why not just let them fuck around and find out?

iExploder 5 months ago

you would be wise to in fact do the exact opposite. claim it does more than 10x. ask to spearhead the AI transformation of your company. promise huge cost savings. make your bag. and who knows, maybe you'll actually deliver on those promises as a side effect. and what if not? remember, the senior eng team is there as a convenient scapegoat ...

SavageBeast 5 months ago

@iExploder has been a consultant in a previous life - this is a more realistic plan than it sounds

ianpurton 5 months ago

Because writing prompts is hard.

Most AI coding tools can generate a Todo app from a small prompt. This is because that problem is well understood.

When you try to use AI coding tools on your own projects you need to start writing a prompt that teaches the AI about your current architecture and decisions.

So the initial prompt is large.

Often the task needs knowledge of other files in your project. You can add them by hand or some AI tools will search the code base.

The prompt is now huge.

When you run that prompt you may or may not get what you expected.

So now the issue is how much time do you spend getting the prompt correct vs just writing the code yourself.

This area is brand new and there are very few resources on how to use AI coding tools effectively.

I have yet to see one demonstration of effective AI coding tool use on a project of reasonable complexity.