Why most AI implementation projects fail.為何大多數 AI 實施項目皆告失敗。
Most AI projects fail not because the technology doesn't work, but because they chase features instead of solving actual problems.大多數 AI 項目失敗,並非因為技術不行,而是因為它們追逐功能,而非解決實際問題。
I've watched dozens of AI implementation projects over the last two years. Most fail quietly.
They fail not because the technology doesn't work. It does. But they fail because they're built on the wrong assumption: that the goal is features, not change.
Here's the pattern I see:
A company decides to “implement AI.” They hire a consultant or vendor. There's a discovery phase, promises are made, a budget is set. Someone builds a system — an AI agent that processes invoices, or a chatbot that handles support emails, or a dashboard that synthesizes reports. The system works technically. It does what it's supposed to do.
But the team doesn't use it. Or they use it for three weeks and stop. Or they use it incorrectly, defeating the purpose. The real workflow doesn't change.
Six months later, it's abandoned. The company writes it off as “AI wasn't ready yet” or “we didn't have the data” or “our business is too complex.” But that's not what happened. What happened is they built a feature instead of solving a problem.
The difference between features and problems
A feature is something you build and hand over. A problem is something you solve by changing how people work.
When someone asks me “can you build an AI system that processes our intake forms,” I first ask a different question: “How are you processing them now, how long does it take, and what's the actual cost of doing it?”
Usually they say “it takes Sarah four hours a day, and she's probably spending 20% of her time on it.”
That's the problem. Not “we need an AI intake processor.” The problem is “we're paying someone to do something repetitive instead of having them focus on what matters.”
The feature might be elegant. The problem-solving might be messy.
A feature might be a sleek dashboard that scores leads. Problem-solving might be: “Your sales team is wasting time qualifying bad leads because they don't have context. Let's build a scoring agent that uses your actual conversion history to flag the ones worth calling.”
Same technology. Different framing. One is “we built this thing.” The other is “your team's time freed up.”
Only the second one actually sticks.
Why it matters
The difference shows up in three places:
In setup:If you're solving a problem, you start by understanding the actual workflow — not the org chart, not what people saythey do, but what they actually do. How many minutes per day? Which systems does Sarah have to log into? What's the worst-case scenario? What's the edge case that trips everyone up?
This is messier than feature-building. It means interviews and observation and saying “I don't understand, walk me through it again.”
In build:If you're solving a problem, you optimize for adoption, not elegance. That might mean the system lives where people already work — in their email, their WhatsApp, their existing CRM — instead of behind a new dashboard. It might mean starting small (automate 30% of the workflow) instead of trying to handle every edge case at once.
In maintenance:If you're solving a problem, you iterate based on what people actually do after launch, not what you guessed before. Sarah uses the system differently than you expected. That's not a bug in the system — it's a signal that the workflow needs adjustment. Good implementation means tuning based on reality, not defending the original design.
What actually works
The AI implementations I've seen stick share three things:
First:they're scoped around actual workflow, not technology. “We're going to automate the intake process” instead of “we're going to use this AI vendor's platform.” Problem first, technology second.
Second:there's a test phase before commitment. You don't sign a six-month contract. You build something small, use it for two weeks, and then decide if it's worth expanding. This removes risk and surfaces whether it'll actually be adopted.
Third:maintenance is built in from the start. A system that works on day one might need tuning on day 15. Most projects treat launch as the end. Good implementations treat launch as the beginning. “We built this. Now let's make it actually work in your business.”
The realistic timeline
This means good implementation is slower than feature-building.
Discovery: one to two weeks.
Test build: three to four weeks.
Try it: two weeks, in your business, with your people.
Decide: should we keep going?
If yes, tune and expand. If no, you keep what we built and we part cleanly.
Four to six weeks from start to first meaningful test. That sounds slow if you're thinking in terms of “features shipped.” It's fast if you're thinking in terms of changing how we work.
Why this matters for operators
If you're evaluating an AI implementation, ask these questions.
Before you start:What workflow are we actually automating? Listen for specificity. “Invoicing” is vague. “Sarah spends 45 minutes every morning manually matching invoices to POs in the accounting system” is real. What does success look like? Not “we have an AI system.” Real: “Sarah spends 10 minutes on this, not 45.” Are we trying something small first, or committing six months to a big build? Small first. Always.
After launch:Are people actually using this? Not on day one — on day 30. What's different about how it works vs. how we expected? This is normal; signals matter. What should we tune? And is tuning included, or are you paying per change?
If a consultant or vendor can't answer these clearly, you're probably buying features instead of solving problems.
One more thing
Most failed AI projects fail silently because nobody talks about them. Companies don't want to admit they spent £50K on something that nobody used. Vendors don't advertise their failures. Consultants move on to the next project.
But if you've got a workflow that's tedious and repetitive and you've been meaning to automate it for two years, that's the signal. There's probably a good implementation waiting there, buried under a bad one.
The work that sticks starts with clarity about the actual problem, not the latest technology. It moves slowly through a test phase. It commits to iteration and tuning.
It's slower. It's messier. It actually works.
過去兩年,我看過數十個 AI 實施項目。大多數都靜悄悄地失敗了。
它們失敗,並不是因為技術不行 — 技術沒有問題。它們失敗,是因為一開始的假設就錯了:把目標當成功能,而不是改變。
我看到的模式是這樣的:
一家公司決定「導入 AI」。他們找來顧問或供應商,做了一輪需求探查,談好承諾,設定預算。有人建了一套系統 — 一個處理發票的 AI 代理、一個應對客服信件的對話機器人,或一個彙整報告的儀表板。系統技術上是可運作的,做著它該做的事。
但團隊不用它。或者用了三星期就停了。又或者用法不對,反而違背了初衷。真正的工作流程,並未改變。
半年後,這套系統被棄置。公司把它寫成「AI 還沒成熟」、「我們資料不夠」,或「我們業務太複雜」。但事實並非如此。真正發生的,是他們建了一個功能,卻沒有解決一個問題。
功能與問題的分別
功能是你建好之後交付出去的東西。問題則是要透過改變人們的工作方式才能解決的東西。
當有人問我「你能不能建一套 AI 系統來處理我們的收件表單」,我會先反問另一個問題:「你們現在是怎麼處理的?花多少時間?實際的成本是多少?」
他們通常會說:「Sarah 每天要花四個小時,大概佔她 20% 的工時。」
那才是問題所在。不是「我們需要一套 AI 收件處理器」,而是「我們正在花錢請人做重複的工作,而不是讓他專注於真正重要的事」。
功能可以很優雅。解決問題的過程通常很狼狽。
功能可能是一個漂亮的儀表板,幫銷售評分線索。解決問題則可能是:「你們的銷售團隊在篩選爛單上浪費時間,因為他們缺少脈絡。我們來建一個評分代理,用你們真正的成交歷史,把值得打電話的那批標出來。」
同樣的技術,不同的框架。一個是「我們把它建好了」。另一個是「你的團隊時間被釋放出來了」。
只有後者真的會留下來。
為何這件事重要
分別會在三個地方顯現出來:
在前期設定:如果你在解決問題,你會從理解真實的工作流程開始 — 不是組織圖,不是大家口頭上說自己做什麼,而是他們實際在做什麼。每天花多少分鐘?Sarah 要登入幾套系統?最糟的情境是什麼?哪個邊角案例最常絆倒所有人?
這比堆功能更狼狽。它意味著訪談、觀察,以及不斷說「我還沒搞懂,請你再帶我走一遍」。
在建造過程:如果你在解決問題,你會為「採用率」最佳化,而不是為「優雅」最佳化。系統可能要住在人們本來就在用的地方 — 電郵、WhatsApp、現有的 CRM — 而不是躲在一個新的儀表板背後。它可能是從小處開始(先自動化 30% 的流程),而不是一開始就想把每個邊角案例都吞下去。
在維護階段:如果你在解決問題,你會根據人們上線後實際的使用方式去迭代,而不是根據你事前的猜測。Sarah 的用法和你預期的不一樣 — 那不是系統的 bug,而是訊號,告訴你流程需要調整。好的實施,是依照現實去調校,而不是固守原本的設計。
真正會奏效的做法
我看過真的留得下來的 AI 實施,共通點有三:
其一:它們的範圍是圍繞真實的工作流程,不是圍繞技術。是「我們要自動化收件流程」,而不是「我們要用某某 AI 供應商的平台」。先談問題,再談技術。
其二:承諾之前先試。不要簽半年的合約。先做一個小東西,用上兩星期,再決定值不值得擴張。這既能去除風險,也能讓你看清楚它到底會不會被採用。
其三:維護從一開始就納入設計。第一天可用的系統,到第十五天可能就需要微調。多數項目把「上線」當成終點,好的實施則把它當成起點。「我們把它建好了。現在,讓它真的在你的業務裡發揮作用。」
合理的時程
這也意味著,好的實施會比堆功能來得慢。
探查:一到兩星期。
試做:三到四星期。
試用:兩星期,在你的業務裡,由你的人去用。
決定:要不要繼續?
要的話,就調校、擴張。不要的話,已建的成果歸你,我們乾淨地分手。
從起步到第一次有意義的試用,大約四到六星期。若你用「出了幾個功能」來衡量,這聽起來很慢;若你用改變工作方式來衡量,這其實很快。
對委託人而言的意義
如果你正在評估一個 AI 實施案,請問這些問題。
開始之前:我們究竟要自動化哪一個流程?聽他答得有多具體。「開發票」是模糊。「Sarah 每天早上花 45 分鐘在會計系統裡,手動把發票和採購單對起來」才是真實的。成功會是什麼樣?不是「我們有了一套 AI 系統」,而是「Sarah 把這件事從 45 分鐘做到 10 分鐘」。我們是先試小的,還是一上來就承諾半年的大工程?永遠先試小的。
上線之後:人們真的有在用嗎?不是第一天,是第三十天。它實際的運作方式,和我們預期的有什麼不同?這是正常的,訊號才是重點。我們該調校什麼?而調校是包含在內,還是每改一次就要再付一次錢?
如果一位顧問或供應商沒辦法把這些問題清楚地答出來,你買的多半是功能,不是解方。
還有一件事
多數失敗的 AI 項目,是靜悄悄地失敗的,因為沒人會出來談它。公司不會公開承認自己花了五十萬港幣建了一套沒人用的東西。供應商不會宣傳自己的失敗。顧問則會繼續到下一個案子。
但是,如果你手上有一個又繁瑣又重複的流程,已經想了兩年要把它自動化,那就是訊號。那裡很可能埋著一個好的實施 — 只是被一個壞的實施蓋住了。
留得下來的工作,從對問題本身的清晰開始,不是從最新的技術開始。它會慢慢走過一段試用期,會持續調校與迭代。
它比較慢。它比較狼狽。它真的會奏效。
Want to explore whether AI implementation makes sense for your business? Begin a correspondence.想知道 AI 實施對你的業務是否合適? 展開一段書信往來。