IBM Granite 3.0 Models Point The Way To Enterprise AI At Scale

by · Forbes
IBM's new Granite 3.0 AI models reflect its particular approach to enterprise AI for itself and its ... [+] customers, drawing on the diversity of the IBM technology portfolio and the strength of IBM Consulting.getty

IBM is doing something different in enterprise AI. The company is making a full-stack play that builds on the twin pillars of its corporate strategy since 2020—AI and hybrid cloud—and that draws on Big Blue’s strengths across its portfolio. More importantly, its approach bucks the general trend of doing primarily AI proofs-of-concept, focusing instead on solving for very specific business use cases and doing it much more efficiently. It achieves this by leveraging fit-for-purpose AI foundation models built and tuned on in-house enterprise data rather than giant LLMs based on all the public data in the world.

Over the past couple of weeks, IBM has announced new Granite 3.0 foundation models, which are fundamental to this strategy, and hosted a big analyst event in New York to explain in more detail what it’s doing and how well it’s working—in the field, for real-life use cases, delivering actual measurable results. I’ve also had one-on-one conversations with IBM executives to hone my understanding of what they’re doing. Think of this as building on the year-in-review of IBM’s AI strategy that I wrote in early 2024. Let’s dig in.

Balancing Efficiency And Performance With Granite 3.0 Models

Let’s start with the Granite 3.0 models, which are foundation models in the technical sense as well as foundational to IBM’s market approach. My colleague Paul Smith-Goodson has already written a more technical overview of the new models and their (excellent) performance on benchmarks; here I want to emphasize the business implications.

According to IBM, enterprise data represents less than 1% of all data in AI foundation models today.IBM

The fundamental difference between the new Granite 2B and 8B models and the enormous LLMs we see from OpenAI, Meta and others is that those multi-hundred-billion-parameter LLMs are built using public data—as in, all the data you can imagine finding anywhere on the public Internet, including everything you might want to know about sports scores, recipes, vacation spots and the like. As IBM CEO Arvind Krishna puts it, these models are great if you don’t know what you want to do with AI, or if you’re looking for general-purpose information.

MORE FOR YOU
Trump Vs. Harris 2024 Polls: Final Forecasts End In A Virtual Tie As Harris Closes Gap
Samsung’s Update Decision—Bad News For Millions Of Galaxy S24 And S23 Owners
Election 2024 Swing State Polls: Trump-Harris Race Deadlocked On Election Eve—As Pennsylvania Still Tied (Updated)

But the boil-the-ocean approach of the big public LLMs has a couple of significant challenges in the enterprise setting. First, those models don’t know your company’s much narrower context for the specific business problems you’re trying to solve, because clearly they don’t have access to your internal billing information or your code base or the data from your supply chain. Second, the brute-force approach to training and inferencing inherent in the big LLMs requires a lot of resources in terms of electricity, datacenter compute (and expensive GPUs), memory—the list goes on.

The ideal would be to use AI models that deliver accurate answers for highly specific use cases and that do it in a way that’s cost-efficient, time-efficient and energy-efficient. Which is exactly what IBM is doing with the new Granite models in the context of its larger data platform.

The data platform supporting IBM’s AI strategy draws from many different areas of its technology ... [+] portfolio.IBM

How Much More Efficient Is It?

Previous versions of the Granite models had been highly efficient, but a number of different metrics said that they hadn’t reached the highest levels of performance. But the benchmarks that Smith-Goodson cited in the article linked above show clearly that Granite is now matching—or in most cases beating—the performance of the best competing models from other makers. It’s a testament to IBM’s hard work on this over a long period (the company was working on AI models long before the generative AI boom started two years ago), and to the power of aiming a well-engineered model at the structured and unstructured data that lives in your databases and ERP system.

At the analyst event in New York, IBM’s chief commercial officer Rob Thomas and head of research Dario Gil offered more hard numbers. “We're now delivering 40x improvements in cost on inference,” Thomas said. Gil added that IBM’s internal numbers show “how much cheaper is it with a fit-for-purpose model that you customize so you can have better performance for the use case at a fraction of the cost.”

Beyond that, I would add that these smaller models don’t require state-of-the-art GPUs to run on. You can run them on less-performant GPUs or even CPUs, which could make them 10 to 20 times less costly when it comes to running your applications.

From ‘Plus AI’ To ‘AI First’

This kind of technical and financial practicality supports a bigger shift that Thomas talked about. In previous years, enterprises could think of any problem they were trying to solve, any business process, and try to add AI to what they’re already doing. Thomas said that IBM has seen a fundamental change in attitude from its customers to what he called “AI-first” thinking, in which AI is not just something tacked onto existing processes, but built in from the start. “I think the companies that will outperform in the next decade will truly be AI-first,” he said. “That's how they will operate their business.”

For IBM’s customers, this generally means picking a model (Granite, or something from a third party), then using either or both of IBM’s watsonx.ai for running the models and its InstructLab offering for fine-tuning models to their own data. One of the advantages of IBM’s broad portfolio—all of which has been retooled to support the overall AI strategy—is that customers can use IBM for as much or as little of their AI needs as they like. It could be a complete IBM solution, even down to things like bringing in asset management data from IBM’s Maximo Application Suite, or customers can piece-part it using this element from Red Hat, that piece about data management or even an engagement with IBM Consulting. (By the way, IBM Consulting continues to grow its AI practice like gangbusters—to the tune of a $2.5 billion book of business, per IBM’s latest earnings report—even as the top line for IBM Consulting and the overall consulting industry stays flat.) In other words, you can DIY whatever you want, or you can turn to IBM for as much of your AI needs as you want.

Thomas went on to explain some of the key areas where IBM already sees this playing out for customers, especially in customer experience, employee experience and IT operations. In area after area, AI models, AI assistants and now AI agents are identifying patterns, solving problems, retrieving relevant information, automating processes and (increasingly) taking action autonomously to take repetitive work off of employees’ plates and create enormous efficiencies. He gave a number of customer examples across specific use cases, but maybe the most impressive number he cited was the $2 billion in savings that IBM has achieved in the last two and a half years “by implementing our own technology to go after productivity—leveraging AI.” (IBM leaders have said that they expect even more savings in this vein still to come.) This is the grand payoff from IBM’s commitment over the past several years to be its own first client for everything that it builds.

Permissive Licensing And Careful Attention To Governance

Adding to the practicality for customers of IBM’s approach, the company has paid close attention to the legal, financial and compliance aspects of enterprise AI. You know, the “boring” corporate stuff without which you can’t do business. The Granite models are covered under a highly permissive Apache 2 license, which jibes with IBM’s broad commitment to open source. As my colleagues and I have pointed out in the past, these models are also highly vetted by IBM’s lawyers and covered by full indemnity from IBM.

Without going into great detail, there is an ongoing debate in the open source community about just how “open” some of the models advertised as open source are. For example, detractors will note that Meta puts some meaningful constraints on how its Llama models can be used and adapted for revenue-generating purposes. The contrast here is that the Apache 2 model is instantly understandable to the open source community; more than that, according to IBM it is “the most permissive license in the world of open licenses.”

IBM also has a big advantage in governance. At the New York event, Thomas expressed his surprise that there hasn’t been an answer in the market—a legit competing product—since IBM made the initial announcement of watsonx.governance more than a year ago. Meanwhile, every global systems integrator that IBM works with “is using watsonx.governance across the board.” According to Thomas, “It's because there’s really no alternative in the market today that can help you deal with . . . changing regulations, how you deliver or certify a report to a regulator, how you manage data sprawl, how you manage risk.” Data governance is another “boring” area where IBM has excelled for decades; I’ll be interested to see if any other company even tries to take on IBM in this field.

What’s Next For IBM In AI

There’s a lot more I could write about IBM’s partnerships, the growing role of AI agents, how quantum computing could supercharge AI or IBM’s other future plans. For example, CEO Krishna believes that GPUs are fundamentally unscalable for the AI workloads of the coming years, and he believes that a different approach to silicon could mean that in the next few years AI inferencing will be done with far lower latency—and 1% of the electrical power it requires today. Not coincidentally, IBM is designing custom silicon in this area, and its NorthPole chip will use a radically different neuromorphic analog architecture at 12nm, something my colleagues and I will follow up on in future articles.

The bigger point is that IBM is thinking about enterprise AI—and, more importantly, implementing it—with a full-stack, full-company approach. It has a $3 billion order book for AI, including lots of consulting engagements to replicate the kinds of real-world results it has already been getting with a wide diversity of clients.

IBM makes a compelling case that AI is, in Rob Thomas’s words, “an enormous catalyst that will drive productivity for businesses.” It’s appropriate, then, that AI is IBM’s big move. Now it needs to pour on the gas—on the promotion side, on the selling side and on the collaboration side with partners like Adobe and SAP. In short, it needs to put a resounding exclamation point on what it is achieving. Because as the release of the Granite 3.0 models makes clear, there are models optimized for consumers and then there are models and services optimized for the enterprise. And clearly IBM is all-in on the enterprise.