A brand new technique to optimize and prioritize AI initiatives for the GPU scarcity


Head over to our on-demand library to view periods from VB Remodel 2023. Register Right here


Generative AI, enabled by giant language fashions (LLMs) like GPT-4, has precipitated shockwaves within the tech world. ChatGPT’s meteoric rise has triggered the worldwide tech business to reassess and prioritize gen AI, reshaping product methods in actual time.

Integration of LLMs has given product builders a straightforward technique to incorporate AI-powered options into their merchandise. Nevertheless it’s not all clean crusing. A evident problem looms giant for product leaders: the GPU scarcity and spiraling prices.

Rise of LLMs and GPU scarcity

The growing variety of AI startups and companies has led to excessive demand for high-end GPUs equivalent to A100s and H100s, thereby overwhelming Nvidia and its manufacturing associate TSMC, each of whom are struggling to satisfy the provision. On-line boards like Reddit are abuzz with frustrations over GPU availability, echoing the sentiment throughout the tech neighborhood. It’s grow to be so dire that each AWS and Azure have had no alternative however to implement quota techniques.

This bottleneck doesn’t simply squeeze startups; it’s a stumbling block for tech giants like OpenAI. At a current off-the-record assembly in London, OpenAI’s CEO Sam Altman candidly acknowledged that the pc chip scarcity is stymieing ChatGPT’s development. Altman reportedly lamented that the dearth of computing energy has resulted in subpar API availability and has obstructed OpenAI from rolling out bigger “context home windows” for ChatGPT.

Occasion

VB Remodel 2023 On-Demand

Did you miss a session from VB Remodel 2023? Register to entry the on-demand library for all of our featured periods.

 


Register Now

Prioritizing AI options

On the one hand, product leaders discover themselves caught in a relentless push to innovate, going through the expectations to ship cutting-edge options that leverage the energy of gen AI. However, they grapple with the cruel realities of GPU capability constraints. It’s a fancy juggling act, the place ruthless prioritization turns into not only a strategic determination however a necessity.

Provided that GPU availability is poised to stay a problem for the foreseeable future, product leaders should suppose strategically about GPU allocation. Historically, product leaders have leaned on prioritization strategies just like the Buyer Worth/Want vs. Effort Matrix. This methodology, nonetheless logical in a world the place computational sources have been ample, now calls for a little bit of reevaluation.

In our present paradigm, the place compute is the constraint and never software program expertise, product leaders should redefine how they prioritize numerous merchandise or options, bringing GPU limitations to the forefront of strategic decision-making.

Planning round capability constraints may appear uncommon for the tech business, nevertheless it’s a commonplace technique in different industries. The underlying idea is easy: Probably the most helpful issue is the time spent on the constrained useful resource, and the target is to optimize the worth per unit of time spent on that constraint.

Know-how success metrics

As a former marketing consultant, I’ve efficiently utilized this framework throughout numerous industries. I imagine that tech product leaders can even use an analogous method to prioritize merchandise or options whereas GPU constraints exist. When making use of this framework, essentially the most simple measure of worth is profitability.

Nevertheless, in tech, profitability won’t all the time be the suitable metric, significantly when venturing into a brand new market or product. Thus, I’ve tailored the framework to align with the success metrics usually utilized in tech, outlining a easy 4 steps course of:

1. Contribution

Initially, establish your North Star metric. That is the contribution of every product or function, one thing that encapsulates the essence of its price. Some concrete examples may embody:

  • A rise in income and revenue
  • Positive aspects in market share
  • Progress within the variety of every day/month-to-month lively customers

2. Variety of GPUs required

Gauge the variety of GPUs wanted for every product or function. Give attention to key components together with:

  • Variety of queries per person per day
  • Variety of every day lively customers
  • Complexity of the question (what number of tokens every question consumes)

3. Calculate contribution per GPU

Break it all the way down to the specifics. How does every GPU contribute to the general purpose? Understanding this gives you a transparent image of the place your GPUs are greatest allotted.

Prioritize merchandise based mostly on contribution per GPU

Now, it’s time to make the powerful selections. Rank your merchandise by their Contribution per GPU, after which line them up accordingly. Give attention to the merchandise with the very best Contribution per GPU first, making certain that your restricted sources are channeled into the areas the place they’ll take advantage of affect.

With GPU constraints now not a blind spot however a quantifiable issue within the decision-making course of, your organization can extra strategically navigate the GPU scarcity. To convey this framework to life, let’s visualize a situation the place you, as a product chief, are grappling with the problem of prioritizing amongst 4 completely different merchandise:

  Product A Product B Product C Product D
Income Potential (Contribution) $100M $80M $50M $25M
Variety of GPUs Required 1,000 450 500 50
Contribution Per GPU $0.1M/GPU $0.18M/GPU $0.1M/GPU $0.5M/GPU

Though Product A has the very best income potential, it doesn’t yield the very best contribution per GPU. Surprisingly, Product D, with the least income potential, provides essentially the most substantial return per GPU. By prioritizing based mostly on this metric, you could possibly maximize whole potential income.

Let’s say you have got a complete of 1,000 GPUs at your disposal. An easy alternative might need you choosing Product A, producing a income potential of $100 million. Nevertheless, by making use of the prioritization technique described above, you could possibly obtain $155 million in income:

Precedence Order Product Income Achieve GPUs
1 Product D $25M 50
2 Product B $80M 450
3 Product C $50M 500
Whole   $155M 1,000

The identical methodology might be utilized to different contribution metrics, equivalent to market share achieve:

  Product A Product B Product C Product D
Market Share Achieve (Contribution) 5% 4% 2.5% 1.25%
Variety of GPUs Required 1,000 500 500 50
Contribution Per GPU 0.005%/GPU 0.008%/GPU 0.005%/GPU 0.025%/GPU

Equally, deciding on Product A would have led to a market share achieve of 5%. Nevertheless, making use of the prioritization technique described above, you could possibly obtain 7.75% in market share achieve:

Precedence Order Product Market Share achieve GPUs
1 Product D 1.25% 50
2 Product B 4% 450
3 Product C 2.5% 500
Whole   7.75% 1,000

Advantages and limitations

This different prioritization framework introduces a extra nuanced and strategic method. By zeroing in on the Contribution Per GPU, you’re strategically aligning sources the place they’ll take advantage of substantial distinction, whether or not when it comes to income, market share or every other defining metric.

However the benefits don’t cease there. This methodology additionally fosters a higher sense of readability and objectivity throughout product groups. In my expertise, together with my early days main digital transformation at a healthcare firm and later whereas working with numerous McKinsey shoppers, this method has been a game-changer in eventualities the place capability constraints are a important issue. It’s enabled us to prioritize initiatives in a extra data-driven and rational method, sidelining the normal politics the place selections may in any other case fall to the loudest voice within the room.

Nevertheless, no one-size-fits-all answer exists, and it’s price acknowledging the potential limitations of this methodology. As an example, this method could not all the time encapsulate the strategic significance of sure investments. Thus, whereas exceptions to the framework can and must be made, they must be rigorously thought-about somewhat than the norm. This maintains the integrity of the method and ensures that any deviations are made with a broader strategic context in thoughts.

Conclusion

Product leaders are going through an unprecedented scenario with the GPU scarcity, so discovering new methods of managing sources is required. Within the phrases of the nice strategist Solar Tzu, “Within the midst of chaos, there’s additionally alternative.”

The GPU scarcity is certainly a problem, however with the fitting method, it could even be a catalyst for differentiation and success. The proposed prioritization framework, specializing in Contribution Per GPU, provides a strategic technique to prioritize. By zeroing in on Contribution Per GPU, corporations can maximize their return on funding, aligning sources the place they’ll take advantage of affect and specializing in what issues essentially the most to the long-term success of their firm.

Prerak Garg is senior director of cloud and AI company technique at Microsoft and a former McKinsey and Firm engagement supervisor.

DataDecisionMakers

Welcome to the VentureBeat neighborhood!

DataDecisionMakers is the place specialists, together with the technical individuals doing information work, can share data-related insights and innovation.

If you wish to examine cutting-edge concepts and up-to-date data, greatest practices, and the way forward for information and information tech, be a part of us at DataDecisionMakers.

You may even think about contributing an article of your individual!

Learn Extra From DataDecisionMakers

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles