Demand for generative AI is soaring, putting pressure on data center infrastructure.
Article Source: Network World
Enterprise adoption of generative artificial intelligence (AI), which is capable of generating text, images, or other media in response to prompts, is in its early stages, but is expected to increase rapidly as organizations find new uses for the technology.
“The generative AI frenzy shows no signs of abating,” says Gartner analyst Frances Karamouzis. “Organizations are scrambling to determine how much cash to pour into generative AI solutions, which products are worth the investment, when to get started and how to mitigate the risks that come with this emerging technology.”
Bloomberg Intelligence predicts that the generative AI market will grow at a staggering 42% per year over the next decade, from $40 billion in 2022 to $1.3 trillion.
Generative AI can help IT teams in a variety of ways: it can write software code and networking scripts, provide troubleshooting and issue resolution, automate processes, provide training and onboarding, create documentation and knowledge management systems, and help with project management and planning.
It can transform other parts of the business as well, including call centers, customer service, virtual assistants, data analytics, content creation, design and development, and predictive maintenance—to name a few.
But will data center infrastructures be able to handle the growing workload generated by generative AI?
Generative AI impact on computing requirements
There is no doubt that generative AI will be part of most organizations’ data strategies going forward. What networking and IT leaders need to be doing today is ensuring that their IT infrastructures, as well as their teams, are prepared for the coming changes.
As they build and deploy applications that incorporate generative AI, how will that affect demand for computing power and other resources?
“The demand will increase for data centers as we know them today, and will drastically change what data centers and their associated technology look like in the future,” says Brian Lewis, managing director, advisory, at consulting firm KPMG.
Generative AI applications create significant demand for computing power in two phases: training the large language models (LLMs) that form the core of generate AI systems, and then operating the application with these trained LLMs, says Raul Martynek, CEO of data center operator DataBank.
“Training the LLMs requires dense computing in the form of neural networks, where billions of language or image examples are fed into a system of neural networks and repeatedly refined until the system ‘recognizes’ them as well as a human being would,” Martynek says.
Neural networks require tremendously dense high-performance computing (HPC) clusters of GPU processors running continuously for months, or even years at a time, Martynek says. “They are more efficiently run on dedicated infrastructure that can be located close to the proprietary data sets used for training,” he says.
The second phase is the “inference process” or the use of these applications to actually make inquiries and return data results. “In this operational phase, it requires a more geographically dispersed infrastructure that can scale quickly and provide access to the applications with lower latency—as users who are querying the information will want a fast response for the imagined use cases.”
That will require data centers in many locations as opposed to the centralized public cloud model that currently supports most applications, Martynek says. In this phase, data center computing power demand will still be elevated, he says, “but relative to the first phase such demand is spread out across more data centers.”
Generative AI drives demand for liquid cooling
Networking and IT leaders need to be cognizant of the impact generative AI will have on server density and what that does to cooling requirements, power demands, sustainability initiatives, etc.
“It’s not just density, but duty cycle of how often and how much those servers are being used at peak load,” says Francis Sideco, a principal analyst at Tirias Research. “We’re seeing companies like NVIDIA, AMD and Intel with each generation of AI silicon trying to increase performance while keeping power and thermal under control.”
Even with these efforts, power budgets are still increasing, Sideco says. “With how rapidly the workloads are increasing, especially with GenAI, at some point we will be hitting a wall.”
Server density “doesn’t have to rise like we saw with blade technology and virtual hosts,” Lewis adds. “Technical innovations like non-silicon chips, graphics processing units (GPUs), quantum computing, and hardware-aware, model-based software development will be able to get more out of existing hardware.”
The industry has already been experimenting with innovative liquid cooling techniques that are more efficient than air, as well as sustainability in diverse locations such as Microsoft’s Project Natick, an undersea data center, Lewis says.
“Traditional air cooling techniques, such as the use of fans, ducts, vents and air-conditioning systems, are not sufficient to meet the cooling demands of high-performance computing hardware such as GPUs,” Lewis says. “Therefore, alternative cooling technologies such as liquid cooling are gaining traction.”
Liquid cooling involves circulating coolants, such as water or other fluids, through heat exchangers to absorb the heat generated by computer components, Lewis says. “Liquid cooling is more energy-efficient than traditional air cooling, as liquids have a higher thermal conductivity than air, which allows for better and more efficient heat transfer.”
New data center designs will need to satisfy higher cooling requirements and power demands, Martynek says, meaning future data centers will have to rely on new cooling methods such as rear chilled doors, water to the chip or immersion technologies to provide the right mix of power, cooling and sustainability.
Data center operators are already rolling out advancements in liquid cooling, Martynek says. For instance, DataBank uses a new ColdLogik Dx Rear Door cooling solution from QCooling at its facility in Atlanta housing the Georgia Tech Supercomputer.
“We expect a significant increase in water to the door and water to the chip cooling technologies, especially as future generations of GPUs will consume even more power,” Martynek says. “The demand for more compute space and power stemming from generative AI adoption will undoubtedly drive the search for more efficiencies in power consumption and cooling.”
How Gen AI impacts power requirements
It might become more prevalent for data center operators to build their own power substations, Martynek says. “Strains on the electric grid due to demand and the transition to renewable power sources are creating more uncertainty around power supply, and new data center project schedules are heavily influenced by the utility company’s workload and its capabilities to handle the power needs of new facilities,” he says.
Having a reliable and scalable source of power will increasingly be top of mind for data center operators, both to keep up with the demand for power generated by HPC clusters and to get around the timelines and limitations of utilities, Martynek says.
DataBank is rolling out a new data center design standard called the Universal Data Hall Design (UDHD), which features a slab floor with perimeter air cooling and greater spacing between cabinets that is ideal for hyperscale cloud deployments and can be deployed quickly, Martynek says.
“This approach also allows us to easily add raised-flooring and closer cabinet spacing for more traditional enterprise workloads,” Martynek says. “And, we can add next-generation cooling technologies like rear door heat exchangers, water-chilled door configurations or direct chip cooling infrastructure with minimal effort,” he says.
In the future, technology design for data centers “will need to adapt to higher compute demands like quick-access memory, robust storage/storage area networks, high-performance delay/disruption tolerant networking, and big data database technologies,” Lewis says.
IT teams need to get ready
Network and data center teams should be preparing now. “These changes are happening too fast for anyone to be fully ready,” Sideco says. “It’s not just the network/data center teams, but it’s really the whole ecosystem that needs to address all of the changes that are needed.”
That includes the silicon suppliers to handle the increased workloads and power needs. “They provide the different options that the network/data center teams then use to try to [address] the changing requirements,” Sideco says. “Collaboration across all of these is going to be important to try to keep pace with the demand.”
Others are more confident about preparations. “We in IT are always ready for the next disruption,” Lewis says. “The real question is: Will the business invest in what is needed to change? Cost savings remain at the forefront of data center outsourcing. However, the business has not yet adopted modern IT total-cost-of-ownership and value-realization frameworks to measure the ability of IT to be responsive and adapt at the speed technologies like AI are driving the business.”
“To prepare for AI adoption, data centers need to identify the right business and capital strategy so they can invest in the necessary infrastructure and tools and develop an appropriately skilled workforce,” Martynek says. “Having the right people in place to execute the strategy is as important as having the right strategy.”