AWS Expands Tools for Enterprise AI, From Custom LLMs to On-Prem AI Hubs

AWS introduced new tools for enterprise customers to build and fine-tune custom large language models, alongside AI infrastructure offerings for on-premises deployments.

By Olivia Grant Edited by Maria Konash Published: Dec 3, 2025 at 5:05 pm UTC Updated: Feb 26, 2026 at 12:20 pm UTC

AWS Expands Tools for Enterprise AI, From Custom LLMs to On-Prem AI Hubs — AWS advances enterprise AI capabilities introducing a new set of tools for building custom LLMs. Photo: Woliul Hasan / Unsplash

Amazon Web Services (AWS) announced expanded AI offerings at its AWS re:Invent conference, introducing new capabilities in Amazon Bedrock and Amazon SageMaker AI. The updates aim to simplify building and fine-tuning custom large language models (LLMs) for enterprise developers.

The cloud provider now offers serverless model customization in SageMaker, allowing developers to build models without managing compute resources or infrastructure. Users can engage with this feature via a self-guided point-and-click workflow or an agent-led interface using natural language prompts, currently in preview. The system supports AWS’s Nova models and select open-source models, including DeepSeek and Meta’s Llama.

Ankur Mehrotra, AWS general manager of AI platforms, explained that the tools can be used for domain-specific applications. “If you’re a healthcare customer and want a model to understand medical terminology, you can provide labeled data, select the technique, and SageMaker fine-tunes the model automatically,” he said.

Reinforcement Fine-Tuning in Bedrock

AWS is also launching Reinforcement Fine-Tuning in Bedrock, which enables developers to customize models using either a reward function or pre-set workflow. The platform manages the full model customization process from start to finish. These initiatives reflect AWS’s focus on frontier LLMs and enterprise-specific model customization.

Nova Forge, announced earlier in the week, provides fully managed custom Nova models for $100,000 annually. Mehrotra noted that differentiation through model customization is key for enterprises seeking unique AI solutions.

AWS AI Factories Bring AI On-Premises

In a related development, AWS unveiled AI Factories, a service delivering dedicated AI infrastructure inside customer data centers. The offering combines AWS AI services, NVIDIA Trainium chips, high-speed networking, and storage to enable organizations to develop and deploy AI applications at scale without building infrastructure from scratch.

AWS AI Factories operate like a private AWS Region, providing managed access to AI tools, foundation models, and storage while leveraging customers’ existing power, network, and space. The service is designed for regulated industries and public sector organizations that require secure, low-latency AI environments. Partnerships with NVIDIA and projects such as the AI Zone in Saudi Arabia demonstrate the scale and performance of these on-premises deployments.

These combined innovations—customizable LLMs through SageMaker and Bedrock, managed Nova Forge models, and AI Factories—position AWS to compete more effectively with established enterprise AI providers while reducing operational complexity for large-scale AI adoption.

AI & Machine Learning, Cloud & Infrastructure, News

AI & Machine Learning, Cloud & Infrastructure, Enterprise Tech, News Palantir and Nvidia Launch Sovereign AI Data Center Architecture

By Olivia Grant March 12th, 2026

AI & Machine Learning, Cloud & Infrastructure, News Meta Unveils New MTIA Chips for AI Data Centers

By Olivia Grant March 11th, 2026

AI & Machine Learning, Cloud & Infrastructure, News, Startups & Investment, Story of the Day Nvidia Invests $2 Billion in Nebius AI Cloud Partnership

By Olivia Grant March 11th, 2026

Netflix Buys Ben Affleck’s AI Startup for Up to $600 Million

Netflix has acquired InterPositive, an AI startup focused on post-production editing tools co-founded by Ben Affleck. The deal could reach $600 million as streaming platforms expand AI use in filmmaking.

By Samantha Reed Edited by Maria Konash Published: Mar 12, 2026 at 2:08 pm UTC

Netflix Buys Ben Affleck’s AI Startup for Up to $600 Million — Netflix acquires Ben Affleck’s AI startup for up to $600M to develop post-production editing tools for filmmakers. Image: Jakob Owens / Unsplash

Netflix has acquired InterPositive, an artificial intelligence company focused on post-production tools for filmmakers. The startup was co-founded by actor and filmmaker Ben Affleck and develops software designed to assist editors and production teams in refining footage during the editing process.

According to Bloomberg, the transaction could be worth up to $600 million, potentially making it one of the largest acquisitions in Netflix’s history. Netflix has not publicly confirmed the full financial terms of the deal.

Sources familiar with the agreement told Bloomberg that the upfront cash payment may be lower, with additional payouts tied to performance targets. If those targets are met, the total value of the acquisition could approach the reported figure.

The largest acquisition previously completed by Netflix was the purchase of the Roald Dahl Story Company for about $700 million.

AI Tools Designed for Film Post-Production

InterPositive develops artificial intelligence tools aimed at helping filmmakers work more efficiently during the editing phase of production. The company’s software focuses on improving workflows rather than generating entirely new content.

Its technology can assist with tasks such as identifying continuity issues, enhancing scenes, and streamlining the process of reviewing and organizing large volumes of footage. These capabilities are intended to reduce time spent on manual editing work while maintaining creative control for filmmakers.

The company’s tools also avoid using copyrighted footage without permission or generating new scenes from scratch, which has been a point of concern for many professionals in the entertainment industry as generative AI becomes more widely adopted.

Netflix has not yet announced how InterPositive’s technology will be integrated into its production pipeline. However, the acquisition aligns with the company’s ongoing efforts to incorporate artificial intelligence into film and television production workflows.

Streaming Platforms Accelerate AI Adoption

Netflix has already experimented with AI-assisted visual effects in its original content. One example includes the use of generative AI technology to create a building-collapse sequence in the Argentine series The Eternaut.

Other entertainment companies are also expanding AI initiatives. Amazon has been developing internal AI teams focused on film and television production, while Disney recently signed an agreement with OpenAI to explore the use of artificial intelligence in media workflows.

The growing role of AI in filmmaking has also raised concerns among industry professionals. Workers in the film and television sectors have warned that AI tools could affect employment in editing, visual effects, and other creative roles.

There are also ongoing debates about how AI models should compensate creators when training on copyrighted material. Industry unions and advocacy groups have called for clearer guidelines to ensure that creative professionals are fairly credited and compensated as AI technologies become more integrated into production pipelines.

AI & Machine Learning, News

AI & Machine Learning, Cybersecurity & Privacy, News, Startups & Investment OpenAI Acquires Promptfoo to Strengthen AI Security Tools

By Maria Konash March 9th, 2026

AI & Machine Learning, News, Story of the Day Google Launches Nano Banana 2: Lightning-Fast, High-Fidelity Image Generation

By Daniel Mercer February 26th, 2026

AI & Machine Learning, Enterprise Tech, News Anthropic Acquires Vercept to Enhance Claude’s Autonomous Computer Use

By Daniel Mercer February 25th, 2026

Palantir and Nvidia Launch Sovereign AI Data Center Architecture

Palantir and Nvidia unveiled a sovereign AI OS reference architecture designed to deliver turnkey AI data centers. The platform integrates Nvidia Blackwell systems with Palantir’s enterprise AI software stack.

By Olivia Grant Edited by Maria Konash Published: Mar 12, 2026 at 1:49 pm UTC

Palantir Technologies has announced a new sovereign AI operating system reference architecture developed in partnership with Nvidia. The system is designed to provide organizations with a turnkey AI data center environment that integrates hardware infrastructure, data management, and AI application deployment.

The platform, called the Palantir AI OS Reference Architecture (AIOS-RA), combines Nvidia’s AI hardware and software stack with Palantir’s enterprise data and AI platforms. The companies said the architecture delivers a production-ready system capable of running advanced AI workloads while maintaining strict data sovereignty and operational control.

The reference architecture is built on Nvidia’s Enterprise Reference Architectures and is optimized to run Palantir’s full software ecosystem. This includes the company’s Artificial Intelligence Platform (AIP), Foundry data platform, Apollo deployment system, Rubix security framework, and AIP Hub.

Full-Stack AI Data Center Design

The system integrates Nvidia’s Blackwell Ultra AI infrastructure, including servers equipped with eight Blackwell Ultra GPUs and Spectrum-X Ethernet networking designed for AI training and inference workloads.

Alongside the hardware layer, the architecture includes a hardened Kubernetes-based compute environment for running Foundry services such as data cataloging, pipeline development, and analytics workloads.

Palantir’s Rubix platform provides a zero-trust security model for Kubernetes infrastructure, while the Apollo system automates deployment and lifecycle management across distributed environments.

The architecture also incorporates Nvidia’s AI Enterprise software stack, CUDA-X libraries, Nemotron open AI models, and Magnum IO data acceleration tools to improve training and inference performance.

Supporting Sovereign and Edge AI Deployments

The companies said the architecture is particularly suited for organizations operating in sensitive environments where data control and regulatory compliance are critical. These include government agencies, defense organizations, and industries with strict data residency requirements.

By enabling on-premise and edge deployments, the sovereign AI architecture allows organizations to run AI workloads locally without relying on public cloud infrastructure. This approach can also reduce latency for mission-critical applications and enable AI deployments in geographically distributed environments.

“From our first deployment with the United States government and in every deployment since, our software has had to meet the moment in the most complex and sensitive environments where customers must maintain control,” said Palantir Chief Architect Akshay Krishnaswamy.

Nvidia’s enterprise AI platforms team said the collaboration reflects growing demand for integrated AI infrastructure solutions.

“AI is redefining the infrastructure stack — demanding, latency-sensitive and data-sovereign environments require a full-stack architecture,” said Nvidia Vice President Justin Boitano.

The companies said the joint architecture is designed to help organizations convert large volumes of operational data into actionable intelligence while maintaining control over infrastructure, models, and applications.

AI & Machine Learning, Cloud & Infrastructure, Enterprise Tech, News

AI & Machine Learning, News, Story of the Day Musk Confirms Tesla–xAI Collaboration on Digital Optimus AI Agent Project

By Daniel Mercer March 12th, 2026

AI & Machine Learning, Cloud & Infrastructure, News Meta Unveils New MTIA Chips for AI Data Centers

By Olivia Grant March 11th, 2026

AI & Machine Learning, News, Research & Innovation Anthropic Launches Institute to Study Societal Impact of AI

By Laura Bennett March 11th, 2026

Musk Confirms Tesla–xAI Collaboration on Digital Optimus AI Agent Project

Elon Musk revealed Digital Optimus, a joint Tesla–xAI AI agent project powered by Grok and Tesla hardware. The announcement deepens links between the companies amid ongoing shareholder litigation.

By Daniel Mercer Edited by Maria Konash Published: Mar 12, 2026 at 1:37 pm UTC

Elon Musk has announced a new artificial intelligence project called Digital Optimus, a collaboration between Tesla and xAI designed to build a computer-controlling AI agent. The system will combine xAI’s Grok large language model with Tesla hardware and software infrastructure.

Macrohard or Digital Optimus is a joint xAI-Tesla project, coming as part of Tesla’s investment agreement with xAI.

Grok is the master conductor/navigator with deep understanding of the world to direct digital Optimus, which is processing and actioning the past 5 secs of…

— Elon Musk (@elonmusk) March 11, 2026

Musk described the system as a dual-layer AI architecture in which Grok performs high-level reasoning while a Tesla-built component processes real-time computer interactions. The system is designed to analyze screen activity and user inputs, including keyboard and mouse actions, from the previous several seconds.

The approach mirrors psychologist Daniel Kahneman’s “dual-process” theory of cognition. According to Musk, Tesla’s component will handle fast, instinctive responses while Grok functions as the strategic reasoning layer.

In a post on X, Musk said the project will run on Tesla’s AI4 chip alongside Nvidia-powered cloud infrastructure used by xAI. He also introduced the internal nickname “Macrohard,” a reference aimed at Microsoft, and claimed the architecture could eventually emulate complex organizational workflows.

The initiative forms part of Tesla’s previously disclosed $2 billion investment agreement with xAI, expanding the relationship between the electric vehicle maker and Musk’s artificial intelligence startup.

Reversal From Earlier Tesla Strategy

The announcement represents a shift from Musk’s earlier statements about the relationship between the two companies. In September 2024, Musk said Tesla had “no need to license anything from xAI,” arguing that Tesla’s real-world AI systems differed significantly from large language models.

At the time, Musk maintained that Tesla’s AI development was focused on autonomous driving and robotics systems rather than generative AI models. The statement also came shortly after Tesla shareholders filed a lawsuit alleging that Musk had breached fiduciary duties by creating xAI outside Tesla while redirecting resources and talent.

Digital Optimus now explicitly integrates xAI’s Grok model with Tesla hardware, signaling deeper technical collaboration between the companies. The system positions Grok as the central reasoning engine while Tesla supplies the specialized hardware and interface layer.

The partnership also follows Tesla’s $2 billion investment in xAI earlier this year during the startup’s Series E funding round, which valued the company at roughly $230 billion.

Legal and Strategic Context

The growing relationship between Tesla and xAI continues to draw scrutiny from investors and courts. The ongoing lawsuit filed in Delaware Chancery Court alleges that Musk diverted Tesla resources, including engineers and computing hardware, to build xAI as a separate company under his control.

Plaintiffs in the case argue that technologies developed within xAI could have been built inside Tesla, particularly given Musk’s long-standing positioning of Tesla as a leader in artificial intelligence and robotics. The lawsuit seeks remedies that could potentially transfer Musk’s xAI ownership stake to Tesla.

Further developments have intensified the connection between Musk’s companies. Earlier this year, SpaceX completed an all-stock acquisition of xAI that valued the combined entity at roughly $1.25 trillion, with plans for a future public offering. Tesla’s investment in xAI therefore represents an indirect stake in the merged SpaceX-xAI structure.

The Digital Optimus project also aligns with Musk’s broader vision for Tesla’s Optimus humanoid robot. xAI executives previously told investors that their long-term goal was to develop advanced AI systems capable of powering robotics platforms such as Tesla’s Optimus.

Taken together, the project signals a tighter technological link between Tesla’s hardware platforms and xAI’s AI models, as Musk increasingly positions the companies’ technologies to work together in building advanced AI-driven systems.

AI & Machine Learning, News

AI & Machine Learning, Cloud & Infrastructure, Enterprise Tech, News Palantir and Nvidia Launch Sovereign AI Data Center Architecture

By Olivia Grant March 12th, 2026

AI & Machine Learning, News, Research & Innovation Anthropic Launches Institute to Study Societal Impact of AI

By Laura Bennett March 11th, 2026

AI & Machine Learning, Cloud & Infrastructure, News, Startups & Investment, Story of the Day Nvidia Invests $2 Billion in Nebius AI Cloud Partnership

By Olivia Grant March 11th, 2026

Google Launches Gemini Embedding 2 Multimodal Model

Google has introduced Gemini Embedding 2, a multimodal embedding model that processes text, images, video, audio, and documents in a unified vector space. The model is available in preview through the Gemini API and Vertex AI.

By Daniel Mercer Edited by Maria Konash Published: Mar 12, 2026 at 1:12 pm UTC

Google Launches Gemini Embedding 2 Multimodal Model — Google launches Gemini Embedding 2, a multimodal model for text, images, video, audio, and documents. Image: Google

Google has released Gemini Embedding 2, its first fully multimodal embedding model built on the Gemini architecture. The model is now available in public preview through the Gemini API and Vertex AI platform.

The new system expands beyond traditional text embeddings by mapping multiple data types into a single vector space. Gemini Embedding 2 can process text, images, video, audio, and documents, enabling applications that require semantic understanding across different media formats.

Embeddings are numerical representations of content that allow AI systems to understand relationships between pieces of information. They are widely used in tasks such as semantic search, recommendation systems, clustering, and retrieval-augmented generation.

By integrating multiple modalities, the model allows developers to simplify pipelines that previously required separate systems for different types of data. According to Google, the unified embedding space can capture relationships between media types, improving the accuracy of multimodal search and analysis.

Multimodal Capabilities and Input Support

Gemini Embedding 2 supports a wide range of inputs across several formats. The model can process up to 8,192 tokens of text, six images per request in PNG or JPEG format, and up to 120 seconds of video in MP4 or MOV format.

The model can also ingest audio directly without requiring transcription and can embed PDF documents of up to six pages. Developers can combine multiple input types within a single request, such as pairing images with text prompts to generate richer semantic representations.

This multimodal capability allows AI systems to analyze complex datasets where information appears in different formats. For example, developers could use a photo or video clip as a search query to retrieve related media assets or documents.

Flexible Embedding Dimensions and Performance Gains

Like earlier Google embedding models, Gemini Embedding 2 uses Matryoshka Representation Learning, a technique that compresses information across nested vector dimensions. This allows developers to adjust embedding sizes depending on performance and storage requirements.

The model’s default vector size is 3,072 dimensions, though developers can scale outputs to smaller dimensions such as 1,536 or 768 for more efficient storage and processing.

Google said the model demonstrates improved performance across text, image, video, and speech tasks compared with previous embedding models. The company noted that the model’s multimodal capabilities could enable new applications in areas such as media search, sentiment analysis, and data clustering.

Early partners are already using the technology to build advanced search tools. Paramount Skydance reported that the model improved its ability to locate video assets based on text queries, allowing the system to identify visual expressions and retrieve related media more accurately.

Gemini Embedding 2 is now accessible through Google’s developer tools and can integrate with AI frameworks including LangChain, LlamaIndex, and vector databases such as Weaviate, Qdrant, and ChromaDB. Google said the model is designed to serve as a foundation for next-generation multimodal AI systems that work with diverse forms of data.

AI & Machine Learning, News

AI & Machine Learning, News, Story of the Day Netflix Buys Ben Affleck’s AI Startup for Up to $600 Million

By Samantha Reed March 12th, 2026

AI & Machine Learning, Cloud & Infrastructure, Enterprise Tech, News Palantir and Nvidia Launch Sovereign AI Data Center Architecture

By Olivia Grant March 12th, 2026

AI & Machine Learning, News, Story of the Day Musk Confirms Tesla–xAI Collaboration on Digital Optimus AI Agent Project

By Daniel Mercer March 12th, 2026

Meta Unveils New MTIA Chips for AI Data Centers

Meta introduced four new in-house MTIA chips designed for AI training and inference as the company accelerates data center expansion. The chips aim to improve performance and reduce reliance on external hardware suppliers.

By Olivia Grant Edited by Maria Konash Published: Mar 11, 2026 at 5:01 pm UTC

Meta has unveiled four new custom chips designed for artificial intelligence workloads as part of its expanding data center infrastructure. The processors are part of the company’s Meta Training and Inference Accelerator (MTIA) family, a line of chips built to handle specific AI tasks within Meta’s platforms.

The announcement marks the latest step in Meta’s push to reduce reliance on external chip vendors by designing its own silicon optimized for internal workloads. According to Meta Vice President of Engineering Yee Jiun Song, custom chips allow the company to improve price-performance efficiency across its data centers while diversifying its hardware supply chain.

“This also provides us with more diversity in terms of silicon supply and insulates us from price changes to some extent,” Song said.

Meta first revealed the MTIA architecture in 2023 and released a second-generation version in 2024. The new lineup significantly expands the platform as the company scales its AI infrastructure to support recommendation systems and generative AI features across its products.

Four New Chips Target Different AI Workloads

The first of the new chips, MTIA 300, has already been deployed in Meta data centers. It is designed to train smaller AI models used for core platform tasks such as ranking and recommendation algorithms. These systems determine which posts, ads, and videos appear in users’ feeds across services including Facebook and Instagram.

Three additional chips are currently in development. The MTIA 400 is nearing deployment after completing testing, while the MTIA 450 and MTIA 500 are scheduled to become operational by 2027.

Unlike the MTIA 300, the upcoming chips will focus on generative AI inference workloads such as generating images and videos from text prompts. However, Meta said the processors will not be used to train large language models.

Song noted that Meta plans to release new chips roughly every six months as the company rapidly expands computing capacity. Each chip generation is expected to remain in service for more than five years.

Data Center Expansion and AI Infrastructure

The new processors will support Meta’s massive data center expansion. The company is currently building a large facility in Louisiana and additional centers in Ohio and Indiana. Reports also indicate that Meta is exploring leasing space at a major AI data center site in Texas.

The custom chips are manufactured by Taiwan Semiconductor Manufacturing Company, though Meta did not confirm whether production will occur at the company’s new fabrication facilities in Arizona.

Meta’s in-house silicon strategy follows a broader industry trend among major technology companies developing application-specific integrated circuits, or ASICs, for AI workloads. These chips are typically smaller and more energy-efficient than general-purpose GPUs but are optimized for narrower tasks.

Google pioneered this approach with its Tensor Processing Units in 2015, and Amazon followed with custom chips for its cloud services in 2018. Unlike those companies, Meta uses its MTIA chips exclusively for internal operations rather than offering them through a public cloud platform.

Despite building its own silicon, Meta continues to rely heavily on external hardware suppliers. The company recently signed agreements to deploy millions of Nvidia GPUs and up to six gigawatts of AMD GPUs across its data centers.

Song acknowledged that securing high-bandwidth memory remains a potential constraint as AI infrastructure spending increases across the technology industry. However, he said Meta has diversified its supply chain and believes it has secured the resources required for its planned deployments.

AI & Machine Learning, Cloud & Infrastructure, News

AI & Machine Learning, Cloud & Infrastructure, Enterprise Tech, News Palantir and Nvidia Launch Sovereign AI Data Center Architecture

By Olivia Grant March 12th, 2026

AI & Machine Learning, News, Research & Innovation Anthropic Launches Institute to Study Societal Impact of AI

By Laura Bennett March 11th, 2026

AI & Machine Learning, Consumer Tech, News Amazon Launches Health AI Assistant for Prime Members

By Samantha Reed March 11th, 2026