At 3M, AI Agents are Making Data Pipelines ‘Self-Healing’ | AIM

May 16, 2025

10

Data engineering is shifting from reactive maintenance to intelligent automation. As enterprises grapple with constant schema changes, growing data volumes and evolving source systems, there is a growing push to make pipelines more adaptive and resilient. At the heart of this shift is the use of AI agents, not as replacements for engineers, but as tools that reduce manual intervention and bring consistency to everyday operations.

While speaking at AIM’s event DES 2025, Manjunatha G, engineering and site leader at the 3M Global Technology Centre, laid out a practical path to integrate AI agents into data engineering workflows.

“Transformation in data is going to be an easy change if we embrace the technology,” he said. However, the change he referred to isn’t flashy. It’s incremental, often mundane, like moving from 10 fields to 12 in a schema, or switching a source system from mainframe to SAP. “These are the kind of standard changes [which every company faces],” he noted.

Schema Changes are Constant

Manjunatha pointed out that schema evolution is inevitable as businesses change. “New dimensions of the data will be introduced,” he said. Traditionally, such changes trigger a long series of updates including source definitions, mapping documents, transformation logic and destination schemas.

He offered an alternative by introducing AI into the pipeline. Specifically, the use of large language models (LLMs) with carefully crafted system prompts. “This change can be done with any full-stack developer or data engineer who knows how to develop and ingest data pipelines,” he said.

He described a setup using prompts to define what the LLM should do. “Be very clear,” he advised. For example, one might instruct the system to ingest only if the file is in CSV format, or to log instances where data volumes exceed 20 MB. With such guardrails in place, a pipeline can dynamically detect new fields, validate them, and update the destination schema, without manual intervention.

“It is self-healing,” he said. “Instead of updating the mapping, transformation engine, and destination schema manually, we can make it totally dynamic.”

System Prompts are Key

The success of this approach depends on the quality of system prompts. “System prompt is where the trick is. User prompt is very easy to build,” he said. A robust system prompt ensures consistent behaviour across pipeline executions and helps reduce hallucinations.

Manjunatha explained how system prompts can also embed controls for schema validation. For instance, new fields can be compared against a gold dataset before they are accepted. This prevents spurious changes from corrupting downstream data.

Beyond Schema, Volume and Business Logic

AI agents are useful for more than schema handling; they can track ingestion volumes, latency, and error rates. Manjunatha shared an example in which the system flagged increased latency and volume, automatically prompting further investigation.

“Please do something,” the system might prompt, indicating a need for action. “And you can only ask what needs to be done,” he said, reinforcing that this is about augmenting engineers, not replacing them.

He also mentioned how these methods can support live transactional systems. Predictive models could be layered onto smart pipelines to forecast demand surges or prevent stockouts.

Small Change, Large Impact

Manjunatha’s message was clear: small code changes backed by AI logic can lead to significant operational improvements.

“Most of the data pipelines, the schema changes are going to be the common scenario,” he said. “Change is minimal. The impact is going to be big.”

Manjunatha emphasised practicality. This approach works across tooling—whether Terraform, ERP tools, or ingestion frameworks—and can be embedded as a lightweight step in existing pipelines.

He urged organisations to begin experimenting. “Wherever there is an opportunity, try to leverage this thought process.” With proactive monitoring, smart prompts, and validation logic, data pipelines can evolve into intelligent systems—less fragile, more responsive, and better aligned with business needs.

Source link

Tags
AI Agents

At 3M, AI Agents are Making Data Pipelines ‘Self-Healing’ | AIM

Schema Changes are Constant

System Prompts are Key

Beyond Schema, Volume and Business Logic

Small Change, Large Impact

Related Articles

How redBus Uses Raw Data from 150 billion Data Points | AIM

Apple Unlikely to Discuss Advanced Siri Upgrades at WWDC 2025: Report

How Zepto’s Data Team Built for 10-Minute Delivery | AIM

LEAVE A REPLY Cancel reply

Latest Articles

How redBus Uses Raw Data from 150 billion Data Points | AIM

Apple Unlikely to Discuss Advanced Siri Upgrades at WWDC 2025: Report

How Zepto’s Data Team Built for 10-Minute Delivery | AIM

Indian IT’s AI Revenue is Still 2–3 Years Away | AIM

This Bengaluru Startup is Building the World’s First Fully Reusable Medium-Lift Rocket | AIM

At 3M, AI Agents are Making Data Pipelines ‘Self-Healing’ | AIM

Schema Changes are Constant

System Prompts are Key

Beyond Schema, Volume and Business Logic

Small Change, Large Impact

Related Articles

LEAVE A REPLY Cancel reply

Stay Connected

Latest Articles