In industrial environments, data analysis is crucial for optimizing processes, detecting anomalies, and making informed decisions. Manufacturing plants, energy systems, and industrial IoT generate massive amounts of data from sensors, machines, and control systems. Traditionally, analyzing this data requires specialized knowledge in both industrial processes and data science, creating a bottleneck for quick insights.
I’ve been exploring agentic AI frameworks lately, particularly for complex data analysis tasks. While working on industrial data problems, I realized that combining the reasoning capabilities of Large Language Models with specialized tools could create a powerful solution for industrial data analysis. This project demonstrates how to build a ReAct ( Reasoning and Acting) AI agent using LangGraph that can analyze manufacturing data, understand industrial processes, and provide actionable insights.
The goal of this project is to create an AI agent that can analyze industrial datasets (manufacturing metrics, sensor readings, process control data) and provide expert-level insights about production optimization, quality control, and process efficiency. Using LangGraph’s ReAct agent framework with AWS Bedrock, the system can execute Python code dynamically in a sandboxed environment, process large datasets, and reason about industrial contexts.
The dataset is a fake sample of industrial data with manufacturing metrics like temperature, speed, humidity, pressure, operator experience, scrap rates, and unplanned stops. In fact, I’ve generated the dataset using chatgpt
This project uses several key components:
- LangGraph ReAct Agent: For building the multi-tool AI agent with ReAct (Reasoning and Acting) patterns that can dynamically choose tools and reason about results
- AWS Bedrock: Claude Sonnet 4 as the underlying LLM for reasoning and code generation
- Sandboxed Code Interpreter: Secure execution of Python code for data analysis using AWS Agent Core. One tool taken from strands-agents-tools library.
- Industrial Domain Expertise: Specialized system prompts with knowledge of manufacturing processes, quality control, and industrial IoT
The agent has access to powerful tools:
- Code Interpreter: Executes Python code safely in a sandboxed AWS environment using pandas, numpy, scipy, and other scientific libraries
- Data Processing: Handles large industrial datasets with memory-efficient strategies
- Industrial Context: Understands manufacturing processes, sensor data, and quality metrics
The system uses AWS Agent Core’s sandboxed code interpreter, which means:
- Python code is executed in an isolated environment
- No risk to the host system
- Access to scientific computing libraries (pandas, numpy, scipy)
- Memory management for large datasets
The core of the system is surprisingly simple. The ReAct agent is built using LangGraph’s create_react_agent
with
custom tools:
from langgraph.prebuilt import create_react_agent
from typing import List
import pandas as pd
from langchain_core.callbacks import BaseCallbackHandler
def analyze_df(df: pd.DataFrame, system_prompt: str, user_prompt: str,
callbacks: List[BaseCallbackHandler], streaming: bool = False):
code_interpreter_tools = CodeInterpreter()
tools = code_interpreter_tools.get_tools()
agent = create_react_agent(
model=get_llm(model=DEFAULT_MODEL, streaming=streaming,
budget_tokens=12288, callbacks=callbacks),
tools=tools,
prompt=system_prompt
)
agent_prompt = f"""
I have a DataFrame with the following data:
- Columns: {list(df.columns)}
- Shape: {df.shape}
- data: {df}
The output must be an executive summary with the key points.
The response must be only markdown, not plots.
"""
messages = [
("user", agent_prompt),
("user", user_prompt)
]
agent_input = {"messages": messages}
return agent. Invoke(agent_input)
The ReAct pattern (Reasoning and Acting) allows the agent to:
- Reason about what analysis is needed
- Act by calling the appropriate tools (in this case: code interpreter)
- Observe the results of code execution
- Re-reason and potentially call more tools if needed
This creates a dynamic loop where the agent can iteratively analyze data, examine results, and refine its approach – much more powerful than a single code execution.
The magic happens in the system prompt, which provides the agent with industrial domain expertise:
SYSTEM_PROMPT = """
# Industrial Data Analysis Agent - System Prompt
You are an expert AI agent specialized in industrial data analysis and programming.
You excel at solving complex data problems in manufacturing, process control,
energy systems, and industrial IoT environments.
## Core Capabilities
- Execute Python code using pandas, numpy, scipy
- Handle large datasets with chunking strategies
- Process time-series data, sensor readings, production metrics
- Perform statistical analysis, anomaly detection, predictive modeling
## Industrial Domain Expertise
- Manufacturing processes and production optimization
- Process control systems (PID controllers, SCADA, DCS)
- Industrial IoT sensor data and telemetry
- Quality control and Six Sigma methodologies
- Energy consumption analysis and optimization
- Predictive maintenance and failure analysis
"""
The code interpreter tool is wrapped with safety validations:
def validate_code_ast(code: str) -> bool:
"""Validate Python code using AST to ensure safety."""
try:
ast.parse(code)
return True
except SyntaxError:
return False
@tool
def code_interpreter(code: str) -> str:
"""Executes Python code in a sandboxed environment."""
if not validate_code_ast(code):
raise UnsafeCodeError("Unsafe code or syntax errors.")
return code_tool(code_interpreter_input={
"action": {
"type": "executeCode",
"session_name": session_name,
"code": code,
"language": "python"
}
})
The system uses Claude Sonnet 4 through AWS Bedrock with optimized parameters for industrial analysis:
def get_llm(model: str = DEFAULT_MODEL, max_tokens: int = 4096,
temperature: float = TemperatureLevel.BALANCED,
top_k: int = TopKLevel.DIVERSE,
top_p: float = TopPLevel.CREATIVE) -> BaseChatModel:
model_kwargs = {
"max_tokens": max_tokens,
"temperature": temperature,
"top_k": top_k,
"top_p": top_p
}
return ChatBedrock(
model=model,
client=aws_get_service('bedrock-runtime'),
model_kwargs=model_kwargs
)
The project includes fake sample industrial data with manufacturing metrics:
- `machine_id`: Equipment identifier
- `shift`: Production shift (A/M/N for morning/afternoon/night)
- `temperature`, `speed`, `humidity`, `pressure`: Process parameters
- `operator_experience`: Years of operator experience
- `scrap_kg`: Quality metric (waste produced)
- `unplanned_stop`: Equipment failure indicator
A typical analysis query might be: "Do temperature and speed setpoints vary across shifts?"
The agent will stream the response as it generates it.
The agent will:
1. Load and examine the dataset structure
2. Generate appropriate Python code for analysis
3. Execute the code in a sandboxed environment
4. Provide insights about shift-based variations
5. Suggest process optimization recommendations
import logging
import pandas as pd
from langchain_core.callbacks import StreamingStdOutCallbackHandler
from modules.df_analyzer import analyze_df
from prompts import SYSTEM_PROMPT
logging.basicConfig(
format='%(asctime)s [%(levelname)s] %(message)s',
level='INFO',
datefmt='%d/%m/%Y %X')
logger = logging.getLogger(__name__)
class StreamingCallbackHandler(StreamingStdOutCallbackHandler):
def on_llm_new_token(self, token: str, **kwargs):
print(token, end='', flush=True)
df = pd.read_csv('fake_data.csv')
user_prompt = "Do temperature and speed setpoints vary across shifts?"
for chunk in analyze_df(
user_prompt=user_prompt,
df=df,
system_prompt=SYSTEM_PROMPT,
callbacks=[StreamingCallbackHandler()],
streaming=True):
logger.debug(chunk)
This project demonstrates the power of agentic AI for specialized domains. Instead of building custom analytics dashboards or writing specific analysis scripts, we provide the agent with:
- Domain Knowledge: Through specialized system prompts
- Tools: Safe code execution capabilities
- Context: The actual data to analyze
The agent can then:
- Generate appropriate analysis code
- Execute it safely
- Interpret results with industrial context
- Provide actionable recommendations
The result is a flexible system that can handle various industrial analysis tasks without pre-programmed solutions. The agent reasons about the problem, writes the necessary code (sandboxed), and provides expert-level insights.
Full code in my github.