What’s AI Agent?
An AI Agent is an intelligent entity capable of perceiving its environment, making decisions, and executing actions. Its functionality is based on large language model (LLM) . However, unlike directly conversing with an LLM, an AI Agent can independently think, utilize tools, and gradually complete given tasks. Depending on the developer’s skill level, it can accomplish various specialized tasks.
In this case study, we will build a simple single-agent system based on the DeepSeek large language model. The execution logic is straightforward and consists of the following steps: Definition + Observation + Thinking + Action + Memory .
Since an offline LLM lacks internet retrieval capabilities, it requires data input for learning. This ensures that whenever we activate the Agent, it is already prepared to assume its designated role and has sufficient knowledge to answer our queries. During usage, it will record user-approved responses, store them in a database, and continue learning from them.
myCobot 280 Pi
The myCobot 280 series, created by Elephant Robotcs, represent a line of 6 DOF collaborative robot arms designed primarily for personal DIY projects, education and research applications. The myCobot 280 Pi, equipped with a Raspberry Pi as its control board, it can be used without PC, even more, full Python API is provided for easily control, which is designed to be user-friendly and easy for beginners to learn and use.
Project Setup
1. Provide Knowledge Input
To enable the Agent to function effectively, we need to create a knowledge base and input relevant information.
We save the following information as separate DOCX files :
● Introduction to myCobot
● Technical details of 6 DOF collaborative robotic arms
● Usage instructions for the pymycobot API function library
(These resources can be found on myCobot’s GitBook.)
For example :
Save the reference text as ‘.docx’ documents.
2. Load the DeepSeek Model
Before that, you need to purchase your own API_key from the official website of DeepSeek.
Then, we need to load the knowledge into the DeepSeek model via code.
import osfrom docx import Document from openai import OpenAI def extract_text_from_word(doc_path): """Extract From Word (.docx) """ doc = Document(doc_path) return "\n".join([para.text for para in doc.paragraphs]) def load_local_documents(directory): """Read Word """ texts = [] for filename in os.listdir(directory): if filename.endswith(".docx"): file_path = os.path.join(directory, filename) text = extract_text_from_word(file_path) texts.append(text) return texts word_documents = load_local_documents("E:\MyCode\Agent_Deepseek\RobotData")context = "\n".join(word_documents) # Merge all textclient = OpenAI( api_key="xxxx {Your API}", base_url="https://api.deepseek.com") query = "" completion = client.chat.completions.create( model="deepseek-chat", temperature=0.6, messages=[ {"role": "system", "content": "You are mainly researching Python tasks for collaborative robotic arms. You are familiar with and proficient in the Python language, and can utilize the 'pymycobot' robotic API interface to provide a complete Python code that can be used."}, {"role": "user", "content": f" Word Reference:\n{context}\n\n:{query}"} ]) print(completion.choices[0].message.content)
After executing the code, the DeepSeek model will generate a complete myCobot example script based on our input.
3. Output Formatting
LLM-generated output is typically presented as a continuous text stream, which cannot be directly executed as a program.
To allow the Agent to achieve our desired result as automatically saving AI-generated code as a ‘.py’ file and executing it to control the robot, we must format the output properly .
We can use the method of regular expressions to extract Python code from the response of DeepSeek model and save it as a file.
# Use a regular expression to match Python code blockscode_pattern = r"python(.*?)" # Match Python code blocksmatches = re.findall(code_pattern, message_content,re.DOTALL) # If code blocks are found, extract themif matches: python_code = "\n".join(matches).strip()else: # If no markdown code block is found, try to match plain Python code python_code = message_content.strip() # Specify the Python file pathfile_path = "generated_script.py" # Write the extracted Python code to a filewith open(file_path, "w", encoding="utf-8") as f: f.write(python_code) print(f"Python code has been saved to {file_path}")
4. Execute the Script Automatically
To enable automatic execution, we need to call the system terminal and run the generated script.
def execute_command(command): """Executes a shell command and returns stdout and stderr.""" try: result = subprocess.run(command, shell=True, capture_output=True, text=True) return result.stdout, result.stderr except Exception as e: return None, str(e) def run_command(): """Runs the specific command and prints the output.""" command = "conda activate base && python generated_script.py " print(f"\nRunning command: {command}") stdout, stderr = execute_command(command) if stdout: print(f"\nOutput:\n{stdout}") if stderr: print(f"\nError:\n{stderr}") # run commandrun_command()
5. Test with Robot
By adding a while True: loop to the execution process, we can continuously run tasks.
At this point, we have successfully built a simple AI Agent to control the myCobot robot. We can now connect the robot and test its functionality.
Code
import osfrom docx import Documentfrom openai import OpenAIimport subprocessimport re def extract_text_from_word(doc_path): """Extract From Word (.docx) """ doc = Document(doc_path) return "\n".join([para.text for para in doc.paragraphs]) def load_local_documents(directory): """Read Word """ texts = [] for filename in os.listdir(directory): if filename.endswith(".docx"): file_path = os.path.join(directory, filename) text = extract_text_from_word(file_path) texts.append(text) return texts def execute_command(command): """Executes a shell command and returns stdout and stderr.""" try: result = subprocess.run(command, shell=True, capture_output=True, text=True) return result.stdout, result.stderr except Exception as e: return None, str(e) def run_command(): """Runs the specific command and prints the output.""" command = "conda activate base && E: && cd E:\MyCode\Agent_Deepseek\RobotData && python generated_script.py" #print(f"\nRunning command: {command}") stdout, stderr = execute_command(command) if stdout: print(f"\nOutput:\n{stdout}") if stderr: print(f"\nError:\n{stderr}") word_documents = load_local_documents("E:\MyCode\Agent_Deepseek\RobotData")context = "\n".join(word_documents) # Merge all text client = OpenAI( api_key="xxxx {Your API}", base_url="https://api.deepseek.com") while True: # Get user input query = input("\n input command ( 'exit' ):") if query.lower() == "exit": print("exit") break completion = client.chat.completions.create( model="deepseek-chat", temperature=0.6, messages=[ {"role": "system", "content": "You are mainly researching Python tasks for collaborative robotic arms. You are familiar with and proficient in the Python language, and can utilize the 'pymycobot' robotic API interface to provide a complete Python code that can be used."}, {"role": "user", "content": f"reference text:\n{context}\n\n:{query},generate Python Script"} ] ) # Extract the generated Python code message_content = completion.choices[0].message.content code_pattern = r"```python(.*?)```" # Extract code between ```python ... ``` matches = re.findall(code_pattern, message_content, re.DOTALL) if matches: python_code = "\n".join(matches).strip() else: python_code = message_content.strip() # Save the extracted Python code to a file file_path = "E:\\MyCode\\Agent_Deepseek\\RobotData\\generated_script.py" with open(file_path, "w", encoding="utf-8") as f: f.write(python_code) print(f" running..... ") # Run the generated script run_command()
Summary
By building a simple AI agent to control the 6-axis cobot myCobot 280 Pi, we have learned how to create a basic LLM-driven robotics application .
Since this version of agent task does not include additional vision models, speech models, or sensors, it can only perform simple actions . However, if developers integrate vision, speech processing, and sensors , the AI Agent can autonomously complete more complex tasks .