In Part 1, we learned how to connect to Alibaba Cloud's Qwen model using Python. We asked it general questions and it gave us great answers.
But what if you ask Qwen: "How do I reset my password on my company's internal portal?" or "Summarize this specific legal contract?"
Qwen will fail. Why? Because Generic LLMs do not know your private data.
To fix this, we need RAG (Retrieval-Augmented Generation). In this episode, we are going to build a "Chat with PDF" tool. We will feed a PDF document into our system, and Qwen will answer questions based only on that document.
Standard LLMs work like a student taking a closed-book exam. They have to rely on their memory (training data). If they don't know the answer, they might guess (hallucinate).
RAG changes this to an open-book exam.
To achieve this, we will use two Alibaba Cloud models:
● Text-Embedding-v3: To convert text into search-friendly numbers.
● Qwen-Plus: To read the text and generate the answer.
We need a library to read PDFs. We will use pdfplumber because it handles text extraction very accurately. We also need numpy for the vector math.
Open your terminal and run:
pip install openai numpy pdfplumber python-dotenv
Computers can't understand text directly; they understand numbers. To search a PDF, we have to:
When the user asks a question, we convert their question into a Vector, too. Then, we simply look for the PDF chunk that is "mathematically closest" to the question.
Create a file named pdf_rag.py.
We are sticking to the OpenAI-Compatible method we used in Episode 1. This ensures your code works globally without region errors.
Copy and paste the following code:
import os
import numpy as np
import pdfplumber
from openai import OpenAI
from dotenv import load_dotenv
# 1. Load API Key
# Make sure you have a .env file with: DASHSCOPE_API_KEY=sk-your_key
load_dotenv()
# 2. Setup Client (International Endpoint)
# We point the base_url to Alibaba Cloud's International server.
client = OpenAI(
api_key=os.getenv("DASHSCOPE_API_KEY"),
base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1",
)
def extract_text_from_pdf(pdf_path):
"""
Reads a PDF and splits it into chunks (one chunk per page).
"""
chunks = []
if not os.path.exists(pdf_path):
print(f"Error: File '{pdf_path}' not found.")
return []
print(f"Loading {pdf_path}...")
with pdfplumber.open(pdf_path) as pdf:
for i, page in enumerate(pdf.pages):
text = page.extract_text()
if text:
# We prepend "Page X" so the AI can cite its sources!
chunks.append(f"[Page {i+1}] {text}")
print(f"Successfully loaded {len(chunks)} pages.")
return chunks
def get_embedding(text):
"""
Calls Alibaba Cloud 'text-embedding-v3' to turn text into vectors.
"""
# Clean up newlines to improve embedding quality
text = text.replace("\n", " ")
try:
response = client.embeddings.create(
model="text-embedding-v3",
input=[text],
dimensions=1024
)
return response.data[0].embedding
except Exception as e:
print(f"Error getting embedding: {e}")
return []
def find_best_match(query, corpus_embeddings, corpus_text):
"""
Compares the user's question against all PDF pages using Cosine Similarity.
"""
# 1. Embed the user's question
query_vec = get_embedding(query)
if not query_vec: return None
# 2. Compare against every page in the PDF
scores = []
for doc_vec in corpus_embeddings:
# Dot product is a simple way to measure similarity between normalized vectors
score = np.dot(query_vec, doc_vec)
scores.append(score)
# 3. Get the index of the highest score
best_idx = np.argmax(scores)
return corpus_text[best_idx]
# --- MAIN APPLICATION ---
if __name__ == "__main__":
# SETUP: Put a PDF file in the same folder and name it 'manual.pdf'
pdf_filename = "manual.pdf"
print("--- Step 1: Processing PDF ---")
kb_text = extract_text_from_pdf(pdf_filename)
if not kb_text:
print("Exiting: No text found.")
exit()
print("--- Step 2: Generating Embeddings (This takes a few seconds) ---")
kb_vectors = [get_embedding(text) for text in kb_text]
print("--- Ready! Ask questions about your PDF. ---\n")
while True:
user_query = input("You: ")
if user_query.lower() in ['exit', 'quit']: break
print("Searching document...")
# 1. RETRIEVE: Find the best page
best_context = find_best_match(user_query, kb_vectors, kb_text)
# 2. AUGMENT: Create the prompt
prompt = f"""
You are a helpful assistant. Answer the user's question based ONLY on the content below.
If the answer is not in the content, say "I don't know."
Document Content:
{best_context}
User Question: {user_query}
"""
# 3. GENERATE: Ask Qwen
try:
completion = client.chat.completions.create(
model="qwen-plus", # Strong reasoning model
messages=[
{'role': 'system', 'content': 'You are a helpful assistant.'},
{'role': 'user', 'content': prompt}
]
)
print(f"\nQwen: {completion.choices[0].message.content}\n")
print("-" * 50)
except Exception as e:
print(f"Error: {e}")
manual.pdf and place it in the same folder as your python script.python pdf_rag.py
Example Output:
Imagine I uploaded a PDF about a coffee machine.
You: Why is the red light blinking?
Searching document...
Qwen: According to [Page 4], the red light blinks when the water tank is empty. Please refill the tank.
Notice what just happened:
text-embedding-v3 (International version) to understand the meaning of your question, not just keywords.qwen-plus to read that page and explain it to you in plain English.We didn't need a massive vector database or complex infrastructure. Just few lines of Python code and Alibaba Cloud Model Studio.
If you want to build this for a real startup or enterprise app:
kb_vectors in a Python list (which disappears when you close the script), store them in Alibaba Cloud AnalyticDB for PostgreSQL. It has built-in vector search.You now have a working RAG chatbot! But what if you want to build an AI that can take action? What if you want it to not just read the manual, but actually book a meeting or send an email?
In Episode 3, we will explore AI Agents and Function Calling. We will teach Qwen how to use tools to interact with the real world.
See you then!
Disclaimer: The views expressed herein are for reference only and don't necessarily represent the official views of Alibaba Cloud.
12 posts | 0 followers
FollowAlibaba Cloud Native Community - March 20, 2025
Farah Abdou - December 1, 2025
Farruh - February 26, 2024
Alibaba Cloud Community - September 6, 2024
Alibaba Cloud Data Intelligence - December 27, 2024
Alibaba Cloud Data Intelligence - June 20, 2024
12 posts | 0 followers
Follow
Tongyi Qianwen (Qwen)
Top-performance foundation models from Alibaba Cloud
Learn More
Alibaba Cloud for Generative AI
Accelerate innovation with generative AI to create new business success
Learn More
AI Acceleration Solution
Accelerate AI-driven business and AI model training and inference with Alibaba Cloud GPU technology
Learn More
Platform For AI
A platform that provides enterprise-level data modeling services based on machine learning algorithms to quickly meet your needs for data-driven operations.
Learn MoreMore Posts by Farah Abdou