Rafael Saraceni Avatar

Building a RAG powered AI Agent using Langchain.js

Recently I was challenged to create a conversational AI Agent that was capable of reading a PDF and answer questions about it using Langchain.js. In the past I already tried Langchain but didn’t like the experience. I have found it over abstracted, complex to setup and verbose, in the sense that I had to create so many lines of code just to do simple things. I also have tried it in Python, which is a programming language that I am not as comfortable and experienced as with Javascript.

For most of my AI Agents, I have been using Vercel’s AI SDK. For someone with a strong background with React and Next.js, it was perfect for me. I understand that the Langchain community is very active and the framework improved a lot since the last time I tried it. Now that the Javascript version of the framework is more robust and production-ready, I decided to leave my comfort zone and give it a try.

The simplest use case

The idea was to read a book containing the rules for playing a RPG-based game named Pathfinder. It is a 578 pages PDF divided in 15 chapters. For simplicity and velocity, I have decided to save the embeddings on a local vector store called HNSWLib.

My approach for chunking and embedding the contents of the book was quite simple. First I would check if there was already a vector store saved on my local storage. If so I would use it for my RAG, otherwise I would parse the book using Langchain’s PDFLoader and split the content in chunks using RecursiveCharacterTextSplitter. I chose a chunk size of 1.000 and an overlap of 200. Based on the content of the book, these values seemed good enough guarantee the information would be fetched without increasing too much the token count of the LLM’s context.

After getting the chunks, I used OpenAIEmbeddings to convert my chunks into vector format before saving it on my database.

Here you can see the complete source code for this implementation:

import { PDFLoader } from "@langchain/community/document_loaders/fs/pdf";
import { OpenAIEmbeddings } from "@langchain/openai";
import { RecursiveCharacterTextSplitter } from "langchain/text_splitter";
import path from "path";
import { HNSWLib } from "@langchain/community/vectorstores/hnswlib";

// Load the PDF file
export const loadPDF = async (filePath: string) => {
    const loader = new PDFLoader(filePath);
    const docs = await loader.load();
    return docs;
};

// Split the PDF into chunks
export const splitDocs = async (docs: any[]) => {
    const splitter = new RecursiveCharacterTextSplitter({
        chunkSize: 1000, // Reduced from 1000 to stay under token limit
        chunkOverlap: 200, // Reduced overlap proportionally
    });

    return await splitter.splitDocuments(docs);
};

// Get or create the vector store
export async function getOrCreateVectorStore() {
    const directory = "vectorstore";
    const embeddings = new OpenAIEmbeddings({ apiKey: process.env.OPEN_AI_API_KEY });

    try {

        // Try to load existing vector store
        const existingStore = await HNSWLib.load(
            directory,
            embeddings
        );

        console.log("✅ Vector store already exists, loaded from disk");
        return existingStore;

    } catch (error) {

        // Vector store doesn't exist, create it
        console.log("🆕 Creating new vector store...");

        const filePath = path.resolve("pathfinder_rule_book.pdf");
        const documents = await loadPDF(filePath);

        const splitDocuments = await splitDocs(documents);
        const vectorStore = await HNSWLib.fromDocuments(splitDocuments, embeddings);

        await vectorStore.save(directory);

        console.log("✅ New vector store created and saved locally");
        return vectorStore;
    }

}

Creating a RAG tool using Langchain.js

Now that the vector store and the embeddings are all setup, we can create our tool that will be responsible for the RAG using Langchain’s tool.

import { getOrCreateVectorStore } from "./rag.mts";
import { z } from "zod";
import { tool } from "@langchain/core/tools";

const vectorStore = await getOrCreateVectorStore();

const retrieveSchema = z.object({ query: z.string() });

const retrieve = tool(
    async (input: any) => {
        const { query } = input;
        console.log("query", query);
        const retrievedDocs = await vectorStore.similaritySearch(query, 10);
        console.log("retrievedDocs", retrievedDocs.length);
        const serialized = retrievedDocs
            .map(
                (doc) => `Source: ${doc.metadata.source}\nContent: ${doc.pageContent}`
            )
            .join("\n");
        return serialized;
    },
    {
        name: "retrieve",
        description: "Retrieve information about the Pathfinder RPG.",
        schema: retrieveSchema,
    }
);

Creating an Agent using Langchain.js

Now that we have the RAG tool, we need to create an Agent. An important feature for any conversational-based AI is to be able to keep in memory the history of the conversation. This can be quite tricky because saving and providing the LLM with the entire list of questions and answers can cause it to reach the token limit and break. There are many strategies for managing history without increasing the token usage to a limit such as summarising the conversation to a specified token count or keeping just the last messages that don’t reach a high Token value.

Langchain provides an implementation called MemorySaver that allows us delegate this responsibility to the framework instead of having to worry about it ourselves. For this simple test use case, it works fine. In order to create an Agent that relies on Llama 3.3 70B versatile LLM and uses MemorySaver to handle history we can do as follow:

import { ChatGroq } from "@langchain/groq"
import { MemorySaver } from "@langchain/langgraph";
import { createReactAgent } from "@langchain/langgraph/prebuilt";

const agentTools = [retrieve];
const agentModel = new ChatGroq({
    apiKey: process.env.GROQ_API_KEY,
    model: "llama-3.3-70b-versatile",
    temperature: 0.0,
})
agentModel.bindTools(agentTools);

const agentCheckpointer = new MemorySaver();
const agent = createReactAgent({
    llm: agentModel,
    tools: agentTools,
    checkpointSaver: agentCheckpointer,
    prompt: "You are a helpful assistant that can answer questions about the Pathfinder RPG."
})

Now we want to allow the user to ask questions to the Agent. I decided to use the terminal which is the simplest implementation in order to provide a conversational textual interface for or AI. So when you run the script you can start talking to the Agent by just simply asking questions on your command line terminal:

import readline from "readline";
import { HumanMessage } from "@langchain/core/messages";

const rl = readline.createInterface({
    input: process.stdin,
    output: process.stdout,
});

const run = async () => {

    const askQuestion = () => {
        rl.question("\nAsk your question (or type 'exit' to quit): \n", async (question) => {
            if (question.toLowerCase() === "exit") {
                rl.close();
                process.exit(0);
            }

            const result = await agent.invoke({ messages: [new HumanMessage(question)] }, { configurable: { thread_id: "42" } },);

            const lastMessage = result.messages.at(-1);
            console.log(lastMessage?.content);

            askQuestion();
        });
    };

    askQuestion();
};

run();

Conclusion

Well, that was fast. With less than 200 lines of code it was possible to create a fully RAG powered AI Agent using Langchain in Javascript. It was very interesting to see how much this framework evolved in the last time I tried it almost 6 months ago. I believe the Langchain community have been doing a great work. There are still many things I like and don’t like about Langchain, but it seems it still gaining a lot of momentum and the community is still growing.

If you want to check the full source code, it is available on my Github Repo. The PDF book used for this example is also included on the Repo. I understand it might not be a good practice to include a PDF with more than 500 pages on a Github repository but for the simplicity and learning purposes of this tutorial I have decided to leave it like this.

I hope you have enjoyed the reading and feel free to follow me on my social media to keep updated on new content of this blog!

Tagged in :

Rafael Saraceni Avatar

Leave a Reply

Your email address will not be published. Required fields are marked *

More Articles & Posts