Optimizing Knowledge Graphs for AI Search and Discovery

Summary:
A behind-the-scenes look at our knowledge graph development.
We share our technical journey working with GraphRag, demonstrating how intentional design choices led to an efficient 9.1MB implementation.
Our approach to knowledge graph development for AI Discovery

Knowledge graphs (KGs) have been around for centuries. Their origins trace back to the roots of graph theory itself (dive into the history). While graph theory has long been an integral part of mathematical and scientific studies, it is only recently that graph databases have been standardized and productionized. Now, they adhere to the ACID standards that developers expect from business-ready databases, solving real-world problems across various fields, from business analytics to sociology. Yet, despite these advances, graph databases are still emerging technologies. Many organizations find them challenging to adopt due to their complexity. Creating effective KGs involves organizing information into nodes, entities, and relationships—it's a sophisticated technical task that mirrors human cognitive processes but requires significant expertise to implement.

So, how does one make sense of KGs in order to make them useful? By addressing the core problem: Each graph has to be unique, reasonably scaled, and must be created from a critical, timely, and informative information source.

At Living Assets, we’ve started applying our entity relationship mapping principles developed for AI Search or Search Agent Optimization (SAO) to various types of files to make them efficient and effective resources for LLMs—in our upcoming product launch codenamed Beacon, knowledge graphs go from being unmanageable and expensive undertakings to useful in a number of business and enterprise situations. From information retrieval (as in this example) to digital marketing, KGs can unlock new opportunities across the Agentic Web.

If this sounds complex, that's because it is!

Before Finetuning: Image shows the KG representation of our blog, Living Assets Lab, without any fine tuning. As you can see, relationships are a bit random and concepts are not linked together.

This complexity is partly why graph databases are still considered in their infancy. A significant learning curve hinders broader adoption. The problem is that graphs databases are still really hard to organize. LLMs can certainly help, but as you can see from this example above - where we used a basic LLM prompt and an open-source RAG library without any finetuning. They’re still not perfect at creating meaningful relationships without the guidance from a human being.

Open-source tools like graphrag tool aims to simplify knowledge graph creation so that more developers and companies can have access to this technology to beef up their LLMs’ RAG capabilities but obviously still falls short.

Why Businesses Struggle with LLMs and Knowledge Graphs?

Every graph must be created with specific intentions. To illustrate this, we compare two examples of KGs and LLM prompt engineering. One without any finetuning and the other with Living Assets entity relationship mapping.

In this first example with no finetuning, we created a knowledge graph of our entire, Living Assets Lab, and asked an LLM with access to the graph the following question:

When did Living Assets move from Copenhagen to Los Angeles?

The response was a typical long-winded LLM answer, missing the mark.

Living Assets' strategic relocation to Los Angeles reflects a deliberate move to be closer to its core user base and investors…. It underscores a theme of strategic engagement and collaboration, aimed at fostering closer ties with the ecosystem that Living Assets seeks to serve and innovate within….

bla…bla..bla A typical-LLM Response that Doesn’t Answer The Question.

Since we know the dataset very well, we know that we specifically have an article focusing on our decision to move from our founding home, Copenhagen, Denmark, to Los Angeles, USA. The response? Disappointing, but remember this is with no guidance and no special technique. This shows that creating an efficient graph isn’t simply a matter of using ChatGPT and telling it “create a knowledge graph”.

So then, we fine-tuned the knowledge graph creation process and carefully chose a combination of models—claude-sonnet-3 and chatgpt-4o—to create the KG. As you can see from this snippet of the relations, this just makes more sense.

Image shows the fine tuned knowledge graph representation of our blog, Living Assets Lab. Relationships are intuitive and make sense, allowing LLMs to efficiently query the graph to answer user questions.

We can improve upon this implementation by customizing the knowledge graph creation pipeline to fit our needs. By rebuilding our knowledge graph using custom settings, Living Assets’ secret configurations, and a more appropriate ensemble of models even the XML representation of our knowledge graph becomes much more clear. As you can see from the snippet of the relations, this just makes more sense.

We asked our same question again.

When did Living Assets move from Copenhagen to Los Angeles?

The response? Clearly answers our question and provides further informative and accurate insight by pulling directly from the source article.

Living Assets transitioned from Copenhagen to Los Angeles as part of its evolution from the claai project. This move was highlighted in a blog post by Dima Durah, published on September 30, 2024… The decision to move was driven by the desire to be closer to the user base and investors… This move signifies a new chapter for Living Assets…

This visualization demonstrates how we created our graph with the intentional connections that we ourselves know to exist, and thus any LLM - even a third party - is able to understand the more intuitive relationships; an AI-agent can now understand the true contents of Living Assets Lab and answer very specific questions! Best of all, this entire knowledge graph takes up just 9,1 MB on disk, an incredibly scalable resource requirement.

The Takeaway:

Through our work at Living Assets, we've demonstrated that with the right approach KGs go from complex, resource-heavy systems to lean, practical tools for the AI era. This compact knowledge graph implementation effectively organized and enhanced content for AI search and discovery with only a 9.1MB footprint.

Our experiment with GraphRag shows that while AI tools are advancing rapidly, the human element remains crucial in creating meaningful relationships within KGs. However, by combining technical expertise with intentional design, we are able to build bridges between traditional content organization and the emerging needs of the Agentic Web.

You’ve Made It This Far!

Living Assets specializes in creating targeted knowledge graphs that revolutionize digital marketing strategies. Our approach combines traditional SEO principles with innovative Search Agent Optimization (SAO) techniques, designed specifically for the emerging Agentic Web. In this AI-first world, where LLMs and AI-powered search engines dominate, we help businesses increase their visibility and conversion rates through specialized knowledge graph implementations. Our focus on precise terminology and semantic relationships enables businesses to effectively communicate with both human users and AI agents across digital marketplaces, social media, and the broader web.

Want to be among the first to implement our knowledge graphs for your business? We’d love to hear from you.

Dima Durah

Dec 23, 2024

Optimizing Knowledge Graphs for AI Search and Discovery

If this sounds complex, that's because it is!

Why Businesses Struggle with LLMs and Knowledge Graphs?

bla…bla..bla A typical-LLM Response that Doesn’t Answer The Question.

The Takeaway:

You’ve Made It This Far!

Want to learn more? Here's our curated list of relevant resources:

More from the Lab

More from the Lab

Optimizing Knowledge Graphs for AI Search and Discovery

How We Accidentally Built a Better SEO Tool: Agentic Web Report