Building an AI Assistant with Memory Using Semantic Kernel and OpenAI

Package details
What We're Building
Setting Up the Foundation
Initializing the AI Brain
The Main Conversation Loop
How Memory Storage Works
Real-World Example
Why This Matters

Discover more of what matters to you

In this article, we’ll explore how to create an intelligent chat assistant that can remember information across conversations. Solutions like this are increasingly used in software modernization projects to extend existing applications with intelligent assistants and automation capabilities. Using Microsoft’s Semantic Kernel and OpenAI’s powerful models, we’ll build a console application that not only responds to questions but can also store and recall important information—just like a human assistant would.

Package details

Below are the packages in our project, a Console application targeting .NET 8.0:

Building an AI Assistant with Memory Using Semantic Kernel and OpenAI

12	<PackageReference Include="Microsoft.SemanticKernel" Version="1.67.1" /> <PackageReference Include="Microsoft.Extensions.AI" Version="9.10.0" />

And these are the namespaces we need:

12345

using Microsoft.SemanticKernel;

using Microsoft.SemanticKernel.ChatCompletion;

using System.Net.Http.Json;

using System.Numerics.Tensors;

using System.Text.Json;

What We’re Building

Our application will be a conversational AI with three key capabilities:

Natural conversation using GPT-4
Memory storage – saving important information with semantic embeddings
Memory recall – finding relevant memories using semantic search

Let’s break down how each part works.

Setting Up the Foundation

First, we need to set up our environment and initialize the AI services:

123456789101112131415161718

namespace article3_SemanticContext

{

internal class Program

{

static async Task Main()

{

Console.WriteLine("=== Semantic-Kernel Chat Console with Memory ===");

Console.WriteLine("Type 'exit' to quit.");

Console.WriteLine("Type 'remember: <text>' to save something to memory.");

Console.WriteLine("Type 'recall: <query>' to search memory.\n");

// 1. Obtain your OpenAI key

var apiKey = Environment.GetEnvironmentVariable("OPENAI_API_KEY");

if (string.IsNullOrWhiteSpace(apiKey))

{

Console.Write("OpenAI API key: ");

apiKey = Console.ReadLine();

}

This initial setup creates our console interface and handles the OpenAI API key configuration. The key can be provided either through environment variables or direct input.

Initializing the AI Brain

Next, we initialize the Semantic Kernel with OpenAI’s chat completion service:

123456789101112

// 2. Build the kernel with chat completion

var builder = Kernel.CreateBuilder();

builder.AddOpenAIChatCompletion(modelId: "gpt-4o", apiKey: apiKey!);

var kernel = builder.Build();

var chatService = kernel.GetRequiredService<IChatCompletionService>();

// Simple in-memory storage for memories

var memoryStore = new List<MemoryRecord>();

// 3. Create the chat history and prime the system prompt

var chatHistory = new ChatHistory();

chatHistory.AddSystemMessage("You are a helpful assistant with memory capabilities.");

Here we’re setting up the following crucial components:

Chat Service: Powered by GPT-4, this handles the conversational AI
Memory Store: A simple list that will hold our remembered information
Chat History: Maintains the conversation context with a system prompt that defines the assistant’s personality

The Main Conversation Loop

The heart of our application is the REPL (Read-Eval-Print Loop) that processes user input:

123456789101112131415161718192021222324252627282930313233343536

// 4. REPL loop

while (true)

{

Console.Write("\nYou: ");

var userText = Console.ReadLine();

if (string.IsNullOrWhiteSpace(userText) ||

userText.Equals("exit", StringComparison.OrdinalIgnoreCase))

break;

// Handle memory commands

if (userText.StartsWith("remember:", StringComparison.OrdinalIgnoreCase))

{

var textToRemember = userText.Substring(9).Trim();

await SaveToMemory(textToRemember, apiKey!, memoryStore);

Console.WriteLine($"\n✓ Saved to memory: {textToRemember}");

continue;

}

if (userText.StartsWith("recall:", StringComparison.OrdinalIgnoreCase))

{

var query = userText.Substring(7).Trim();

await RecallFromMemory(query, apiKey!, memoryStore);

continue;

}

// Normal chat

chatHistory.AddUserMessage(userText);

var assistantReply = await chatService.GetChatMessageContentAsync(chatHistory);

chatHistory.AddAssistantMessage(assistantReply.Content ?? string.Empty);

Console.WriteLine($"\nAssistant: {assistantReply.Content}");

}

This loop handles three types of commands:

remember: <text> – Stores information in memory
recall: <query> – Searches for relevant memories
Normal messages – Regular conversation with the AI

How Memory Storage Works

When you use the remember: command, here’s what happens behind the scenes:

12345678910111213141516

static async Task SaveToMemory(

string text,

string apiKey,

List<MemoryRecord> memoryStore)

{

var embedding = await GenerateEmbeddingAsync(text, apiKey);

var record = new MemoryRecord

{

Id = Guid.NewGuid().ToString(),

Text = text,

Embedding = embedding

};

memoryStore.Add(record);

}

The key innovation here is semantic embeddings. Instead of just storing the text, we convert it into a mathematical representation (vector) that captures its meaning.

Want to add AI assistants to your existing applications?

Our team helps companies design and implement AI-powered features in existing systems.

Talk to Experts

The Magic of Semantic Search

When you want to recall information, we use cosine similarity to find the most relevant memories:

123456789101112131415161718192021222324252627282930

static async Task RecallFromMemory(

string query,

string apiKey,

List<MemoryRecord> memoryStore)

{

if (memoryStore.Count == 0)

{

Console.WriteLine("\n— Memory is empty —");

return;

}

var queryEmbedding = await GenerateEmbeddingAsync(query, apiKey);

var results = memoryStore

.Select(record => new

{

Record = record,

Similarity = CosineSimilarity(queryEmbedding.Span, record.Embedding.Span)

})

.OrderByDescending(x => x.Similarity)

.Take(3)

.ToList();

Console.WriteLine("\n— Memory Recall —");

foreach (var result in results)

{

Console.WriteLine($"• {result.Record.Text} (Similarity: {result.Similarity:F3})");

}

Console.WriteLine("——————-");

}

This is where the real power of semantic memory shines. The system doesn’t just look for keyword matches—it finds conceptually similar information. For example, searching for “feline pets” might recall memories about “cats” even if the word “feline” was never stored.

Generating Embeddings: The Brain’s Understanding

Embeddings are the secret sauce that makes semantic search possible:

123456789101112131415161718192021222324252627282930313233

static async Task<ReadOnlyMemory<float>> GenerateEmbeddingAsync(string text, string apiKey)

{

using var client = new HttpClient();

client.DefaultRequestHeaders.Add("Authorization", $"Bearer {apiKey}");

var request = new

{

input = text,

model = "text-embedding-3-small"

};

var response = await client.PostAsJsonAsync(

"https://api.openai.com/v1/embeddings",

request);

if (response.IsSuccessStatusCode)

{

var content = await response.Content.ReadAsStringAsync();

using var document = JsonDocument.Parse(content);

var embeddingArray = document.RootElement

.GetProperty("data")[0]

.GetProperty("embedding")

.EnumerateArray()

.Select(x => x.GetSingle())

.ToArray();

return embeddingArray;

}

else

{

throw new Exception($"Embedding generation failed: {response.StatusCode}");

}

Embeddings convert text into a high-dimensional vector (typically 1536 numbers for OpenAI’s models) where similar meanings result in similar vectors. This mathematical representation allows us to compute how closely related different pieces of text are.

The Similarity Measurement

We use cosine similarity to measure how closely related two pieces of text are:

12345

static float CosineSimilarity(ReadOnlySpan<float> vector1, ReadOnlySpan<float> vector2)

{

return TensorPrimitives.CosineSimilarity(vector1, vector2);

}

Cosine similarity measures the cosine of the angle between two vectors, giving us a value between -1 (completely opposite) and 1 (identical). Values closer to 1 indicate higher semantic similarity.

Data Structure for Memories

Finally, we define a simple class to store our memories:

123456

public class MemoryRecord

{

public string Id { get; set; } = string.Empty;

public string Text { get; set; } = string.Empty;

public ReadOnlyMemory<float> Embedding { get; set; }

}

Each memory contains the original text and its vector representation, allowing for both human-readable content and machine-understandable semantics.

Real-World Example

Imagine you’re using this assistant to plan a trip:

You: Remember: My passport expires in June 2025

✓ Saved to memory: My passport expires in June 2025

You: remember: I’m allergic to peanuts

✓ Saved to memory: I’m allergic to peanuts

You: recall: travel documents

— Memory Recall —

• My passport expires in June 2025 (Similarity: 0.824)

——————-

You: recall: food restrictions

— Memory Recall —

• I’m allergic to peanuts (Similarity: 0.791)

——————-

The system successfully finds relevant information even when the search terms don’t exactly match the stored text.

Why This Matters

This approach demonstrates several important AI concepts:

Semantic Understanding: The system understands meaning, not just keywords
Contextual Memory: Memories are retrieved based on conceptual relevance
Scalable Architecture: While we use simple in-memory storage, this pattern can scale to databases like Redis or vector databases like Pinecone

This foundation can be extended to build more sophisticated AI applications like personalized assistants, intelligent documentation systems, or context-aware applications that adapt to user preferences and history.

The complete code provides a working example of how to combine conversational AI with semantic memory, creating a more intelligent and context-aware application that truly understands and remembers.

Exploring how AI can enhance your existing applications?

Our engineers help design scalable AI architectures and integrate them into enterprise software.

Reach out

Subscribe to our newsletter and get amazing content right in your inbox.