In this article, we’ll explore how to create an intelligent chat assistant that can remember information across conversations. Solutions like this are increasingly used in software modernization projects to extend existing applications with intelligent assistants and automation capabilities. Using Microsoft’s Semantic Kernel and OpenAI’s powerful models, we’ll build a console application that not only responds to questions but can also store and recall important information—just like a human assistant would.
Package details
Below are the packages in our project, a Console application targeting .NET 8.0:
| 12 | <PackageReference Include="Microsoft.SemanticKernel" Version="1.67.1" />
<PackageReference Include="Microsoft.Extensions.AI" Version="9.10.0" /> |
And these are the namespaces we need:
| 12345 | using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.ChatCompletion;
using System.Net.Http.Json;
using System.Numerics.Tensors;
using System.Text.Json;
|
What We’re Building
Our application will be a conversational AI with three key capabilities:
- Natural conversation using GPT-4
- Memory storage – saving important information with semantic embeddings
- Memory recall – finding relevant memories using semantic search
Let’s break down how each part works.
Setting Up the Foundation
First, we need to set up our environment and initialize the AI services:
| 123456789101112131415161718 | namespace article3_SemanticContext
{
internal class Program
{
static async Task Main()
{
Console.WriteLine("=== Semantic-Kernel Chat Console with Memory ===");
Console.WriteLine("Type 'exit' to quit.");
Console.WriteLine("Type 'remember: <text>' to save something to memory.");
Console.WriteLine("Type 'recall: <query>' to search memory.\n");
// 1. Obtain your OpenAI key
var apiKey = Environment.GetEnvironmentVariable("OPENAI_API_KEY");
if (string.IsNullOrWhiteSpace(apiKey))
{
Console.Write("OpenAI API key: ");
apiKey = Console.ReadLine();
} |
This initial setup creates our console interface and handles the OpenAI API key configuration. The key can be provided either through environment variables or direct input.
Initializing the AI Brain
Next, we initialize the Semantic Kernel with OpenAI’s chat completion service:
| 123456789101112 | // 2. Build the kernel with chat completion
var builder = Kernel.CreateBuilder();
builder.AddOpenAIChatCompletion(modelId: "gpt-4o", apiKey: apiKey!);
var kernel = builder.Build();
var chatService = kernel.GetRequiredService<IChatCompletionService>();
// Simple in-memory storage for memories
var memoryStore = new List<MemoryRecord>();
// 3. Create the chat history and prime the system prompt
var chatHistory = new ChatHistory();
chatHistory.AddSystemMessage("You are a helpful assistant with memory capabilities."); |
Here we’re setting up the following crucial components:
- Chat Service: Powered by GPT-4, this handles the conversational AI
- Memory Store: A simple list that will hold our remembered information
- Chat History: Maintains the conversation context with a system prompt that defines the assistant’s personality
The Main Conversation Loop
The heart of our application is the REPL (Read-Eval-Print Loop) that processes user input:
| 123456789101112131415161718192021222324252627282930313233343536 | // 4. REPL loop
while (true)
{
Console.Write("\nYou: ");
var userText = Console.ReadLine();
if (string.IsNullOrWhiteSpace(userText) ||
userText.Equals("exit", StringComparison.OrdinalIgnoreCase))
break;
// Handle memory commands
if (userText.StartsWith("remember:", StringComparison.OrdinalIgnoreCase))
{
var textToRemember = userText.Substring(9).Trim();
await SaveToMemory(textToRemember, apiKey!, memoryStore);
Console.WriteLine($"\n✓ Saved to memory: {textToRemember}");
continue;
}
if (userText.StartsWith("recall:", StringComparison.OrdinalIgnoreCase))
{
var query = userText.Substring(7).Trim();
await RecallFromMemory(query, apiKey!, memoryStore);
continue;
}
// Normal chat
chatHistory.AddUserMessage(userText);
var assistantReply = await chatService.GetChatMessageContentAsync(chatHistory);
chatHistory.AddAssistantMessage(assistantReply.Content ?? string.Empty);
Console.WriteLine($"\nAssistant: {assistantReply.Content}");
}
} |
This loop handles three types of commands:
- remember: <text> – Stores information in memory
- recall: <query> – Searches for relevant memories
- Normal messages – Regular conversation with the AI
How Memory Storage Works
When you use the remember: command, here’s what happens behind the scenes:
| 12345678910111213141516 | static async Task SaveToMemory(
string text,
string apiKey,
List<MemoryRecord> memoryStore)
{
var embedding = await GenerateEmbeddingAsync(text, apiKey);
var record = new MemoryRecord
{
Id = Guid.NewGuid().ToString(),
Text = text,
Embedding = embedding
};
memoryStore.Add(record);
} |
The key innovation here is semantic embeddings. Instead of just storing the text, we convert it into a mathematical representation (vector) that captures its meaning.
The Magic of Semantic Search
When you want to recall information, we use cosine similarity to find the most relevant memories:
| 123456789101112131415161718192021222324252627282930 | static async Task RecallFromMemory(
string query,
string apiKey,
List<MemoryRecord> memoryStore)
{
if (memoryStore.Count == 0)
{
Console.WriteLine("\n— Memory is empty —");
return;
}
var queryEmbedding = await GenerateEmbeddingAsync(query, apiKey);
var results = memoryStore
.Select(record => new
{
Record = record,
Similarity = CosineSimilarity(queryEmbedding.Span, record.Embedding.Span)
})
.OrderByDescending(x => x.Similarity)
.Take(3)
.ToList();
Console.WriteLine("\n— Memory Recall —");
foreach (var result in results)
{
Console.WriteLine($"• {result.Record.Text} (Similarity: {result.Similarity:F3})");
}
Console.WriteLine("——————-");
} |
This is where the real power of semantic memory shines. The system doesn’t just look for keyword matches—it finds conceptually similar information. For example, searching for “feline pets” might recall memories about “cats” even if the word “feline” was never stored.
Generating Embeddings: The Brain’s Understanding
Embeddings are the secret sauce that makes semantic search possible:
| 123456789101112131415161718192021222324252627282930313233 | static async Task<ReadOnlyMemory<float>> GenerateEmbeddingAsync(string text, string apiKey)
{
using var client = new HttpClient();
client.DefaultRequestHeaders.Add("Authorization", $"Bearer {apiKey}");
var request = new
{
input = text,
model = "text-embedding-3-small"
};
var response = await client.PostAsJsonAsync(
"https://api.openai.com/v1/embeddings",
request);
if (response.IsSuccessStatusCode)
{
var content = await response.Content.ReadAsStringAsync();
using var document = JsonDocument.Parse(content);
var embeddingArray = document.RootElement
.GetProperty("data")[0]
.GetProperty("embedding")
.EnumerateArray()
.Select(x => x.GetSingle())
.ToArray();
return embeddingArray;
}
else
{
throw new Exception($"Embedding generation failed: {response.StatusCode}");
}
} |
Embeddings convert text into a high-dimensional vector (typically 1536 numbers for OpenAI’s models) where similar meanings result in similar vectors. This mathematical representation allows us to compute how closely related different pieces of text are.
The Similarity Measurement
We use cosine similarity to measure how closely related two pieces of text are:
| 12345 | static float CosineSimilarity(ReadOnlySpan<float> vector1, ReadOnlySpan<float> vector2)
{
return TensorPrimitives.CosineSimilarity(vector1, vector2);
}
} |
Cosine similarity measures the cosine of the angle between two vectors, giving us a value between -1 (completely opposite) and 1 (identical). Values closer to 1 indicate higher semantic similarity.
Data Structure for Memories
Finally, we define a simple class to store our memories:
| 123456 | public class MemoryRecord
{
public string Id { get; set; } = string.Empty;
public string Text { get; set; } = string.Empty;
public ReadOnlyMemory<float> Embedding { get; set; }
} |
Each memory contains the original text and its vector representation, allowing for both human-readable content and machine-understandable semantics.
Real-World Example
Imagine you’re using this assistant to plan a trip:
You: Remember: My passport expires in June 2025
✓ Saved to memory: My passport expires in June 2025
You: remember: I’m allergic to peanuts
✓ Saved to memory: I’m allergic to peanuts
You: recall: travel documents
— Memory Recall —
• My passport expires in June 2025 (Similarity: 0.824)
——————-
You: recall: food restrictions
— Memory Recall —
• I’m allergic to peanuts (Similarity: 0.791)
——————-
The system successfully finds relevant information even when the search terms don’t exactly match the stored text.
Why This Matters
This approach demonstrates several important AI concepts:
- Semantic Understanding: The system understands meaning, not just keywords
- Contextual Memory: Memories are retrieved based on conceptual relevance
- Scalable Architecture: While we use simple in-memory storage, this pattern can scale to databases like Redis or vector databases like Pinecone
This foundation can be extended to build more sophisticated AI applications like personalized assistants, intelligent documentation systems, or context-aware applications that adapt to user preferences and history.
The complete code provides a working example of how to combine conversational AI with semantic memory, creating a more intelligent and context-aware application that truly understands and remembers.