Importing Recipes from Copy Me That to Paprika

Tags: just for fun

My girlfriend and I have been using Copy Me That for a few years now to store our recipes and plan meals. It’s been a great tool for us! But I’ve been hearing amazing things about a different app called Paprika. Since I’m always on the lookout for new and fun tools, and Paprika is currently running a 50% off Black Friday sale, I thought I’d give it a shot.

We currently have 1782 recipes saved in Copy Me That. It’s absolutely essential that we’re able to export our recipes into any new tool that we want to use. Luckily for us, Paprika supports importing from Copy Me That. Unluckily, their importer does not import collections (or tags, in Paprika terminology). This is kind of a dealbreaker. I was able to find someone on reddit talking about how Paprika uses a SQLite database to store everything, so I thought I’d give it a shot.

Once I bought and logged into the Paprika Windows app, I used Everything to search for the directory where the database might be. Found it! It’s in C:\Users\Andrew\AppData\Local\Paprika Recipe Manager 3\Database.

Screenshot of the Everything app for Windows, showing the Paprika database file.

I made a backup copy of the database, just in case. Then I opened the database file in DB Browser for SQLite to see what sort of structure I was working with, and it’s pretty straightforward. All I really care about are the recipes, recipe_categories, and recipes_to_categories tables.

Rewind! Last night I requested a recipe export from Copy Me That. They have two options; I chose the full HTML export. The structure of the HTML is pretty straightforward. It’s just a single HTML file with a bunch of nodes that look like this:

<div class="recipe">
<div id="name">
Example Recipe Name
</div>
<div id="link">
Adapted from <a id="original_link" href="https://example.com/recipe">https://example.com/recipe</a>
</div>
<img class="recipeImage" src="images/recipe_image.jpg" width="120px">
<div id="categories">
<span>Tags: </span>
<span class="recipeCategory">Cat 1</span>
<span> </span>
<span class="recipeCategory">Cat 2</span>
<span> </span>
</div>
<div id="servings">
Servings: <a id="recipeYield">3 servings</a>
</div>
<div id="recipeIngredient_header">
Ingredients
</div>
<ul id="recipeIngredients">
...
</ul>
<div id="recipeInstructions_header">
Steps
</div>
<ol id="recipeInstructions">
...
</ol>
<div id="recipeNotes_header">
Notes
</div>
<ul id="recipeNotes">
...
</ul>
</div>

I was glad to see how well-structured the HTML is, since that makes it easy to parse. I created a new .NET console project and added the HtmlAgilityPack package. My idea here is to read the HTML file, parse out the recipes, and then create a SQL script I can run to import the recipe categories into the Paprika database.

First, we load the HTML file into the doc variable.

var doc = new HtmlDocument();
doc.Load("recipes.html");
 
var recipeNodes = doc.DocumentNode.SelectNodes("//div[@class='recipe']");
 
if (recipeNodes == null) return;

Then we loop through each recipe node, extract the data we care about, and add it to our recipes list.

var recipes = new List<Recipe>();
foreach (var recipeNode in recipeNodes)
{
var recipe = new Recipe
{
Name = recipeNode.SelectSingleNode(".//div[@id='name']")
?.InnerText.Trim(),
Url = recipeNode.SelectSingleNode(".//a[@id='original_link']")?.InnerText.Trim(),
Tags = recipeNode.SelectNodes(".//span[@class='recipeCategory']")
?.Select(node => node.InnerText.Trim())
.ToList() ?? []
};
 
recipes.Add(recipe);
}
 
// Recipe class
public class Recipe
{
public string Name { get; set; }
public string Url { get; set; }
public List<string> Tags { get; set; }
 
public override string ToString()
{
var tagsString = string.Join(",", Tags);
return $"{Name}: {tagsString}";
}
}

The SelectSinglNode and SelectNodes methods are from HTML Agility Pack. They use XPath to select the nodes we care about.

Now I use a StringBuilder and some logic to build the SQL script:

var sb = new StringBuilder();
foreach (var r in recipes.Where(x => x.Tags.Count > 0))
{
var cleanName = r.Name.Replace("\t", "").Replace("\n", "").Replace("\r", "").Replace("'", "''").Trim();
var cleanUrl = string.IsNullOrWhiteSpace(r.Url) ?
string.Empty :
r.Url.Replace("\t", "").Replace("\n", "").Replace("\r", "").Replace("'", "''").Trim();
foreach (var t in r.Tags)
{
var cleanTag = t.Replace("\t", "").Replace("\n", "").Replace("\r", "").Replace("'", "''").Trim();
sb.AppendLine(@$"INSERT OR IGNORE INTO recipes_to_categories
SELECT r.uid, c.uid
FROM recipes r, recipe_categories c
WHERE r.name = '{cleanName}'
AND r.source_url = '{cleanUrl}'
AND c.name = '{cleanTag}'
AND NOT EXISTS (
SELECT 1 FROM recipes_to_categories rtc
WHERE rtc.recipe_uid = r.uid
AND rtc.category_uid = c.uid
);");
}
}

I had to do some string manipulation to clean up the data, since the HTML contains a lot of whitespace and line breaks.

I could have used Entity Framework to run the resulting SQL script, but I decided to just run it by hand in DB Browser for SQLite.

Finally, in order to actually get Paprika to recognize the new recipes_to_categories entries (and this took multiple hours of trial and error), I had to create a new _test category and add every single recipe to it. This is done easily in the desktop app by hitting Ctrl+A to select all recipes, then dragging them to the new category. Let your desktop app finish syncing, delete the _test category, sync one more time, and then you’re done!