Create Anki decks based on a CSV file

Anki is a flashcard software that helps to memorize knowledge easily. It relies on spaced repetition, that is also the technique used by the famous app Duolingo. When properly used, it is very efficient to learn languages. Anki exists on many platforms including Linux and Android. Personally, I use AnkiDroid to learn chinese.

A common use of Anki is to create a few flashcards manually every day, containing unknown words you heard or read during the day. However, if you want to test your knowledge based on a predefined list of words, it would be convenient to generate a deck automatically. In this article, we will create python script that reads a CSV or TSV file and transforms its data into a deck.

For generating decks, we will use genanki. The example data file in this article is a list of chinese words with their meaning and pinyin (romanization of chinese characters). Several lists can be found on this website: HSKHSK.

First, get the data file (like this one). Let us write the preample of the script:

import csv
import random
import genanki

# Filename of the data file
data_filename = "HSK Official With Definitions 2012 L1.txt"

# Filename of the Anki deck to generate
deck_filename = "HSK1.apkg"

# Title of the deck as shown in Anki
anki_deck_title = "HSK1"

# Name of the card model
anki_model_name = "HSK"

A flashcard is typically composed of a front side and a back side. Each flashcard of our deck will contain three data fields: hanzi (the chinese word in simplified chinese), pinyin and meaning. Using this data, we want to make two types of flashcards:

Our deck model is defined as follows:

# Create the deck model

model_id = random.randrange(1 << 30, 1 << 31)

style = """
.card {
font-family: arial;
font-size: 24px;
text-align: center;
color: black;
background-color: white;
}
.hanzi {
font-size: 64px;
}
"""


anki_model = genanki.Model(
model_id,
anki_model_name,
fields=[{"name": "hanzi"}, {"name": "pinyin"}, {"name": "meaning"}],
templates=[
{
"name": "Card 1",
"qfmt": '<p class="hanzi">{{hanzi}}</p>',
"afmt": '{{FrontSide}}<hr id="answer"><p class="pinyin">{{pinyin}}</p><p class="meaning">{{meaning}}</p>',
},
{
"name": "Card 2",
"qfmt": '<p class="meaning">{{meaning}}</p>',
"afmt": '{{FrontSide}}<hr id="answer"><p class="hanzi">{{hanzi}}</p><p class="pinyin">{{pinyin}}</p>',
},
],
css=style,
)

model_id is a unique integer that identifies the deck. style describes the CSS style properties of the flashcards. Feel free to change the font sizes according to your needs. The fields argument describes the data fields that each flashcard has. The templates argument represents the two types of flashcards. The qfmt value is a HTML template of the front side, the afmt value represents the back side. You can customize these values to show the flashcards in different ways, using the field names surrounded by {{ and }}. Note that {{FrontSize}} is a special field to represent the HTML content of the front side. Then we create the deck:

# The list of flashcards
anki_notes = []

with open(data_filename, "r") as csv_file:

csv_reader = csv.reader(csv_file, delimiter="\t")

for row in csv_reader:
anki_note = genanki.Note(
model=anki_model,
# simplified writing, pinyin, meaning
fields=[row[0], row[3], row[4]],
)
anki_notes.append(anki_note)

This code reads the CSV file and generates a flashcard for each row. Note that our data file contains a tab-separated values (TSV file), so we specify delimiter="\t". In our data file, hanzi, pinyin and meaning are in the 1st, 4th and 5th rows, respectively.

Let's randomize the order of the flashcards:

# Shuffle flashcards
random.shuffle(anki_notes)

Then create the deck, add the flashcards to it, save it:

anki_deck = genanki.Deck(model_id, anki_deck_title)
anki_package = genanki.Package(anki_deck)

# Add flashcards to the deck
for anki_note in anki_notes:
anki_deck.add_note(anki_note)

# Save the deck to a file
anki_package.write_to_file(deck_filename)

print("Created deck with {} flashcards".format(len(anki_deck.notes)))

And voilĂ ! The full script can be downloaded here.