Create Anki decks based on a CSV file
Anki is a flashcard software that helps to memorize knowledge easily. It relies on spaced repetition, that is also the technique used by the famous app Duolingo. When properly used, it is very efficient to learn languages. Anki exists on many platforms including Linux and Android. Personally, I use AnkiDroid to learn chinese.
A common use of Anki is to create a few flashcards manually every day, containing unknown words you heard or read during the day. However, if you want to test your knowledge based on a predefined list of words, it would be convenient to generate a deck automatically. In this article, we will create python script that reads a CSV or TSV file and transforms its data into a deck.
For generating decks, we will use genanki. The example data file in this article is a list of chinese words with their meaning and pinyin (romanization of chinese characters). Several lists can be found on this website: HSKHSK.
First, get the data file (like this one). Let us write the preample of the script:
# Filename of the data file
=
# Filename of the Anki deck to generate
=
# Title of the deck as shown in Anki
=
# Name of the card model
=
A flashcard is typically composed of a front side and a back side. Each flashcard of our deck will contain three data fields: hanzi
(the chinese word in simplified chinese), pinyin
and meaning
. Using this data, we want to make two types of flashcards:
- (Card 1) shows the
hanzi
, the user has to guess thepinyin
and themeaning
. - (Card 2) shows the
meaning
, the user has to guess thehanzi
and thepinyin
.
Our deck model is defined as follows:
# Create the deck model
=
=
=
model_id
is a unique integer that identifies the deck. style
describes the CSS style properties of the flashcards. Feel free to change the font sizes according to your needs. The fields
argument describes the data fields that each flashcard has. The templates
argument represents the two types of flashcards. The qfmt
value is a HTML template of the front side, the afmt
value represents the back side. You can customize these values to show the flashcards in different ways, using the field names surrounded by {{
and }}
. Note that {{FrontSize}}
is a special field to represent the HTML content of the front side. Then we create the deck:
# The list of flashcards
=
=
=
This code reads the CSV file and generates a flashcard for each row. Note that our data file contains a tab-separated values (TSV file), so we specify delimiter=“\t”
. In our data file, hanzi
, pinyin
and meaning
are in the 1st, 4th and 5th rows, respectively.
Let’s randomize the order of the flashcards:
# Shuffle flashcards
Then create the deck, add the flashcards to it, save it:
=
=
# Add flashcards to the deck
# Save the deck to a file
And voilà! The full script can be downloaded here.