I’ve been working through the r/Fantasy 2021 Book Bingo this year:

2021 Book Bingo

Five SFF Short Stories Any short story as long as there are five of them. Hard Mode: Read an entire SFF anthology or collection.	Set in Asia Any book set in Asia or an analogous fantasy setting that is based on a real-world Asian setting. Hard Mode: Written by an Asian author.	A Selection from the r/Fantasy A to Z Genre Guide Any book listed in our A to Z Genre Guide. Hard Mode: A book by a BIPOC author.	Found Family Or as TV Tropes calls it - Family of Choice. Often not biologically related, these relationships in a group typically form through bonds of shared experiences and become as important (in some cases more) as family members. Hard Mode: Featuring an LGBTQ+ character as a member of the found family.	First Person POV Defined as: a literary style in which the narrative is told from the perspective of a narrator speaking directly about themselves. Link for examples. Hard Mode: There is more than one perspective, but each perspective is written in First Person.
Book Club OR Readalong Book Any past or active r/Fantasy book clubs count as well as past or active r/Fantasy readalongs. See our full list of book clubs here. Hard Mode: Must read a current selection of either a book club or readalong and participate in the discussion.	New to You Author This would be an author whose work you’ve yet to read, meaning no novel, no novella, no short fiction, etc. Hard Mode: Not only have you never read their work before but you’ve not heard much about this author or their work before deciding to try a book by them.	Gothic Fantasy Gothic Fantasy is similar to Gothic Fiction but it includes fantasy elements or settings. Gothic Fiction is “a style of writing that is characterized by elements of fear, horror, death, and gloom, as well as romantic elements, such as nature, individuality, and very high emotion. These emotions can include fear and suspense.” (Source) Here is a good ‘introductory post’ on Gothic Fantasy for further reading from Book Riot. Hard Mode: NOT one of the ten titles listed in the Book Riot article.	Backlist Book For our purposes we’re considering ‘backlist’ an author’s older titles that are not their latest published book or part of a currently running series (no further sequels announced when you read it). The author must also be a currently publishing author. Hard Mode: Published before the year 2000.	Revenge-Seeking Character Book has a character whose main motivation in the story is revenge. Hard Mode: Revenge is central to the plot of the entire book.
Mystery Plot The main plot of the book centers around solving a mystery. Hard Mode: Not a primary world Urban Fantasy (secondary world urban fantasy is okay!)	Comfort Read This is one of those ‘personal to you’ squares. Any book that brings you comfort while reading it. You can use a reread on this square and it WON’T count for your ‘1 reread’. Hard Mode: Don’t use a reread, find a brand new comfort read!	Published in 2021 A book published for the first time in 2021 (no reprints or new editions). Hard Mode: It’s also a debut novel–as in it’s the author’s first published novel.	Cat Squasher: 500+ Pages Time to go tome hunting–find a book that is over 500 pages in length. Hard Mode: Lion Squasher - a book that is over 800 pages.	SFF-Related Nonfiction Back by popular demand! Any nonfiction book that is related to SFF. Could be a book about the history of something in SFF, writing SFF, essays from a SFF writer, etc. Hard Mode: Published within the last five years.
Latinx or Latin American Author Author is from Latin America or of Latinx/Hispanic heritage. Hard Mode: Book has fewer than 1000 Goodreads ratings.	Self-Published Only self-published novels will count for this square. If the novel has been picked up by a publisher as long as you read it when it was self-pubbed it will still count. Hard Mode: Self-pubbed and has fewer than 50 ratings on Goodreads.	Forest Setting This setting must be used be for a good portion of the book. Hard Mode: The entire book takes place in this setting.	Genre Mashup A book that utilizes major elements from two or more genres. Examples: a romance set in a fantasy world, a book that combines science fiction and fantasy, etc. Hard Mode: Three or more genres are combined.	Has Chapter Titles A book where each chapter has a title (other than numbers or just a character’s name). Hard Mode: Chapter title is more than a single word FOR EVERY SINGLE CHAPTER
Title: _____ of _____ The title of the book must feature the format X of Y. Example: The Harp of Kings by Juliet Marillier. Hard Mode: _____ of ______ and ________. Format of title must be X of Y and Z.	First Contact From Wikipedia: Science Fiction about the first meeting between humans and extraterrestrial life, or of any sentient species’ first encounter with another one, given they are from different planets or natural satellites. Hard Mode: War does not break out as a result of contact.	Trans or Nonbinary Character A book featuring a trans or nonbinary character that isn’t an alien or a robot. Hard Mode: This character is a main protagonist.	Debut Author An author’s debut novel or novella. Hard Mode: The author has participated in an AMA. AMA List linked here.	Witches A book featuring witches. Note - characters practicing what is traditionally in their culture referred to as witchcraft would also count. For example brujos or brujas would count for this square. Hard Mode: A witch is a main protagonist.

One thing that I’ve been having a bit of trouble with is categorizing books for that. There is a very active recommendations thread, but without the ability to load the entire thread, it … isn’t great to search. So let’s make that easier.

First thing first, let’s get the raw data. It turns out that Reddit has a wonderfully simple API to start with, just add .json to a URL to get a thread in JSON format. Example. It’s a bit of a weird format, but it’s parsable. From there, you have references to child nodes that you can download in order to get one giant JSON object for the entire thread. Which sounds like a fascinating problem, but this time around, I just skipped that and used this code. Give it a thread, wait a bit (for such a large thread), get JSON.

You could keep that as a JSON file, but I wanted to be sneaky/weird and put it straight in the script. It’s a fair chunk of data with a number of weird characters, so storing it could be tricky… unless you just base64 encode the entire thing. You can then store it straight inline and get it all out with data = json.loads(base64.b64decode('W3siYm9keS...')). It’s actually not that unusual of an idea. You see the same thing with inline data: images in webpages or games that directly embed art assets in the compiled file for optimization/distribution reasons.

Next, parsing. In this case, the recommendations thread has one first level response for each of the categories in the bingo, but after that just about any level of response could contain book titles. So what we want is to search the JSON object recursively.

For dictionaries, search the ‘body’ (for text) and ‘replies’ (for further children)
For lists, search all entries (lists of replies)
For strings (bodies), search the text (case insensitive)

import base64
import json
import sys

data = json.loads(base64.b64decode('W3siYm9keS...'))
    
def search(key, data, path = None):
    path = path or []
    
    if isinstance(data, list):
        for i, child in enumerate(data):
            yield from search(key, child, path)
    elif isinstance(data, dict):
        if body := data.get('body'):
            yield from search(key, body, path)
            yield from search(key, data.get('replies'), path + [body])
    elif isinstance(data, str):
        if key.lower() in data.lower():
            yield path + [data]

def top_level_search(key):
    results = set()
    for result in search(key, data):
        results.add(result[0])
    return list(sorted(results))

for arg in sys.argv[1:]:
    print(arg)
    for result in top_level_search(arg):
        print(result)
    print()

I really do love generators in this case, with yield from. You can recursively scan through the entire structure and just sort of return a flat list for free. In this case, I’m keeping track of the path through the nodes that I took to get to a specific point, although I’m only ending up returning the top_level_search for each thread (I did the whole path at first, which was neat).

And as a result:

$ python3 ~/Dropbox/book-bingo.py 'Six Wakes'

Six Wakes
**Mystery Plot** \- The main plot of the book centers around solving a mystery. **HARD MODE:** Not a primary world Urban Fantasy (secondary world urban fantasy is okay!)

$ python3 ~/Dropbox/book-bingo.py 'Annihilation'

Annihilation
**First Contact** \- From Wikipedia:  Science Fiction about the first meeting between humans and extraterrestrial life, or of any sentient species' first encounter with another one, given they are from different planets or natural satellites. **HARD MODE:** War does not break out as a result of contact.
**First Person POV** \- defined as:  a literary style in which the narrative is told from the perspective of a narrator speaking directly about themselves. [Link for examples.](https://examples.yourdictionary.com/examples-of-point-of-view.html) **HARD MODE:**  There is more than one perspective, but each perspective is written in First Person.
**Forest Setting** \-  This setting must be used be for a good portion of the book. **HARD MODE:** The entire book takes place in this setting.
**Mystery Plot** \- The main plot of the book centers around solving a mystery. **HARD MODE:** Not a primary world Urban Fantasy (secondary world urban fantasy is okay!)

Pretty handy!

JP's Blog

Categorizing r/Fantasy Book Bingo Books

2021 Book Bingo