Becoming Pangu with GNU sed

In case you aren't familiar with Chinese mythology or blogosphere, there's an old meme aptly named "Space of Pangu": a typesetting rule of thumb in favor of additional spacing between Chinese characters (but not punctuation marks) and Latin characters or numbers. My variant of the rule also includes additional spacing around any HTML elements like links and emphasis.

Up till now, I've been manually adding spaces in my source files (in Markdown or org), which is admittedly the worst way to do it. Aside from the additional chore, such a typesetting rule should, in my opinion, be implemented in the output/rendering format, not the source. Besides, manually fixing all the old posts I just brought back is not exactly a rewarding task. Unwilling to load additional JavaScript, I turned to the all-mighty GNU sed. To add Space of Pangu to the final HTML and XML files that Hugo produces (normally in the ./public directory), I used the following shell script:

#! /usr/bin/env sh
# For punctuation marks to be recongnized correctly.
export LC_CTYPE=en_US.UTF-8
find . -path "./public/*" \( -name "*.html" -or -name "*.xml" \) -print -exec sed \
     -e 's/\([a-zA-Z0-9]\|<\/[a-z]*>\)\([^[:punct:][:space:]a-zA-Z0-9\s]\)/\1 \2/g' \
     -e 's/\([^[:punct:][:space:][:alnum:]]\)\([a-zA-Z0-9]\|<[a-z]\)/\1 \2/g' \
     -i {} ";"

In case you are adamant about adhering to the recommendation by this W3C Working Draft and wouldn't mind bloating up the resulting web page, using CSS to create the spacing should do the trick:

find . -path "./public/*" \( -name "*.html" -or -name "*.xml" \) -print -exec sed \
     -e 's/\([a-zA-Z0-9]\|<\/[a-z]*>\)\([^[:punct:][:space:]a-zA-Z0-9\s]\)/\1<span style="margin:0.25ch;"><\/span>\2/g' \
     -e 's/\([^[:punct:][:space:]a-zA-Z0-9]\)\([a-zA-Z0-9]\|<[a-z]\)/\1<span style="margin:0.25ch;"><\/span>\2/g' \
     -i {} ";"

If you are another one of those Space of Pangu disciples, just note that there's no need to worry about adding spaces when leaving comments here: thanks to Hyperskip comments being inserted at Hugo's building stage, they are affected by those scripts as well. Just sit back, relax, and enjoy staring at the blank spaces.

March Goes out Like a Lion, Too

It was not until a few weeks ago (while watching Level1 News) that I learned about the complete version of the saying "March comes in like a lion and goes out like a lamb." I knew the first half of the saying from manga series March Comes in Like a Lion, but I had no idea the saying was describing the weather in March.

What you have read so far was actually my entire motivation for starting this post, but it has indeed been a rather unusual March. Because of COVID-19, I'm spending time at home "social distancing", or rather, indulging myself in the company of solitude. In preparation for (a.k.a. using as an excuse) extended periods of working from home, I went on an upgrade spree for electronics: I got a second monitor, a monitor stand, and larger hard drives for my NAS. In fact, I've been gradually expanding my arsenal of devices since last Fall, so look out for a potential setup post.

Amazon's Prime Now service has been keeping me fed for two out of the past three years during which I cooked for myself. It's a bit alarming that Amazon of all things has become literally something I can't live without. But until I have my underground bunker and algae farm, I'll have to make do with this symbiotic (or should I say parasitic) relationship. I'm not sure if I really enjoy cooking though, as least most of my efforts devoted to it has been on how to reduce the amount of time I spend in the kitchen. Fortunately I hardly ever get tired of eating the same dishes, so I just kept making the same ones, while gradually optimizing the preparation: I have yogurt and trail mix for breakfast, beef curry with rice for lunch, and pan-fried salmon with rice and stir-fried cabbage for dinner.

The pandemic also puts my running plans on hold: the trail I normally run on has been closed down. I did get plenty of mileages in before social distancing started (4 weeks ahead of schedule in terms of total mileage now), so I should still be on track to hit my 2020 target. Perhaps due to the snow and ice along the way, my running shoes (Mizuno Wave Rider 23) are wearing out faster than before: at 250-mile-mark, I'm already feeling arch discomfort in longer distance runs, while previous iterations of those shoes lasted until around 300 miles. Aside from shoe issues, shin pain also started to creep up as I've been doing longer runs, so this just might be the opportunity I needed to take some rest. I have converted myself to a morning runner as I plan to ultimately sneak a run or two on weekdays. So far I'm enjoying my morning routines, despite a few of snow-stormy days that were extra tough (but fun). Plus, I get to see sunrise instead of its less cheerful sibling.

Reading the news during the outbreak frequently struck me with an unreal feeling: because of both the things that are actually happening and the way news articles covers them in a deliberately divisive facade. To be fair, asking an organization that preys on human attention to report in a plain and down-to-earth way is an oxymoron in itself. It's probably hypocritical for me to pick on the news agencies though, as I am also guilty of deriving excitement from the current situation: the mere thought that what is ordinarily just an apartment is now my personal fortress against an uncured pathogen is enough to keep me up at night.

Should this indeed be the downfall of humanity, at least my blog and Emacs configuration will (assuming Microsoft means it) live on thanks to the Github Archive Program. Before that, be safe, stay at your personal living pods, and prepare for the neon-colored Space-Age algae diet we've all been waiting for.

Static Alternatives to Mastodon and Gitea

Like how I decided to switch off Wordpress, I think I've had enough running Mastodon and Gitea.

Keeping up with configuration changes with Gitea had been annoying, whereas with Mastodon, breakages are common due the mismatching system library versions (mostly protobuf) in the dependencies. While the latter is not a fault of Mastodon itself, having to install two package managers (for Ruby and Node.js, respectively) just to run a program is rather ridiculous to me.

I started hosting both applications in 2018: Mastadon first as a replacement for Twitter, and Gitea later in reaction to Microsoft's acquisition of Github. Looking back, they were probably overkill for my needs: my primary use case for a git server and a micro blog are both very much single-user focused and write-only, which means these content should be available in read-only form for my site's visitors, making static pages the perfect replacement for both web front ends.

Starting with Mastadon, I'm using the twtxt format to store and serve my micro blog. The format has existed for some time now, but enjoyed a recent resurgence in the tildeverse (a series of websites offering public access Unix-like systems). While there is now a whole community supported ecosystem of various syntax extensions and software seeking to add more features to the format, I have found the barebone timestamp-tab-and-then-text syntax to be sufficient. The write-and-forget cycle is really addicting, and even more so when using a command line client (mine is aptly named twixter).

As for Gitea, while an excellent Github replacement in my opinion, is more suitable for community collaboration than as a personal project dumping ground. I opted to manage the git repositories directly (see Chapter 4.4 and 4.5 of Pro Git), and use stagit to generate the corresponding HTML files. These stagit-generated pages have replaced Gitea as the new Trantor Holocron.

Now that I have found satisfactory solution for the write-only portion of my online presence, I will continue to explore options for the remaining two pillars: read-only (content consumption) and interaction (means of communication). Web feeds and email are my best answers now, but they still don't cover all the bases in my experience.

Blog 9 from Outer Space

Recently, I've been thinking about ways to unify my micro blog entries with my current site, and I've been reconsidering the ideas from IndieWeb: unlike ActivityPub (the protocol Mastodon, Pleroma and the likes use for federation), which seems to want everything be done dynamically via server APIs and JSON responses, the various standards recommended by the IndieWeb community allows machine readable feed to be generated straight from a static HTML file correctly marked-up. A core idea that IndieWeb seem to implicitly rely on is the lifetime of the URIs, and to a greater extent, site owner's control over the domain name. Withe the recent drama regarding the .ORG domain, I came to realize that a future in which domain names are too expensive to maintain (or are subject to seizures by various entities) may not actually be too distant, and this could seriously undermine the entire premise IndieWeb is built upon, not to mention the a lot more common link rots. Fortunately, I think the IPFS (InterPlanetary File System) has the potential to solve both problems.

A Crash Course on IPFS

Now, now, I know when compared similar projects like the Dat protocol, pingfs, or even Scruttlebutt, IPFS has a really buzz-wordy vibe (trust me, I was as skeptical as you are at the beginning) to it, and the various cryptocurrency start-ups that bundle IPFS and all kinds of acronyms in their marketing materials surely doesn't do it any favors, but it does seem like the most established and ready-to-use. Here's my best attempt at explaining IPFS, with information mostly obtained from the official documentation and this talk. In case you are interested in further implementation details, this session from IPFS Camp 2019 is a great starting point.

A simplified interpretation of link to an web page is but a fancy way to point to a file on some server. Just like path to a file, the link would be unreachable if the server is down, even if someone sitting in the same room might have the contents cached. In IPFS, files (or data blocks) are addressed by corresponding cryptographic hashes of their contents, and stored in a distributed fashion across all peers. This means no centralized facility is required to access the files, file integrity can be easily verified, P2P sharing can be used to speed up access, and files stored this way are inherently immutable.

Not being able to change files seems like a rather large price to pay, but just like any other problem in computer science, this can be solved by adding a layer of abstraction. IPNS (InterPlanetary Name System) utilizes public-key cryptography to create immutable addresses that can point to different files. An IPNS address is basically the hash of a public key. An IPNS lookup would involve retrieval of the public key, searching for files (each containing an IPFS address) signed by the corresponding private keys, identifying the most recent one, and finally redirecting to the correct file. To utilize IPNS, the user would start by creating a public-private key pair, followed by uploading desired files into IPNS, and sign and upload a pointer file containing IPFS address to the uploaded content. When an update is desired, the user only need to sign and upload another pointer file to the new location.

A lot of ideas used in IPFS has been explored before by projects like BitTorrent (peer-to-peer sharing), Fossil and Venti from Plan9 (write-once data blocks and path redirection), git (Merkle tree/directed acyclic graph), etc. However, the killer feature is how easily IPFS integrates with existing infrastructure. Not only are there HTTP gateways that allows for accessing IPFS/IPNS from web browsers instead of IPFS clients, but also compatibility with FUSE (Filesystem in Userspace), which actually allows you to mount the entire IPFS as a read-only partition: sure this also makes hosting static websites possible, but you have to admit that having access to a global-scale (or should I say, interplanetary?) P2P shared drive is way cooler.

Hosting Static Websites on IPFS

The official guide already outlines the general usage pattern pretty well. Here's the TLDR:

  • Run ipfs init and ipfs daemon to initialize and start the IPFS client.
  • Generate the website files and run ipfs add -r <website-root> to send its contents onto the IPFS. The last few lines of the output should tell you the hash for the root directory.
  • If you want to make use of IPNS, run ipfs name publish <website-root-hash> to direct the IPNS link to the folder you just uploaded. The IPNS public key hash can be obtained via ipfs key list -l.
  • Repeat the last two steps every time and the website files are updated or rebuilt. The process has little overhead due to the inherent deduplication in addressing, making it particularly suitable for static sites where larger files (like photos) tend to change less often.

Once this is done, you can access your website at either <gatway-address>/ipfs/<website-root-hash> or <gatway-address>/ipns/<ipns-address> from any HTTP gateway: you can use the local one (likely at 127.0.0.1:8080) started by the IPFS daemon, or any of the public ones (comes with extra risk of MITM attacks from the gateway owners as file retrieval is done on the gateway servers). In case you have multiple websites, you can generate more IPNS key pairs using ipns key gen, and specify --key when running ipfs name publish to a specific IPNS address.

Before IPFS supports import/export of the IPNS keys though (so that we can backup keys and publish from multiple devices), DNSLink can be used to more conveniently maintain access to a site, albeit at the cost of depending on owning a domain name and trusting the DNS host provider. To allow access to the site from the gateways via /ipns/<domain-name>, simply add a TXT record to the domain:

dnslink=/ipfs/<website-root-hash>

or

dnslink=/ipns/<ipns-address>

For instance, you can now access this site using at /ipns/shimmy1996.com (this is a link using the ipfs.io gateway). While not flawless, to me this is a reasonable compromise for now. I find find IPFS to be generally faster than IPNS, so using IPFS address with DNSLink probably makes more sense. To avoid manually copy-pasting the IPFS address each time, I added to my blog build script the following to automatically upload website to IPFS and update DNS record (using DigitalOcean's API):

echo "Uploading to IPFS..."
hash=$(/usr/bin/ipfs add -Qr "<website-root>")

echo "Updating DNSLink record..."
token="<digitalocean-api-token>"
curl -X PUT \
     -H "Content-Type: application/json" \
     -H "Authorization: Bearer $token" \
     -d "{\"data\":\"dnslink=/ipfs/$hash\"}" \
     "https://api.digitalocean.com/v2/domains/<domain>/records/<record-id>"

Record ID for DNS records on DigitalOcean can also be retrieved via their API. You may need to add ?page=2 or later to the request to find the record you want.

Do note that like using any offline HTML files, we need to use relative URLs in the generated web pages. In Hugo, this can be achieved by setting

relativeURLs = true

in config.toml.

Of course, being a P2P network, IPFS won't be able to retrieve the files if there is no copy to work with at all. By default, IPFS client would pin anything you shared from the local machine: pinned contents won't get deleted, ensuring at least one copy of the shared content is available on IPFS. You can unpin outdated versions of the website, or if you want, find and pin the shared directory on multiple machines for some redundancy.

The Stars, Like Dust

Back to the issue with IndieWeb: the increasingly shady domain name system and link rots makes URI stability in HTTP hard to maintain. However, what if we use IPFS/IPNS addresses as URIs? It's a match made in heaven: we get robust distributed access to static web pages, gated by Mathematics instead of FBI warnings, that can theoretically last forever. Removing the need for maintaining a server also lowers the barrier of entry of owning a website. The HTTP protocol has existed for 29 years, and IPFS, only 5. I don't know if IPFS will continue to exist for the next 24 years to come, but if it does, I hope we will be looking at a perhaps more chaotic, but more robust, lively, and colorful, online world.

TIReD: A Personal Rating System

As the pandemic gives me a chance to look through my backlog of movies, shows, and books (read: anime and manga), I started to consider establishing a personal rating system to ease up writing (hypothetical) reviews.

Guiding Principles

Typical rating scales feature 10 or more levels, which is in my opinion way too wide a range to choose from, not to mention those featuring a 100-point-scales. Even the most common 5-star system gets cumbersome fast as soon as we take half-stars into consideration. What exactly differentiates a 6 from a 7 or a 4.6 from a 5.1? Higher granularity could be useful in aggregated ratings, but not so much from an individual reviewer's perspective. I much prefer the approach s1vote took: give the users fewer but more distinctive levels to pick from.

My anecdotal evidences show that most online ratings converge around the 70% mark, a rating just as safe and useless as predicting a 40% success rate for anything. In other words, the lower half of most rating scales are underutilized: how often would you rate something one-and-a-half-star instead of just one? Besides, more often than not, I read ratings and reviews to find out about good shows, not the bad ones. It should be sufficient to only focus on "the better half": why would I sit through the entirety of a bad show and take the effort to give it a rating anyways? There is no -1 star in Michelin Guide, is there?

Summarizing the quality of anything with a single metric seems unfair. I want the rating system to be more expressive, capable of conveying the different aspects of a show that I find enjoyable. At the very minimum, an opinionated pick should be distinct from something with a more general appeal.

Rating Methodology

Enter the TIReD scale! The following uses anime/tv shows as the example here, but much of this methodology also applies to other art forms. A show is scored in the following categories, with sum of points forming the final rating:

CategoryRange
Tangible0-2
Intangible0-2
Revisit-ability0-1
Discretionary0-1

Tangible aspects of a show include visual style, animation, soundtrack, CG quality, special effect, etc. To put it simply, how physically well-made a show is. Starting from a score of 0, a show would be scored a

  • +1 if the show is overall attractive to watch and either has consistent high quality with very few shortcoming (perfection) or utilizes unique ideas/techniques to great effects (ingenious);
  • +2 if its physical quality/way of expression alone would be sufficient reason to watch the show, even if it gets a 0 in all other categories.

Intangible aspects include story, character building, plot pacing, cultural reference, etc. This quality should be relatively medium independent, i.e. I would enjoy a faithful recreation of the story in other art forms at least just as much. Criteria for scoring is similar except for remakes/adaptations with an clear intent to follow the original and when I have seen/read the source material: scoring would be based on the source material's intangible score adjusted downwards by 1 point, with at most extra 1 point adjustment based on quality/difficulty/effect of the remake/adaptation with in the range of 0-2. For instance, a mediocre retelling of a +2 story should only be awarded at most a +1. Remakes and adaptations probably have an easier starting point than original contents, so I wanted to adjust for "how good the show could have been", provide an answer to "should I still see this if I've seen the original", and pick out the "watch this instead of the original" or "transcended and elevated the original story" shows.

Revisit-ability, as the name indicates, represents whether I would want to revisit/rewatch the show later. This correlates more with my own taste or nostalgia: is this something that I would gladly jump into in an leisure afternoon. Longer shows tend to suffer a bit by this metric, so I would take into account of especially memorable segments/episodes. However, in event of remakes and adaptations, this point should generally only be rewarded to the best version of the work in my point of view.

Discretionary point should be awarded sparingly and only when a show doesn't already achieve full scores in all other categories, making the possible maximum score 5 instead of 6. This is used as an adjustment for shows that I feel the current rating system doesn't do it justice. Common situations where this applies include but are not limited to:

  • categorical superiority: best of its kind;
  • a tight coupling between tangible and intangible aspects of the work: it simply won't be the same without one another;
  • quality in spite of objective limitations, especially for older shows or those with a tight budget.

Format

A TIReD rating is recorded as X=T/I/Re[+D]. For instance:

  • a show scoring 1 in tangible, 2 in intangible, 0 in revisit-ability, and 0 in discretionary would be recorded as 3=1/2/0;
  • a show scoring 1 in tangible, 0 in intangible, 0 in revisit-ability, and 1 in discretionary would be recorded as 2=1/0/0+1.

Shows that I abandoned halfway, meaning I won't be able to give a rating, will be marked as DNF (did not finish).

Self Q&A

Some fragments of thoughts that I came across when designing TIReD.

Q: How should tangible points for books be awarded?

A: I'd say it's how good the writing is at face value, i.e. is it "literature" worthy. While I not really confident in my ability of identifying great works, but a +2 should at least be something better than Harry Potter.

Q: How should world settings built up in previous/related works affect the rating?

A: World building actually fits into both revisit-ability (if the system/world is interesting and makes me want to read more about it) and intangible quality (whether the character actions are justified).

Q: How was the rule for discretionary point determined?

A: The best shows should always get full score regardless of the exact scale, so awarding them discretionary points is meaningless. However, there are seemingly not-so-impressive works that really show the passion/devotion/love/good faith of the production team/author and shows whose existence alone is a boon for its fans. I want to express my enjoyment in a way that still allows me to assess the tangible and intangible aspects of a show on an absolute scale, as any further complication can be taken account of as discretionary point.

Q: What happens to ratings for a remake before and after you watch the original?

A: I'll adjust score for the remake now that I have experienced the original.

Q: A lot of details could be lost in translation. How to deal with translated works?

A: For now I will treat these the same way as remakes: adjust the rating if someday I came across the original.

Q: How did you come up with the name "TIReD" (and name for the categories)?

A: The first category to have a concrete name is revisit-ability. From there on it's mostly just playing around with words and initials. I almost settled on "TIRD" thanks to Urban Dictionary. Well, not everything is sh*t. 😜

Get GOing

Yes, I finished the Advent of Code this year! Aside from the problems being easier (for me) than 2019, I'm also using Go for this year's challenge and I find it to be particularly suited for this type of endeavor.

This year's puzzles mostly involve string parsing and finding efficient data structures. Majority of the logic flow are pretty straight forward and there's little need for sophisticated algorithms.

For string parsing, regex, which Go has built-in support for, is definitely the way to go. The abundance of parsing related problems means using only basic string manipulation could be rather painful, and I've definitely seen my share of horrible blobs of find/substr/trim.

Most of the time, slices and maps are all I needed. Go has multiple return values but no tuples, whose usage, I find, is largely replaced by either arrays or structs. Versatility of these data structures are actually increased due to the language's encouragement to use constants instead of enums: storing all information as ints opens up the door to some shortcuts and less conversion between types. Surely they don't give you the peace of mind type checked enums provide, but (ab)using them in short programs does provide the odd walking-on-a-knife-edge (or not-wearing-pants-during-Zoom-call) kind of satisfaction.

Imperative programs are easy to write in Go, mostly because of the language's plain and simple control flows and lack of mixed paradigms. There's no need to worry about whether we should use an STL algorithm or chained iterator methods: just write the loop. Reasonable mutability behaviors also helps: whether it's changing a map while looping through it or passing a struct containing a slice to another function, I can get the language to do what I mean without checking the specification line by line.

There's the ZOI rule about how the only reasonable numbers are zero, one, and infinity. Quite a few other languages I know, such as Python, C++, and Rust, all seem to hinge on the extreme ends of the spectrum in pursuit of consistency: everything follows the same rules and users can dictate what the syntax means as much as the base language. Go definitely has more exceptions (without supporting it) and "one" moments: built-in containers are magically generic, their methods can have variable number of return values, and everything else is denied the privilege of being eligible to be iterated over.

While just a quick comparison without touching other traits of Go (say interfaces or goroutines, but you don't really need them for Advent of Code), I do find Go's choices peculiar and interesting: everything is, well, just its own thing.

2020 in Review

Rooftops are covered in patches of white this morning. All the billboards have lost their typical splendor to the gloomy sky. Even the street lamps' orange glow failed to add any warmth to the car-free roads. Spots of light from a handful of building windows, however, do appear extra dazzling.

What a year. It feels like space-time has a higher viscosity than usual—dense enough to reduce sunlight to just an ivory ambiance—given how eventful the past 300-or-so days have been.

I'm actually glad that the first day of 2021 still feels like any day in 2020. Not very much should physically change simply because of a number flip, not to mention a rather arbitrary one, but perhaps it's exactly for the lack of change that we need to forge something new, something that gives an adrenaline kick, no matter how small.

Ugh, fine. I see no harm in giving in to this cheap psychological trick every once in a while.

Happy New Year, we made it.

2020: Apocalypse

I'm not cutting myself any more slacks this time around.

  • Run 550 miles. Run 205 miles and cycle 865 miles (2.5x). [205/205][872/865]
  • ☑ Write 14 blog posts. [16/14]
  • ☑ No donuts.
  • ☐ Dive into Go and C++20. [1/2]
  • ☐ Set up proper backup workflow.
  • ☐ Read non-technical books.

Because of COVID-19, I have stopped running outdoors since early March. After a few months of hiatus, I got a bike and a trainer in June and started cycling indoors instead. The 2.5x scaling factor is based on the speed differences between cycling and running. Working out in a more controlled environment is very enjoyable. Aside from easy access to fueling and shielding from the weather, being able to watch anime/listen to seiyuu radio while riding is a game changer. Behold, technology!

Blogging about the blog itself still takes up a sizable portion of my posts (and is a frustratingly self-defeating practice), but I did at least accumulated quite the amount of hoots: these fleeting thoughts aren't organized enough to be its own post, but still interesting enough that I want to write it down. I also use hoots to house my replies to other blogs and the rather cumbersome process of which makes me realize how little I really have to say most of the time. Not to color my still largely manual approach superior, but I do think there is some merit in eliminating low-effort-high-noise contents, both for myself and others.

Ah, donuts, the honey glazed shackles of guilt, the deep-fried cuffs of indulgence. While I would like to attribute this to my will of steel, it is COVID-19 that got the better of such temptations. My laziness and excitement for bunker life eliminated any chances of late night Dunkin' visits. Guess it's time to turn up the dial.

Writing Go was quite the mindless fun exercise. Finding an effective way to learn the C++20 features proved to be harder. <format> is the straightforward one and pretty much works as you'd expect (no compiler supports the standard version yet, so checkout the original). <ranges> is similar to Rust's iterator methods and allows chaining, too. Maybe I should update my enumerate() with C++ post. <concepts> seems like the logical solution to the problems SFINAE tried to solve, but I don't have a good context to test out its prowess yet. On a related note, Zig's compile-time function approach to generics is also intriguing.

3 copies, check. 2 different media, check. 1 offsite backup, not yet. I'm also counting Syncthing copies here, and whether they can be relied upon as full fledged backups is debatable. Still some way to go here.

Technically, I did read non-technical books; I didn't finish any (not counting manga at least). The truth is, aside from those I read purely for entertainment, I am not so sure about what to read. Most non-fiction books look like success stories marinated in flattery and survivor-ship bias. Fictions, on the other hand, just don't attract me that much: knowing another story to tell isn't as exciting as learning a new algorithm for me. Gee that sounded harsh. Do I really think my blog posts fare any better? Anyways, before admitting defeat, I will give this a more serious attempt this year.

2021: Days of Future Past

The ongoing pandemic sparkled nostalgia like never before. People look back at the "normal days" with fondness that I find repulsive. Not that I'm completely immune to the atmosphere though, just that it rubs me in the opposite way: I find myself grew more assertive than before. After all, doesn't everyone secretly think they are above average and thus know better, especially after reading the news? At the same time, the voice of reason tells me to suppress this urge before it turns into arrogance, or even worse, ignorance. Perhaps I should learn to let these out in the form of blog posts, like EWDs, except non-technical.

On a positive note, my transition to wake-up-at-5-sleep-before-10 schedule is a resounding success. The lockdown WFH actually helped in that I have more leeway to adjust my sleep schedule. Now I have plenty of time for exercise every morning or even the option of another two—or three if I'm really pushing it—hours of sleep. Given how I was able to clock in the last 100 miles of rides within the winter holidays, I'll bump the target mileage up a bit this year.

The schedule change also made me realize how unproductive the few hours before bed really is for me: after a day of work and much needed dinner, I don't feel motivated enough to exercise or focus on anything for an extended period of time. Since I started beancount-ing in 2020, I'm now looking to apply a similar methodology to my time. I've been testing out Toggl Track to log how I spend the larger chunks of my day and how many minutes in between slipped away with me blanking out watching YouTube. In particular, I figured having a crude "Strava for reading" system would also make my reading goals easier to achieve. As for which books to read, I'm thinking classic fictions.

After donuts, my challenge this year is to abstain from cookies, which can frequently be found in my work place lunch bags. It's strange how exponentially more attractive cookies are to their ingredients, i.e. sticks of butter and bags of sugar, the latter of which would have been sickening to consume directly.

I wonder if this is an age thing: at some point, human's auditory perception would just click with the sound of electric guitars, making it impossible to resist. I'm looking to sink more time into learning the instrument and be good enough to play a song or two by end of 2021.

The generation after Z is named Alpha, which makes no sense at all. To hell with inconsistent naming. To hell with COVID-19 (for other reasons, of course).

Un de ces matins disparaissent
Le soleil brillera toujours.

Bio Pages, Multiscale Writing, and XPA

Or, a roundabout way of explaining why I don't have an dedicated "About" page.

Bio Pages

I find bio pages hard to write.

I've always despised bio pages that sound like:

Scott Danger Solo, MIB, is a WHSA certified Sigma-level worm-hole surfing professional that shoots first, crosses the streams, and thinks 4th-dimensionally.

It grosses me out the same way that ego-flavored bubble gums would. I can't help but take these statements as a desperate attempt at smearing online contents with every last drop of legitimacy squeezed out of grand-yet-insincere-sounding words.

Most of the time, I opt to not include a bio on my online presences. Among the few exceptions is my old WordPress blog where I put:

EE major; new to WP and not very good at it; weeb; disproportional appetite for new hardware compared to my wallet size; may appear on social networks as shimmy1996; let's be friends XDD.

Even that felt too revealing for me. In other cases, I just use random made up sci-fi one-liners, for instance:

University of Trantor, Extraterrestrial Lifeform Breeding and Culinary Arts Major

Coming up with imaginary professions is actually a lot of fun and I can do this all day long. Just to give you a sneak peek at my stockpile:

  • Supervillain mechanic (the kind that engages in their repair and restoration, not actually in taking over the world);
  • Saturnian folklore and Demonology enthusiast;
  • Native speaker of Fishish (a dialect of Atlantish, used by most crustaceans and aquatic mammals in the North Atlantic Ocean; confusing name, I know);
  • Genff panel (chorono-voltaic modules, think about it as a reversed flux capacitor) technician;
  • Collector of ultrasonic music (no, that does not include Snake Jazz, they are inferior to Whale Blues or Bat Rock);
  • Star magnitude calibration specialist;
  • Dream composition and cinematography expert.

The list would have been longer if full-spectrum photography is not actually a thing.

Ah, see how easily I get distracted by these? Back to bio pages on a version of Earth where birds (or Biofueled InspectoR Drones if you prefer) are real and tree octopus aren't, unfortunately.

Why do I always read bio pages under the assumption that they are written with the purpose of exerting authority or "crafting your personal brand"? Wouldn't that make me, who is showing contempt and animosity towards others' qualifications, the one actually displaying syndromes bordering superiority complex? Is it being brought up hearing "modesty is the best policy" all the time finally backfiring? What should the bio page contain anyways? If the purpose is to sprinkle a few hashtags for others to shoehorn my personality into, I would rather not provide such a distraction from contents of the site. Then again, one can argue that if my personality as manifested through the site is easily swayed by the bio page, perhaps the contents aren't really speaking much for themselves after all.

Multiscale Writing

I currently classify contents on this site loosely into three categories:

  • Posts: anything with a publish time and a title;
  • Hoots: anything with a publish time but without a title;
  • Fixed: anything without a publish time.

Up till now, I have always put bio pages under the "fixed" category. However, I have come to realize this have more subtle implications.

This first came struck me as I was casually browsing my RSS reader and landed on a blog post without any indication of publish time. Since I vaguely recognize the page title from memory, I instinctively scanned through the page, searching for a timestamp of any kind. After some detective work, I was able to date the post by checking the HTML source. Realizing that this page was published long ago and only showed up in my RSS reader again due to updated feeds, I promptly left the page. Could there have been subtle wording changes? Maybe, but I didn't remember my first read well enough to recognize them. Could there have been substantial additions? Equally likely, but unless there's a FOMO-inducing "updated XXXX-XX-XX" in huge red fonts, I doubt I would have scrolled down. On a related note, I also see blogs displaying not only publish time, but also a glaring banner warning the readers that the contents may be out of date and the author's opinions may have changed since. Funny how the latter is apparently no longer obvious short of an explicit no-responsibility clause now, but it does illustrate the point: I treat pages without any indication of publish time as ones set in stone, completed works, and ultimate truths of the universe (from the author's view).

There's a mismatch between what I hoped to express through bio pages and the typical fixed page format itself. Well, what are the alternatives? I don't want an E/N site, as I value the process of organizing my fragments of thoughts as much as, if not more than, the process of collecting them. I've played around with the idea of a personal wiki, but I would like to have separate pages for "major versions", instead of cramming all edits, regardless of importance, into editing history. While for technical contents, latest edition with all the errata incorporated is naturally the most desirable, I don't view my former self necessarily as obsolete or wrong, yet I also don't want to mix past and present on the same page "long content" style.

I want bio pages to be condensed me-flavored words, which would be a moving target that a fixed page will forever be playing catching up with as my thoughts evolve over time. Between fixed pages and posts, there is a missing time scale: I need something that manifests change faster than a fixed page, but more long-lasting than regular dated post.

XPA (eXtensible Personality Archive)

Cool name, right? It's a happy accident that XPA is also the name of a protein (and the corresponding gene) responsible for repairing DNA damage.

Now, now, before discounting this as unnecessary formality, hear me out. Instead of a single fixed bio page, I think the most fitting substitute is a collection of gradually updated documents, not dissimilar to chapters of a book. While some books, like manga or web novels, are normally published chapter after chapter non-stop Markov-process-style, I'm thinking more of a non-linear progression where rewrites and revisions can happen more frequently.

Some blogs I visit feature sections named "articles" or "opinions" that are distinct from "posts" and serve similar purposes. The format I have in mind though is closer to RFCs, PEPs, etc. XPAs would be numbered, each XPA would be a dump of my current thoughts and personality pertaining to a specific topic, and they can be superseded by a later one with similar coverage. Meanwhile, posts are reserved for concrete things I did or experienced. In other words, XPAs contains literal states of my mind and posts/hoots serve to document some of the incremental changes between those states.

Following its definition strictly, XPA is actually a much more flexible format than I originally thought: reviews could also fall under its umbrella, for instance. Great Scott, just think about all the possibilities! Now the only remaining bike-shedding to be done before I can get started is to determine how XPAs should be presented on the site, whether I count from 0, which numerical system to use, how should we format the identifiers...

Hmm, naming really is hard isn't it.