Spreading my ignorange

End of feed, I’m going anonymous

It’s the end of this feed after a decade.

I started Scratchpad as a blog for my lesser writings. I wanted to publish rough essays on it, and later turn the best of them into polished articles for my main blog.

The statistics

I had a decent rate of publishing over the past decade: 221 posts; 40,888 words; and 282,533 characters. Unfortunately I have not turned a single post into a quality long form article. I am happy with how the experiment turned out, and I am glad that I sustained a semi-regular pace of production over a decade.

Publishing is number 1 priority

When I write something, even if it’s a rough draft with half-baked ideas, I try to publish it. It makes sense to put most of your work out there for the world to see, even mediocre and flawed artifacts. We are bad at judging and assesing our own creations, we have blind spots, we need others to help us, guide us, and steer us towards our most promising musings. Exposing the fruit of the creator’s labor to the audience is the only way to evaluate it.

Write, get your thoughts checked by the world, and see if somebody cares. Maybe some of your readers will post feedback, critic or compliment to your inbox.

Publishing with your name is risky

It doesn’t matter who you are and where you live: publishing with your personal identity publicly is always risky. When I publish here with my real name, I expose myself to small perils: my writing may be subpar, my thinking may be faulty; this may reflect poorly on me in the future. My opinions may go against the grain of public opinion or the authorities’ interests. Now, or in 10, 20, 40 years…

Nobody knows if the author has sinned until the work has been published for long enough for the public mood to become hostile.

Publishing anonymously is less risky

If they can’t find you, they can’t get you. Anonymity can be liberating if you live in an environment hostile to deviant work and ideas.

The major downside of anonymity is: total obscurity at the beginning.

Starting an anonymous blog means I have to start from absolute zero. I have to build my audience from scratch. I can’t rely on any of my existing connections and assets.

I still think it’s worth it. My hypothesis is that publishing with a pseudonym can set my creativity free. It would give me the freedom to step into topics I would normally avoid, the freedom to explore areas considered off limit.

I want to see what happens.

Publish both ways

I will split my presence on the web in two:

  1. My personal homepage, that I will revive after being dormant for about 10 years. Where I will publish the material I am comfortable putting my name on.
  2. My secret playground. A feed to experiment, and take weird, controversial, or misinformed positions. All without suffering bad consequences in my life.

Anonymity is Internet’s greatest gift

Anonymity is the Internet’s best feature. Anonymous Internet is a parallel universe with less friction than real life, we can play in this space. The anonymous netizen can explore and experiment further.

Au revoir

I am going back to the old Internet: On Internet nobody knows you are a dog. That’s the way I like it. Meanwhile the real me will still publish at henry.precheur.org, subscribe to the Feed to keep yourself updated.

I installed a new battery on my Thinkpad X1 Carbon

I fell in love with Thinkpads since I got my first one in 2005: it was the T42 model. It worked well under Linux, a rare feat for a laptop at the time. Its keyboard was superb, with a satisfying tactile feedback, and a crisp quiet click. It was the best keyboard I ever used back then. I unfortunately had to give this Thinkpad back when I quit my job to move to Canada in 2007.

A few years later I got a used X61, I replaced its internal HDD with a must faster SSD drive, this gave the laptop a new lease of life. I used it for a few years before giving to one of my brother attending university.

In 2013 I bought the then brand new Thinkpad X1 Carbon 1st generation. I loved it when I first saw it, it was like a MacBook Air, but in black with a somewhat open hardware, and a decent keyboard. Initially a full battery lasted 6 hours.

Nine years later, the battery has aged. It held about 60% of its original capacity, but I would get at most a couple of hours out of a single charge. The laptop felt sluggish and ran hot under load. I assumed this was because the software running on it was more expensive to execute than what it ran back when it was new. My X1 Carbon was sparsely used in the past few years, because it was unpleasant to use: slow, toasty, and dead in two hours.

A few weeks ago I decided to get a new Thinkpad as my work laptop. I settled on a used X270, that I’ll talk about in a later post. I got a new battery for the X270, and I saw that the store also sold batteries for my X1 carbon, so I ordered one for 55 CAD plus shipping to try to revive my aging device.

Before I got the battery in the mail I cleaned the outside of the Thinkpad and vacuum-cleaned it to suck much dust out before opening it. To my surprise the laptop performed better after this quick clean: the machine felt snappier, cooler, and the battery lasted a bit longer. I believe the accumulated dust impeded the cooling system, and cleaning it made the laptop work better overall. The cooling was more efficient, saving some energy, and the processor had more headroom to clock up when needed. Less fans spinning, less laps roasted, and less energy wasted. \o/

Once I got the package with the batteries in the mail, switching the old battery with the new one was relatively easy. I followed the instructions from Ifixit, and 15 minutes later the new battery was installed.

After a couple of days of use to let the battery calibrate, the laptop is back to full health. I get between 4 and 6 hours of battery time, and the computer feels responsive and cool. I did the upgrade a week ago and I’m still delighted to use this old friend of mine again.

Replacing the battery on your devices is one of the best ways to preserve the environment, and get more utility out of your electronics.

Highlights from Gerd Gigerenzer’s interview with Russ Roberts.

On the public’s concern about online privacy:

And, as you hinted before, there’s the so-called Privacy Paradox, which is that, in many countries, people say that their greatest concern about their digital life is that they don’t know where the data is going and what’s done with that.

If that’s the greatest concern, then you would expect that they would be willing to pay something. That’s the economic view. […]

[…] Germany is a good case. Because in Germany, we had the East German Stasi. We had another history before that—the Nazis, who would have enjoyed such a surveillance system.

And, so Germans would be a good candidate for a people who are worried about their privacy and would be willing to pay. […]

I have done three surveys since 2018, the last one this year. With representative sample of all Germans over 18. And asked them the question: ‘How much would you be willing to pay for all social media if you could keep your data?’

We are talking about the data about whether you are depressed, whether you’re pregnant, and all those things that they really don’t need.

So: ‘How much are you willing to pay to get your privacy back?’

75% of Germans said nothing. Not a single Euro. […]

So, if you have that situation where people say, ‘My greatest worry is about my data’; at the same time, ‘No, I’m not paying anything for that,’ then that’s called the Privacy Paradox.

The public’s concern about surveillance is similar to the concern about the environment: the public understands the problem, but doesn’t really care.

I believe most people fake their concerns about surveillance and environmental decay because that’s what they are expected to do in polite company. The public shows its true color once it has to expend resources on solving the problem instead of merely virtue signaling.

Gerd Gigerenzer made another great point about surveillance; we get our citizens started early these days:

I think there’s already surveillance in a child’s life. Remember Mattel’s Barbie? The first Barbie was modeled after a German tabloid cartoon, the Bild-Zeitung, and it just gave totally unrealistic long legs and tailored figures. The result was that quite a few little girls found their body not right. In 1998, the second version of Ken could talk briefly—utter sentences like, ‘Math is hard. Let’s go shopping.’

The little girls got a second message: They’re not up to math. They are consumers. And the 2015 generation, called Hello Barbie, which got the Big Brother Award, can actually do a conversation with the little girl. But, the little girl doesn’t know that all the hopes and fears and anxieties it trusts to the Barbie doll are all recorded and sent off to third parties, analyzed by algorithms for advertisement purposes.

And also, the parents can buy the record on a daily or weekly basis to spy on their child.

Now, two things may happen, Russ. One is the obvious, that maybe when the little girl is a little bit older, then she will find out, and trust is gone in her beloved Barbie doll and also maybe in her parents.

But, what I think is the even deeper consequence is: the little girl may not lose trust. The little girl may think that being surveilled, even secretly, that’s how life is.

And so, here is another dimension that the potential of algorithms for surveillance changes our own values. We are no longer concerned so much about privacy. We still say we are concerned, but not really. And then, we’ll get a new generation of people.

There are already plenty of scary stories, like this one: Google falsely told the police that a father was a molesting his son.

Despite all this I’m still running most of my digital life on Google’s infrastructure. I must make a move.

Bullshit is a zero sum game

When two bullshitters meet, they usually start competing within minutes. Bullshit works best when one has a monopoly on it. As soon as there’s competition, nonsense loses some of its power. If two narcissists take part in a group conversation, they have to exaggerate more and more to grab attention. This leads to an arm’s race that quickly undermines the whole lying to get status shtick.

We can see how attention seekers ruin polite conversations on social media. Being the bullshitter-in-chief is hard work: politicians and media personalities must constantly raise their game, it is exhausting. Most bullshitters dabble in politics but aren’t really in the political arena: it’s the big league, it’s too competitive.

Bullshitters are in the game for the easy status. They usually avoid each other and hang out in small circles of normies that will just go alone with them.

Being anti-social can be a competitive advantage

A few weeks ago on Hacker News, someone asked a question about a job offer they got from Amazon. An Amazon employee replied:

For me Amazon took an unprecedented toll on my mental and physical health. I did earn enough money, but I immensely regret all the time I didn’t spend with my family over the years, all the friendships that faded, and the constant reminder from leaders how I could always do better - nothing was ever good enough.

Amazons leadership fundamentally does not see their employees as human beings. As I grew the ranks over the years, I was directly coached on removing myself from certain day to day interactions, because it would simplify decision making if I didn’t have an interest in my own people, that simply forming just work bonds was a conflict of interest in terms of doing what’s “right” for the company.

Being anti-social with colleagues can be a competitive advantage in vast bureaucracies like Amazon. Giant corporations want easily replaceable employees. When you don’t have emotional attachment to your co-workers, you are a better pawn to play with. You’ll get an excellent paycheck at Amazon, but the price is more than the time spent working. The price is a bit of your soul.

I got my first job in 2004, 18 years ago. Of these 18 years there are 4 that I regret: I worried too much, worked too hard, or felt entitled to something I didn’t earn. I stayed because of the money. It was never worth trading my peace of mind for that extra cash.

Work isn’t only about trading one’s time for money. It’s also about the sense of meaning it gives to life. Toil is meaningful because it connects us to the rest of humanity. We work to be with folks that make meaningful economic contribution to our community. When we retire, these connections are all we got left: the memories and the friendships that grew over the years are precious.

When my dad retired from the place he worked at for almost 40 years, it was hard on him: his old company wasn’t doing well and was being dismantled by unscrupulous executives. He found solace with his old co-workers, they meet from time to time to talk and reminisce about the good old days.

What’s the point of material comfort when one’s mind is starved of meaning?

Make friends at work, and stay in touch with them.

6 things I got from “L’enrachinement” by Simone Weil

The English title of this 200 pages book is The Need for Roots. I read it in French.

Simone Weil is acute, delightful, bold, insightful, and scandalous. Her vision is bucolic, and human.

Here are some salient ideas I got from her work.

1

Rome corrupted Catholicism by tying itself to the religion. God was made more like a king ruling over the world rather than the world itself.

2

France has a long history as a dictatorial police state ruled by a single person. Democracy hasn’t changed this.

3

Science replaced religion and tradition as the source of truth. While science allows us to know the world better than these old principles, it can’t tell us what is good or bad.

4

Christianity has become a matter of convenience.

5

The priesthood of science is just as corrupt and the religious priesthoods. The noble scientist is an illusion.

6

France has 3 classes: bourgeois, workers, and peasants. The morale flows from the bourgeoisie to the workers, and then to the peasants. Money is what matters most, because that’s what matters to the bourgeois class.

Gmail’s UI replacement

Before I moved to Gmail I used mutt, a command line mail client. I downloaded and uploaded emails with offlineimap, and used mutt to view, move, and delete the downloaded emails. Offlineimap was fine, but I had a few issues with it. Most of the time it was the local state getting out of sync with the remote side. I’d hit Ctrl+C or close the terminal at the wrong moment and some local state files would get corrupted. It was usually easy to fix, occasionally I had to remove all my local emails and re-download everything.

I set-up isync to see how it does, and thus far I like it. It seems a bit more lightweight than offlineimap. I haven’t had any problem with it yet.

Back when I used mutt, it was a pain to configure the way I wanted and I never really got comfortable with it even with my custom configuration. I disliked that I had to write a lot of configuration to get it to work how I wanted. Switching to Gmail was a breath of fresh air: because shortcuts weren’t configurable I just had to learn Gmail’s defaults. Today mutt is in a better shape than it was 10 years ago. I could give it another try, but I decided to go with aerc a relatively recent Email reader. aerc is opinionated about how its workflow, it’s configurable, but not to the same extend mutt is. I am still learning how to use aerc, things are a bit difficult to figure out at time: the current documentation isn’t that great. Otherwise I like its philosophy.

The end of the free Google lunch

In July 2008 I moved my email with a custom domain to Gmail. Previously I was self-hosting it on a Linode virtual private server.

The move was seamless. I was happy to avoid maintaining my own Email infrastructure; Gmail was fast, reliable, and its user interface was nicer than the other webmails I tried. I have been a happy customer for a long time, and best of all it didn’t cost me anything. Back in 2008 this system was called Google Apps, and it was free for personal use. It was renamed to G suite, and is now called Google Workspace.

Today I received an email saying that this gravy train was about to stop, unless I switched to a paid plan. I don’t mind paying for this, Google provided an excellent service for almost 14 years.

Unfortunately, since I moved my email to Gmail, I lost faith in the company’s ethics. The days when the corporation’s motto was “don’t be evil” are gone. Since the firm’s main source of revenue is advertising I believe its values are fundamentally at odds with mine. The big G has grown enormously and expanded in various areas over the past decade. I don’t feel comfortable having my personal information stirred into its gargantuan data cauldron. Anybody paying money can get a ladle of data laced with some of my dark secrets.

So it looks like 2022 will be the year I move away from Gmail. I plan to document the journey here.

I recently got myself a new phone, and I noticed that my feed reading app: gReader Pro didn’t get migrated to my new phone. It turns out it was no longer on the Google’s app store. I paid for the app’s Pro version, and I didn’t want to subscribe for $5/month to gReader Premium. Without a Premium subscription there’s an Ad at the bottom of that takes about 10% of the screen.

Luckily the app’s updated APK are still available via Github. I was able to download it and get rid of the ads.

That is all.

2020 assessment

The end of the calendar year is an opportunity to look back and reflect: How has this year transformed me?

The body

The Covid-19 pandemic got everyone locked down at home since March. I used to bike to and from work every week-day, I missed these 40 minutes on the bike. After a month inside I felt weaker, my lower back ached from the constant sitting. I tried biking in the morning, but doing a tour and coming back home 30 minutes later didn’t work. I did it once and never found the motivation to do it again.

A month into the lock-down a coworker —Marty— suggested we did push-ups and publicly posted our daily ‘score’, so we created a dedicated Slack channel: #beat-marty-fagan to track our progress. Marty walks across Antarctica with his wife for fun, he knows what commitment to fitness is like.

I started doing 40 push-ups daily, after a week I added some squats and lunges. Having a group of people posting their daily workout accomplishments on Slack helped with motivation. Unfortunately after a couple of months I was the only one still posting regularly on the channel. By that point the habit was sufficiently ingrained that I kept going without the need for peer pressure. Initially I only did body weight, after reading and watching Pavel Tsatsouline I got a set of kettlebells for a more effective workout. I worked out every weekday and most weekends for about 8 months: I feel more energetic, and my lower back handles long sitting sessions without getting sore. When I get out of the shower, I check myself out in the mirror for an inappropriate amount of time, it does wonder for my self-esteem. I love being hotter me.

My current workout routine takes about 10 minutes from start to finish. I’m planning to keep doing it five times a week in 2021.

The mind

The best blog post I read in 2020 is: How I read by Slava Akhmechet. This post gave me a renewed sense of purpose about reading and learning. Its salient insight is that reading 5 books on a subject will give you a better perspective on it than 99% of the people in the world.

The advice from the article I loved the most is: “try to read 40 pages a day”, I put it in practice right away. I fell short of the 40 pages objective most days, but still made steady progress. Since November I finished 3 books: The Brothers Kamarasov, Line by Line, and One Day in the Life of Ivan Denisovich. I only read 4 books in 2020, doing better next year would be a nice.

Cold showers

I had a difficult Monday, I woke up with low energy and felt dizzy. So instead of being productive that day I read articles online. One of the posts I read was the excellent On stress and comfort, where Slava Akhmechet discusses how he adapted to the Covid lock-down and how cold morning showers lift his mood.

The next day I woke up still felling tired and sluggish. So I took a 10 seconds cold shower that morning, and it rebooted my mind: I felt energized and had a renewed sense of purpose. I had a productive day, and went to bed felling good.

Every day my friend Cyrille takes cold showers or goes for cold swims, he started 6 weeks ago and he loves it thus far. I often heard or read about the beneficial effects of cold water. When I was a student I took cold showers to wake myself up after some late night partying, it worked great. I stopped taking cold showers after graduating, possibly because the late night partying also stopped.

There’s a randomized study that shows that people taking daily cold showers tend to be less sick. The study seems robust and its results significant. Many health professionals already use cold showers and baths to help athletes recover.

With all these anecdotes about cold showers, it’s time I jump on the band wagon and take one daily. I’ll update this blog with my experiences.

Books from the bricks and mortar

Over past 20 years I bought most of my books on Amazon, its inventory and delivery are the best in the world. I can find rare or specialized books priced competitively on Amazon, and its delivery is fast. The last time I got a book at a store was in 2009, and it was a gift; the last time I bough a book for myself at a store was in 2006. Amazon swallowed the book distribution sector whole. I felt sad to see small book stores close down, but I didn’t miss the book retail chains that Amazon killed one after the other.

Today Amazon is in a dominant position, but its value proposition has eroded, books got harder to find and are more expensive. Amazon’s delivery is still the best in the business, but I had numerous issues with it over the past years. While the company always reimbursed me, these issues are a tax on my time and tranquility.

For the first time in 14 years I got books from a local bookstore this week. I did it for a few reasons:

  1. I picked up the books in store the day after ordering online. Even small inventories may have the book you’re looking for, while with Amazon you always wait for the delivery. It was a breath a fresh air to order, pick-up, and start reading a book in less than 24 hours.
  2. A brick and mortar shop can order the books they don’t have in store. It’s possible to order most books Amazon sells from any book shop, at least at the ones I have looked at. I may still get some books online, but for most of my needs my local retailer serves me better.
  3. I like local pick-up. There are too many deliveries now because of the Covid-19 pandemic. I like that my local shop keeps the books around for when it’s convenient for me to pick up. Not dealing with delivery contributes to my tranquility.

Support you local book store, and see if they have what you want in stock. It’s nice to go out and talk to people.

Random things I learned about Kakoune today

I finally got around to read Vi(m) to Kakoune, this link was buried in my bookmark list for a while, and as a former Vim user that switched to Kakoune I should have read it earlier. There are a couple of things I learned from it.

Delete to line end:

alt-ld or Gld

This made me realize alt-h and alt-l can be used to extend the selection buffer both ways. Nice.

Edit alternate file / Previously edited file:

ga (in Normal mode)

I’m going to use this one all the time from now on. ga is the gangster command I needed when editing multiple buffers.

After 3 months of using Kakoune, I already feel more ‘connected’ to it than I was with Vim. Kakoune’s commands make more sense, and I love that my configuration file is only 29 lines.

I’m not quite at the level of productivity and speed I had with Vim. It’s hard to outdo 14 years of use and fine-tuned configuration and megabytes of plugins.

Stay tuned for more random bits about Kakoune.

gmi2html

Gemini is a hip alternative to the HTTP/HTML based Internet. I don’t want to miss out on the hype, so I wrote gmi2html. It’s a text/gemini to HTML converter written in Go, and it’s hosted on sourcehut:

$ go get git.sr.ht/~henryprecheur/gmi2html
$ go install git.sr.ht/~henryprecheur/gmi2html

Its design is inspired from Rob Pike’s talk: Lexical Scanning in Go. The state of the lexer is kept in a callback, this neat trick simplifies the lexer, and makes it more efficient. gmi2html reads its input from stdin and writes the result to stdout, and there’s no flag:

$ gmi2html < input.gmi > output.html

It doesn’t support any extension like list, and heading yet. I’ll add these features in the coming weeks.

The text/gemini markup format

A few weeks ago I read the Gemini specification; and I really like the project’s markup format: text/gemini, a markup language with only essential features. It’s a line oriented language with only four types of lines:

  1. Text
  2. Link
  3. Preformatting toggle
  4. Preformatted text

Special formatting like headings, unordered lists, and quote blocks are also supported. And that’s all, that’s the entire markup language!

I love the minimalism of text/gemini and I’m considering using it for future publications instead of Markdown.

I have used Markdown for my writings over the past 15 years, and I have now grown tired of it. I used different Markdown to HTML converters over the years, and they all have different quirks. I’ve been thinking about ditching Markdown for about 5 years, but couldn’t find an alternative I liked. The text/gemini markup feels like it’s a good fit for me. Unfortunately I couldn’t find any command line text/gemini to HTML converter; maybe I should write one?

OpenBSD’s sysupgrade

I run OpenBSD on my laptop and a server hosted in the Cloud. When I upgraded OpenBSD on my server: I provisioned a new server instance running the OpenBSD version to upgrade to; copied the configuration from the old to the new server; altered my DNS to point to the new server; and shut down the old server. For my laptop, I usually downloaded & installed the new system from the tarballs using a script I wrote, and ran pkg_add after rebooting. My script didn’t always work, I had to occasionally fix breakages after the upgrade.

That was until last week, when I used sysupgrade for the first time. Sysupgrade automatically upgrade OpenBSD by downloading the new tarballs along with the firmware files, reboot the machine, install the new system, and finally upgrade the packages.

In both cases the upgrade was fast, didn’t require baby-sitting, and everything worked out-of-the-box once the computer rebooted. I had to upgrade my server twice to move from 6.6 to 6.8, since sysupgrade can’t skip intermediate versions. There was some downtime: about 2 to 3 minutes for each upgrade, for about 10 minutes of downtime in total. I also upgraded my laptop with sysupgrade, I started the upgrade, made myself some tea, and when I came back the laptop was all upgraded and ready to go.

And if you like to live on the bleeding edge, sysupgrade also allows you to upgrade to a snapshot via the -s option. I used my own upgrade script to do that, and it didn’t always work well. Now I can use sysupgrade and be confident it will work.

I’m studying accounting these days. Learning about balance sheets, income statements, and cash flows brought back memories of a story I read about 5 years ago: the story of Crazy Eddie.

Crazy Eddie was an electronics retailer from New York City. Its owners committed various securities fraud from 1969 until they got caught in 1987.

For its first 10 years in business the company was crazy profitable, Crazy Eddie’s management and owner skimmed –stole & hid– cash from the company and under-reported income to pay less taxes.

In the 80’s Crazy Eddie was getting ready for its IPO, and its managers gradually reduced the amount or cash they skimmed to artificially increase income over time. The goal was to increase its Statement of Cash Flows, to make it look like the company was getting more and more profitable in order to get a big fat valuation and raise tons of cash.

After its IPO, Crazy Eddie’s administrators didn’t slow down and committed more securities fraud. They overstated the assets’ value, laundered the money they skimmed by re-investing it into the company, and understated accounts payable to benefit insiders and fool regulators. The company had tons of debts, and didn’t include these liabilities in its statement of cash flows, overstating the company’s position.

The Crazy Eddie story from whitecollarfraud.com is wildly entertaining, I highly recommend it. Sam Antar, Crazy Eddie’s CFO, orchestrated many of these frauds, and created this website to talk about it once he got out of prison for his shenanigans. Securities fraud ain’t no joke.

bspwm and sxhkd are a great window manager

The X window system –the most popular graphical environment for Unix operating systems like Linux and *BSD’s– gives its users the option to choose their window manager. A window manager is a program that arranges windows around the screen, and often adds decorations like a title bar with a close button, and maybe a maximize, and minimize button beside. Tiling window managers arrange windows into mutually non-overlapping frames, that’s what I use.

I used to run dwm, and switched to bspwm and sxhkd four weeks ago. These programs work in tandem to manage windows and handle input events, and it’s a beautiful thing.

First here’s an overview of how traditional window management works. Most window managers use something called reparenting, where it becomes the top window and all the other windows are its children. This lets the window manager decorate these sub-windows. A typical event loop handles both the administration of the window and the input events like keyboard, mouse, or touch. That’s a traditional X application event loop.

Bspwm & sxhkd are different, they split the event loop into two different processes; sxhkd pilots bspwm via a command line tool called bspc. The name bspwm comes from BSP or binary space partition, while sxhkd means Simple X hotkey daemon. Sxhkd handles keyboard, mouse, and other input events, and bspwm only handles windows events, and ignore all input events. Sxhkd drives bspwm by mapping hotkey to execution of the bspc command line tool to tell bspwm what to do.

Because of this split configuration is straightforward, there are two different configuration files, instead of one. Since these files have different purposes, they can use different syntax. Sxhkd has a simple and powerful configuration syntax. Each line of the configuration file is interpreted as so:

So if you want to start xterm when pressing the Alt and the Return keys simultaneously, you put the following in sxhkd’s configuration:

alt + Return
    xterm

Bspwm’s configuration is an executable that can be written any language, it’s executed after the window manager starts. The executable is usually a shell script that calls the bspc tool to configure bspwm. Clear configuration & minimalism makes there two programs attractive options.

I use the “default” configuration that comes with sxhkd & bspwm and I the only change I did was to reduce border between windows from 8 pixels to 4. What my thoughts are after 4 weeks? I got used to the new setup within a few days, it was easy to learn coming from dwm, your experience may be different if you have never used a tiling window manager.

bspwm and sxhkd are a great window manager. If you are running dwm, i3, xnomad or some other tilling window manager, they may be a good alternative.

Khan Academy course review: Finance and capital markets / Housing

I finished the Housing module of Khan Academy’s course Finance and capital markets. First it explains the process of evaluating if one should buy or rent depending on one’s personal circumstances. It’s an important evaluation that everyone should do, preferably away from their Realtor.

The course then goes over all the steps of evaluating, buying, and paying off a place. Seeing the calculations and detailed explanations for the different examples is helpful to understand how slightly different inputs can create vastly different outcomes. For example if the price of the house goes up, the buyer’s equity in the house goes up, and she can borrow cash with that newly created equity as a collateral. This is called a home equity loan, and that thing looks terrifying to me. One line from that part I really liked: when people buy a place with a mortgage they go from renting a place to renting money for a place. The interest on the loan is the rent.

The course also explains the etymology of the word mortgage, it is the bank pledging to give the title after the debt is paid off. Mortgage is an old French word, it’s a portmanteau composed of two words: mort –dead– and gage –pledge–. It’s called a dead-pledge because the deal dies when it is fully paid or when payment fails.

The course then introduces the different types of mortgages: fixed, adjustable, and balloon. Fixed are just a fixed rate for a set number of periods. Adjustable mortgages or ARM are based on an index, like one year treasury bonds, these are riskier mortgages than fixed rate. ARM are often mixed with fixed rate mortgages to make the risk more palatable to the buyer. Then there are balloon mortgages. If you like to take on as much risk as possible, and want a mortgage that can blow up in your face, you’ll love balloon mortgages.

The module finishes with a high-level overview of the house buying process, and explains what title, deed, and escrow are. I’m not planning to buy a place anytime soon, and this part gave me another reason to wait: buying a place is long, bureaucratic, stressful, and can be risky.

Khan Academy course review: Finance and capital markets / Interest and debt

I watched the first module of the course Finance and capital markets: Interest and debt. It’s a series of 16 videos about 10 minutes long. I had rudimentary knowledge of finance and accounting before watching these videos, here’s what I learned.

I learned the term Principal, it’s the initial amount of money invested or loaned from which interest are calculated. The course also introduced the Rule of 72 that I already knew, went into more details explaining why it was a good heuristic. I learned about the credit card system, the interchange fees, where the money goes and in what proportion. I learned about payday loans and how they work, and how some people get around the credit system to still get some money when they need it to pay their bills. Finally I learned more about the eerie number e, and how it can be used to calculate interests continuously.

The pace of the lectures is mellow, but one needs to actively pay attention to absorb the material. I casually watched most of these videos years ago, and I didn’t retain much, I took note this time, and felt I covered and absorbed the material much better this time.

One of the thing I lost when I switched from Vim to Kakoune is digraphs. Vim’s digraphs lets you to input non-ASCII characters by replacing multiple-characters combinations with the corresponding Unicode character. I was born in France and I need to input accented letters when I write to my folks back home. Because I use a QWERTY keyboard, I used Vim’s digraphs to get around the lack of Latin characters on that keyboard’s layout by entering the accented digraphs in Vim. With Kakoune I needed to find another solution. I looked at different solutions like digraph.kak, but I didn’t want to depend on an external program or a plug-in to do this.

After digging a bit more I found this blog post that mentioned the altgr-intl variant of the US keyboard layout in X.org. It uses the right Alt key as a dead key to input accented characters. It’s easy to set-up, I added the following line in my .xsession file:

setxkbmap -rules evdev -model evdev -layout us -variant altgr-intl

The US-altgr-intl layout works well and I got used to it quickly, it doesn’t interfere with my usual workflow since I rarely use the right Alt key. For me this is a better solution than Vim’s digraphs, because it works with every X11 application like web browsers and terminals, and it keeps my Kakoune configuration lightweight.

Old mistakes we keep on fixing

Releasing software publicly is scary, with new features come bugs and errors. Some of these bugs may be unfix-able in practice, and stick around forever.

Today I encountered such a bug with aspell a unix command-line spell checker. I was integrating aspell in with the Kakoune text editor. When I executed the command :spell fr in Kakoune nothing happened. I was expecting the bad words to be highlighted, but none of the glaring mistakes in my prose got highlighted. I then looked at the *debug* buffer to see what was up and I saw this.

Error: The file "..." is not in the proper format. Incompatible hash function.

Luckily it was easy to track down the issue, there’s a page in aspell’s documentation talking about it. Long story short: language spell files built with 32 bits systems aren’t compatible with 64 bits systems. Someone didn’t realize that using the size_t type for an on-disk data structure was a bad idea. Oops.

It’s this type of bug that makes releasing software terrifying. The bug is baked into aspell forever unless someone adds a way to handle both types, and that could be a significant refactoring of an unknown code base. The tragedy of the commons keeps on playing.

Most distributions handle this by making aspell dictionaries architecture specific. I bumped into this bug because I use VoidLinux that handles aspell dictionaries as an architecture independent package. This was fixed already, but I’m convinced I won’t be the last person to work-around this bug.

14 years ago I ditched Emacs as my main text editor and switched to Vim. I felt Emacs was too bloated and slow, I tried Vim for a few days and never looked back once I got comfortable with it.

Vim is still fast today, but after 14 years of use I’ve come to dislike its limitation and weird quirks. It does too much now, and it isn’t as ‘elegant’ as I wish it was.

I tried the Acme editor, and I loved the ideas behind it, but it wasn’t suited to my keyboard-focused workflow. I switched back to Vim after a single day of use, but the experience made me crave something better.

I also tried Neovim on and off over the past year, but I didn’t see the point of switching to a fork of an editor I’ve come to dislike.

After reading good things about it online, I started using the Kakoune text editor, and I’m impressed by its design and trade-offs thus far. I’m still getting used to Kakoune’s new edition and navigation style, and yet I already feel productive. After just a day using it I think it may be the one. Expect more posts about Kakoune in the near future.

Zettelkasten with plain Vim

The zettelkasten (German: “slip box”) is a knowledge management and note-taking method used in research and study.

I saw posts about this method pop-up on Hackernews & Lobsters lately. A zettelkasten is like a personal-Wiki for your research, its salient idea is linking cards or files together. I maintain a spark file since 2012, the concept is similar to a zettelkasten: it’s a collection of notes; each note is about a single idea and is revisited and re-edited regularly. Mine is a Git repository with about 160 text files containing 400,000 characters as of today. Since learning about the zettelkasten method I link between files more and it helps revisit old notes and material more effectively.

I use Vim to navigate it with the gf command, it lets you edit the file whose name is under or after the cursor. With this command I can easily link files together, it makes the organization of my zettelkasten simple, and it works out of the box with Vim.

For example if I want to link network/http.txt to network/ip.txt: I add a line to the file network/http.txt with the filename I want to link to: net/ip.txt. When I position my cursor over this line and enter gf Vim will open network/ip.txt.

For tagging files together I use outline files. When I want to link a set of files together, I create a new file and add all the filenames in it, like an index file.

Since I got into the habit of linking files together I feel more inspired to write and produce. I have revisited old ideas, and I feel the creative juices flowing again.

Playing with web fonts

I updated this blog’s fonts, because I wanted to make it more snappy to download and display. It uses the Charter and Anonymous Pro as body and mono-space fonts. Before the update Charter was a WOFF font hosted on my server and Anonymous Pro was a TFF font hosted on Google® Fonts™ —another piece of Google®’s surveillance network—.

Instead, I stole the ideas in the CSS from Butterick’s Practical Typography, and embedded the WOFF2 formatted fonts encoded as base64 strings in the CSS style-sheet. Before this update, the browser used the default system fonts while the external fonts loaded and then swapped the fonts. This made the page blink a second after it appeared, that blinking effect is now gone.

All the fonts together weight about 140KB which is a bit much for a mandatory resource. My index page with all the posts I made since 2012 and the fonts gzipped weights 210KB as of today. That’s far from the tens of Megabytes of data the typical webpage eats in bandwidth. So hopefully this translates into a better experience for you, my readers.

According to Firefox’s debugger this blog takes 32 seconds to fully download and display on a GPRS connection —about 10KB/seconds—; that works out about 2.5 seconds on a 3G connection.

There’s something else I’d like to improve: WOFF2 fonts are supported only by relatively new browsers. I would like to find a way to load the font only if the browser supports it. Maybe a second CSS that’s conditionally loaded would do the trick.

That’s all for today.

Spring cleaning bucket list: Scrub out surveillance capitalism

I’m doing some virtual spring cleaning, I moved all my websites to a new server because the old one will shutdown soon.

This reminded me that I track you all with the largest surveillance capitalism network in the world: Google®, with their service Analytics™. This service lets me see how little you all care about my writing, which depresses me a little. Also I feel like an ass for selling your data to an already too powerful corporation.

I removed the Google® Analytics™ tracker from all my websites, including this blog.

This place is safe now, and I feel better.

I’m working on updating this blog’s look. Charter is the blog’s body font, it’s served as EOF and WOFF formats. Now there’s a new hot web font format in town: WOFF 2, like WOFF but better I guess.

I looked for a WOFF2 version of Charter, and I found it in the Wikimedia UI style guide’s repository. This may be a good option for those looking for a great-looking font with a permissive license.

Pango 1.44 dropped support for bitmap fonts, this is a problem for me with my favorite font —Terminus— being a bitmap font. This means that Vim’s GTK front-end no longer works with Terminus.

To get around this I simply use Vim in with support for X11 clipboards in a terminal. Here’s the little script I put together to do this:

#!/bin/sh

# Void Linux vim command doesn't support X11 clipboards
if command -v vim-x11 > /dev/null
then
    readonly editor=vim-x11
else
    readonly editor=vim
fi

exec urxvt -fn 'xft:Terminus:size=18' -e "$editor" "$@"

Go CBOR encoder: Episode 11, timestamps

This is a tutorial on how to write a CBOR encoder in Go, where we’ll learn more about reflection and type introspection.

Make sure to read the previous episodes, each episode builds on the previous one:


In the previous episode we improved floating point number support in our encoder. We implemented all the Go native types, now we’ll implement a custom stype: time.Time, a timestamp type from Go’s standard library. The CBOR format supports 3 timestamp types natively:

The CBOR format has special values called tags to represent data with additional semantics like timestamps. Tags’ headers major type is 6 and represents an integer number used to determine the tag content’s type. Each tagged type has a unique integer identifier number.

For example URIs are represented as a tagged unicode string: first there’s the header with the major type 6 —indicating it’s a tagged value— encoding the integer 32 —the URIs’ identifier—, followed by the URI encoded as an UTF-8 CBOR string.

How can we detect if we have a time.Time value in the encoder? Looking at time.Time’s definition we see that it’s a struct, a kind of value we already handle in the encoder. The reflect package lets us query and compare value’s types, so we will check if the value’s type is time.Time when we have a reflect.Struct kind and write a CBOR timestamp when that’s the case.

There’s a bit of gymnastic needed to get time.Time’s type without allocating extra stuff, we can either do:

reflect.TypeOf(time.Time{})

Or:

reflect.TypeOf((*time.Time)(nil)).Elem()

In the first case we create an empty time.Time object, pass an interface pointing to it to reflect.TypeOf that will return its reflect.Type. In the second case we create an empty interface to time.Time and retreive its type directly. We’ll use the second way because it doesn’t create an empty time.Time object and is therefor a bit more efficient.

In the main switch block we add a conditional statement in the reflect.Struct case to check is the struct’s type is time.Time:

case reflect.Struct:
	if x.Type() == reflect.TypeOf((*time.Time)(nil)).Elem() {
		return ErrNotImplemented
	}
	return e.writeStruct(x)

Timestamps have two tagged data item types: 0 for RFC3339 timestamps encoded as unicode strings, or 1 for epoch-based timestamps —floating point & integer values—. Let’s add a new function to write the timestamps: writeTime. We’ll handle string timestamps first, and implement scalar epoch-based timestamp types second. Starting with RFC3339 strings, we lookup the example from the spec, and add our first test case:

func TestTimestamp(t *testing.T) {
    var rfc3339Timestamp, _ = time.Parse(time.RFC3339, "2013-03-21T20:04:00Z")

    var cases = []struct {
        Value    time.Time
        Expected []byte
    }{
        {
            Value: rfc3339Timestamp,
            Expected: []byte{
                0xc0, 0x74, 0x32, 0x30, 0x31, 0x33, 0x2d, 0x30, 0x33, 0x2d,
                0x32, 0x31, 0x54, 0x32, 0x30, 0x3a, 0x30, 0x34, 0x3a, 0x30,
                0x30, 0x3a,
            },
        },
    }

    for _, c := range cases {
        t.Run(fmt.Sprintf("%v", c.Value), func(t *testing.T) {
            testEncoder(t, c.Value, c.Expected)
        })
    }
}

Back in cbor.go we add a few header constants required to encode the new tagged types:

const (
    // major types
    ...
    majorTag             = 6
    ...

    // major type 6: tagged values
    minorTimeString = 0
    minorTimeEpoch  = 1
    ...
)

The function writeTime writes the tag’s header with minorTimeString to indicate a string follows, then it converts the timestamp into a RFC3339 string and writes it to the output:

func (e *Encoder) writeTime(v reflect.Value) error {
    if err := e.writeHeader(majorTag, minorTimeString); err != nil {
        return err
    }
    var t = v.Interface().(time.Time)
    return e.writeUnicodeString(t.Format(time.RFC3339))
}

We hook it up to the rest of the code by adding a call to writeTime in our main switch statement:

case reflect.Struct:
	if x.Type() == reflect.TypeOf((*time.Time)(nil)).Elem() {
		return e.writeTime(x)
	}
	return e.writeStruct(x)

A quick go test to confirm writing string timestamps works, so let’s get started with epoch-based timestamps.

Epoch-based timestamps are scalar values where 0 corresponds to the Unix epoch (January 1, 1970), that can either be integer or floating point values. We’ll minimize the size of our output by using the most compact type without losing precision. The timestamp can either be an integer, a floating point number, or a RFC3339 string. If the timestamp’s timezone isn’t UTC we’ll have to use the largest type: RFC3339 strings, because we need to encode the timezone information and we can’t do it with scalar timestamps. If the timestamp’s timezone is UTC or is nil we can use a scalar timestamp because they are set in UTC time. We’ll use an integer when the timestamp can be represented as whole seconds or use a floating point number otherwise.

First we add a condition to only use RFC3339 strings when the timestamp has a timezone that’s not UTC:

func (e *Encoder) writeTime(v reflect.Value) error {
    var t = v.Interface().(time.Time)
    if t.Location() != time.UTC && t.Location() != nil {
        if err := e.writeHeader(majorTag, minorTimeString); err != nil {
            return err
        }
        return e.writeUnicodeString(t.Format(time.RFC3339))
    }
    return ErrNotImplemented
}

Because we are changing the behavior of writeTime when the timezone is UTC, we have to fix the first test case to use a timestamp with a non-UTC timezone set, otherwise the test will fail with ErrNotImplemented returned. We replace the Z —a shortcut for the UTC timezone— at the end of rfc3339Timestamp with +07:00:

func TestTimestamp(t *testing.T) {
    var rfc3339Timestamp, _ = time.Parse(time.RFC3339, "2013-03-21T20:04:00+07:00")

    var cases = []struct {
        Value    time.Time
        Expected []byte
    }{
        {
            Value: rfc3339Timestamp,
            Expected: []byte{
                0xc0, 0x78, 0x19, 0x32, 0x30, 0x31, 0x33, 0x2d, 0x30, 0x33, 0x2d,
                0x32, 0x31, 0x54, 0x32, 0x30, 0x3a, 0x30, 0x34, 0x3a, 0x30, 0x30,
                '+', '0', '7', ':', '0', '0',
            },
        },
    }
    ...
}

Let’s implement floating point numbers when there’s no timezone information to encode. As usual we start by adding a test case for this from the spec:

func TestTimestamp(t *testing.T) {
    ...
    var cases = []struct {
        Value    time.Time
        Expected []byte
    }{
        ...
        {
            Value:    time.Unix(1363896240, 0.5*1e9).UTC(),
            Expected: []byte{0xc1, 0xfb, 0x41, 0xd4, 0x52, 0xd9, 0xec, 0x20, 0x00, 0x00},
        },
    }
    ...
}

Note that we had to call the .UTC() method on the time.Time object returned by time.Unix, that’s because otherwise the object will have the computer’s local timezone associated to it, a call on the UTC method get us a UTC timestamp.

Since time.Time stores its internal time as an integer counting the number of nanoseconds since the Epoch, we’ll have to convert it into a floating point number in seconds before writing it. To do this we define a constant to convert from nanoseconds to seconds from the time’s module units:

const nanoSecondsInSecond = time.Second / time.Nanosecond

Then we add the code after the block to handle string timestamps. We write the header with minorTimeEpoch as its sub-type to indicate we have a scalar timestamp, then write the converted value as a floating point number:

func (e *Encoder) writeTime(v reflect.Value) error {
    var t = v.Interface().(time.Time)
    if t.Location() != time.UTC && t.Location() != nil {
        if err := e.writeHeader(majorTag, minorTimeString); err != nil {
            return err
        }
        return e.writeUnicodeString(t.Format(time.RFC3339))
    }

    // write an epoch timestamp to preserve space
    if err := e.writeHeader(majorTag, minorTimeEpoch); err != nil {
        return err
    }
    var unixTimeNano = t.UnixNano()
	return e.writeFloat(
		float64(unixTimeNano) / float64(nanoSecondsInSecond))
}

If the timestamp in seconds is an integer number we can write it as an integer timestamp without losing precision. Integers are usually more compact than floating point numbers, we’ll always use them when possible. Another test case from the spec makes it into cbor_test.go:

func TestTimestamp(t *testing.T) {
    ...
    var cases = []struct {
        Value    time.Time
        Expected []byte
    }{
        ...
        {
            Value:    time.Unix(1363896240, 0).UTC(),
            Expected: []byte{0xc1, 0x1a, 0x51, 0x4b, 0x67, 0xb0},
        },
    }

    ...
}

To determine if we can write an integer timestamp we check if the fractional part of the timestamp in seconds is zero, then we convert unixTimeNano into seconds, set the CBOR integer’s header minor type depending on the timestamp’s sign, and use writeInteger to write the timestamp:

const nanoSecondsInSecond = time.Second / time.Nanosecond

func (e *Encoder) writeTime(v reflect.Value) error {
    ...

    // write an epoch timestamp to preserve space
    if err := e.writeHeader(majorTag, minorTimeEpoch); err != nil {
        return err
    }
    var unixTimeNano = t.UnixNano()
    if unixTimeNano%int64(nanoSecondsInSecond) == 0 {
        var unixTime = unixTimeNano / int64(nanoSecondsInSecond)
        var sign byte = majorPositiveInteger
        if unixTime < 0 {
            sign = majorNegativeInteger
            unixTime = -unixTime
        }
        return e.writeInteger(sign, uint64(unixTime))
    } else {
        return e.writeFloat(
            float64(unixTimeNano) / float64(nanoSecondsInSecond))
    }
}

And it’s all we needed to do to support the non-native type time.Time!

We are done writing our CBOR encoder. It you would like to see other things covered feel free to reach me at henry@precheur.org.

Go CBOR encoder: Episode 10, special floating point numbers

This is a tutorial on how to write a CBOR encoder in Go, where we’ll learn more about reflection and type introspection.

Make sure to read the previous episodes, each episode builds on the previous one:


In the previous episode we added floating point number support to our encoder.

We minimized the size of the output without losing precision. There’s still room for improvement though: we encode all regular floating point numbers as 16 bits numbers when possible, but there are also special numbers in the standard IEEE 754 that can be packed more efficiently:

The way the encoder works now these special values are all encoded as 32 or 64 bits floats, and lots of them we can be encoded as 16 bits numbers without losing information.

We’ll starts with infinite values, then not a number values, and finish subnormal numbers.

For infinite values, there are two types: positive and negative. The only thing that changes with infinite values is the sign bit, the exponent is all 1’s, and the fractional part is all 0’s. Infinite values are easy to detect in Go with the math.IsInf function. To detect infinites values we add an if block with math.IsInf at the beginning of the writeFloat function, and write a 16 bits float with all 1’s in the exponent and all 0’s in the fractional:

func (e *Encoder) writeFloat(input float64) error {
    if math.IsInf(input, 0):
        return e.writeFloat16(math.Signbit(input), (1<<float16ExpBits)-1, 0)
    }
    ...
}

Nan or Not a number is similar to infinites but has a changing fractional part. The fractional part of a NaN carries some information, we’ll copy it as is and just chop off the end, all the important information is in the first few bits. We add the following to the second switch statement in writeFloat:

func (e *Encoder) writeFloat(input float64) error {
    ...
    var (
        exp, frac     = unpackFloat64(input)
    )
    ...
    switch {
    case math.IsNaN(input):
        return e.writeFloat16(math.Signbit(input), 1<<float16ExpBits-1, frac)
    ...
    }
}

And that’s all we need for not a number. To verify we implemented it correctly we add the corresponding test cases from the CBOR spec in cbor_test.go:

func TestFloat(t *testing.T) {
    var cases = []struct {
        Value    float64
        Expected []byte
    }{
        ...
        {Value: math.Inf(1), Expected: []byte{0xf9, 0x7c, 0x00}},
        {Value: math.NaN(), Expected: []byte{0xf9, 0x7e, 0x00}},
        {Value: math.Inf(-1), Expected: []byte{0xf9, 0xfc, 0x00}},
        ...
    }

    for _, c := range cases {
        t.Run(fmt.Sprintf("%v", c.Value), func(t *testing.T) {
            testEncoder(t, c.Value, c.Expected)
        })
    }
}

We now store tightly infinites and not a number, but here comes the hard part: subnormal numbers. There’s a lot of bit fiddling ahead.

When an exponent’s binary value is all 0’s, it means we have a subnormal number, and Zero is a subnormal number. Zero needs a special number because it cannot be represented precisely when fractional part is prefixed by a 1 like with regular floating point numbers. Even if the factional was all zeros and the exponent very small, a regular floating point number can’t precisely represent 0 because there’s always a 1 somewhere in in the fractional (like 0.000…01). Therefor we have subnormal numbers that start with a 0 instead of a 1 to represent zero precisely and other very small numbers more accurately.

Let’s start by encoding efficiently zero and negative zero. Negative zero is zero with its sign bit set to one. Here are the two test cases we add to our TestFloat test in cbor_test.go:

...
{Value: 0.0, Expected: []byte{0xf9, 0x00, 0x00}},
{Value: math.Copysign(0, -1), Expected: []byte{0xf9, 0x80, 0x00}},
...

To get a negative zero in Go we have to use the math.Copysign function, because the compiler turns the expression -0.0 into a positive zero. We turn the if statement at the beginning into a switch, with an additional case to detect zero, and encode it as a 16 bits float to preserve space:

func (e *Encoder) writeFloat(input float64) error {
    switch {
    case input == 0:
        return e.writeFloat16(math.Signbit(input), 0, 0)
    case math.IsInf(input, 0):
        ...
    }
    ...
}

We don’t check if the input equals -0 because -0 equals 0. Zeros are done!

What other numbers can we represent as subnormal numbers? Let’s learn more about them and the difference with regular numbers. Here’s the formula for regular 16 bits floating point numbers:

(−1)signbit × 2exponent−15 × 1.significantbits2

When we have 16 bits subnormal numbers the formula turns into:

(−1)signbit × 2−14 × 0.significantbits2

Regular numbers are prefixed with a 1 bit, and subnormal numbers start with 0 bit. This means that by shifting the bits to the left, we can represent regular numbers with exponent lower than -14 as subnormal numbers. We’ll use the smallest 16 bits subnormal number: 5.960464477539063e-8 as a example. Its regular floating point representation is:

2-24 × 1.00000000002

The fractional part is all zeros and the exponent is -24. How can we represent it as a 16 bits floating point number when the exponent is set to -14 and can’t be changed? We shift the fractional part to the left, it’s like lowering the exponent by the same amount. Every time we shift left the fractional part by 1 bit it’s equivalent to lowering the exponent by 1.

For our example we shift the fractional part by 10 bits, which is equivalent to lowering the exponent by 10 to -24:

2-24 × 1.00000000002 = 2-14 × 0.00000000012

As long as we can shift the fractional part to the left without dropping any 1’s we can represent the number as a 16 bits float. In summary to encode a value as a 16 bits subnormal numbers we have to:

  1. Verify the exponent and the number of trailing zeros are within the ranges required to encode precisely the input
  2. Add a trailing 1 at the head of the regular fractional, since a those number’s fractional doesn’t have a leading 1 like regular number’s do
  3. Shift the fractional part to match the number’s exponent

The smallest possible 16 bits subnormal number is one of the example in the CBOR spec. Let’s add it to the TestFloat test suite:

...
{Value: 5.960464477539063e-8, Expected: []byte{0xf9, 0x00, 0x01}},
...

To check if we have a number that can be encoded as a subnormal number we add a predicate function subnumber() with two parameters: the exponent, and the number of trailing zeros in the fractional part. It verifies that the exponent is within the range of what’s representable by a subnormal number, and that we don’t drop any 1 from the fractional when we cut it:

func subnumber(exp int, zeros int) bool {
    var d = -exp + float16MinBias
    var canFitFractional = d <= zeros-float64FracBits+float16FracBits
    return d >= 0 && d <= float16FracBits && canFitFractional
}

Then we add a case statement at the beginning of the second switch, such as we encode the value as a 16 bits subnormal number when possible, and then fallback to 32 bits float otherwise:

func (e *Encoder) writeFloat(input float64) error {
    ...
    var (
        exp, frac     = unpackFloat64(input)
        trailingZeros = bits.TrailingZeros64(frac)
    )
    if trailingZeros > float64FracBits {
        trailingZeros = float64FracBits
    }
    switch {
    ...
    case subnumber(exp, trailingZeros):
        // this number can be encoded as 16 bits subnormal numbers
        frac |= 1 << float64FracBits
        frac >>= uint(-exp + float16MinBias)
        return e.writeFloat16(math.Signbit(input), 0, frac)
    case float64(float32(input)) == input:
    ...
    }
}

Let’s take a closer look step by step. When subnumber() matches, we build the new fractional part by prefixing the fractional part with a 1, this is the implicit 1 prefix from the regular number formula:

frac |= 1 << float64FracBits

Then we shift the fractional by the difference between the number’s exponent and the fixed exponent: -14 for 16 bits subnormal numbers:

frac >>= uint(-exp + float16MinBias)

Finally we write the number as a 16 bits floating point with a zero exponent:

return e.writeFloat16(math.Signbit(input), 0, frac)

One last run of go test confirms that everything works. We now pack tightly all special float values, and with the subnormal numbers optimization we just implemented we also pack 210 numbers more efficiently as 16 bits floats.

We successfully encoded one of the most complex types Go natively supports. Next time we’ll implement a custom type: timestamps.

Check out the repository with the full code for this episode.

Go CBOR encoder: Episode 9, floating point numbers

This is a tutorial on how to write a CBOR encoder in Go, where we’ll learn more about reflection and type introspection.

Make sure to read the previous episodes, each episode builds on the previous one:


This episode is about floating point numbers. There are 3 kinds of floating point numbers supported by CBOR:

Go only supports float32 & float64 natively. To support 16 bits numbers we will build the 16 bits values ourselves. We’ll implement 32 & 64 bits floats first, and then do the 16 bits numbers. We’ll minimize the size of the output by encoding numbers as tightly as possible, this means we’ll use 64 bits numbers only when using smaller numbers would lose precision. We don’t want to lose information or precision, the encoded numbers have to be exact.

As usual we take some examples from the CBOR spec, and look for numbers that can only be represented as 32 and 64 bits floats and add a test case for them. We find that 100,000.0 can be encoded exactly with a float32, while 1.1 can only be represented by a float64.

We start with those two examples and add the new test:

func TestFloat(t *testing.T) {
    var cases = []struct {
        Value    float64
        Expected []byte
    }{
        {
            Value:    1.1,
            Expected: []byte{0xfb, 0x3f, 0xf1, 0x99, 0x99, 0x99, 0x99, 0x99, 0x9a},
        },
        {Value: 100000.0, Expected: []byte{0xfa, 0x47, 0xc3, 0x50, 0x00}},
    }

    for _, c := range cases {
        t.Run(fmt.Sprintf("%v", c.Value), func(t *testing.T) {
            testEncoder(t, c.Value, c.Expected)
        })
    }
}

To decide whether to use float32 or float64 for a value we convert the value to float32 and compare it to the original float64 value. If both values are the same we can safely encode the number as a float32 without losing precision. Let’s add a new function writeFloat to do that:

const (
    // floating point types
    minorFloat16 = 25
    minorFloat32 = 26
    minorFloat64 = 27
)

func (e *Encoder) writeFloat(input float64) error {
    if float64(float32(input)) == input {
        if err := e.writeHeader(majorSimpleValue, minorFloat32); err != nil {
            return err
        }
        return binary.Write(e.w, binary.BigEndian, float32(input))
    } else {
        if err := e.writeHeader(majorSimpleValue, minorFloat64); err != nil {
            return err
        }
        return binary.Write(e.w, binary.BigEndian, input)
    }
}

We add writeFloat to our big switch statement on the input’s type:

switch x.Kind() {
...
case reflect.Float32, reflect.Float64:
	return e.writeFloat(x.Float())
}

go test confirms TestFloat passes. We are done with 32 and 64 bits floats. The first part was easy, but the second part won’t be this simple: there’s more work ahead of us.

Next let’s add support for 16 bits floats. As mentioned before Go doesn’t support float16 natively, so we’ll generate the binary value ourselves. What kind of number can we store in a 16bits float? A 16 bits float looks like this:

SEEEEEFFFFFFFFFF

S is the sign bit, 0 for positive, 1 for negative. EEEEE is the 5 bits exponent, and FFFFFFFFFF is the 10 bits fractional part.

According to the IEEE 754 spec the 5 bits exponent’s range is -14 to 15. if a number’s exponent is within those limits we can encode it as a 16 bits float.

The 10 bits fractional is quite a bit smaller than the 23 bits of the 32 bits floats’ exponent. We may lose precision when we chop off the end of a number’s fractional part: if there’s a 1 anywhere in those dropped bits we lose precision. To prevent this we will use a bit mask to ensure we’re not dropping any bits. In summary we can encode a number as a 16 bits if and only if:

  1. Its exponent is between -14 and 15
  2. Its fractional part doesn’t have any 1’s after its 10th bit

Let’s write some code: first we break down our numbers into those 3 parts. We add the function unpackFloat64 to decompose float64 into its sign bit, exponent, and fractional part. We unpack 64 bits floats because it’s the type with the highest precision we support, and all numbers can be represented as float64. We also add constants at the top to use for bit mask and shifting operations:

const (
    float64ExpBits  = 11
    float64ExpBias  = 1023
    float64FracBits = 52

    expMask  = (1 << float64ExpBits) - 1
    fracMask = (1 << float64FracBits) - 1
)

func unpackFloat64(f float64) (exp int, frac uint64) {
    var r = math.Float64bits(f)
    exp = int(r>>float64FracBits&expMask) - float64ExpBias
    frac = r & fracMask
    return
}

math.Float64bits converts the floating number to a uint64 type containing float64’s raw binary value. We then extract the exponent by shifting r by float64FracBits and mask it with expMask to trim off the bit sign. The result is converted to an integer and we subtract the exponent’s bias from it to get the real exponent value. The fractional part is extracted with a bit mask.

We’ll refactor writeFloat and introduce unpackFloat64, add use a bit mask to determing what type we should use. The exponent range of 32 bits float is -126 to 127 and we need at least float32MinZeros = 23 - 10 = 13 trailing zeros at the end of the fractional part.

We use a switch case with the smallest type first and use float64 only if float16 and float32 don’t work:

func (e *Encoder) writeFloat(input float64) error {
    var (
        exp, frac     = unpackFloat64(input)
        trailingZeros = bits.TrailingZeros64(frac)
    )
    if trailingZeros > float64FracBits {
        trailingZeros = float64FracBits
    }
    switch {
    case (-14 <= exp) && (exp <= 15) && (trailingZeros >= float16MinZeros):
        // FIXME write float16 here
        return ErrNotImplemented
    case float64(float32(input)) == input:
        if err := e.writeHeader(majorSimpleValue, minorFloat32); err != nil {
            return err
        }
        return binary.Write(e.w, binary.BigEndian, float32(input))
    default:
        if err := e.writeHeader(majorSimpleValue, minorFloat64); err != nil {
            return err
        }
        return binary.Write(e.w, binary.BigEndian, input)
    }
}

go test still works, because we haven’t added any test to verify 16 bits floats work. Let’s add support for 16 bits floats and add a test case for it. 1.0 is an easy number to represent with 16 bits, so we start with that:

...
{Value: 1.0, Expected: []byte{0xf9, 0x3c, 0x00}},
...

To write 16 bits floats we add a new method writeFloat16 that takes all three parameters needed to build a 16 bits float: sign bit, exponent, and fractional. We turn them into a single 16 bits integer, and write the value to the output:

func (e *Encoder) writeFloat16(negative bool, exp uint16, frac uint64) error {
    if err := e.writeHeader(majorSimpleValue, minorFloat16); err != nil {
        return err
    }
    var output uint16
    if negative {
        output = 1 << 15  // set sign bit
    }
    output |= exp << float16FracBits
    output |= uint16(frac >> (float64FracBits - float16FracBits))
    return binary.Write(e.w, binary.BigEndian, output)
}

Finally we hook up writeFloat16 to writeFloat with a switch case. We check the exponent’s range and that we’re not dropping any 1’s at the end of the fractional for float16 and float32, if none match we fall-back to float64:

func (e *Encoder) writeFloat(input float64) error {
    var (
        exp, frac     = unpackFloat64(input)
        trailingZeros = bits.TrailingZeros64(frac)
    )
    switch {
    case (-14 <= exp) && (exp <= 15) && (trailingZeros >= float16MinZeros):
        return e.writeFloat16(math.Signbit(input), uint16(exp+float16ExpBias), frac)
    case float64(float32(input)) == input:
        if err := e.writeHeader(majorSimpleValue, minorFloat32); err != nil {
            return err
        }
        return binary.Write(e.w, binary.BigEndian, float32(input))
    default:
        if err := e.writeHeader(majorSimpleValue, minorFloat64); err != nil {
            return err
        }
        return binary.Write(e.w, binary.BigEndian, input)
    }
}

Our encoder handles float16, we’ve covered all 3 floating point number types. It looks like we’re done with floats, but there’s still more cases and special numbers we have to take care of. In the next episode we’ll add support for more special numbers: Zero, Infinity, Not A Number, and subnormal numbers.

Check out the repository with the full code for this episode.

3 years ago I opened-up about an intimate, and controversial subject: the font I use with my terminal and text-editor, the superb Terminus.

My old monitor was a 24” 1080p monitor with a pixel pitch of 0.2715mm, it had fat pixels. Terminus looked sharper on it than outline fonts because the monitor’s pixels were so big, I liked how Terminus popped out with its sharp edges. I tried to use outline fonts like Source Code Pro, but they didn’t look as sharp and defined either under Linux and Windows. I played with the font hinting but it didn’t improve the outline fonts’ look enough to match the good old Terminus’ sharpness. I have stuck with my bitmap font of choice for over a decade.

A year ago I got a bigger 27” monitor with a 2560 × 1440 resolution and a pixel pitch of 0.233mm. The pixels on it are 15% smaller than on my old monitor, and it’s noticeable, everything looks smaller and sharper on this screen.

I tried outline fonts again to see if the finer pixels meant I could ditch my bitmap font and use a smoother outline font. I tried Source Code Pro and Deja Vu Mono. They looked better than on my old monitor: the smaller pitch helped of course, and the new screen is generally better so the fonts didn’t ‘bleed’ as much. The improvement wasn’t enough for me to stick with outline font with the new monitor: Terminus still looked sharper compared side by side. I eventually switched back to Terminus, it is still the font that looks the best to me on that new screen.

When I get a monitor with a pixel pitch below 0.2mm I may give outline fonts another try, but for now I’ll stick with my favorite bitmap font as I did for the past 10 years.

From the TC-4310 cable modem manual:

The value is provided in Hertz. So, for 333 MHz, you must type: 333000000.

Go CBOR encoder: Episode 8, structs

This is a tutorial on how to write a CBOR encoder in Go, where we’ll learn more about reflection and type introspection.

Read the previous episodes, each episode builds on the previous one:


To encode structs we’ll mimic what the standard JSON encoder does and encode structs into maps of strings to values. For example if we pass the following to the JSON encoder:

struct {
    a int
    b string
}{
    a: 1,
    b: "hello",
}

It outputs:

{"a": 1, "b": "hello"}

The struct kind is different from the map kind we implemented in the previous episode: with struct the fields’ are ordered and the keys are always strings. Because struct’s keys are strings, we can’t use all the examples from the spec like we did with maps, we can only use example with string-only keys. This leaves us with these three test cases:

{}
{"a": 1, "b": [2, 3]}
{"a": "A", "b": "B", "c": "C", "d": "D", "e": "E"}

On the flip side because the keys are ordered we don’t have to look for each individual pair in the output like we did with maps. We can use the function testEncoder as it is for our test. Let’s add TestStruct to cbor_test.go:

func TestStruct(t *testing.T) {
    var cases = []struct {
        Value    interface{}
        Expected []byte
    }{
        {Value: struct{}{}, Expected: []byte{0xa0}},
        {
            Value: struct {
                a int
                b []int
            }{a: 1, b: []int{2, 3}},
            Expected: []byte{
                0xa2, 0x61, 0x61, 0x01, 0x61, 0x62, 0x82, 0x02, 0x03,
            },
        },
        {
            Value: struct {
                a string
                b string
                c string
                d string
                e string
            }{"A", "B", "C", "D", "E"},
            Expected: []byte{
                0xa5, 0x61, 0x61, 0x61, 0x41, 0x61, 0x62, 0x61, 0x42, 0x61,
                0x63, 0x61, 0x43, 0x61, 0x64, 0x61, 0x44, 0x61, 0x65, 0x61,
                0x45,
            },
        },
    }

    for _, c := range cases {
        t.Run(fmt.Sprintf("%v", c.Value), func(t *testing.T) {
            testEncoder(t, c.Value, c.Expected)
        })
    }
}

To encode struct we’ll iterate over the fields of the struct with an index using Value.NumField and Value.Field, like this:

var v = reflect.ValueOf(struct {
	AKey string
	BKey string
}{AKey: "a value", BKey: "b value"})

for i := 0; i < v.NumField(); i++ {
    fmt.Println(v.Field(i))
}

This prints:

a value
b value

We have the fields’ values, we still need their names to write the map. The fields’ names aren’t stored in the value itself, they are stored in its type. We’ll use v.Type().Field() to get a StructField with the name of this particular field. For instance if we added the following at the end of the listing above:

for i := 0; i < v.NumField(); i++ {
    fmt.Println(v.Type().Field(i).Name)
}

We’d get the names of each field printed at the end:

AKey
BKey

Let’s assemble all that into a new function writeStruct in cbor.go. writeUnicodeString writes the keys, and we encode the value recursively with the encode() method:

func (e *Encoder) writeStruct(v reflect.Value) error {
    if err := e.writeInteger(majorMap, uint64(v.NumField())); err != nil {
        return err
    }
    // Iterate over each field and write its key & value
    for i := 0; i < v.NumField(); i++ {
        if err := e.writeUnicodeString(v.Type().Field(i).Name); err != nil {
            return err
        }
        if err := e.encode(v.Field(i)); err != nil {
            return err
        }
    }
    return nil
}

We add a call to writeStruct in the main switch statement:

case reflect.Struct:
	return e.writeStruct(x)

A quick run of go test confirms everything works as intended:

$ go test -v
...
--- PASS: TestStruct (0.00s)
    --- PASS: TestStruct/{} (0.00s)
    --- PASS: TestStruct/{1_[2_3]} (0.00s)
    --- PASS: TestStruct/{A_B_C_D_E} (0.00s)
PASS
ok

Basic structs work, but we aren’t done yet. We’ll extend support for structs by mimicking the standard JSON encoder and add support for struct tagging. Here’s a summary of what option the encoder supports:

// Field appears in JSON as key "myName".
Field int `json:"myName"`

// Field appears in JSON as key "myName" and
// the field is omitted from the object if its value is empty[...]
Field int `json:"myName,omitempty"`

// Field appears in JSON as key "Field" (the default), but
// the field is skipped if empty.
// Note the leading comma.
Field int `json:",omitempty"`

// Field is ignored by this package.
Field int `json:"-"`

// Field appears in JSON as key "-".
Field int `json:"-,"`

We’ll implement these features with the cbor tag instead of json, like this:

Field int `cbor:"name,omitempty"`

Let’s write a test with the feature we want to verify, we’ll re-use this example from the CBOR spec:

{"a": 1, "b": [2, 3]}

In TestStructTag we call testEncoder with a tagged struct and checks the output. AField & BField have the name a & b respectively, while all the other fields must be ignored:

func TestStructTag(t *testing.T) {
    testEncoder(t,
        struct {
            AField int   `cbor:"a"`
            BField []int `cbor:"b"`
            Omit1  int   `cbor:"c,omitempty"`
            Omit2  int   `cbor:",omitempty"`
            Ignore int   `cbor:"-"`
        }{AField: 1, BField: []int{2, 3}, Ignore: 12345},
        []byte{0xa2, 0x61, 0x61, 0x01, 0x61, 0x62, 0x82, 0x02, 0x03},
    )
}

If we run TestStructTag now the struct won’t be encoded correctly: every field will be in the output and the first two fields won’t have the right key.

The encoding/json package implements best-in-class tagging: we are going to steal some of its code to save time. Why write something new when we have some battle-tested code available?

We’ll copy encoding/json/tags.go into our project and we’ll add the function isEmptyValue from encoding/json/encode.go to it. We’ll replace package json with package cbor at the top to import the new code into our package. The new file tags.go looks like this:

// Source: https://golang.org/src/encoding/json/tags.go
//
// Copyright 2011 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.

package cbor

import (
    "reflect"
    "strings"
)

// tagOptions is the string following a comma in a struct field's "json"
// tag, or the empty string. It does not include the leading comma.
type tagOptions string

// parseTag splits a struct field's json tag into its name and
// comma-separated options.
func parseTag(tag string) (string, tagOptions) {
    if idx := strings.Index(tag, ","); idx != -1 {
        return tag[:idx], tagOptions(tag[idx+1:])
    }
    return tag, tagOptions("")
}

// Contains reports whether a comma-separated list of options
// contains a particular substr flag. substr must be surrounded by a
// string boundary or commas.
func (o tagOptions) Contains(optionName string) bool {
    if len(o) == 0 {
        return false
    }
    s := string(o)
    for s != "" {
        var next string
        i := strings.Index(s, ",")
        if i >= 0 {
            s, next = s[:i], s[i+1:]
        }
        if s == optionName {
            return true
        }
        s = next
    }
    return false
}

// Source for isEmptyValue:
//
// https://golang.org/src/encoding/json/encode.go
func isEmptyValue(v reflect.Value) bool {
    switch v.Kind() {
    case reflect.Array, reflect.Map, reflect.Slice, reflect.String:
        return v.Len() == 0
    case reflect.Bool:
        return !v.Bool()
    case reflect.Int, reflect.Int8, reflect.Int16, reflect.Int32, reflect.Int64:
        return v.Int() == 0
    case reflect.Uint, reflect.Uint8, reflect.Uint16, reflect.Uint32, reflect.Uint64, reflect.Uintptr:
        return v.Uint() == 0
    case reflect.Float32, reflect.Float64:
        return v.Float() == 0
    case reflect.Interface, reflect.Ptr:
        return v.IsNil()
    }
    return false
}

Copying code like this may be bad for long-term maintenance, if the Golang developers fix something in the upstream code we won’t get the fix until we copy it ourselves. It’s OK to do that with this exercise because we’re here to learn, not to ship! Here’s what each function does:

Let’s refactor writeStruct to handle tagging. We need to know how many elements are in our map before we write the header. For example if we had a struct with 3 fields but one of them had the tag cbor:"-" indicating the field must be ignored, the encoded map would only have 2 key-value pairs. Instead of iterating and writing key-values on the fly, we’ll parse the fields first, and write the result to the output second. We’ll build the list of fields to encode and then write the encoded map from that list.

We define a new type fieldKeyValue to hold our key-value pairs, and iterate over each field in the struct and skip the fields marked with a tag to ignore it. Then we write the list of fields to the output, the new writeStruct function looks like:

func (e *Encoder) writeStruct(v reflect.Value) error {
    type fieldKeyValue struct {
        Name  string
        Value reflect.Value
    }
    var fields []fieldKeyValue
    // Iterate over each field and add its key & value to fields
    for i := 0; i < v.NumField(); i++ {
        var fType = v.Type().Field(i)
        var fValue = v.Field(i)
        var tag = fType.Tag.Get("cbor")
        if tag == "-" {
            continue
        }
        name, opts := parseTag(tag)
        // with the option omitempty skip the value if it's empty
        if opts.Contains("omitempty") && isEmptyValue(fValue) {
            continue
        }
        if name == "" {
            name = fType.Name
        }
        fields = append(fields, fieldKeyValue{Name: name, Value: fValue})
    }
    // write map from fields
    if err := e.writeInteger(majorMap, uint64(len(fields))); err != nil {
        return err
    }
    for _, kv := range fields {
        if err := e.writeUnicodeString(kv.Name); err != nil {
            return err
        }
        if err := e.encode(kv.Value); err != nil {
            return err
        }
    }
    return nil
}

As you can see we get the information about each field’s tag via fType.Tag.Get("cbor"). We skip the field if its tag is “-”; or has an empty value with the “omitempty” option.

go test runs and confirms that struct tagging is implemented correctly. structs are done and our encoder is getting closer to be usable by a third party. We only have a few reflect.Kind’s that needs to handled:

We’ll implement floating and complex numbers, and ignore the UnsafePointer kind since we can’t reliably encode it. We’ll cover floating point numbers in the next episode.

Check out the repository with the full code for this episode.

Go CBOR encoder: Episode 7, maps

This is a tutorial on how to write a CBOR encoder in Go, where we’ll learn more about reflection and type introspection in Go.

Read the previous episodes, each episode builds on the previous one:


CBOR has a object or map type like JSON: it’s an ordered list of key/value pairs. We’ll use it to encode two different kinds of Go types: maps and structs. We’ll implement maps first and add support for structs in the next episode.

Here’s what the spec says about maps:

Major type 5: a map of pairs of data items. Maps are also called tables, dictionaries, hashes, or objects (in JSON). A map is comprised of pairs of data items, each pair consisting of a key that is immediately followed by a value. The map’s length follows the rules for byte strings (major type 2), except that the length denotes the number of pairs, not the length in bytes that the map takes up. For example, a map that contains 9 pairs would have an initial byte of 0b101_01001 (major type of 5, additional information of 9 for the number of pairs) followed by the 18 remaining items. The first item is the first key, the second item is the first value, the third item is the second key, and so on. […]

As usual we’ll use the examples from the CBOR RFC to write the tests. Maps are challenging to test because CBOR maps are ordered while Go maps aren’t. The order in which Go maps keys are returned is unspecified according to Value.MapKeys’s documentation:

MapKeys returns a slice containing all the keys present in the map, in unspecified order.

It means testEncoder cannot verify maps with more than one key/value pair in it, because it expects a unique result. Consider this map:

{1: 2, 3: 4}

They are multiple valid CBOR encoding for this map, because Go maps’ items can be in any order. With the example above the first key could either be 1 or 3.

We’ll have to check the different possibilities in the tests. For example from the CBOR spec we see that:

{1: 2, 3: 4}

Turns into:

0xa201020304

Here’s the breakdown of the output:

0xa2            → header for a map of 2 pairs
0x01            → first key: 1
0x02            → first value: 2
0x03            → second key: 3
0x04            → second value: 4

Because the map has two elements there’s another valid CBOR encoding for it with 3 as the first key and 1 as the second key like this:

0xa2            → header for a map of 2 pairs
0x03            → first key: 3
0x04            → first value: 4
0x01            → second key: 1
0x02            → second value: 2

Our tests will have to handle unordered keys in the output. To verify the results we will search for every individual key/value pair in the encoded map to ensure all the values are here regardless of the order.

We don’t need to worry about this for maps with less than two entries, we have two examples from the CBOR spec like that:

We’ll use testEncoder for those two cases. Let’s add a new test function TestMap in cbor_test.go with two subtests for the easy examples:

func TestMap(t *testing.T) {
    // {}
    t.Run("{}", func(t *testing.T) {
        testEncoder(t, map[struct{}]struct{}{}, nil, []byte{0xa0})
    })
    // ["a", {"b": "c"}]
    t.Run("[\"a\", {\"b\": \"c\"]", func(t *testing.T) {
        testEncoder(
            t,
            []interface{}{"a", map[string]string{"b": "c"}},
            nil,
            []byte{0x82, 0x61, 0x61, 0xa1, 0x61, 0x62, 0x61, 0x63},
        )
    })
}

Now we’ll add what’s needed for multi-item maps: we verify the header’s major type, and the map length, then search all key-value pairs in the output. The tests cases we’ll use to test maps are:

{1: 2, 3: 4}
{"a": 1, "b": [2, 3]}
{"a": "A", "b": "B", "c": "C", "d": "D", "e": "E"}

To verify unordered maps the test needs the list of encoded key-value pairs. In our previous tests the test cases where stored in a structure like this:

struct {
    Value    interface{}
    Expected []byte
}

We’ll change it to hold what we need to verify the map, we’ll turn Expected from a slice of bytes into a slice of slice of bytes. The length of Expected is the size of the map. Items in Expected are encoded key-value pairs to look-up in the result:

struct {
    Value    interface{}
    Expected [][]byte
}

We add the new test cases and the code to check the result in the TestMap function:

var cases = []struct {
	Value    interface{}
	Expected [][]byte
}{
	{
		Value: map[int]int{1: 2, 3: 4},
		Expected: [][]byte{
			[]byte{0x01, 0x02}, // {1: 2}
			[]byte{0x03, 0x04}, // {3: 4}
		},
	},
	{
		Value: map[string]interface{}{"a": 1, "b": []int{2, 3}},
		Expected: [][]byte{
			[]byte{0x61, 0x61, 0x01},             // {"a": 1}
			[]byte{0x61, 0x62, 0x82, 0x02, 0x03}, // {"b": [2, 3]}
		},
	},
	{
		Value: map[string]string{
			"a": "A", "b": "B", "c": "C", "d": "D", "e": "E",
		},
		Expected: [][]byte{
			[]byte{0x61, 0x61, 0x61, 0x41}, // {"a": "A"}
			[]byte{0x61, 0x62, 0x61, 0x42}, // {"b": "B"}
			[]byte{0x61, 0x63, 0x61, 0x43}, // {"c": "C"}
			[]byte{0x61, 0x64, 0x61, 0x44}, // {"d": "D"}
			[]byte{0x61, 0x65, 0x61, 0x45}, // {"e": "E"}
		},
	},
}

Each case the test will extract the major type and length of the map from the header using a bit mask, then verify their values. Our test cases have less than 23 elements in them, so the header will be only one byte. If we had a case with more than 23 elements we would have to change the code accordingly.

Let’s add the loop to iterate over the cases and verify what’s in the header:

for _, c := range cases {
    t.Run(fmt.Sprintf("%v", c.Value), func(t *testing.T) {
        var buffer bytes.Buffer

        if err := NewEncoder(&buffer).Encode(c.Value); err != nil {
            t.Fatalf("err: %#v != nil with %#v", err, c.Value)
        }

        var (
            header     = buffer.Bytes()[0]
            result     = buffer.Bytes()[1:]
            lengthMask = ^uint8(0) >> 3 // bit mask to extract the length
            length     = header & lengthMask
        )
        if header>>5 != majorMap {
            t.Fatalf("invalid major type: %#v", header)
        }

        if int(length) != len(c.Expected) {
            t.Fatalf("invalid length: %#v != %#v", length, len(c.Expected))
        }
    }
}

We haven’t verified the map’s content yet, let’s add it: we search for each pair in the encoder’s output, then remove it from the output. Once we’re done verifying all the key-values, we check if the slice is empty to ensure there’s nothing left-over in the output. We add that code is at the end of the loop:

for _, c := range cases {
    t.Run(fmt.Sprintf("%v", c.Value), func(t *testing.T) {
        ...

        // Iterate over the key/values we expect in the map
        for _, kv := range c.Expected {
            if !bytes.Contains(result, kv) {
                t.Fatalf("key/value %#v not found in result", kv)
            }
            // remove the value from the result
            result = bytes.Replace(result, kv, []byte{}, 1)
        }

        // ensure we got everything is the map
        if len(result) > 0 {
            t.Fatalf("leftover in result: %#v", result)
        }
    }
}

Tests are done, now let’s get them working. To encode the map we’ll write its size in the header, then recursively encode each key followed by its Then we’ll add a case clause matching reflect.Map in the encode’s switch statement and call writeMap from it:

const majorMap = 5
...

func (e *Encoder) writeMap(v reflect.Value) error {
    if err := e.writeInteger(majorMap, uint64(v.Len())); err != nil {
        return err
    }

    for _, key := range v.MapKeys() {
        e.encode(key)
        e.encode(v.MapIndex(key))
    }
    return nil
}

func (e *Encoder) encode(x reflect.Value) error {
    switch x.Kind() {
    ...
    case reflect.Map:
        return e.writeMap(x)
    }
}

As you can see we didn’t have to add much code to encode maps. The real challenge was the tests. Implemention was easy this time, but it won’t be next time: we’ll work with structs in the next episode and it’ll be a big one.

Check out the repository with the full code for this episode.

Go CBOR encoder: Episode 6, negative integers and arrays

This is tutorial on how to write a CBOR encoder in Go. Its goal is to teach reflection and type introspection. I recommend you read the previous episodes before jumping into this one:


Our CBOR encoder only accepts unsigned integers at the moment, to support all integer types we have to handle negative numbers. Negative number encoding is similar to positive number with a different major type. The spec says:

Major type 1: a negative integer. The encoding follows the rules for unsigned integers (major type 0), except that the value is then -1 minus the encoded unsigned integer. For example, the integer -500 would be 0b001_11001 (major type 1, additional information 25) followed by the two bytes 0x01f3, which is 499 in decimal.

As usual the tests come first, we re-use the examples from the CBOR specification to add TestNegativeIntegers in cbor_test.go:

import "math"  // for math.MinInt64
...
func TestNegativeIntegers(t *testing.T) {
    var cases = []struct {
        Value    int64
        Expected []byte
    }{
        {Value: -1, Expected: []byte{0x20}},
        {Value: -10, Expected: []byte{0x29}},
        {Value: -100, Expected: []byte{0x38, 0x63}},
        {Value: -1000, Expected: []byte{0x39, 0x03, 0xe7}},
        {
            Value: math.MinInt64,
            Expected: []byte{
                0x3b, 0x7f, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
            },
        },
    }

    for _, c := range cases {
        t.Run(fmt.Sprintf("%d", c.Value), func(t *testing.T) {
            testEncoder(t, c.Value, nil, c.Expected)
        })
    }
}

For the encoder to recognize all integers types we add a new case clause in Encode()’s switch statement with the additional integer kinds like reflect.Int. It checks the sign of the integer: If the integer is positive we write it as a positive number, if it’s negative we turn it into an unsigned integer using the formula -(x+1), and we write that number to the output:

const majorNegativeInteger = 1

...

func (e *Encoder) encode(x reflect.Value) error {
    ...
    case reflect.Int, reflect.Int8, reflect.Int16, reflect.Int32, reflect.Int64:
        var i = x.Int()
        if i < 0 {
            return e.writeInteger(majorNegativeInteger, uint64(-(i + 1)))
        } else {
            return e.writeInteger(majorPositiveInteger, uint64(i))
        }
    ...
}

8 lines of code was all we needed to support all integer types. That was easy, now we move onto something harder: arrays.

Arrays are the first composite type we add to the encoder. An array is a list of objects, it can contain any type of object like a JSON array:

[null, true, 1, "hello"]

Arrays have their own major type according to the spec:

Major type 4: an array of data items. Arrays are also called lists, sequences, or tuples. The array’s length follows the rules for byte strings (major type 2), except that the length denotes the number of data items, not the length in bytes that the array takes up. Items in an array do not need to all be of the same type. For example, an array that contains 10 items of any type would have an initial byte of 0b100_01010 (major type of 4, additional information of 10 for the length) followed by the 10 remaining items.

Because arrays can contain any type we’ll have to recursively encode objects, like we did in episode 4 with pointers.

Before we get started we’ll refactor how we recursively encode objects. Our encoder works with reflect.Value but the Encode() method takes an interface{} not a reflect.Value. When we call Encode() recursively we convert the reflect.Value into an interface which is then converted back into a reflect.Value. Those conversions aren’t efficient, so we’ll move all the code in the Encode() method into a new method called encode() —all lowercase— that takes a reflect.Value as parameter. Encode() is now just a call to this new method:

func (e *Encoder) Encode(v interface{}) error {
    return e.encode(reflect.ValueOf(v))
}

func (e *Encoder) encode(x reflect.Value) error {
    switch x.Kind() {
    ...
    case reflect.Ptr:
        if x.IsNil() {
            return e.writeHeader(majorSimpleValue, simpleValueNil)
        } else {
            // this replaces e.Encode(reflect.Indirect(x).Interface())
            return e.encode(reflect.Indirect(x))
        }
    ...
    }
    return ErrNotImplemented
}

With this small refactoring done let’s add tests based on the examples from the CBOR specification to our test suite, we have five cases:

[]
[1, 2, 3]
[1, [2, 3], [4, 5]]
[1, 2, 3, ... 25]
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14,
 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25]

We add TestArray to cbor_test.go that runs a subtest for each of the cases above:

func TestArray(t *testing.T) {
    var cases = []struct {
        Value    []interface{}
        Expected []byte
    }{
        {Value: []interface{}{}, Expected: []byte{0x80}},
        {Value: []interface{}{1, 2, 3}, Expected: []byte{0x83, 0x1, 0x2, 0x3}},
        {
            Value:    []interface{}{1, []interface{}{2, 3}, []interface{}{4, 5}},
            Expected: []byte{0x83, 0x01, 0x82, 0x02, 0x03, 0x82, 0x04, 0x05},
        },
        {
            Value: []interface{}{
                1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18,
                19, 20, 21, 22, 23, 24, 25,
            },
            Expected: []byte{
                0x98, 0x19, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08,
                0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f, 0x10, 0x11, 0x12,
                0x13, 0x14, 0x15, 0x16, 0x17, 0x18, 0x18, 0x18, 0x19,
            },
        },
    }

    for _, c := range cases {
        t.Run(fmt.Sprintf("%v", c.Value), func(t *testing.T) {
            testEncoder(t, c.Value, nil, c.Expected)
        })
    }
}

To get the tests to pass we have to match all array and slice types, except byte array and byte slice. We already match arrays and slices in the previous episode when we implemented byte strings.

When we have an array-like object to encode, we pass it to a new method writeArray. It writes the header with the length of the array, then iterates over the array’s elements and encode them recursively. To iterate over the array all we need are the methods reflect.Value.Len and reflect.Value.Index, we write a simple for loop and retrieve each item with v.Index(i):

majorArray           = 4
...
func (e *Encoder) writeArray(v reflect.Value) error {
    if err := e.writeInteger(majorArray, uint64(v.Len())); err != nil {
        return err
    }
    for i := 0; i < v.Len(); i++ {
        if err := e.encode(v.Index(i)); err != nil {
            return err
        }
    }
    return nil
}

Let’s hook up writeArray to the main switch statement in encode(). We want to match array and slice not made of bytes. To achieve this we just need to add a call to writeArray after the if statement to check if we have a byte string in the reflect.Slice’s case clause. We literally add a single line to cbor.go:

func (e *Encoder) encode(x reflect.Value) error {
    switch x.Kind() {
    ...
    case reflect.Array:
        // Create slice from array
        var n = reflect.New(x.Type())
        n.Elem().Set(x)
        x = reflect.Indirect(n).Slice(0, x.Len())
        fallthrough
    case reflect.Slice:
        if x.Type().Elem().Kind() == reflect.Uint8 {
            return e.writeByteString(x.Bytes())
        }
        // We don’t have a byte string therefor we have an array
        return e.writeArray(x)
    ...
    }
    return ErrNotImplemented
}

TestArray successfully runs, we are done with arrays. Check out the repository with the full code for this episode.

With the addition of arrays our encoder can now encode complex data structures. We’re about to make it even better and dive deeper into Go reflection with the next major type: maps. See you in the next episode.

Go CBOR encoder: Episode 5, strings: bytes & unicode characters


CBOR strings are more complex than the types we already implemented, they come in two flavors: byte string, and unicode string. Byte strings are meant to encode binary content like images, while Unicode strings are for human-readable text.

We’ll start with byte string, here’s what the spec says:

Major type 2: a byte string. The string’s length in bytes is represented following the rules for positive integers (major type 0). For example, a byte string whose length is 5 would have an initial byte of 0b010_00101 (major type 2, additional information 5 for the length), followed by 5 bytes of binary content. A byte string whose length is 500 would have 3 initial bytes of 0b010_11001 (major type 2, additional information 25 to indicate a two-byte length) followed by the two bytes 0x01f4 for a length of 500, followed by 500 bytes of binary content.

If we encoded the 5 bytes “hello” as a CBOR byte string we’d have something like that:

0x45    // header for byte string of size 5: (2 << 5) | 5 → 0x45
0x68 0x65 0x6C 0x6C 0x6f   // The 5 byte string hello

To encode byte strings we’ll encode a regular CBOR integer with major type 2, and then we’ll write the byte string itself right after. The header has the type and the size of the string as a positive integer —we implemented this in episode 3—; and the data sized by the integer in the header. Before we can write the special header we’ll change the function writeInteger we wrote in episode 3 to add a parameter for the major type so it is now configurable by the caller and we modify the call to writeInteger() in Encode() to work with the new call:

func (e *Encoder) writeInteger(major byte, i uint64) error {
    switch {
    case i <= 23:
        return e.writeHeader(major, byte(i))
    case i <= 0xff:
        return e.writeHeaderInteger(major, minorPositiveInt8, uint8(i))
    case i <= 0xffff:
        return e.writeHeaderInteger(major, minorPositiveInt16, uint16(i))
    case i <= 0xffffffff:
        return e.writeHeaderInteger(major, minorPositiveInt32, uint32(i))
    default:
        return e.writeHeaderInteger(major, minorPositiveInt64, uint64(i))
    }
}

...
case reflect.Uint, reflect.Uint8, reflect.Uint16, reflect.Uint32, reflect.Uint64:
    // we pass major type to writeInteger
    return e.writeInteger(majorPositiveInteger, x.Uint())

This change will be useful later when we implement more complex types: it’s common to write an integer with the header for variable sized CBOR types.

Now we have to figure out how to match byte string types with reflect. Go has two distict types that match CBOR byte strings: byte slices, and byte arrays. If you don’t know the different between a slice and an array, I recommend the splendid article from the Golang blog: Go Slices: usage and internals. We’ll focus on slices first then arrays.

We start by adding tests based on the examples from the CBOR spec:

func TestByteString(t *testing.T) {
    var cases = []struct {
        Value    []byte
        Expected []byte
    }{
        {Value: []byte{}, Expected: []byte{0x40}},
        {Value: []byte{1, 2, 3, 4}, Expected: []byte{0x44, 0x01, 0x02, 0x03, 0x04}},
        {
            Value:    []byte("hello"),
            Expected: []byte{0x45, 0x68, 0x65, 0x6c, 0x6c, 0x6f},
        },
    }

    for _, c := range cases {
        t.Run(fmt.Sprintf("%v", c.Value), func(t *testing.T) {
            testEncoder(t, c.Value, nil, c.Expected)
        })
    }
}

Slices have their own reflect kind: reflect.Slice. We only handle slices of bytes, so we’ll have to check the slice elements’ type like this:

var exampleSlice = reflect.ValueOf([]byte{1, 2, 3})

if exampleSlice.Type().Elem().Kind() == reflect.Uint8 {
    fmt.Println("Slice of bytes")
}

We use reflect.Uint8 in the if clause, because the byte type is an alias to uint8 in Go.

We add another case clause in Encode’s switch statement for slices and we check the slice’s elements’ type like this:

case reflect.Slice:
    if x.Type().Elem().Kind() == reflect.Uint8 {
        // byte string
    }

Now all we have left to do is write the header and the byte string into the output, we’ll add the writeByteString method to tuck all the boilerplate code away from our main switch statement:

// we add the major type for byte string
majorByteString      = 2

...

func (e *Encoder) writeByteString(s []byte) error {
    if err := e.writeInteger(majorByteString, uint64(len(s))); err != nil {
        return err
    }
    _, err := e.w.Write(s)
    return err
}

... In Encode() ...

case reflect.Slice:
    if x.Type().Elem().Kind() == reflect.Uint8 {
        return e.writeByteString(x.Bytes())
    }

A quick run of go test confirms byte slices work, but we’re not done with byte strings yet, we still have to handle arrays. It’s easier to work with slices in general, so we’ll convert arrays to slices to avoid writing array specific code and re-use what we just wrote. We add the following code to our existing test TestByteString:

// for arrays
t.Run("array", func(t *testing.T) {
	a := [...]byte{1, 2}
	testEncoder(t, &a, nil, []byte{0x42, 1, 2})
})

Let’s add another case clause right before the case clause matching reflect.Slice:

case reflect.Array:
	// turn x into a slice
    x = x.Slice(0, x.Len())
	fallthrough
case reflect.Slice:
    ...

We create a slice from our backing array with Value.Slice(), then we run the tests and we get a surprise:

$ go test -v .
...
=== RUN   TestByteString/array
panic: reflect.Value.Slice: slice of unaddressable array [recovered]
    panic: reflect.Value.Slice: slice of unaddressable array
...

It turns out we have an “unaddressable” array, and we cannot create a slice on it with Value.Slice() according to the doc. How are we going to get out of this? reflect doesn’t let us reference the array directly, we need to turn the array into something addressable: a pointer to the array. We create a pointer to it with reflect.New, then we use the pointer with reflect.Indirect to create our slice:

case reflect.Array:
    // Create slice from array
    var n = reflect.New(x.Type())
    n.Elem().Set(x)
    x = reflect.Indirect(n).Slice(0, x.Len())
    fallthrough
case reflect.Slice:
    ...

A quick run of go test confirms this solved our issue with the unaddressable array. All TestByteString tests now pass! We’re done with byte strings, unicode strings are next.

Text strings are like byte strings with a different major type. We have the header with the length of the string in bytes, and the data at the end. Text data is encoded in UTF-8 —Go’s native string encoding— so there’s no need to re-encode it: we can just write the string to the output as it is. Like we did for byte strings we add examples from the CBOR spec in a new test called TestUnicodeString:

func TestUnicodeString(t *testing.T) {
    var cases = []struct {
        Value    string
        Expected []byte
    }{
        {Value: "", Expected: []byte{0x60}},
        {Value: "IETF", Expected: []byte{0x64, 0x49, 0x45, 0x54, 0x46}},
        {Value: "\"\\", Expected: []byte{0x62, 0x22, 0x5c}},
        {Value: "\u00fc", Expected: []byte{0x62, 0xc3, 0xbc}},
        {Value: "\u6c34", Expected: []byte{0x63, 0xe6, 0xb0, 0xb4}},
    }

    for _, c := range cases {
        t.Run(fmt.Sprintf("%s", c.Value), func(t *testing.T) {
            testEncoder(t, c.Value, nil, c.Expected)
        })
    }
}

We add a case clause for the kind reflect.String, then we write the header with the size of our string, and finally we write the string to the output:

majorUnicodeString   = 3
...
func (e *Encoder) writeUnicodeString(s string) error {
    if err := e.writeInteger(majorUnicodeString, uint64(len(s))); err != nil {
        return err
    }
    _, err := io.WriteString(e.w, s)
    return err
}
...
case reflect.String:
    return e.writeUnicodeString(x.String())

And we are done with CBOR strings. Check out the code for this episode.

In the next episode we’ll implement signed integers, and our first composite type: array.

Go CBOR encoder: Episode 4, reflect and pointers


In the previous episode we encoded positive integers and learned how to write a CBOR item with a variable size. Our CBOR encoder can now encode nil, true, false, and unsigned integers. cbor.Encoder has grown strong, but type switches have their limits, we need more powerful weapons for the battles ahead: we’re about to take on pointers, and reflect will be our sword.

In the first episode of the series we encoded of the nil value since it was the easiest value to start with, but we aren’t finished with the nil we still got work to do to cover all cases. That’s because our encoder only handles the “naked” nil value, but not typed pointers that are nil. Whaaat? There are two kinds of nil pointers? Yep, that’s because nil by itself is special. Consider the following code:

var p *int = nil
var v interface{} = p
switch v.(type) {
case nil:
    fmt.Println("nil")
case *int:
    fmt.Println("int pointer")
}

The example above prints “int pointer”, because v isn’t a regular value but an interface that points to a int pointer value. Go interfaces are pairs of 32 or 64bits addresses: one for the type and one for the value. So in the type switch above we match the *int case because p’s type is *int. If we replaced the v definition with var v interface{} = nil, the program would print “nil”. That’s because the type of a nil value is itself nil, but typed-pointers’ type aren’t. Russ Cox’s article Go Data Structures: Interfaces is a superb introduction to how Go interfaces work if you’d like to learn more.

Let’s exhibit the problem in our code and add a test for typed nil pointers:

func TestNilTyped(t *testing.T) {
    var i *int = nil
    testEncoder(t, i, nil, []byte{0xf6})

    var v interface{} = nil
    testEncoder(t, v, nil, []byte{0xf6})
}

And run our tests with go test to see what happens:

--- FAIL: TestNilTyped (0.00s)
    cbor_test.go:18: err: &errors.errorString{s:"Not Implemented"} != <nil> with (*int)(nil)

The *int(nil) value isn’t recognized. So why did plain nil worked? Because it’s special: both its type and its value are nil. The Encode function matches the naked nil with the case nil statement in the type switch, this means only interfaces with a nil type will be matched. Therefor the code only works with the naked nil value, but not with typed pointers.

It turns out there’s a package to address that: reflect introspects the type system and let us match pointer types individually. The Law of reflection is a great introduction to reflection and the use of this package.

So we want to know if a value is a pointer. How does reflect help us? Consider this snippet:

fmt.Println(reflect.ValueOf(nil).Kind())
var i *int = nil
fmt.Println(reflect.ValueOf(i).Kind())

It prints:

invalid
ptr

What happens here? First we convert each Go value to a reflect.Value, then we query its type with the method Kind that returns a reflect.Kind enumeration. reflect.Kind represents the specific kind of type that a Type represents. Kinds are families of types. For example there is a kind for structs —reflect.Struct—, for functions —reflect.Func—, and for pointers —reflect.Pointer.

We see above that the naked nil value and a nil pointer to integer have different kinds: invalid, and ptr. We’ll have to handle the two cases separately.

Refactoring time! We replace the type switch with a switch statement on the Kind of our value. In the example below x.Kind() allows us to distinguish types the same way the type switch x.(type) did:

func (e *Encoder) Encode(v interface{}) error {
    x := reflect.ValueOf(v)
    switch x.Kind() {
    case reflect.Invalid:
        // naked nil value == invalid type
        return e.writeHeader(majorSimpleValue, simpleValueNil)
    case reflect.Bool:
        var minor byte
        if x.Bool() {
            minor = simpleValueTrue
        } else {
            minor = simpleValueFalse
        }
        return e.writeHeader(majorSimpleValue, minor)
    case reflect.Uint, reflect.Uint8, reflect.Uint16, reflect.Uint32, reflect.Uint64:
        return e.writeInteger(x.Uint())
    }
    return ErrNotImplemented
}

To identify pointer types reflect has a Kind called reflect.Ptr. We add a another case clause for reflect.Ptr, then if the pointer is nil we write the encoded nil value to the output:

case reflect.Ptr:
	if x.IsNil() {
		return e.writeHeader(majorSimpleValue, simpleValueNil)
	}

After we add that, a quick run of go test confirms that TestNilTyped works.

Splendid! We solved nil pointers. How about non-nil pointers? They are relatively easy to handle: if we detect a pointer we can fetch the value it refers to via reflect.Indirect. So when we get a pointer we get the value it references instead of the memory address. Here’s an example of how reflect.Indirect works:

var i = 1
var p = &i
var reflectValue = reflect.ValueOf(p)
fmt.Println(reflectValue.Kind())
fmt.Println(reflect.Indirect(reflectValue).Kind())

It prints:

int
ptr

When we find a non-nil pointer type, we call the Indirect function to retrieve the pointed value and we recursively call the Encode method on that value. We add a new test: TestPointer that verifies pointer referencing works as intended:

func TestPointer(t *testing.T) {
    i := uint(10)
    pi := &i  // pi is a *uint

    // should output the number 10
    testEncoder(t, pi, nil, []byte{0x0a})
}

With our test written let’s add the code necessary to handle valid pointers in our case clause:

case reflect.Ptr:
	if x.IsNil() {
		return e.writeHeader(majorSimpleValue, simpleValueNil)
	} else {
		return e.Encode(reflect.Indirect(x).Interface())
	}

reflect.Indirect(x).Interface()) retrieves an interface to x’s underlying value, we pass it recursively to Encode and return the result. So if we passed a pointer to pointer to pointer to integer (***int) we’d have 3 recursive calls to Encode. TestPointer now passes, we are done with pointers!

There’s a repository with the code for this episode.

The reflect package will help us to handle more complex types in subsequent episodes. Next time we will encode string types: byte string, and Unicode strings.

kgpdemux is a TCP demultiplexer that uses the KGP protocol. I wrote it about a year ago as an experiment to use with Sauce Connect Proxy. It’s my best example of how to uses channels to implement complex control flows in Go efficiently. Sources are on Bitbucket: https://bitbucket.org/henry/kgp/src

The juicy part is kgp.go, where most of the concurrency is implemented.

That is all.

Go CBOR encoder: Episode 3, positive integers

In the previous episode we wrote a CBOR encoder that can handle the values nil, true, and false. Next we’ll focus on positive integers.

To proceed we have to learn more about how values are encoded. A CBOR object’s type is determined by the 3 first bits of its first byte. The first byte is called the header: it describes the data type and tells the decoder how to decode what follows, sometime the header contains data about the value in the additional 5 bits leftover but most of the time it contains information about the type.

For example: the encoded nil value is a single byte with the value 246, in binary that’s: 0b11110110. The first 3 bits are all 1’s, that’s 7 in decimal. The nil value’s major type is 7, which correspond to the “simple values” major type. The last 5 bits are 0b10110 or 22 in decimal, that’s the additional value with the type of the value, in our case it’s nil. To summarize the nil value’s major type is 7, and the additional value 22 identifies it as type nil. Here’s how you’d reconstruct the header for nil from the major type and the additional value:

byte(majorType << 5) | additionalValue

The booleans true and false have the same major type as nil: 7 and their additional values are 20 and 21 respectively. We’d build booleans from their major type and additional value like this:

fmt.Printf("%x\n", byte(7 << 5) | 20)   // prints f4
fmt.Printf("%x\n", byte(7 << 5) | 21)   // prints f5

Positive integers have their own major type: 0. With only 5 bits in the header that’s not enough to encode values higher than 32, therefor integers’ encoding in more complex than booleans and nil. The first 24 values are reserved for integers from 0 to 23, for integers bigger than 23 we have to write extra bytes to the output to encode them. To indicate how much data is needed to decode the integer we have the special additional values 24, 25, 26, and 27, they correspond to 8, 16, 32, and 64 bits integers respectively.

For example to encode 500 we need to use at least a 2 bytes integer, because 500 is too much to be represented as a single byte. So the first byte would be major type 0 and additional value 25 to tell the decoder: “hey, what follows is a two byte positive integer”. The header would look like this: 0b000_11001, followed by two byte 0x01 0xf4, that’s 500 encoded as a 16 bits big-endian integer.

Start with the easy case: integers from 0 to 23. We add a method called writeHeader to cbor.go that writes the single byte header to the output. To avoid using magic numbers all over our code we’ll also set some constants for the types we can encode thus far. We add the following to cbor.go:

const (
    // major types
    majorPositiveInteger = 0
    majorSimpleValue     = 7

    // simple values == major type 7
    simpleValueFalse = 20
    simpleValueTrue  = 21
    simpleValueNil   = 22
)

func (e *Encoder) writeHeader(major, minor byte) error {
    h := byte((major << 5) | minor)
    _, err := e.w.Write([]byte{h})
    return err
}

We use writeHeader to clear the magic numbers we put in the Encode method from the previous episodes. Our Encode method looks tighter now:

func (e *Encoder) Encode(v interface{}) error {
    switch v.(type) {
    case nil:
        return e.writeHeader(majorSimpleValue, simpleValueNil)
    case bool:
        var minor byte
        if v.(bool) {
            minor = simpleValueTrue
        } else {
            minor = simpleValueFalse
        }
        return e.writeHeader(majorSimpleValue, minor)
    }
    return ErrNotImplemented
}

Our mini-refactoring is done, we check everything is still working with go test and it does still work. Now that we cleaned that up and verified it works we add tests for the small integers in cbor_test.go:

func TestIntSmall(t *testing.T) {
    for i := 0; i <= 23; i++ {
        testEncoder(t, uint64(i), nil, []byte{i})
    }
}

We loop from 0 to 23, we build our expected return value and check it corresponds to what the encoder gives us. In this case a single byte with the major type 0, and our value i.

Some of you may have noticed that we turn our value i into an uint64 when we pass it to testEncoder instead of a plain int. That’s because Go has different integers types like uint64, and int16, and plain int, unfortunately all those types are different for the Go type system and require adding extra code to work. We will handle the other integers later for now we’ll stick to uint64.

Small integers are easy to implement: in Encode switch’s statement we add a case uint64: clause, and if the integer is between 0 and 23 we output the header with the right additional value and that’s all:

case uint64:
	var i = v.(uint64)
    if i <= 23 {
        return e.writeHeader(majorPositiveInteger, byte(i))
    }
}

A quick run with go test confirms TestIntSmall works. Time to work on the extended integers: as usual we’ll write the tests first. To get good coverage, we’re going to copy the examples given in the appendix of the CBOR spec for our tests.

We’ll use subtests to make it easier to track what test fails, subtests allows you to define multiple sub-tests with different names inside a single test function. Our subtests’ names will be the numbers we’re checking, for example to test the integer 10 we’d do something like this:

func TestExample(t *testing.T) {
    t.Run(
        "10",                 // name of the subtest
        func(t *testing.T) {  // function to execute
            testEncoder(t, uint64(10), nil, byte{0x0a})
        },
    )
}

When we run go test with this example we’ll have a test named “TestExample/10”, we could add another call to t.Run() with the string “foo” as name to create another subtest named “TestExample/foo”.

Let’s replace this example with real tests. We’ll use a table to store our test cases, iterate over it, and verify each results. Our tests values and expected outputs are taken from the CBOR spec examples:

func TestIntBig(t *testing.T) {
    var cases = []struct {
        Value    uint64
        Expected []byte
    }{
        {Value: 0, Expected: []byte{0x00}},
        {Value: 1, Expected: []byte{0x01}},
        {Value: 10, Expected: []byte{0x0a}},
        {Value: 23, Expected: []byte{0x17}},
        {Value: 24, Expected: []byte{0x18, 0x18}},
        {Value: 25, Expected: []byte{0x18, 0x19}},
        {Value: 100, Expected: []byte{0x18, 0x64}},
        {Value: 1000, Expected: []byte{0x19, 0x03, 0xe8}},
        {Value: 1000000, Expected: []byte{0x1a, 0x00, 0x0f, 0x42, 0x40}},
        {
            Value: 1000000000000,
            Expected: []byte{
                0x1b, 0x00, 0x00, 0x00, 0xe8, 0xd4, 0xa5, 0x10, 0x00,
            },
        },
        {
            Value: 18446744073709551615,
            Expected: []byte{
                0x1b, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
            },
        },
    }

    for _, c := range cases {
        t.Run(fmt.Sprintf("%d", c.Value), func(t *testing.T) {
            testEncoder(t, uint64(c.Value), nil, c.Expected)
        })
    }
}

If we run the tests as they are now, the ones with numbers less than 24 will pass, but all the bigger numbers will fail with a not implemented error:

--- PASS: TestIntBig/0 (0.00s)
--- PASS: TestIntBig/1 (0.00s)
--- PASS: TestIntBig/10 (0.00s)
--- PASS: TestIntBig/23 (0.00s)
--- FAIL: TestIntBig/24 (0.00s)
	cbor_test.go:18: err: &errors.errorString{s:"Not Implemented"} != <nil> with 0x18
--- FAIL: TestIntBig/25 (0.00s)
	cbor_test.go:18: err: &errors.errorString{s:"Not Implemented"} != <nil> with 0x19
--- FAIL: TestIntBig/100 (0.00s)
	cbor_test.go:18: err: &errors.errorString{s:"Not Implemented"} != <nil> with 0x64
...

Big CBOR integers have 2 parts: a header to determine the type, followed by the value encoded as a big endian integer. For example 25 is encoded as 0x1819, that’s 2 bytes: the header is 0x18 or 24 in decimal, that corresponds to a 8 bit integer type. The second byte after the header is 0x19 or 25 in decimal the integer we encoded. To re-iterate: the header gives us the type of the value and the bytes following the header is the value being encoded.

The first thing we’ll do is add a helper function to write our native integers as big endian integers. It takes an interface{} as parameter instead of an integer because the package encoding/binary uses the type of the value it writes to determine how much data to write. For example passing the value 1 typed as a uint16 to binary.Write will output 2 bytes: 0x0001. This allows us to cast our integer to the right type to encode our the correct sized integer with binary.Write:

// writeHeaderInteger writes out a header created from major and minor magic
// numbers and write the value v as a big endian value
func (e *Encoder) writeHeaderInteger(major, minor byte, v interface{}) error {
    if err := e.writeHeader(major, minor); err != nil {
        return err
    }
    return binary.Write(e.w, binary.BigEndian, v)
}

We don’t want the big switch statement in the Encode method to become messy as we’re adding more code, so we create a new method for our encoder: writeInteger where we’ll put all the code to encode integers.

The writeInteger method encodes our single integer value and casts it to the smallest integer type that can hold its value:

func (e *Encoder) writeInteger(i uint64) error {
    switch {
    case i <= 23:
        return e.writeHeader(majorPositiveInteger, byte(i))
    case i <= 0xff:
        return e.writeHeaderInteger(
            majorPositiveInteger, minorPositiveInt8, uint8(i),
        )
    case i <= 0xffff:
        return e.writeHeaderInteger(
            majorPositiveInteger, minorPositiveInt16, uint16(i),
        )
    case i <= 0xffffffff:
        return e.writeHeaderInteger(
            majorPositiveInteger, minorPositiveInt32, uint32(i),
        )
    default:
        return e.writeHeaderInteger(
            majorPositiveInteger, minorPositiveInt64, uint64(i),
        )
    }
}

As you can see we cast the value i into different integer types depending on how big it is to minimize the size of what we write to the output. The less bytes we use the better.

Encode now looks like this:

func (e *Encoder) Encode(v interface{}) error {
    switch v.(type) {
    case nil:
        return e.writeHeader(majorSimpleValue, simpleValueNil)
    case bool:
        var minor byte
        if v.(bool) {
            minor = simpleValueTrue
        } else {
            minor = simpleValueFalse
        }
        return e.writeHeader(majorSimpleValue, minor)
    case uint64,:
        return e.writeInteger(v.(uint64))
    }
    return ErrNotImplemented
}

Once we add this little bit of code our integer tests will pass:

--- PASS: TestIntBig (0.00s)
    --- PASS: TestIntBig/0 (0.00s)
    --- PASS: TestIntBig/1 (0.00s)
    --- PASS: TestIntBig/10 (0.00s)
    --- PASS: TestIntBig/23 (0.00s)
    --- PASS: TestIntBig/24 (0.00s)
    --- PASS: TestIntBig/25 (0.00s)
    --- PASS: TestIntBig/100 (0.00s)
    --- PASS: TestIntBig/1000 (0.00s)
    --- PASS: TestIntBig/1000000 (0.00s)
    --- PASS: TestIntBig/1000000000000 (0.00s)
    --- PASS: TestIntBig/18446744073709551615 (0.00s)

Let’s add the integer types we ignored thus far to be more exhaustive with what our encoder supports:

case uint, uint8, uint16, uint32, uint64, int, int8, int16, int32, int64:
	if v.(uint64) >= 0 {
		return e.writeInteger(v.(uint64))
	}

Now we can pass a positive int, int8, int16, int32, or int64 and it will work. We can’t handle negative number yet.

That’ll all for now. There’s a repository with the code for this episode. In the next episode we’ll introduce the reflect package to care of pointers.

Go CBOR encoder: Episode 2, booleans

In the previous episode, we learned how to encode the nil value. Now we’ll do booleans. According to the CBOR specification, booleans are represented by a single byte: 0xf4 for false, and 0xf5 for true.

We’ll write the tests first, but before we do that let’s write a helper function for our encoder tests. We want to avoid copy-pasting the same code all over our tests. Looking at the test we wrote in the previous episode, that’s how all of our future tests will look like:

func TestNil(t *testing.T) {
    var buffer = bytes.Buffer{}
    var err = NewEncoder(&buffer).Encode(nil)

    if !(err == nil && bytes.Equal(buffer.Bytes(), []byte{0xf6})) {
        t.Fatalf(
            "%#v != %#v or %#v != %#v",
            err, nil, buffer.Bytes(), []byte{0xf6},
        )
    }
}

We test something with a well defined interface: the encoder gets a value, returns an error, and outputs an array of bytes. This means we can use this to factor out most of the code into a single helper function named testEncoder, we add this to our test file:

// testEncoder test the CBOR encoder with the value v, and verify that err, and
// expected match what's returned and written by the encoder.
func testEncoder(t *testing.T, v interface{}, err error, expected []byte) {
    // buffer is where we write the CBOR encoded values
    var buffer = bytes.Buffer{}
    // create a new encoder writing to buffer, and encode v with it
    var e = NewEncoder(&buffer).Encode(v)

    if e != err {
        t.Fatalf("err: %#v != %#v with %#v", e, err, v)
    }

    if !bytes.Equal(buffer.Bytes(), expected) {
        t.Fatalf(
            "(%#v) %#v != %#v", v, buffer.Bytes(), expected,
        )
    }
}

testEncoder will save quite a bit of typing. TestNil turns into a single line —saving 8 lines— with testEncoder doing all the work:

func TestNil(t *testing.T) {
    testEncoder(t, nil, nil, []byte{0xf6})
}

Let’s write the tests for booleans now. We only need to test true and false, so TestBool is a terse two lines:

func TestBool(t *testing.T) {
    testEncoder(t, false, nil, []byte{0xf4})
    testEncoder(t, true, nil, []byte{0xf5})
}

With our current encoder only able to encode nil right now, if we run the tests we’ll get a not implemented error thrown at us:

$ go test -v .
=== RUN   TestNil
--- PASS: TestNil (0.00s)
=== RUN   TestBool
--- FAIL: TestBool (0.00s)
        cbor_test.go:19: err: &errors.errorString{s:"Not Implemented"} != <nil> with false
FAIL
FAIL    _/home/henry/cbor   0.003s

Now we’ll implement booleans encoding and get those tests passing. From the previous episode our Encode function looked like this:

var ErrNotImplemented = errors.New("Not Implemented")

// Can only encode nil
func (enc *Encoder) Encode(v interface{}) error {
    switch v.(type) {
    case nil:
        var _, err = enc.w.Write([]byte{0xf6})
        return err
    }
    return ErrNotImplemented
}

We need to add another case to the switch block to know if we have a boolean, and then we’ll have to turn the generic interface{} named v into a boolean value to know what the encode will output into its writer:

// Can only encode nil, false, and true
func (enc *Encoder) Encode(v interface{}) error {
    switch v.(type) {
    case nil:
        var _, err = enc.w.Write([]byte{0xf6})
        return err
    case bool:
        var err error
        if v.(bool) {
            _, err = enc.w.Write([]byte{0xf5}) // true
        } else {
            _, err = enc.w.Write([]byte{0xf4}) // false
        }
        return err
    }
    return ErrNotImplemented
}

The tricky part here is v.(bool): this turns the non-typed interface v into a boolean value using type assertion.

Encode now works with booleans and our tests pass:

$ go test -v .
=== RUN   TestNil
--- PASS: TestNil (0.00s)
=== RUN   TestBool
--- PASS: TestBool (0.00s)
PASS
ok  	_/home/henry/cbor	0.003s

This wraps up the 2nd episode. Next we’ll encode a type that’s more than a single byte of output: positive integers.

I created a public repository with the code for this episode.

Go CBOR encoder: Episode 1, getting started

Let’s write a CBOR encoder in Go. We’ll learn more about type switching and type manipulation with the reflect package. This is going to be a series of posts, each building on the previous one. It requires a good understanding of the Golang syntax.

CBOR is a data format described in RFC 7049, it’s like JSON but binary instead of text. Its design goals are:

We use an interface similar to the encoding/json package. If you are unfamiliar with this encoding sub-packages, I recommend you read the JSON and Go article.

To start our empty package we’ll create a file named cbor.go like this:

// Implements CBOR encoding:
//
//   https://tools.ietf.org/html/rfc7049
//
package cbor

import (
        "io"
)

type Encoder struct {
        w   io.Writer
}

func NewEncoder(w io.Writer) *Encoder {
        return &Encoder{w: w}
}

func (enc *Encoder)Encode(v interface{}) error {
        return nil
}

Here’s how we use the encoder:

var output = bytes.Buffer{}
var encoder = NewEncoder(&output)
var myvalue = 1234
// write the integer 1234 CBOR encoded into output
if err := encoder.Encode(&myvalue); err != nil {
    ...
}

We have our basic structure we can now start working on the encoder’s implementation. In the previous example we encoded the integer 1234, but we won’t start with integers, instead we will encode the value nil to start because it’s the easiest value to encode.

According the CBOR specification the nil value is represented by a single byte: 0xf6

Let’s write a test with the testing package, we’ll verify the encoder outputs the single byte 0xf6 into the result buffer when we pass nil. We create a new file cbor_test.go beside cbor.go for our tests:

package cbor

import (
    "bytes"
    "testing"
)

func TestNil(t *testing.T) {
    var buffer = bytes.Buffer{}
    var err = NewEncoder(&buffer).Encode(nil)

    if !(err == nil && bytes.Equal(result.Bytes(), []byte{0xf6})) {
        t.Fatalf(
            "%#v != %#v or %#v != %#v",
            err, nil, result, []byte{0xf6},
        )
    }
}

If we run the test with what we currently have we’ll get an error. No surprise: we haven’t implemented anything yet, so the encoder won’t write to the output buffer:

$ go test .
--- FAIL: TestNil (0.00 seconds)
    cbor_test.go:15: <nil> != nil or []byte{} != []bytes{0xf6}
FAIL
FAIL    _/home/henry/essays/cbor    0.011s

To implement the nil value encoding: we write the byte in the result when calling Encode() and we’ll return an error if the value isn’t nil since we haven’t implemented anything else yet:

var ErrNotImplemented = errors.New("Not Implemented")

// Can only encode nil
func (enc *Encoder) Encode(v interface{}) error {
    switch v.(type) {
    case nil:
        var _, err = enc.w.Write([]byte{0xf6})
        return err
    }
    return ErrNotImplemented
}

Here we’re using a type switch to determine the type of the value we got. There’s only one case for now: nil, where we write the 0xf6 value in the output and return the error to the caller.

And now the test succeeds:

=== RUN   TestNil
--- PASS: TestNil (0.00s)
PASS
ok  	_/home/henry/essays/cbor	0.027s

We created the initial encoder that can encode a single value successfully. In the next episode we’ll implement CBOR encoding for more basic Go types.

Boot OpenBSD with EFI for full resolution display

I got a new radeon graphic card for gaming. Unfortunately it’s not supported yet by the radeon(4) driver on OpenBSD. Luckily there’s a workaround: booting with UEFI. UEFI does all the talking with the graphics card, and this allows OpenBSD to use the screen’s full resolution on unsupported cards: it should work better than the vesa(4) driver. EFI boots the operating system differently than old school BIOS: it uses a special partition for the operating system’s bootloader.

First I needed to enable UEFI. It wasn’t called UEFI or EFI in the BIOS setup but something like “Windows 8 / 10 boot method”, I picked the regular version not the WHQL.

Second I created the EFI system partition to store the EFI bootloaders with gparted: I made a 100 MB partition formatted as FAT32 at the beginning of the disk, then set the flags “boot” & “esp” on it.

Third I created the system’s partition and did the install via a USB stick a usual. Once the install was done and before rebooting, I copied the OpenBSD’s EFI bootloader to the EFI system partition like this:

# mount /dev/sd0i /mnt2
# mkdir -p /mnt2/efi/boot
# cp /mnt/usr/mdec/BOOTX64.EFI /mnt2/efi/boot

I rebooted and it worked out of the box: OpenBSD can now use my Radeon RX 570 and my 2560×1440 screen at full resolution.

Source

How to set the timezone on an OpenBSD system:

# ln -sf /usr/share/zoneinfo/America/Vancouver /etc/localtime

Or if one feels lazier:

# zic -l America/Vancouver

Updated reminder for later: pkg.conf doesn’t exist anymore on OpenBSD, now it’s installurl(5) you use to setup the mirror to install packages.

I work at a company that does Selenium stuff. So I build Selenium load testing tools on the side, because I think it’s cool. Maybe someone will be impressed:

https://bitbucket.org/henry/selenium-surfer https://bitbucket.org/henry/selenium-stresser

I’m a programmer.

I’m always a byte away from disaster.

Vim is my day to day editor: I use it for coding and writing prose. Vim was made for programmers, not really for writers. Here are a few plugins I use when I write prose with Vim.

  1. Goyo offers distraction free mode for Vim. It lets me to resize the writing surface in the editor’s window however I want.

  2. I like soft line wraps when I write prose. Soft line wrap is when the text reaches the end of the window it gets wrap back to the beginning of the next line without inserting a new-line. Vim Pencil lets you do that.

  3. vim-textobj-quote offers support for smart quoting as you type like Unicycle but this plugin is still maintained.

I have this snippet in my .vimrc to get in and out of the writer mode:

" I use a 80 columns wide editing window
function s:WriteOn()
    call pencil#init({'wrap': 'soft', 'textwidth': 80})
    Educate
    Goyo 80x100%
endfunction

function s:WriteOff()
    NoPencil
    NoEducate
    Goyo!
endfunction

command WriteOn call s:WriteOn()
command WriteOff call s:WriteOff()

Reminder for later, how to selectively rollback a file to the specified version:

git reset -p '<hash>' '<filename>'

The function ScanLines splits lines from a io.Reader, it returns a channel where the results are written. Handy when you want to read things on the fly line by line:

//
// Read input line by line and send it to the returned channel. Once there's
// nothing left to read closes the channel.
//
func ScanLines(input io.Reader) <-chan string {
        var output = make(chan string)

        go func() {
                var scanner = bufio.NewScanner(input)

                for scanner.Scan() {
                        output <- scanner.Text()
                }

                if err := scanner.Err(); err != nil {
                        fmt.Fprintln(os.Stderr, "reading input:", err)
                }
                close(output)
        }()

        return output
}

func main() {
        var input = ScanLines(os.Stdin)

        for x := range input {
                fmt.Printf("%#v\n", x)
        }
}

Reminder for later, OpenBSD’s pkg_delete utility can remove unused dependencies automatically:

# pkg_delete -a

How to fill a PDF form with pdftk

I had a rather lenghty PDF form to fill, it took me 2 hours to do it becuase copy-pasting didn’t work in with my PDF editor.

After I saved the file I realized that I clicked on a radio button I shouldn’t have clicked on: Kids. I do not have kid, and the radio selection didn’t contain the zero option, only one and more. After trying to get rid of that radio selection for 5 minutes, it looked like there was no way to undo this: I had selected something I couldn’t unselect.

I didn’t want to waste another 2 hours to fill out the form, I needed to fix this by editing the PDF.

After a bit of googling I found pdftk, a command-line toolkit that can fill & extract information out of PDF forms.

To unselect the radio box, I had to extract the form data. Pdftk can extract the information into a text file that you can edit with a text editor.

pdftk input.pdf generate_fdf output form_data.fdf

Here it will generate form_data.fdf from input.pdf’s form values. After that I had to modify the fdf file to get rid of my selection. In my case, I wanted to reset the selection for the Kids radio selection.

/Kids [
<<
/V (1)
/T (RadioButtonList[0])
>>]

I changed it from “1 kid” to “nothing selected”.

/Kids [
<<
/V /Off
/T (RadioButtonList[0])
>>]

Then I had to re-enter the information from the FDF file into the PDF.

pdftk input.pdf fill_form form_data.fdf output output.pdf

It took me around an hour to do all this, so pdftk saved me time. I liked it, you can check out pdftk’s own examples to learne more, the documentation is terse and complete.

Still bitmap after all those years

Bitmap fonts are pixel-art fonts. Unlike outline fonts they cannot be automatically scaled with good results, to create a multi-size bitmap font you have to create a different version for each size. They can’t be anti-aliased so they tend to look blocky compared to outline fonts.

Outline fonts use Bézier curves, they are scallable, and their edges can be anti-aliased to make them look nicer. Today everybody is running an operating system that can render outline fonts decently, and can use those smooth looking beauties with superior results compared to bitmap fonts.

Bitmap fonts are a thing of the past.

Yet, I still use a bitmap font for my day to day programming tasks. I transitionned to an outline font for a while, but ultimately switched back after a few months because the outline font didn’t seem as sharp.

It may be silly, but nothing looks as sharp as a bitmap font to me. I’m talking what it looks like on a computer screen in 2016 with a dot pitch of 0.27 mm. Because each pixel is either black or white and nothing smooths out the edges, it’s sharp.

I salivate like everybody on those screenshots of multi-colored terminal window with a fancy outline font that support ligatures, has cool emoji’s icons, and rainbows of bright pastel colors. I’m sure it’s great to feel like you’re on acid while you write code, but I like my bitmap font and my bland terminal colors. It gets the job done and it’s easy on my eyes.

I’ll switch to outline fonts when I get a screen with a high pixel density for my workstation, but for now I’ll use my bitmap font, it doesn’t look so bad with today’s fat pixels.

Reminder for later, how to setup pkg.conf after a fresh OpenBSD install:

installpath=http://ftp.openbsd.org/pub/OpenBSD/%c/packages/%a

You may have to replace %c with snapshots.

Reminder for later: how to mount MTP device on Ubuntu with jmtpfs.

  1. Your user has to be part of plugdev group: usermod -a -G plugdev $(whoami)
  2. user_allow_other in /etc/fuse.conf
  3. jmtpfs path/to/mount/point will mount the 1st MTP device on path/to/mount/point

When I need to scrap data online, I use Python with requests, and lxml, two libraries taht make it easy to extract data without going crazy.

Often I come accross HTML tables with data formatted like this:

<td>
    <a href='/data1'><strong>data1</strong></a>
</td>
<td>
    data2
</td>
<td>
    data<em>3</em>
</td>

In that case we’d just like to extract the list data1, data2, & data3 from the table. With the different markup in each cell it would take quite a bit of elbow grease to clean it up. lxml has a special method that makes all that easy: text_content. Here’s what the documentation says about it:

Returns the text content of the element, including the text content of its children, with no markup.

For the previous HTML snippet we’d extract the data like this:

>>> from lxml import html
>>> root = html.fromstring('''    <td>
...         <a href='...'><strong>data1</strong></a>
...     </td>
...     <td>
...         data2
...     </td>
...     <td>
...         data<em>3</em>
...     </td>
... ''')
>>> [i.text_content().strip() for i in root.xpath('//td')]
['data1', 'data2', 'data3']

I got new speakers with a built-in USB-DAC for my home computer.

Once plugged OpenBSD recognized it as an USB-audio device, so far so good. Unfortunatly I couldn’t get any sound out of it, but my computer’s sound card —which is recognised as a separate device— worked.

It turns out that by default sndiod —the system audio mixer— uses the first audio device it finds and ignores the others. To get it to use other devices you must specify them in rc.conf.local like this:

sndiod_flags="-f rsnd/0 -f rsnd/1"

I restarted sndiod like this sudo /etc/rc.d/sndiod restart, and everything now works nicely.

I keep diaries on different subjects. When I add a new entry I start by inserting a timestamp at the top of the entry. I used to do it ‘manually’ –by copying the current date and pasting in into Vim–, and yesterday I decided to write a vim function to automate that.

There’s nothing especially hard about that, but it took me a while before I figured out how to insert the timestamp at the current cursor position. It didn’t look like there was any built-in Vim function to do it, and most solution I found online seems overly complicated.

It turns out that all I needed was an execute statetement like this: execute ":normal itext to insert", this will insert the string “text to insert” at the current cursor position.

I this added to my vimrc:

function s:InsertISODate()
    let timestamp = strftime('%Y-%m-%d')
    execute ":normal i" . timestamp
    echo 'New time: ' . timestamp
endfunction

function s:InsertISODatetime()
    let timestamp = strftime('%Y-%m-%d %H:%M:%S')
    execute ":normal i" . timestamp
    echo 'New time: ' . timestamp
endfunction

command Today	call s:InsertISODate()
command Now     call s:InsertISODatetime()

Reminder for laster, reset a branch to what its remote branch is:

git checkout -B master origin/master

I looked for a decent weather app on Android for a while. I tried many, they tended to be cluttered and overly complicated for what they were doing. I’m now using Weather Timeline, it’s clear, fast, and simple. I check it every morning, it gives me a quick and clean overview of the forecast, no need to dig the information in sub-menus, there’s no ad, and it’s just $1.

Sometime you need an up-to-date virtualenv for your Python project, but the one installed is an old version. I read virtualenv’s installation manual, but I didn’t like much that you have to use sudo to bootstrap it. I came with an alternative way of installing an up-to-date virtualenv as long as you have an old version. In my case it was an Ubuntu 12.04 which ships virtualenv 1.7.

1st install the outdated version of virtualenv:

$ sudo apt-get install python-virtualenv

Then setup a temporary environment:

$ virtualenv $HOME/tmpenv

Finally use the environment created before to bootstrap an up-to-date one:

$ "$HOME/tmpenv/bin/pip" install virtualenv


$ "$HOME/tmpenv/bin/virtualenv" $HOME/env
$ rm -rf "$HOME/tmpenv"  # Delete the old one if needed

I love Rob Pike’s talks: face-paced and intense. It’s a nice change from the typical talks about programming: slow, and often more about self-promotion than teaching.

I usually feel I miss a few things here and there when I listen to Rob, he’s smart and expect you to be smart, he doesn’t talk down to you. I rarely understand everything perfectly: this is good, it means I don’t fully master the subject, it means I’m learning, it means he’s making a good use of my time.

This talk about implementing a bignum calculator is the perfect example, Rob doesn’t spend much time reading and explaining the code or the examples: he assumes that his audience is smart enough to understand the most of the details; he focuses on the big picture, and the hard details.

I needed a set datastructure for a Go program, after a quick search on the interweb I saw this reddit thread about sets, queues, etc…

Short answer: for sets use maps, specifically map[<element>]struct{}. My first intuition was to use map[<element>]interface{}, but it turns out that an empty interface takes 8 bytes: 4 bytes for the type, and 4 bytes for the value which is always nil, while an empty structure doesn’t use any space.

There weren’t many details on how to do to it. So I just gave it a try, it was pretty easy to figure out the implementation, as long as operations like union, intersection arent’ needed.

That’s how I would implement an integer set:

type set map[int]struct{}

var myset = make(set)  // Allocate the map

// Add an element to the set by adding an empty structure for the key 1
myset[1] = struct{}{}

// Check if we have 1 in our set
if _, ok := myset[1]; ok {
    println("1 in myset")
} else {
    println("1 not in myset")
}

// Remove the element from the set
delete(myset, 1)

REST toolbox for the command-line hacker

I work with lots of REST services those days. RESTful services are easy to access and use because they’re based on well-known tech, this eliminates half of the tedious work. Unfortunatly the other tedious half is still here: interfacing. We still need to get and convert the data from the original format to the format we want. Lately I found two tools that help a great deal with HTTP and JSON: HTTPie and Jq. Today I’ll talk about HTTPie.

I used cURL for almost a decade to deal with HTTP from the command-line, as few month ago I heard about a command line client called HTTPie, that has a nice interface that totally makes sense:

$ http --form POST localhost:8000/handler foo=1 bar=hello

What does it do? It does a HTTP POST on localhost:8000/handler with the following body:

POST /handler HTTP/1.1
Host: localhost:8000
Content-Length: 15
Accept-Encoding: gzip, deflate
Accept: */*
User-Agent: HTTPie/0.8.0
Connection: keep-alive
Content-Type: application/x-www-form-urlencoded; charset=utf-8

foo=1&bar=hello

It’s exactly the kind of stuff I want. I often automate the common stuff away with a function, like this:

http() {
    # 1st parameter is the path, we pop it out of the parameter list
    local urlpath = "$1"; shift

    # since we use http has our function name we have to use `command' to
    # call the executable http and not the function
    command http --form POST "example.com$urlpath" "$*"
}

# Do a POST with the following parameters: foo=1&bar=2
http /test foo=1 bar=2

If you’d rather submit JSON instead of an url-encoded form, replace the --form option with --json.

Give HTTPie a shot next time you want to talk with an HTTP service from the command line: it may take you less time to learn it from scratch than remember how to use cURL.

One day at work the Internet went out around 4pm, most of my co-workers couldn’t work: most of the information they needed to work were online, and they didn’t have local copies. If I was writing Python at the time, being offline would have been a problem, I rely on the information on docs.python.org, frameworks, and libraries’ documentation: all of which are online.

With Go, it’s less of a problem: if you have godoc installed you can access the installed packages’ documentation using a local HTTP server:

$ "$GOPATH/bin/godoc" -http=:8080

Point your browser to localhost:8080 and here you have it: the documentation for all your installed packages.

A few tips & tricks for properly managing views with Postgres:

  1. Name the return values with AS
  2. Type constant values by prefixing them with their type

For example consider the following:

$ CREATE VIEW myview AS SELECT 'bar';
WARNING:  column "?column?" has type "unknown"
DETAIL:  Proceeding with relation creation anyway.
CREATE VIEW

Here’s what the Postgres documentation says about it:

Be careful that the names and types of the view’s columns will be assigned the way you want. For example:

  CREATE VIEW vista AS SELECT 'Hello World';

is bad form in two ways: the column name defaults to ?column?, and the column data type defaults to unknown. If you want a string literal in a view’s result, use something like:

  CREATE VIEW vista AS SELECT text 'Hello World' AS hello;

First we’ll name our string to get rid of the “?column?” name:

$ CREATE VIEW myview AS SELECT 'bar' AS bar;
WARNING:  column "bar" has type "unknown"
DETAIL:  Proceeding with relation creation anyway.
CREATE VIEW

Second we set the type of our return value by prefixing with TEXT:

$ CREATE VIEW myview AS SELECT TEXT 'bar' AS bar;
CREATE VIEW

That is all.

I wanted to upgrade to Go 1.3 on my Desktop at work. The version of Go shipped with Ubuntu 14.10 is 1.2. I found this article talking about godeb a go program to package Go into a .deb file installable on Ubuntu.

You still need Go 1.0+ installed, otherwise the installation is straightforward:

$ go get gopkg.in/niemeyer/godeb.v1/cmd/godeb

And it’s easy to use:

$ $GOPATH/bin/godeb list
1.4beta1
1.3.3
[...]
1.0.1

$ sudo $GOPATH/bin/godeb install 1.3.3

You may to rebuild you Go packages after the install, the easiest is to delete the old version and let Go rebuild them when they are needed:

$ rm -rf $GOPATH/pkg/*

Reminder for later, how to setup an SSH tunnel to a remote PostgreSQL database.

Create the script postgres_tunnel.sh:

#!/bin/sh

: ${REMOTEHOST:=example.com}
: ${LOCALPORT:=12345}

# Assuming postgres listens on localhost:5432
ssh "$REMOTEHOST" -L "$LOCALPORT":localhost:5432 -N

Execute it, and connects to the remote database like this:

$ psql 'postgres://username:password@localhost:12345/mydb'

I work in open spaces a lot, my current job is in a shared open office. Open spaces are known to imped workers productivity. It’s one of the worst office arrangement, yet they are popular in IT. They offers a short-term gain: cheaper space, for an invisible –but real– price: less productive and satisfied workers. Noise and lack of privacy are the main causes of disatisfaction, and while it’s difficult to address the lack of privacy, something can be done about the noise. I deal with it via a two-pronged attack:

  1. Earplugs, I use Howard Leight Max, and there are lots of other good earplugs around with different shape and foam-type.
  2. Headphones with a Pink noise playing in a loop

This way I get a good isolation from the environment, and it makes interruptions awkward for the interrupter: he has to wait for me to take off my headphones and earplugs. This makes interrupting me more costly, which is a nice side-effect.

I have regular headphones; I wonder how good the combo earplugs & noise-cancelling would be.

I used Swiftkey as my Android keyboard: I found it worked better than the default Android keyboard. I switch between English & French often, Swiftkey just works without selecting a language while the default Android keyboard needs to be switched between the 2 languages to work properly.

I was hanging-out on security.google.com, and saw that Swiftkey Cloud had full-access to my Email: read & write access! I didn’t remember giving them any of these permissions, I did at some point, but I don’t know when.

Reading one’s emails is a great way to improve her predictive typing, but it wasn’t clear to me that they’d read my Emails. I’m almost certain there was no big-fat pop-up saying so.

That kind of thing really annoys me: Emails are sacred, you don’t mess-up with it unless you’re a company with no moral or ethic like Linkedin or Facebook… I made fun of people that gave their Email and password to 3rd parties, but I kind of did the same…

I revoked the access, deleted my Swiftkey Cloud account, removed the Switfkey app from my phone, and switched back to Google keyboard, it came a long way since I replaced it with Swiftkey a year ago.

I started a project in Go, when I got started everything was in a single file. Now this file is too big for my own taste, so I split it into 2 separate files, let’s call them main.go & util.go. In main.go I have the main() function, in util.go I have functions used by main.go.

When I tried to run main.go directly I got this error:

$ go run main.go
# command-line-arguments
./main.go:150: undefined: SomeFunction

I didn’t want to create a package just for util.go, sometime source files really are specific to a program and aren’t reusable.

My search for a solution on the web didn’t yield anything useful. I knew it was possible, I saw programs like godeb do it. After a while I built the program with go build to see if the error would be different, and it worked this time. Weird… What’s going on?

Everything was the same except I didn’t specify what to build, in that case main.go. Go just built every go source files in the directory, I got the same error with go build with only main.go specified.

$ go build main.go
# command-line-arguments
./main.go:150: undefined: SomeFunction

That’s when it hit me, I just needed to list all the files necessary to run the program on the command line:

$ go run main.go util.go

Here it is: go run needs the complete list of files to execute in the main package. I’ll know it for next time!

I’m a hater, especially when it comes to programming languages. I approach most of them with pessimism, I rarely look at new language and think it’s great right away.

C

I started programming in high-school with QBasic. I made a small choose-your-own-adventure style game in text-mode, then moved on to C. I didn’t like C much: it didn’t feel expressive enough, and was too hard to use for the noob I was. I started programming seriously after high-school and discovered C++ during my 2nd year studying computer science. I became instantly a fan, C++ had so many features, the language felt more expressive, more powerful. I though I could master it within a few years. It took me a good 5 years to realize that C++ was too big: It seemed baroque and overly complex after all this time. After 5 years I still didn’t master most of the language; like everybody I just used a subset. I went back to C and saw what I wasn’t able to see then: C was expressive and simple. I took me years of struggle with the seemingly cool features of C++ to realize C was the best part of C++.

Javascript

I was a Javascript hater for a long time, the language seemed so absurdly hard to work with, there were traps and gotcha. If you asked me 5 years ago: PHP or Javascript I’d reply: “PHP of course! Javascript is terrible.” Then I learned more about it thanks to Douglas Croford’s videos. While Javascript is not my favorite language I came to appreciate it, today I’d pick it over PHP if I had to start a new project.

Python

Python looked a bit ridiculous when I first used it. I didn’t like the indentation to define blocks, or that the language was interpreted, I didn’t get that a dynamic language opens up a realm of new possibilities. At the beginning Python felt like slow, and dumbed down C++. It took time writing Python everyday to fall in love with it, but after a year it was my favorite language. I’ve been writing Python personally and professionally for 10 years now.

Go

My first impression of Go was: it’s kind of like a cleaned-up C. My main problem was that concurrency was part of the language like Erlang, I though it’d be better if the tools for concurrency were contained in a library like multiprocessing in Python. Also there were a few things that really bothered me with it like the semi-colon insertion, a known Javascript gotcha.

Then I heard about goroutines, channels, & Go’s select statement, after that it all made sense. Go has an elegant solution to a fundamental problem of modern computing: concurrency.

The semi-colon insertion turned out to be a convenient quirk.

Go became my new toy a month ago, it’s now on track to replace Python as my favorite programming language.

1.2 GB is easier to understand than 1234567890 Bytes, at least for humans. I write functions to ‘humanize’ numbers often, but it never seems worth keeping those functions around since they are generally quick and easy to write. Today I decided to finally stop rewriting the same thing over and over, and headed to PyPi –the Python module repository–, and of course there’s a module to do that on it:

>>> import humanize
>>> humanize.naturalsize(1234567890)
'1.2 GB'

The Oak Island Money Pit is the story of a 2 centuries long treasure hunt on a small Island near Nova Scotia, Canada. What makes this treasure hunt special is how must resource was sunk into it. For 200 years adventurers lost their time, money, and sometime life trying to find the elusive gold & jewels.

Long story short: In 1795 after seeing lights in the island three teenagers found a mysterious depression on the ground of the Island, and started treasure hunting by digging in the ground. Clues, and signs of treasures were found, fortunes were wasted on fruitless digging, and 6 people died. To this day nothing of value was found. Speculations & theories about the origin of the supposed treasure in the money pit abound.

Something that wasn’t addressed in the article: How the hell did all that stuff got so deep into the ground? If digging deep enough was still a problem in the 1960’s how did 17th century men managed to dig a hole 100 feet deep, along with booty traps, and flooding tunnels? Given the numerous difficulties the treasure hunters went through for the past 200 years, it would have been a great engineering feat. All of that without being detected by the locals, and keeping it secret for 200 years.

Most adventurers probably though about it, and they gave their imaginary enemy —the treasure digger— too much credit, and didn’t give their predecessors enough credit.

The Oak Island Money Pit is a great story because it’s a great tragedy. The only treasure on the island is the memories of this great human adventure.

I use xdm(1) as my login manager under OpenBSD. After I loging, it starts xconsole(1). It’s not a big deal, but I’d rather not have it happen.

To stop xdm from starting a new xconsole for every session, edit /etc/X11/xdm/xdm-config, remove or comment out the following lines:

DisplayManager._0.setup:      /etc/X11/xdm/Xsetup_0
DisplayManager._0.startup:    /etc/X11/xdm/GiveConsole
DisplayManager._0.reset:      /etc/X11/xdm/TakeConsole

I’m not a fan of 37signals, but I must admit their DNS service xip.io is handy. I’m setting up some web servers right now, and I needed a domain to test my configuration. The whole DNS dance is a bit time-consuming: add record to zone file, & wait for my DNS to pick it up. With xip.io there’s no need to wait: prepend your host’s IP address before .xip.io and the domain will resolve to your own IP.

For example 127.0.0.1.xip.io will resolve to 127.0.0.1.

They are other services like this like ipq.co or localtest.me, but as far as I know they don’t work out of the box: you have to register your subdomain first or can only use it with localhost.

How to run PostgreSQL as a non-privileged user

The quick and dirty guide to setup a postgres database without root access.

Create a directory where your data will live:

$ postgres_dir="$HOME/postgres"
$ mkdir -p "$postgres_dir"
$ initdb -D "$postgres_dir"
[lots of output...]

Then run postgres:

postgres -D "$postgres_dir"

Create and database for yourself:

$ createdb $(whoami)
$ psql
user=#

To stop the server type Ctrl-C, or you can use pg_ctl if postgres runs in the background:

pg_ctl stop -D "$postgres_dir"
$ sudo apt-get install zsh-doc
[...]
$ man zsh
No manual entry for zsh
See 'man 7 undocumented' for help when manual pages are not available.
$ man 7 undocumented
[...]
NAME
       undocumented - No manpage for this program, utility or function
[...]

Damn you Ubuntu/Debian/whoever decided that man pages were ‘too big’ to be part of documentation packages!

Copy a branch between Git repositories

Git is tricky to use: after 4 years I still have a hard time figuring out how to do simple operations with it. I just spent 30 minutes on that one:

Say you have 2 copies of the same repository repo1 & repo2. In repo1 there’s a branch called copybranch that you want to copy to repo2 without merging it: just copy the branch. git pull repo1 copybranch from repo2 doesn’t work because it will try to merge the copybranch into the current branch: no good.

It looks like git fetch repo1 copybranch would be the way to go, but when I did it, here’s what I saw:

From repo1
 * branch            copybranch -> FETCH_HEAD

After that a quick look at the logs doesn’t show copybranch, FETCH_HEAD, or any of the commits from copybranch. What happenned? Git copied the content of copybranch, but instead of creating another branch named copybranch it creates a temporary reference called FETCH_HEAD, but FETCH_HEAD doesn’t appear in the logs. In summary: Git copied the branch, & made it invisible, because you know… it makes perfect sense to hide what you just copied.

So how do you copy the branch, and create a branch with the same name referencing the commits? Here it is:

git fetch repo1 copybranch:copybranch

I use VMWare Player to run my OpenBSD under Windows with my laptop –a Thinkpad X1 Carbon–, its newer hardware wasn’t fully supported by OpenBSD when I got it. I had issues with VirtualBox, it was slow: 50%+ of the CPU time was spend on interupts, and I couldn’t find a solution on the Internets. After reading Ted unangst’s blog where he describes his setup I decided to switch to VMWare Player.

I use Putty to connect to the VM, and while VMWare worked well, sometime the dynamically assigned IP changed. I had to reopen Putty to change the IP from time to time, and it was getting annoying. It turns out that you can use static IPs. VMWare uses a 255.255.255.0 netmask, and it reserves the 3-127 range for static IPs. I put this in my /etc/hostname.em0:

inet 192.168.234.3 255.255.255.0

It didn’t work right away. It turns out that the gateway was at 192.168.234.2. I put the following in /etc/mygate:

192.168.234.2

And things are now working nicely.

CSS3 Quickies

I did some web-design for a friend this January. I didn’t use HTML & CSS for a while, I did quite a bit of tinkering with the graphic design of scratchpad 6 months ago, using a custom font and trying a few ‘newish’ features like media queries. It was a good oportunity to discover & use the new HTML5 & CSS3 features.

My friend had a Joomla template he wanted to use for his site. His needs were limited: 5 pages and maybe a contact form. Hosting this with Joomla seemed a bit overkill, so I decided to ‘rip-off’ the template and create a clean HTML skeleton for him to us.

First we tried to work from the source of the template, but the template’s HTML & CSS were very hairy, I couldn’t wrap my head around it, so I decided to rewrite it from scratch. Who doesn’t love to reinvent the wheel? :)

I used purecss on the 1st version, but I wasn’t satisfied with the way it worked. I like to minimize HTML markup. I really dislike it there are 5 divs to size what should be a single box, when all you need to do is use CSS correctly. Unfortunately purecss works this way, you need to get your boxes inside other boxes to get things to work correctly. It’s understandable why they do that: it’s a CSS framework, the CSS directs the way the DOM is structured. CSS is complicated to get to work without a few intermediate steps. Since I was here to learn more about CSS, I dropped purecss, started using what I learned studying it for the new template.

Here are the few things I tried while working on the site:

box-sizing

box-sizing: border-box is handy: it includes the border & the padding in the box’s size. If you have 2px borders and a 1em padding in a 200px box, the box will be 200px with 2px off for the border and 200px - 2 × 2px = 196px of usable space. It simplify box’s placement, no more: My borders are 4px, my box is 200px so that’s 4 × 2 + 200 = 208px… It’s only supported by IE9+, and it needs a prefix on some browsers like Firefox. I used it when developing the site, at the end of the design process I removed it, I had to make a few adjustments here and there, but it was easy to do. border-box was neat though: no more pointless tinkering. I’ll use it again for sure.

Media queries

Media queries are the basis of responsive design. Instead of using pixels as a unit like most do, I use ems, the typographic unit of measure. That made many things simpler, like re-calculating the size of the grid when adjusting the font size.

While media queries aren’t that easy to use & lack expressive power there weren’t too bad and I managed to do what I wanted without too much thinkering.

inline-block

display: inline-block; allows you to simplify box packing: designing layouts requires less tweaks and hacks. inline-blocks are well supported by all modern browsers. IE6 supports it –short-of–, and it even works correctly on IE7! I’m kind of late to the party, better late than never.

CSS3 transition

Fancy, but meh. It’s all eye-candy, and I don’t think it improves usability / readability one bit. I’ll still used them there and there to fade in and out bits of interface.

I was trying to get Tornado’s AsyncHTTPTestCase to work with Motor, but the tests were blocking as soon as there was a call to Motor. It turns out that Motor wasn’t hooked to the tests’ IO loop, therefor the callbacks were never called. I found the solution after looking at Motor’s own tests:

motor.MotorClient(host, port, io_loop=self.io_loop)

In HTML whitespaces aren’t significant, but single space between tags can wreak havok with your CSS layout. Consider the following:

<div></div> <div></div>

Or

<div></div>
<div></div>

The space between the 2 div tags will insert a single space between the 2 divs. If the 2 divs width were 50% of the parent, they wouldn’t fit in it because of the added space. To fix this you have to remove the spaces:

<div></div><div></div>

It looks kind of ugly to have everything on the same line. My favorite way to deal with this is to move the final tag’s chevron at right before the next tag, like so:

<div></div
><div></div
><div></div>

It doesn’t look super nice, but it’s better than having everything glued on the same line.

3rd party sharing & social-media buttons are a waste of your and your reader’s time: Sweep the Sleaze

size_t

Random C fact of the day: I though that size_t was a built-in type, because that’s what the operator sizeof is supposed to return. Roman Ligasor –a co-worker– proved me wrong. It turns out that it’s defined in the header stddef.h. C is a minimalist language, why define a built-in type that will just be an alias to another built-in type?

Without transition: According to the CERT Secure Coding Standards, one should use size_t instead of integer types for object sizes.

Don’t:

int copy(void *dst, const void *src, int size);

Do:

size_t copy(void *dst, const void *src, size_t size);

I use DuckDuckGo those days, one its best feature is the bang, a smart shortcuts to other websites. I use !man & !posix all the time: it’s give you direct access to the POSIX standard manuals & specification. That’s better than relying on Linux manuals, as I have to at work.

I started drinking coffee 15 years ago, when I was a student. Like many students my sleep schedule was messed-up: I was working late, and getting up late. I loved working at night: it’s quiet, there’s almost no distraction. To compensate for my lack of sleep during the day I drank coffee, sodas, and occasionally tea. After graduating, I stopped drinking coffee and soda for a while. I switched to tea, 2 to 4 cups of tea every weekday for 10 years.

I started drinking coffee again 2 years ago when I started my new job. I drank between 1 to 3 cups of coffee at the start of the day, and 2 to 4 cans of Diet Soda during the day on top of that. I ingested 150mg to 400mg of caffeine everyday. I though that coffee was by far the biggest source of caffeine, but it turns out that sodas, and tea also contain a significant amount. A cup of coffee contains around 100mg, a can of Diet Pepsi has 35mg, while a cup of tea is around 40mg.

How much caffeine is too much? According to Wikipedia, 100mg per day is enough to get you addicted:

[…] people who take in a minimum of 100 mg of caffeine per day (about the amount in one cup of coffee) can acquire a physical dependence that would trigger withdrawal symptoms that include headaches, muscle pain and stiffness, lethargy, nausea, vomiting, depressed mood, and marked irritability.

The mayo clinic recommends cutting back for those who get more than 500mg everyday, I suspect this limit is lower for me.

I had my last coffee Sunday morning, almost 4 days ago. I’ve experience most of the withdrawal symptoms, it’s getting better, but I think I have another day or two before I can feel normal again. I didn’t even consume that much caffeine. It must be awful to be nauseous or vomit on top of the other symptoms. I imagine only big consumers get these problems, but this tells you a lot about how strong the addiction can be. The headaches are especially annoying, they’re caused by an increase of blood flow in the head, compressing the brain. I usually exercise when I want to get my mind off something or try to get back into a healthy routine, but In the case of caffeine withdrawal, exercise seems to make the headaches even worse. Aspirin works well, but it still hurts quite a bit. The worse part is how irritable I am right now, I tend to go crazy when I’m on my own, and idle. I get restless and my mind wanders, thinking of past personal injustices, and how I’ll get revenge: I get angry for noting. I can’t even focus on a book for more than 10 minutes without my mind wandering.

The good news is: it’s almost over.

[…] withdrawals occurred within 12 to 24 hours after stopping caffeine intake and could last as long as nine days.

There were positive side effect: I used to go pee 3 to 5 times a day, not anymore. My sleep seems to improve. Sleep is why I stopped caffeine consumption. I don’t sleep well most nights, waking up tired but not sleepy.

Like most things, caffeine isn’t bad, but it has to be consumed in moderation. I don’t plan to ban caffeine from my life, but I do need to reduce my consumption, and take a break from time to time.

I always forget about the HTTP server in Python. I’ve been using a quick’n dirty shell script with netcat to quickly serve a single file over HTTP for a while, but this is easier, and works better:

python -m SimpleHTTPServer [port number]

It will serve the content of the current directory.

I’ve redesigned this space after reading the excellent Practical typography by Matthew Butterick. I picked Charter as the font for the body text. Charter is recommended in the appendix of the aforementioned book as one the bests free-fonts by far. I tried Vollkorn from Google web fonts for a while before switching to Charter. While Vollkorn looked fine to me, Charter looks even better it feels crisper.

I picked fonts from the Source Pro family by Adobe as my sans & mono-spaced fonts, but I may switch to one of the DejaVu’s fonts if I find one that I like better.

I had problems with KiTTY: the session management doesn’t seem to work with multiple sessions. I looked for an alternative and found this one: http://jakub.kotrla.net/putty/. It’s basically a normal PuTTY with a patch to store Sessions on disk. It lacks the URL click-to-open feature, but I think I can live without it. I’ve been using it for 2 weeks now, and I’m happy with it.

curl is a useful tool if you’re working with HTTP. I’m fond of the -w option: it prints all kind of information about the transfert, including timing:

$ curl -s -o /dev/null -w '
url_effective: %{url_effective}
http_code: %{http_code}
time_total: %{time_total}
time_namelookup: %{time_namelookup}
time_connect: %{time_connect}
time_pretransfer: %{time_pretransfer}
time_starttransfer: %{time_starttransfer}
size_download: %{size_download}
size_upload: %{size_upload}
size_header: %{size_header}
size_request: %{size_request}
speed_download: %{speed_download}
speed_upload: %{speed_upload}
content_type: %{content_type}' http://google.ca/

url_effective: http://google.ca/
http_code: 301
time_total: 0.062
time_namelookup: 0.038
time_connect: 0.045
time_pretransfer: 0.045
time_starttransfer: 0.062
size_download: 218
size_upload: 0
size_header: 320
size_request: 153
speed_download: 3504.000
speed_upload: 0.000

My C is rusty. Here are a few tricks I forgot and had to rediscover:

int array[42];
int *pointer = array + 8;

// This will be 8, not 8 * sizeof int
size_t x = pointer - array;

Number of elements in an array:

int array[42];

// x == 42 * sizeof int. Not what we want
size_t x = sizeof array;

// The right way to do it: y == 42
size_t y = sizeof array / sizeof array[0];

That is all.

Reminder for later. Put a file-descriptor in non-blocking mode with fcntl:

static int
set_non_blocking(int fd)
{
        return fcntl(fd, F_SETFL, fcntl(fd, F_GETFL)|O_NONBLOCK);
}

I’ve tried to write regularly for more than 5 years. Yet I still struggle to publish content monthly, let alone write something every week. It’s not really a time problem: every week I waste many hours slacking on Internet, at work or at home. I could probably turn one of those wasted hours into a semi-productive writing session. I realize that ‘wasting’ time is a necessary evil, nobody can be productive all the time: sometime you just need to turn the brain off to recharge.

I’m not posting high quality content. Most of my posts in my Blog took 2 to 4 hours of focused effert to research and write, not including the time it takes for the idea to mature. I want to post less, but post better articles.

Most of my writing happen in short bursts over a few days. I’m writing a lot at the moment because I have a shiny new bitbucket repository with all my essays in progress. Unfortunately once the novelty wears out, I don’t think I’ll write at the same rate…

Scratchpad made me write more. It was inspired by Steven Johnson’s Spark file. I think this new repository with short essays in progress is a good complement to it. I’ll see how things turn out…

Cal Newport argues that the best way to write for wannabe writers like me is to have an adaptable writing schedule every week. I wonder if I should reserve a time-slot during the week for writing. 1 hour focused on writing may turn out to be the trigger I need to think and publish valuable articles.

I used a bitmap font for many years for all my development related applications. Bitmap fonts aren’t scalable, but they usually look sharper and clearer than the scalable variety, because they fit perfectly to the pixels on the screen.

My font used to be Terminus, it’s sharp and I like its shape, but it’s missing quite a few glyphs and it’s becoming harder to use as screen resolution and DPI increase. I looked for new fonts to try the past few weeks. There are many font for programming, here’s my selection of monospaced scalable fonts:

Comparison of monospaced fonts: What Are The Best Programming Fonts?

It took me a while to figure out how to dump the content of UDP packets with tcpdump/Wireshark. Here’s how you do it:

# Dump all the UDP traffic going to port 1234 into dump.pcap
$ sudo tcpdump -i lo -s 0 -w dump.pcap udp and port 1234
# Then we print the data on stdout
$ tshark -r dump.pcap -T fields -e data

This will print all the data in hex-encoded form, 1 packet per line. You’ll have to decode it to get the data in binary form. The following Python program does that:

import sys

for line in sys.stdin:
    # Binary data
    data = line.rstrip('\n').decode('hex')
    print repr(data)

Last week I installed Bootstrap as scratchpad’s CSS, before I had a minimal CSS with normalize.css, but I couldn’t get it to display the feed “correctly” on my Smartphone: the article block occupied only half the screen. I had to readjust the zoom-level after loading the page to get it ‘fullscreen’.

I’m unhappy to add a dependency like that to scratchpad, but I can’t be bothered with CSS anymore. The folks from Twitter like to deal with that ridiculous shit, so I figured I should use the thing that just works, no matter how much I dislike the idea. At least the typography is a nicer and the color are prettier.

I had a few problems with KiTTY a clone a PuTTY, and tmux, a terminal multiplexer: pane separators where displayed as ‘q’ or ‘x’ instead of lines. It turns out it’s a PuTTY problem, according to tmux FAQ:

PuTTY is using a character set translation that doesn’t support ACS line drawing.

I had this problem for a while, and I didn’t manage to solve the problem with my old bitmap font: Terminus. Maybe because the font is missing the glythes to draw lines. There were actually a few problems:

  1. My old font didn’t have line-drawing glyth (?)
  2. PuTTY character translation problem
  3. Encoding wasn’t properly detected by tmux

I recently switched to a new font: Source Code Pro by Adobe, which allowed me to fix the problem with the missing glyth. I also had to tweak tmux & KiTTY a little bit. In KiTTY’s settings, in the Windows > Translation Category: set the remote character set to ‘UTF8’, and select the unicode line drawing code points. Make sure your font support line-drawing code point. Start tmux with the -u option to tell it to use UTF-8, and you should be good to go.

I instantly became a fan of micro-credits when I first heard about the idea. It seemed to be a perfect way to get people out of poverty: loans to small businesses in under-developed part of the world. Business, exchange, trade, and the ability to sustain oneself is what lift people out of poverty: not charities, and government hand-outs. Micro-credits fit nicely my view of the world: individuals stepping up to raise their standard of living with help from non-governmental bodies.

Planet money posted an article about micro-credit, or more precisely about studies to determine how effective micro-loans are. It turns out it was a lot of hype, and not a lot of result. It doesn’t look like micro-loans improve much the standard of living of those who benefit from it.

It’s disappointing, but after thinking about it for a while it seemed foolish to think that small sums of money here and there could have a significant impact on the lives of those who live in poverty.

With Unix becoming more and more ubiquitous, the POSIX Shell is the de-facto scripting language on most computers. Unfortunately it’s difficult to write proper scripts with it: few really know it, and there’s no much good documentation available: Rick’s sh tricks is one of the best resource I found about pure POSIX Shell scripting.

After reading The Best Water Bottles on The Wirecutter –a review website–, I got myself the 800ml Klean Kanteen Classic. It’s exactly what I wanted: a water bottle with a mouth opening wide enough for ice cubes. That should help me get over my diet soda addiction: I made myself lemon water this afternoon, plain water has a difficult time competing with diet soda, but lemon water is another story… :)

I’m in relatively good shape for a programmer. I’m not overweight, but I have a little bit of extra fat, I usually shave it off during summer. As I got older, I noticed that it’s getting harder to get back in shape. After reading Hacking strength: Gaining muscle with least resistance & Least resistance weight loss by the excellent Matt Might, I started using MyFitnessPal, a food diary application. It has great ratings on Google Play Store with millions of downloads, and I managed to log a full week with it without going crazy.

Getting up early in the morning has always been a problem for me. It got better last year mostly because we have scrum stand-up meetings at 9:45am every morning. I’d like to start work earlier, especially with summer coming up: I want to get out by 5:00pm to enjoy the sun and the outdoors.

I started tracking my time of arrival at work using joe’s goals, a simple log book. It’s quick and easy to get started and to use, I like it.

I just had a weird problem with sh’s syntax, the following function didn’t parse:

f() {
    return { false; } | true
}

f should return 0: a successful exit value. The problem is that using return like that is invalid. Return expects a normal shell variable, not a list or compound command according to the POSIX spec. The solution is simply to do something like that:

f() {
    { false; } | true
}

This will return the last command’s exit code, in that case it’s true: so the value is zero. It’s still difficult to find good information about shell-scripting on the net: that I though I’d throw that here.

Google announced that they will shutdown Google Reader, their RSS feed reader this summer. I’ve been using it for many years now, feeds and Blogs aren’t trendy anymore: the cool places to post content are social networks. I though about letting Reader go without finding a replacement. RSS feeds are a better source of quality content than social networks, but checking Reader every day takes between 10 to 20 minutes.

I still decided to look for an alternative, I briefly tried Newsblur demo. I didn’t like the interface: it felt a little bit too cluttered, and the site wasn’t working well when I tried it. Then I tried Feedly for a week, and so far I’m impressed by it. The interface is clean and minimal, while still providing all the things I had in Reader and more.

The shell job control can be good alternative to terminal multiplexer like screen or tmux. Here’s a small example:

$ vi file

[1]+  Stopped(SIGTSTP)        vi file
$ jobs
[1]+  Stopped(SIGTSTP)        vi file
$ vi other

[2]+  Stopped(SIGTSTP)        vi other
$ jobs
[1]-  Stopped(SIGTSTP)        vi file
[2]+  Stopped(SIGTSTP)        vi other
$ fg %1  # resume 'vi file'
vi file

[1]+  Stopped(SIGTSTP)        vi file
$ jobs
[1]+  Stopped(SIGTSTP)        vi file
[2]-  Stopped(SIGTSTP)        vi other
$ fg %2  # resume 'vi other'

I used Todoist a few years to manage my TODO list. Then I gave up on electrical TODO lists and Todoist, it just didn’t seem to work very well because I needed to be near a computer at all time. I switched to paper, but it didn’t work either, at least for my personal use. I forgot the notebook constantly, and sometime didn’t check it for weeks. Now when I get home after work I write my tasks for the evening on a small piece of paper: this works rather well, but it’s only good for short-term tasks.

Since I got a Nexus 4, I can run modern apps without too much frustration. I’m trying a few TODO list apps to see if any of them is any good. I tried Todoist again: the web app, and the Android app. I didn’t like it: I remember Todoist being more keyboard driven, all the shortcuts I learned back then seem to be gone or not working. A lot of features are reserved for paid users, I didn’t like it enough to pay those $30. Moving on.

I looked for a better alternative and found Wunderlist. It looked nice and simple. After a week of usage, I’m not really convinced. It doesn’t do nested tasks, or more precisely you can only have 2 levels of nesting, and adding subtasks isn’t super intuitive. I use nested tasks quite heavily, that may be a deal breaker. On the other hand the interface seems to be more or less the same everywhere, which is a nice plus. I’ll keep using it for a little while, and probably get rid of it, but I want to give it an honest try.

I liked Emacs Org-mode back in the days when I used Emacs. It could be an interesting option.

Last Christmas I got the book Breakthrough Rapid Reading. I haven’t read it yet, but after listening to Skeptoid’s podcast on speed-reading I may not read it at all.

According to the Podcast the only good way to improve reading speed is… Wait for it:

To read faster, concentrate on reading slower, and read more often.

All those things they teach you with rapid reading are gimmicky: you just trade comprehension for speed. Speed reading doesn’t look that attractive when you know that… I’ve put the book at the bottom of my reading pile: I’ll probably never read it.

My biggest problem when reading is distraction. I read a few paragraphs, I get bored, and I look away for a few minutes. When I come back to the book, I’m out of it, and I have to re-read those paragraphs to get back into it. When that happens I estimate it takes me triple the time to read and comprehend this chunk of text.

Meditation is supposed to help focus, but I think to thing I need the most is sleep. I’m still pretty bad at going to bed at a reasonable time.

Reminder for later: How to manage FreeBSD ports.

FreeBSD handbook:

http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/ports.html

Man page:

http://www.freebsd.org/cgi/man.cgi?query=ports&sektion=7

Configure port:

$ make config
$ make rmconfig  # Delete old config
$ make config-recursive  # Configure port & all its dependencies
$ make config-conditional  # Configure port & deps without options

I love Nick Johnsonz’s Damn cool algorithms series, where he writes about new or unusual algorithms. I just finished the post about Homomorphic hash. It’s a cool idea, but it’s based on modular artithmetic like RSA, which is rather slow even on modern computers. I wonder if an algorithm based on Elliptic curve cryptography would be more practical.

Idea for later: A web based carcasonne like game with a bunch of “credible” AI playing it to get it started. this would “solve” the chicken-eff problem that most multiplayers game have: to attract users you need to already have a bunch of them. The challenge would be to have an AI good enough to be midly challenging to human player.

Apparently Disqus –an online commenting service for website– decided unilaterally to put Ads on their users/customers websites:

http://jacquesmattheij.com/disqus-bait-and-switch-now-with-ads

I liked Disqus, but this is a bad move on their part. I’m not going to use their service after that…

Note to self: apparently boiled eggs are easier to peel when you add baking soda to the water. I should try that.

The W3C doesn’t seem to be a big fan of Twitter. From the HTML5 spec:

The following extract shows how a messaging client’s text entry could be arbitrarily restricted to a fixed number of characters, thus forcing any conversation through this medium to be terse and discouraging intelligent discourse.

<label>What are you doing? <input name=status maxlength=140></label>

On asserts

It’s common to see C/C++ projects disable asserts when building releases. The book Cryptography engineering argues that it’s a mistake: production code is exactly the place where assertions are most needed. That’s where things should never go wrong, and if they do we shouldn’t sweep the problem under the rug.

Patrick Wyatt an ex-Blizzard developer who worked on the early Warcraft, Diablo, and StarCraft came to the same conclusion after working on Guild Wars: it’s OK to “waste” a little bit of CPU to make sure production code runs correctly.

Assertions aren’t that expensive, we really shouldn’t remove them in production. These days speed is rarely an issue while correctness is always an issue.

Do-It-Yourself Dropbox based on Mercurial:

#!/bin/sh

set -o errexit  # Exit as soon as a command returns an error

hg pull --update
hg commit --addremove --message "Update from $(hostname)" $@
hg push

How to use it:

$ hg clone <remote repo> ./shared
$ cd shared
$ cp ..../sync.sh .  # sync.sh is the script above
$ touch file1 file2
$ ./sync.sh  # This will add file1 & file2
$ rm file2
$ ./sync.sh  # This will delete file2
$ touch file3 file4
$ ./sync.sh file3  # This will add file3 but not file4

I also have a script update.sh that doesn’t synchronize remotely:

#!/bin/sh

hg commit --addremove --message "Update from $(hostname)" $@

If you’re using an editor that writes temporary files in the directory like Vim or Emacs don’t forget to add the relevant regex in the directory’s .hgignore.

\..*\.sw[op]
\.*~

If you have difficulties sleeping at night because of the noise, or if you work in an open-space and can’t focus for very long: you should give earplugs a try. They require a bit of adaptation, but after around 10 hours of use the initial discomfort almost vanish.

I tried 4 different types of foam earplugs those past 10 years. Initially I used EAR foam earplugs for many years. I tried the classic, and the neon yellow. The classic weren’t great: they’re a bit rough against the skin, which make them rather difficult to wear at night and for long periods of time. The neon yellow were a bit softer and isolated better, I used them for 4 years after I ditched the classic.

6 months ago, I decided to try some new ones. I ordered 20 pair of Howard Leight MAX & Hearos Ultimate Softness Series after reading reviews on the web (links below).

The Howard Leight MAX are great for work. They fit snugly, and isolate well. They are a notch above the EAR neon yellow in comfort and isolation. They aren’t that great for sleeping: if you wear them for more than 2 hours, they start hurting you ear canal a bit. For sleeping the Hearos Ultimate Softness are great: they don’t isolate as well as the others, but when you’re sleeping this isn’t as important. What’s important when you’re sleeping is comfort, and the Ultimate Softness are the most comfortable earplugs I ever tried. After a night of sleep your ears wont hurt a bit. I’m planning to order 100 pairs of those new ones: focus and sleep are 2 things I can’t afford to lose: I need all the help I can get in life.

Reviews links:

New year resolutions were trendy last week, I like to be fashionably late at parties… I wont set ambitious goals for 2013, like winning an Olympic gold medal, or having sex with Twin Swedish super-models. I’ll go for small manageable goals for those first 3 months:

Ideas for the rest of the year (1 per month):

Great blog post about Content centric networking or CCN:

http://apenwarr.ca/log/?m=201211#11

I often wonder what will come after the current “Internet”. CCN is a good candidate to replace the whole or parts of the IP/TCP/HTTP stack, and it can run on top of the existing stuff: IP, IP/TCP, etc… unlike say IPv6.

Awesome isn’t anymore. It’s overused, everybody knows it, as of today the first result from Google for Awesome is the Urban dictionary’s definition:

An overused adjective intended to denote something as “cool” or “great” but instead winds up meaning “lame.”

Instead of awesome, use great: Alexander the Great is better than Alexander the Awesome. The worst with the overuse of the word awesome everywhere: it’s almost always used inappropriately. It comes from awe: a feeling of fear and wonder.

Article about memory corruption, and how to detect if it’s a hardware or not from Antirez, the guy who started Redis:

http://antirez.com/news/43

I love his final idea: Get the kernel to check the memory incrementally.

I tried a few new recipes lately. Fall is great time for some soup:

The white bean / chicken chili tasted OK, but it’s a bit long to cook I find.

I really like the black bean soup, it’s faster to make and tasted better. I spiced it up with some red thai chili in the relish. I’m not big on spicy food, but most soup are bland, and they often need a bit more punch.

Japanese toilets:

http://priceonomics.com/toilets/#japanese

I have to try one.

Procrastination: I’ve finally filled the paperwork for a new savings account. I got the form in April, I looked at it, put it on a shelf, and left it there for more than 6 months. I didn’t forget about this form, I just left it, while it was at the back my mind during those 6 months. It took me literally 2 minutes to fill it out. Now I have to send the letter, and if I don’t do it right away I may leave it in my desk for weeks.

I’m like everybody: an every day life underachiever.

Frank and Oak is an online menswear maker. They release a new collection every month, and sell it exclusively on their website. They target guys with slimmer bodies. I ordered a shirt and a blazer, it was quickly shipped and delivered.

The shirt was a good surprise: it fits well out-of-box. I’m a skinny guy, the shirts that fit me best are the ultra-slim 1MX from Express. Frank and Oak’s shirt isn’t as snug, but it’s still one the best fitting shirt I had. For $45 you’ll get a decent shirt, and with the money you saved: get it tailored just right.

The blazer is okay. It’s a little bit narrow around the shoulders, but overall it’s pretty good considering it was only $50. It took me a while to realize the side-pockets were sewed, they looked like those fake pockets you get on low-end blazers.

I’ll probably order more from them.

I used to have a Wrap –a tiny x86 computer– as a router. It wasn’t doing much routing though, since it only had my desktop connected to it. I messed up with it 1 year ago while flashing the firmware, broke it, and never managed to get it to work again.

I just ordered an Alix 2d13 as a replacement. It’s nice upgrade, with a USB port, and an IDE connector. I’m planning to install OpenBSD 5.2 on it. It will be released tomorrow, right before I get the new hardware.

It’s an expensive toy: $300+ not including shipping, but I have an open platform I can hack and play with. I tried to use an old wireless router with OpenWRT, but the wireless signal was pretty bad.

Creating a queue in MySQL isn’t a great idea as highlighted in this article: 5 subtle ways you’re using MySQL as a queue, and why it’ll bite you. Yet it’s possible to create a relatively-efficient queue as long as you avoid SELECT FOR UPDATE. I had to create one for work a little while ago.

Here’s the schema for such queue:

CREATE TABLE queue (
        id INTEGER PRIMARY KEY,
        available BOOLEAN NOT NULL DEFAULT TRUE
        ...
);

The table queue is only used to lock items, and mark them as them as done. You can store data in the queue table, but I’d recommend to store it in a separate table to keep the queue table relatively small.

To lock the next item in the queue:

UPDATE queue SET id = @last_queue_id := id, available = FALSE
    WHERE available = TRUE ORDER BY id LIMIT 1

The key part is id = @last_queue_id := id: this will mark the next item with available = FALSE and set the user variable @last_queue_id to its ID. You can then get it with:

SELECT @last_queue_id

Once you’re done with the item, you delete it from the queue:

DELETE FROM queue WHERE id = @last_queue_id AND available = FALSE

The available = FALSE clause isn’t necessary, but I like to keep it just to be extra safe.

Vancouver is expensive, save money on booze: http://www.vancitydrinkspecials.ca/

Last night I ironed 5 shirts & 1 polo in less than 1 hour. I followed the instructions from the videos I posted yesterday. That’s pretty good considering that I haven’t ironed for 5+ years.

That’d work out as 1h30m weekly if I include pants. It’s too long, I’ll try to lower that to 45m with the same “quality”.

My life is very exciting.

I try to look sharper those days, that’s why I started regularly ironing shirts and pants. I used to iron weekly, but stopped after I moved to Vancouver. Back then I never really tried to improve my ironing technique, it was a mindless chore not worth perfecting.

That was a mistake, I should try to perfect my ironing technique BECAUSE ironing is boring and time-consuming.

There are quite a few videos on how to iron stuff on Youtube. Those 2 are the best I’ve found so far:

Impressive video presentation of acme(1), a “text editor” written by Rob Pike. If you saw the introducing Xiki, it may look familiar. Many ideas from Xiki are implemented in Acme. The idea that text IS the interface is pushed quite far in Acme.

I’m a Vim user, I like to use the keyboard only to drive my computer. Acme takes a completely different approach: The mouse is used everywhere, the mouse does more than in regular programs: Button 2 execute the selected command for example. I never used it, but I’m very tempted to look into it. My main grief with Vim is that is difficult to script: VimScript is yet another language you got to learn. The programming interface in Vim seems to be kind of ad-hoc and looks difficult to learn and use.

Acme seems to be easier to interact with programmatically. It may very well be my next text editor.

I posted on scratchpad a TED talk about aging, and how living past 100 years may be common in a not-so-distant future. I wasn’t convinced, but the talk was cool and informative nonetheless.

It turns out that some guy named Edward Marks did his homework and looked at life expectancy numbers and where we’re headed if we keep going at the same rate. It looks like the dream that average people could live past 100 years is indeed a just a dream. We die less, but progress is slowing down.

Cooking tip: Apparently lemon is a cooking silver-bullet

Steven Johnson is the author of 2 books about creativity: Where Good Ideas Come From, and The Invention of Air. He wrote an article about what he calls a Sparkfile, which is just a list of ideas.

He writes down every idea he has in this Sparkfile, and once in a while revisit it to see if there’s anything of value, or ideas he could combine.

That was more or less what my scratchpad was for. Blog & note taking kind of thing. I need to post more ideas in here.

Another great post by Rob Pike about UTF-8’s creation.

I posted a while ago links to a few python projects worth checking out.

I tried them all for a bit, and I really like PyRepl: an alternative to the standard Python interpreter readline-interface.

It crashes less than bpython, it’s lighter than IPython, and it’s written 100% in Python. What’s not to like? Well it crashes from time to time when I use the arrow keys, kind of like bpython. I haven’t managed to reproduce the problem consistently yet, mostly because it’s rare.

You should use PyRepl for its better completion and coloring, it’s a pretty good alternative to the bloated IPython.

Another great talk from Rich Hickey: Deconstructing the Database

He talks about Datomic a new database with an innovative architecture.

I’m making pizzas for dinner those days.

I used to coat the top of my pizza with a little bit of generic tomato sauce. I saw a a pizza sauce recipe on my favorite cooking blog last week, and decided to try it. It changes everything: because the previous sauces weren’t thick enough I had to put a ton of toppings on my pizzas. This sauce is thicker, and much tastier than what I used. Result: less topping, less work, and yet the pizzas taste better.

The Economist had a good editorial about Money last week.

I found it fascinating that money gets routinely reinvented by people who don’t have access to “regular” currency. Bartering creates such a big transaction cost that we’re almost hardwired to come up with something better.

Another surprising part is that metal money (coins) are likely a state invention. It turns out that the private sector naturally uses things with real “value”, like rice.

I tried DuckDuckGo a few times since it launched. Every time I wanted to like it, but it had a few shortcomings: the results weren’t quite as good as Google, or it was a little bit too unfamiliar.

I tried it once again this week, and I think it’s now good enough for me to switch to it permanently. Google is full of spam right now, it looks like Google refuses to ban content farms like Demand Media which operate eHow & Cracked. This switch to DuckDuckGo has more to do with Google Search deteriorating than DuckDuckGo giving me something new.

Yesterday I tried this recipe:

http://foodwishes.blogspot.ca/2012/07/veracruz-style-red-snapper-new-take-on.html

It was easy enough to prepare, but I made a big mistake: I didn’t taste the pepper before integrating it into the vegetable mixture. It was spicy, too spicy. Otherwise I recommend this recipe.

Side note: Fish is easy to cook, healthy, tasty, and often better for the environment than meat.

Simplicity Matters by Rich Hickey is a great talk about simplicity.

I’m glad I took 40 minutes to watch it, I’ve learned and understood so much.

I started programming in C 15 years ago, but I don’t consider myself an expert or anything like that. There are just too many aspects of C I’m not comfortable with. One of those is the Preprocessor.

Something I never fully understood is the STRINGIFY macro hack. The preprocessor has a “#” operator that turns its argument into a string.

#define STRINGIFY(x) #x
STRINGIFY(foo)
#define HELLO(x) "Hello " #x
HELLO(world)

STRINGIFY(foo) is expanded to “foo”, HELLO(world) is expanded to “Hello ” “world”. So far, so good, but when you try to stringify another macro it doesn’t work as expected:

#define STRINGIFY(x) #x
#define FOO bar
STRINGIFY(FOO) /* This will output ... "FOO", not "bar" */
STRINGIFY(__LINE__) /* This will output "__LINE__", not "4" */

If you look for a solution on the interweb, the answer is usually to use another auxiliary macro, and it “magically” works:

#define STRINGIFY(x) STRINGIFY2(x)
#define STRINGIFY2(x) #x
#define FOO bar
STRINGIFY(FOO) /* This wil output "bar", why?!?!? */

Why does that work? Because the preprocessor doesn’t expand macro’s arguments, but the result of the expansion can be expanded afterward. Here’s what happen:

  1. STRINGIFY(FOO) is expanded to STRINGIFY2(FOO) because of #define STRINGIFY(x) STRINGIFY2(x)
  2. FOO is expanded to bar using #define FOO bar, we now have STRINGIFY2(bar)
  3. STRINGIFY2(bar) is expanded to “bar”

Rob Pike is a pretty good photographer.

I’m exited by 2 new languages at the moment: Go and Rust.

Rust is still pretty much in development, but Go is already stable. There are a few introduction to Go floating around, but the latest one by Russ Cox just shows you the good stuff:

A tour of Go

I’m trying to get over a Diet Coke addiction. I drink 3 or 4 cans of Diet Coke each day. There’s salt and caffeine in Diet Coke to make people pee, feel thirsty, and drink more.

Going to the bathroom every 2 hours is not pleasant and may have armful health effect in the long run.

There’s 45mg of caffeine, and 35mg of salt in a Diet Coke can. A can of Diet Pepsi contains only 36mg of caffeine, and the same amount of salt. From now on I’ll drink Diet Pepsi: 9mg isn’t a big reduction of caffeine, but it’s a step in the right direction.

I like cycling caps, I like the snug fit. The short visor pointing down protects well against the sun, the wind and the rain. I have a wool one from Walz Caps, I had it for 3 months now and it instantly became my favorite hat. I’ll get rid of my regular caps, and get 2 more of those.

Cycling caps are kind of hipster’s hats, but I can deal with that.

Try one once, you may like it ;)

Finding pants that fit was a problem for me. It takes a while to find one that fits just right: each brand has a slightly different cut, and the right combination of hip / inseam size is not always available. I usually ended buying pants that were slightly too long or too short. There’s an easy solution to this problem: tailors. For $10 or less you can get your pants shortened just right.

I’m embarrassed I never went to a tailor before: in retrospect it seems like such an obvious thing to do.

Captain Clueless, signing-off.

Small python script to calculate a file’s entropy. It reads its input from stdin.

#!/usr/bin/env python

from collections import defaultdict
from math import log

def entropy(input):
    alphabet = defaultdict(int)
    total = 0 # How many bytes we have in the buffer
    buf = True
    while buf:
        buf = input.read(1024 * 64)

        total += len(buf)
        for c in buf:
            alphabet[c] += 1

    if total == 0 or len(alphabet) < 2:
        return 0.0

    entropy = 0.0
    for c in alphabet.values():
        x = float(c) / total
        if x != 0:
            entropy += x * log(x, len(alphabet))
    return -entropy

if __name__ == '__main__':
    import sys
    print entropy(sys.stdin)

If you’re looking for narcissist people, there’s a new site that’s referencing them:

http://about.me/

Cool infographic detailing bike parts.

I planned to write a funny essay about how I’m a nerd who can’t dress, that I finally realized the obvious, and would take better care of my look from now on. It turns out that writing funny essays takes a long time, and it wouldn’t be that funny anyway.

In May I decided to get into that “style” thing. I grew a light beard back in March, and it altered the way people perceive me more than I expected. No need to show ID at the liquor store anymore, at 32 it was about time…

Others look at you a lot, way more than they listen to you. Look is a quick and relatively reliable way to weight someone. A guy wearing a nicely cut suite is not a homeless guy to anyone, someone wearing skinny torn jeans and a dirty T-shirt ‘Punk not dead’ is probably not a banker.

I want to get better at looking good, and I want to talk about it. I’ll post more later.

Too few updates lately… I have been pretty busy at work. Here’s something I’ve learned today. In C, when I wanted to initialize an array to all zeros I used memset. But there’s a simpler way:

int array[3] = {0, 0, 0};

OK, that’s nice. But what if the array is really big? There’s a shorter version with the same effect:

int array[3] = {};

When you omit a parameter in an initializer, it automatically defaults to the type’s zero value. If you want to initialize an array without specifying all the elements, you can do something like that:

int array[10] = {1, [3] = 3, [8] = 2};

This will produce an array like that:

[1, 0, 0, 3, 0, 0, 0, 0, 2, 0]

See GCC’s doc about Designated Initializers.

I bough a new bike a few months ago: a fixed gear. After a week of getting used to it, it feels great. It feels like being connected to the ground: it’s easy to adjust your speed, most of the time there’s no-need for brakes: you can slow down with just the pedals. I’m going to keep the brakes on for a little while, better be safe than sorry ;)

I still have my old bike. It’s going to be my “rainy days” bike. I plan on converting it to fixed gear sometime after this summer.

I’ve been looking for a cheap track frame to replace my old road frame. I found this one for $169.

libtomcrypt is a pleasure to work with: the code is clean, readable, and things are well laid-out. One the few things I disliked was how ECC keys are represented internally:

typedef struct {
    void *x, *y, *z;
} ecc_point;

/** An ECC key */
typedef struct {
    /** Type of key, PK_PRIVATE or PK_PUBLIC */
    int type;

    [...]

    /** The public key */
    ecc_point pubkey;

    /** The private key */
    void *k;
} ecc_key;

If type is PK_PUBLIC, the private component of the key should probably be NULL. I think this is suboptimal and potentially confusing. It seems to me that the following would be better:

struct ecc_public_key {
    void *x, *y, *z;
};

struct ecc_private_key {
    struct ecc_public_key public;
    void* k;
};

Introducing 2 different types for public and private keys allows us to be more specific with our type requirements. For example the function ecc_shared_secret looks like that:

int  ecc_shared_secret(ecc_key *private_key, ecc_key *public_key,
                       unsigned char *out, unsigned long *outlen);

A new API could enforce the key’s type more easily:

int  ecc_shared_secret(const struct ecc_private_key *private_key,
                       const struct ecc_public_key *public_key,
                       unsigned char *out, unsigned long *outlen);

This way you can get rid of checks at the beginning of some functions, like this one:

/* type valid? */
if (private_key->type != PK_PRIVATE) {
   return CRYPT_PK_NOT_PRIVATE;
}

Now the key type is explicit, private keys will only be struct ecc_private_key, public ones struct ecc_public_key. If you want to be able to pass both keys types, you can do something like this:

int ecc_some_function(struct ecc_public_key* public, void* private, ...);

And pass the private component of the key manually.

Designing a good API is hard. Even the little choices can be difficult. Let’s take a function to decrypt AES blocks for example, this function will consume a buffer 16 bytes at a time. Here’s what such function would look like:

void decrypt(const void* input, size_t nbytes);

(There’s no output parameter, we’re just looking at the input here)

input is a pointer to the buffer we’re working with, nbytes is how many bytes to read from the buffer.

The function consumes blocks of 16 bytes, so what happens when nbytes is not a multiple of 16? Should we silently ignore the few extra bytes? Should we have an assert(nbytes % 16 == 0)? Maybe we could specify how many blocks to consume? But then the API’s user would have to remember to divide the buffer size by 16.

I don’t know what the good answer is there.

Reading or sending data on a TCP socket looks simple, but it can be tricky. read(2) & write(2) don’t have to consume the whole data they get. If you call read on a socket requesting a billion bytes, you’ll probably get less.

If you need to read a set number of bytes, you’ll have to repeat calls until you have all the data you want. Here’s an example of such loop in Python:

def readall(sock, count):
    r = bytes()
    while count:
        x = sock.recv(count)

        if x == '': # End of file
            raise ...

        r += x
        count -= len(x)
    return r

In Python sockets have a sendall method which does more or less that, but there’s no recvall method. There’s another method: the socket.makefile() function creates a file-like object for the socket. With this file-like object there’s no need to loop to get the whole buffer back:

sock = socket.create_connection(...)

f = sock.makefile()
x = f.read(1234) # Will return a 1234 characters long string

You must make sure buffering is enabled for this to work.

PS: NOTE that calling read/write on a socket is basically the same a calling it with recv/send without flags.

Reminder to self.

The Open Group Base Specifications Issue 7, also known as IEEE Std 1003.1-2008, also known as POSIX.1, is publicly available here:

http://pubs.opengroup.org/onlinepubs/9699919799/

tmux, the terminal multiplexer from heaven, can copy-paste!

Here’s how to use it:

  1. C-b [ start the copy
  2. Move around a stop on the character where you want the selection to start
  3. Press Space to start selection
  4. Press Enter to finish selection
  5. C-b ] paste the copied text

I finished the Kobold Guide to Board Game Design last night. I never tried to create or modify a board game, but I think what makes board games great also applies to other kind of games, especially online browser games.

The book is a collection of articles by various game designers sharing their ideas and stories. The authors include: Richard Garfield (Magic: The Gathering), Steve Jackson (Munchkin), and Dale Yu (Dominion). The book is easy to read and mostly jargon-free.

I expected the book’s content to be more analytical than it is. Most articles turned out to be practical and concrete, and it’s great. I wish there was a little bit of ‘theory’, but board game designers don’t seem to use math & statistics to balance and design their game.

Great read overall: highly recommended, five stars, and all.

Rob Pike: The byte order fallacy:

http://commandcenter.blogspot.ca/2012/04/byte-order-fallacy.html

http://thread.gmane.org/gmane.linux.kernel/1126136

Don’t just make random changes. There really are only two acceptable models of development: “think and analyze” or “years and years of testing on thousands of machines”. Those two really do work.

If you can ignore the irritating InfoQ video/presentation player, this Rich Hickey’s interview is full of fresh and innovative ideas:

http://www.infoq.com/interviews/hickey-datomic

John Carmack posted on Twitter:

I can send an IP packet to Europe faster than I can send a pixel to the screen. How f’d up is that?

Somebody asked about that on superuser, John Carmack went on to explain why it was like that. He didn’t just cross reference a few sources to come to his conclusion: he measured, he checked his assumptions with experiments. He acted like a scientist, adapting his mental model to the world.

While hacking on my dwm I noticed this line in dwm.c:

while(running && !XNextEvent(dpy, &ev))

Notice the ‘not’ before XNextEvent. I wonder why it’s here, as far as I can remember XNextEvent isn’t supposed to return something. A quick look at the manual helped, but didn’t solve the problem:

int XNextEvent(Display *display, XEvent *event_return);

The function returns an int but there’s no explanation on what this return value is. In X.org’s libX11’s source code XNextEvent always returns 0.

I imagine that somewhere there’s a version of the Xlibs that return a non-zero value when there’s an error, maybe there’s a doc somewhere explaining what this means, but I couldn’t find it. Or maybe it’s simply an undefined behavior, an error…

When I connect to a remote host via SSH, I like to start a new Tmux session or re-attach to an existing one. Before that’s how I was doing it:

if tmux has-session
then
    tmux attach-session
else
    tmux new-session
fi

There’s something much simpler: tmux attach || tmux new

Things that would be nice to fix with URLs.

  1. Drop the //, http:google.com is shorter and easier to read. Tim Berners-Lee regretted putting those 2 extra characters when he started the web.

  2. Inverse domain names. Example: sub.domain.com, com is the top level, and sub is the bottom level: domain names put the top level last. Path in URLs do the opposite, the top level is first:

    http://sub.domain.com/blog/2012/04/18/
           <--<------<---+---->---->-->-->
    

With those principles in mind, new URLs would look like that:

http:com.domain.sub/blog/2012/04/18/

Create chaos in your C program with just 1 line:

#define struct union

Aubrey de Grey: Why we age and how we can avoid it

I don’t buy it, pushing death back by 30+ years sounds too good to be true. It’s nonetheless one the best TED talks I’ve seen.

Are Bananas Really as Bad for you as Cookies?

I’ll certainly not switch from bananas to cookies for my mid-afternoon snack, but switching from bananas to apples might be good. Starting today :)

A long, but good and detailed article about RPython: Fast Enough VMs in Fast Enough Time.

Alice is a librarian: she has a house full of books. Bob likes books, he wants to read as many as he can, and is willing to pay. Alice –being of savvy businesswoman– wants to open her library to the public for a fee. She wants to maximize profits; Bob wants to maximize the number of books he gets, up to the limit of what he can read, and he wants to minimize the money he spends.

The books are a limited resource, if Bob takes out all the copies out of the library Alice can’t have more clients, she needs to manage her library to make sure customers don’t abuse the system.

We’ll consider everything else to be equal. All books have the same value, and they all take the same time to read.

Let’s consider 2 different business models:

All-you-can-read

Alice gets $20 from Bob, and he gets free access to the library for the rest of the month.

To increase her profits Alice needs more customers, that’s the only way; she can’t charge Bob more. Since the number of books is limited, more customers means less books per customer. Every time a customer takes out a copy from the library it reduces Alice’s potential profit. To maximize her profit Alice should minimize its resource usage: limit the number of customers, or how many books a customer can take each month. Bob tries read as many as he can: one extra book doesn’t cost him anything, without any quota he can get more than his fair share.

Alice’s goal is not aligned with Bob’s goal. Alice wants to reduce the number of books Bob reads, Bob wants to maximize it.

Pay-per-book

With a price of $1 per book, Alice doesn’t really care how many customers she has. She wants to rent as many books as possible. 1 or 100 customers doesn’t really make a difference to her bottom-line.

Here our 2 characters’ goals are aligned, Alice wants Bob to read as much as he can. Bob can choose how much he reads and spends.

So What?

It’s clear that the pay-per-use formula works better than all-you-can-read.

So why do we use the worst model for our Internet access and phone? Why don’t we try to align the goals of carriers and customers?

Getting connected is not all about bandwidth; there are fixed costs than can’t be easily covered by a pay-per-use model. A price per Gigabyte with a minimum price per month could lower our bills while holding back the freeloaders.

That’s not going to happen anything soon though. Many have interests in keeping the old fixed price per month going. Internet and phone carriers like it, because it’s an easy way to maintain and increase their profits. Customers will pick up plans that exceed their needs, and pay more than they should as a result. The price of bandwidth tends to fall steadily over time, but carriers don’t lower their price very often. Freeloaders also want to keep the system the way it is, customers who use most of their quota are the ones getting the best value from their broadband access.

If we want broadband to be more ubiquitous and cheaper, we need to treat it as a real commodity, like water or electricity.

I’m reading http://0x10c.com/doc/dcpu-16.txt, The DCPU-16 Specification. A small computer for Notch’s next game. Notch is the guy who built Minecraft.

Instructions are simple 16 bits words with the following format (bits):

bbbbbbaaaaaaoooo

oooo is a 4 bits op code; a’s, and b’s are two 6 bits operand. The instruction format is a little more complicated than that, but that’s roughly it.

To me this looks like a pretty good candidate for a perfect hash function like the ones created by Gperf. What kind of tree do we have?

The 4 bits values spawn 16 branches. 6 bits value spawn –according to my quick glance at the spec– 12 branches. That’s 16 * 12 * 12 = 2304. Quite a bit more than I expected. Gperf might not be such a good idea after all.

Cooperation and Engagement: What can board games teach us?

Google just announced Account Activity, a new feature that let you see what you’ve done on its services every month. Big companies like Google and Facebook know how valuable personal data is: it helps get more customer, and it can be sold more or less directly to advertisers.

It’s also useful to their customers. I suspect my personal data is most valuable to me, I’m sure I can get more use of it than Google or advertisers. This extra data is another reason to use Google’s services. Well played Google, well played.

Linux 3.3: Finally a little good news for bufferbloat

From 7 Years of YouTube Scalability Lessons in 30 Minutes:

You have to measure. Vitess swapped out one its protocols for an HTTP implementation. Even though it was in C it was slow. So they ripped out HTTP and did a direct socket call using python and that was 8% cheaper on global CPU. The enveloping for HTTP is really expensive.

Not a surprise for me, but I guess it would be for most people. HTTP is not a good general purpose protocol, it’s not even good at doing what it was designed for. I try to avoid HTTP like the plague, but it’s difficult to go against the herd of web-developers who think “HTTP is th3 4w3s0me!”

I haven’t maintained my bike for a year, last week-end it got a long overdue tune-up.

Saturday I went to the Pedal Depot a small bike workshop, where you can rent stands and tools to fix your bike, with friendly employees to help you when you’re stuck or have a question, they also sell second hand parts.

There’s another bigger workshop nearby on Main St. I went there 3 years ago, I was able to buy and assemble my current bike for around $200. It took me almost 10 hours over 2 days to get it all done, it was long, but it was fun, and I learned a lot.

The Green Biscuit seems like a good way to improve ice hockey stick handling during summer. I play roller hockey with a ball during summer, when I go back to ice hockey with a regular puck it takes a few weeks to get used to the puck again.

Something to look at again in a few months.

I realized this week-end that I had an artificial scarcity problem with bus tickets.

My old process was:

  1. Realize I’m out of tickets
  2. Go to the store and buy 1 booklet of 10
  3. Repeat after a few weeks.

The new one is:

  1. Realize I’m out of tickets
  2. Go to the store and buy 10 booklets
  3. Chill for 6 months because I now have 100 tickets

I don’t know if the post about nonces was really clear.

To encrypt the counter, you need to put it in a buffer the size of a block. For example to encrypt the counter 1234567890, you’d have something like that (notation little-endian):

0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x00 0x00 0x00 0x00 0x49 0x96 0x02 0xd2
                    ^^^^^^^^^^^^^^^^^^^ 1234567890

You encrypt that block, then you use the resulting encrypted block as initialization vector to decrypt the first block of the encrypted message.

I wanted to use only unixy tools to build scratchpad, but it turned out to be cumbersome. The HTML and Atom generators are written in Python: 103 lines of code as of today.

I tried to use rc and sh, but it was quite inconvenient to use. After a while I decided to fall back to Python.

These days I read Cryptography Engineering. I just started the part about block cipher modes, that’s where I learned about Nonces.

Using nonces with block ciphers is a good way to minimize the space taken by Initializations Vectors or IV. Instead of sending an additional block with the IV, you associate a number (counter) with each messages. The counter doesn’t necessarily need to be transmitted with each messages, it can be implicit: for example the first message could have 0 as nonce, the second 1, etc…

Then you encrypt the counter with the raw block cipher and use the result as the IV for the 1st block. Simple and elegant, I really like this crypto ‘trick’.

From I’ve read so far I highly recommend Cryptography Engineering. It’s a pleasure to read, and you might learn a thing or two.

I just played another game of Dominion. I’m just getting started with it, but I can already understand why it’s such a popular game:

Looking forward to play more Dominion! :)

I usually clone Virtualenv’s repository to use it. Here’s a quicker solution:

curl -s https://raw.github.com/pypa/virtualenv/master/virtualenv.py | python - .env

Cool Python projects worth checking out:

Possible framesets for a future fixed gear:

The Levi’s 511 Commuter looks nice:

I don’t like it when Mercurial opens vimdiff or other external tools during merges. So I added this to my hgrc:

[ui]
merge = internal:merge

I believed that passphrases were pretty strong. I was probably wrong:

by our metrics, even 5-word phrases would be highly insecure against offline attacks, with fewer than 30 bits of work compromising over half of users.

http://www.lightbluetouchpaper.org/2012/03/07/some-evidence-on-multi-word-passphrases/

I just played Dominion with co-workers. I think it was the 1st time I played a 4 players game of Dominion. It was a surprisingly fast, less than 40 minutes including setup time.

Looking forward to play some more Dominion.

I’ve added Google Analytics’ tracker on Scratchpad, because I can.

Scratchpad’s database is a simple log file like that:

2011-12-30T21:22:48-08:00

First post
^L
2011-12-31T01:42:24-08:00

Second post
^L

Just the time, and the content, and ^L –the ascii character 12 or form feed– as separator.

This gives me ‘super easy simple’ back-ups. On another server’s crontab I just add:

curl http://henry.precheur.org/scratchpad/log > backup/$(date)

People get that long passwords with many different characters are safer. Short password = bad, long password = good.

Humans are good at making analogies, a lot of us think that long keys are better than short ones. A crypto-system with 1024 bits keys is safer than one with 256 bits keys, right?

YES! Yes, if “everything else is equal”. If you know a little bit of cryptography, you know there’s a lot more in a crypto-system’s security than its key length. Passwords are much weaker than keys, most passwords are problably weaker than a 32 bits random key.

Marketers uses our –correct– assumption that more bits in the key add much security to sell us insecure products with very long keys. Something true makes us believe something wrong. 10 years ago I believed that long keys significantly improved security. I should be more wary of all those areas I don’t know much about. If I make an analogy with cryptography: my lack of knowledge will make me believe wrong things :)

Quick’n dirty shell script to serve a single file over HTTP:

#!/bin/sh

trap 'echo stop; exit' SIGHUP SIGINT SIGTERM

while true
do
    (echo -e "HTTP/1.1 200 OK\r\nContent-Type: text/html\r\n\r"; cat $*) | \
        nc -l 8000 > /dev/null
done

You’ll need netcat (nc) to run it, it will listen on port 8000.

I’m writing the HTML and Atom generation scripts in Python. I tried to use Awk, but it was kind of inconvenient. Lua seemed to be promising, but the lacks idioms I got used to with Python made it a bit frustrating.

I wanted to finish a first version of Scratchpad before the end of January, it’s late-February now: I better get on it with what I know. So Python it is.

Why are score pages so dawn awful?

I follow the NHL, I haven’t found any score page which does a decent job of showing the information I want efficiently.

It might be a good project idea for later :)

/usr/bin shouldn’t exist, /usr was what /home used to be, and today’s Unix hierarchy doesn’t make sense:

http://lists.busybox.net/pipermail/busybox/2010-December/074114.html

I just finished reading “The Development of the C Language” by Dennis Ritchie: a great read for every C programmer.

I find it remarkable that the early versions of C were developed and used in a very constrained environment. Memory and CPU time were scarce, the language had to be simple to implement. I’m convinced that constraints and limitations fuel creativity, not restrain it. It’s easier to find a great solution to a problem when the set of solutions is limited.

From “The Development of the C Language”:

Thompson’s PDP-7 assembler outdid even DEC’s in simplicity; it evaluated expressions and emitted the corresponding bits. There were no libraries, no loader or link editor: the entire source of a program was presented to the assembler, and the output file.with a fixed name.that emerged was directly executable. (This name, a.out, explains a bit of Unix etymology; it is the output of the assembler. Even after the system gained a linker and a means of specifying another name explicitly, it was retained as the default executable result of a compilation.)

I’ve read the Rust’s tutorial last night.

So far I’m pleased by its design choice. I like that everything is constant by default. If you want to modify something, you have to declare it mutable since the beginning.

The thing I’m not a big fan of are vectors. Vectors are Rust’s version of arrays, they are ‘uniquely’ allocated in the heap. I like to have arrays on the stack in C, I’m not sure it’s OK to have ‘everything’ on the heap in practice.

I just looked at Rust from Mozilla. It’s more or less a Go-like language. After looking at it for 10 minutes, I think I like it better than Go.

Turns out that using multiple characters for RS is a GNU extension.

So my reverse example doesn’t work with other implementations of Awk. I should know better, every time I’ve used GNU Unix programs I’ve run into portability problems.

For scratch-pad I need to reverse the order of the records in my log file. The tricky thing is that each record is one or more lines separated by a form feed or page break: “\f” or ^L.

Here’s my 1st version using getline. getline allows us to consume lines without exiting the current rule:

BEGIN { i = 0 }
{
      do {
              sort[i] = sort[i] $0 "\n";
              getline; # Where the magic happens
      } while ($0 != "\f");
      sort[i] = sort[i] $0;
      i += 1;
}
END {
      for (x = i - 1; x > -1; x--) {
             print sort[x];
      }
}

It worked well enough, but there’s something much simpler. In Awk the record separator can be any string, it doesn’t have to be a carriage return. We can change to record separator at runtime using the RS variable. This simplifies things:

BEGIN { RS="\f\n"; i = 0 }
{
      sort[++i] = $0
}
END {
      for (x = i; x > 0; x--) {
             print sort[x] "\f";
      }
}

Played 2 games today. Both were 3 Rax all-ins.

1st was against a Protoss on Shattered temple. I fucked up, lost a SCV, forgot to put one back to build the Baracks. I left right away, I don’t think it’s smart to stay in a game if you screw up early. Better start over and get a ‘real’ game going.

2nd, against another Protoss was a win. He didn’t manage to scout me before my 3 Rax completed. Easy win. 3 Rax all-in is pretty strong at low levels.

I’m learning rc. This one took me a while to figure out:

; for (i in 1 100 10) { echo $i } | sort
1
100
10

It turns out that this is not what I expected: I though that the 3 numbers would be sorted. Here’s how rc interprets it:

for (i in 1 100 10) { echo $i | sort }

What I wanted was:

{ for (i in 1 100 10) echo $i } | sort

It took me a long time to figure that one out:

    $ awk '/\.\./ { print "2. " $0; next }
           /\./ { print "1. " $0; next }
           { print }' <<EOF
    > foo
    > .foo
    > ..foo
    > EOF
    foo
    1. .foo
    2. ..foo

I didn’t know that command next which allows you to jump to the next line without executing the other rules.

I forgot to mention that warning that Mercurial gives me:

warning: filename contains ':', which is reserved on Windows: 'scratchpad/2012-01-19T18:35:19-08:00'

Using RFC 3339 times as filenames was not a good idea.

I think I’m going to change the architecture of my scratch pad. Instead of a collection of files in mercurial, I’ll use a simple log file. It’ll give me version control for ‘free’, but I’ll lose the completion from the shell. On the other end I wont use Mercurial, that’s one less tool to worry about.

Not sure if that’ll work. I’ll write a prototype and see how it goes.

I played 2 games of Starcraft during my lunch time break today. I lost against 2 silver players. I was way too passive during both games. I need to be more aggressive, I should go 2 Rax pressure every time.

3rd day of practice. I’ll try to play more tonight.

I finally did a biggish blog post today. It’s been a while since I last blogged. Time to get back into it seriously: 1 article per month. It doesn’t matter if it’s bad. Maybe I’ll take stuff from this scratchpad. That’s what it’s for ;-)

There’s going to be a Starcraft 2 tournament at work next month. I’m planning to participate, but I didn’t play Starcraft for a while now. It’s time to get back into it.

1st thing: pratice a set build order tonight and play a ladder game.

I lost my rear bike light today. I went to Canadian Tire to get a new one, I was underwhelm by the choice and the price. A crappy looking rear light was $9; batteries not-included. I’m cheap, I didn’t buy it.

I went to MEC instead. There for the same price I got a rear light and a small front light, batteries included. A friendly service was also included.

I should know better, specialist store have almost always cheaper and/or have more choice than generalist.

After working at Image-Engine, I realized how important color temperature is. When you take a picture indoor and it looks all yellowish, that’s because the ambient color temperature is ‘low’. Most cheapish light bulbs have a color temperature around 3000K. When you are outside, the color temperature is usually between 4000K and 6500K.

Today I decided to replace the light bulb over my desk at home with a 5000K+ light. Since almost nobody knows about color temperature, it is mostly absent from the technical specifications of cheapish light bulbs. It’s hard to know which one to buy. Luckily the Energy star website has a complete list of its certified light bulbs, with their respective color temperature.

I’ll probably get a Philips ‘Mini Twister Daylight Compact Fluorescent Bulb’, it is supposed to be ‘daylight’-like, with a color temperature of 6500K and a good luminance.

Video is full of stuff like that:

http://mjg59.dreamwidth.org/8705.html

I’m about to write the html generator for my scratchpad. I’ll probably use rc and shell tools, there’s no need for something more complicated using Python and Jinja for example.

Socks are a problem for me. I never have enough of them. I have plenty of T-shirts, but only 10 pairs of socks at any given time.

It is time to end this, and buy 20 pairs in one go!

http://matt.might.net/articles/artificial-scarcity/

I’ve just recorded my first test screencast. I’ll need some more practice before posting on Youtube…

Here’s the command I use under OpenBSD to record:

ffmpeg -f sndio -i rsnd/0 \
    -f x11grab -r 10 -s 1284x772 -i :0.0+1,16 \
    -acodec pcm_s16le \
    -vcodec libx264 -vpre lossless_ultrafast -threads 0 \
    -y "./$(date +%FT%T).mkv"

I wanted to generate all the HTML for the scratchpad using Python, but I’ll try to use rc instead.

I just wrote a small pipe editor inspired by vipe. It’s a short shell script:

#!/bin/sh

case $1 in
  -h|--help)
      cat <<EOF
Usage:
${0} [command] [arguments...]

${0} is a pipe editor. It read its standard input, opens in a text editor,
and write the result to its standard output.

EXAMPLES

Edit a file a write it into another file:
  cat input | ${0} > output
or
  ${0} < input > output

Edit a file using gvim --nofork, a pipe the result into wc:
  ${0} gvim --nofork < input | wc

To call ${0} without any input, use /dev/null:
  ${0} < /dev/null
EOF
      exit
esac

args=$*

function edit {
  ${args:-${VISUAL:-${EDITOR:-vi}}} $*
}

tmp=`mktemp`
cat > $tmp
edit $tmp < /dev/tty > /dev/tty
cat < $tmp
rm $tmp

Yay, I’m finally done with the first version of my little scratchpad.

I’ll tell you some more later, when this published ;-)

Let’s talk about the buildin fc in zsh. fc allows you to edit the last command in your editor. It might sound kind of pointless, but it makes the command line that much more powerfull. You can have ‘real’ programs doing something useful in 1 minute right from the shell.

This makes languages like Awk that much more interesting.