I started Scratchpad as a blog for my lesser writings. I wanted to publish
rough essays on it, and later turn the best of them into polished articles for
my main blog.
The statistics
I had a decent rate of publishing over the past decade: 221 posts; 40,888
words; and 282,533 characters. Unfortunately I have not turned a single
post into a quality long form article. I am happy with how the experiment
turned out, and I am glad that I sustained a semi-regular pace of production
over a decade.
Publishing is number 1 priority
When I write something, even if it’s a rough draft with half-baked ideas,
I try to publish it. It makes sense to put most of your work out there for
the world to see, even mediocre and flawed artifacts. We are bad at judging
and assesing our own creations, we have blind spots, we need others to help
us, guide us, and steer us towards our most promising musings. Exposing the
fruit of the creator’s labor to the audience is the only way to evaluate it.
Write, get your thoughts checked by the world, and see if somebody cares. Maybe
some of your readers will post feedback, critic or compliment to your inbox.
Publishing with your name is risky
It doesn’t matter who you are and where you live: publishing with your
personal identity publicly is always risky. When I publish here with my real
name, I expose myself to small perils: my writing may be subpar, my thinking
may be faulty; this may reflect poorly on me in the future. My opinions may
go against the grain of public opinion or the authorities’ interests. Now,
or in 10, 20, 40 years…
Nobody knows if the author has sinned until the work has been published for
long enough for the public mood to become hostile.
Publishing anonymously is less risky
If they can’t find you, they can’t get you. Anonymity can be liberating if
you live in an environment hostile to deviant work and ideas.
The major downside of anonymity is: total obscurity at the beginning.
Starting an anonymous blog means I have to start from absolute zero. I have
to build my audience from scratch. I can’t rely on any of my existing
connections and assets.
I still think it’s worth it. My hypothesis is that publishing with a pseudonym
can set my creativity free. It would give me the freedom to step into topics I
would normally avoid, the freedom to explore areas considered off limit.
I want to see what happens.
Publish both ways
I will split my presence on the web in two:
My personal homepage, that I will revive after being dormant for about
10 years. Where I will publish the material I am comfortable putting my name
on.
My secret playground. A feed to experiment, and take weird, controversial,
or misinformed positions. All without suffering bad consequences in my life.
Anonymity is Internet’s greatest gift
Anonymity is the Internet’s best feature. Anonymous Internet is a parallel
universe with less friction than real life, we can play in this space. The
anonymous netizen can explore and experiment further.
I installed a new battery on my Thinkpad X1 Carbon
I fell in love with Thinkpads since I got my first one in 2005: it was the
T42 model. It worked well under Linux, a rare feat for a laptop at the
time. Its keyboard was superb, with a satisfying tactile feedback, and a crisp
quiet click. It was the best keyboard I ever used back then. I unfortunately
had to give this Thinkpad back when I quit my job to move to Canada in 2007.
A few years later I got a used X61, I replaced its internal HDD with a
must faster SSD drive, this gave the laptop a new lease of life. I used it
for a few years before giving to one of my brother attending university.
In 2013 I bought the then brand new Thinkpad X1 Carbon 1st generation. I
loved it when I first saw it, it was like a MacBook Air, but in black with
a somewhat open hardware, and a decent keyboard. Initially a full battery
lasted 6 hours.
Nine years later, the battery has aged. It held about 60% of its original
capacity, but I would get at most a couple of hours out of a single charge. The
laptop felt sluggish and ran hot under load. I assumed this was because
the software running on it was more expensive to execute than what it ran
back when it was new. My X1 Carbon was sparsely used in the past few years,
because it was unpleasant to use: slow, toasty, and dead in two hours.
A few weeks ago I decided to get a new Thinkpad as my work laptop. I settled
on a used X270, that I’ll talk about in a later post. I got a new battery
for the X270, and I saw that the store also sold batteries for my X1 carbon,
so I ordered one for 55 CAD plus shipping to try to revive my aging device.
Before I got the battery in the mail I cleaned the outside of the Thinkpad
and vacuum-cleaned it to suck much dust out before opening it. To my
surprise the laptop performed better after this quick clean: the machine
felt snappier, cooler, and the battery lasted a bit longer. I believe the
accumulated dust impeded the cooling system, and cleaning it made the laptop
work better overall. The cooling was more efficient, saving some energy, and
the processor had more headroom to clock up when needed. Less fans spinning,
less laps roasted, and less energy wasted. \o/
Once I got the package with the batteries in the mail, switching the old
battery with the new one was relatively easy. I followed the instructions from
Ifixit, and 15 minutes later the new battery was installed.
After a couple of days of use to let the battery calibrate, the laptop is
back to full health. I get between 4 and 6 hours of battery time, and the
computer feels responsive and cool. I did the upgrade a week ago and I’m
still delighted to use this old friend of mine again.
Replacing the battery on your devices is one of the best ways to preserve
the environment, and get more utility out of your electronics.
Highlights from Gerd Gigerenzer’s interview with Russ Roberts.
On the public’s concern about online privacy:
And, as you hinted before, there’s the so-called Privacy Paradox, which is
that, in many countries, people say that their greatest concern about their
digital life is that they don’t know where the data is going and what’s done
with that.
If that’s the greatest concern, then you would expect that they would be
willing to pay something. That’s the economic view. […]
[…] Germany is a good case. Because in Germany, we had the East German
Stasi. We had another history before that—the Nazis, who would have
enjoyed such a surveillance system.
And, so Germans would be a good candidate for a people who are worried
about their privacy and would be willing to pay. […]
I have done three surveys since 2018, the last one this year. With
representative sample of all Germans over 18. And asked them the question:
‘How much would you be willing to pay for all social media if you could
keep your data?’
We are talking about the data about whether you are depressed, whether
you’re pregnant, and all those things that they really don’t need.
So: ‘How much are you willing to pay to get your privacy back?’
75% of Germans said nothing. Not a single Euro. […]
So, if you have that situation where people say, ‘My greatest worry is
about my data’; at the same time, ‘No, I’m not paying anything for
that,’ then that’s called the Privacy Paradox.
The public’s concern about surveillance is similar to the concern
about the environment: the public understands the problem, but doesn’t
really care.
I believe most people fake their concerns about surveillance and environmental
decay because that’s what they are expected to do in polite company. The
public shows its true color once it has to expend resources on solving
the problem instead of merely virtue signaling.
Gerd Gigerenzer made another great point about surveillance; we get our
citizens started early these days:
I think there’s already surveillance in a child’s life. Remember
Mattel’s Barbie? The first Barbie was modeled after a German tabloid
cartoon, the Bild-Zeitung, and it just gave totally unrealistic long
legs and tailored figures. The result was that quite a few little girls
found their body not right. In 1998, the second version of Ken could talk
briefly—utter sentences like, ‘Math is hard. Let’s go shopping.’
The little girls got a second message: They’re not up to math. They are
consumers. And the 2015 generation, called Hello Barbie, which got the Big
Brother Award, can actually do a conversation with the little girl. But,
the little girl doesn’t know that all the hopes and fears and anxieties it
trusts to the Barbie doll are all recorded and sent off to third parties,
analyzed by algorithms for advertisement purposes.
And also, the parents can buy the record on a daily or weekly basis to
spy on their child.
Now, two things may happen, Russ. One is the obvious, that maybe when the
little girl is a little bit older, then she will find out, and trust is
gone in her beloved Barbie doll and also maybe in her parents.
But, what I think is the even deeper consequence is: the little girl may not
lose trust. The little girl may think that being surveilled, even secretly,
that’s how life is.
And so, here is another dimension that the potential of algorithms for
surveillance changes our own values. We are no longer concerned so much
about privacy. We still say we are concerned, but not really. And then,
we’ll get a new generation of people.
Despite all this I’m still running most of my digital life on Google’s
infrastructure. I must make a move.
Bullshit is a zero sum game
When two bullshitters meet, they usually start competing within minutes.
Bullshit works best when one has a monopoly on it. As soon as there’s
competition, nonsense loses some of its power. If two narcissists take part
in a group conversation, they have to exaggerate more and more to grab
attention. This leads to an arm’s race that quickly undermines the whole
lying to get status shtick.
We can see how attention seekers ruin polite conversations on social
media. Being the bullshitter-in-chief is hard work: politicians and
media personalities must constantly raise their game, it is exhausting.
Most bullshitters dabble in politics but aren’t really in the political
arena: it’s the big league, it’s too competitive.
Bullshitters are in the game for the easy status. They usually avoid each other
and hang out in small circles of normies that will just go alone with them.
Being anti-social can be a competitive advantage
A few weeks ago on Hacker News, someone asked a question about a job offer
they got from Amazon. An Amazon employee replied:
For me Amazon took an unprecedented toll on my mental and physical health. I
did earn enough money, but I immensely regret all the time I didn’t
spend with my family over the years, all the friendships that faded,
and the constant reminder from leaders how I could always do better -
nothing was ever good enough.
Amazons leadership fundamentally does not see their employees as human
beings. As I grew the ranks over the years, I was directly coached on
removing myself from certain day to day interactions, because it would
simplify decision making if I didn’t have an interest in my own people,
that simply forming just work bonds was a conflict of interest in terms
of doing what’s “right” for the company.
Being anti-social with colleagues can be a competitive advantage in vast
bureaucracies like Amazon. Giant corporations want easily replaceable
employees. When you don’t have emotional attachment to your co-workers,
you are a better pawn to play with. You’ll get an excellent paycheck at
Amazon, but the price is more than the time spent working. The price is a
bit of your soul.
I got my first job in 2004, 18 years ago. Of these 18 years there are 4 that
I regret: I worried too much, worked too hard, or felt entitled to something
I didn’t earn. I stayed because of the money. It was never
worth trading my peace of mind for that extra cash.
Work isn’t only about trading one’s time for money. It’s also about the
sense of meaning it gives to life. Toil is meaningful because it connects us
to the rest of humanity. We work to be with folks that make meaningful economic
contribution to our community. When we retire, these connections are all
we got left: the memories and the friendships that grew over the years are
precious.
When my dad retired from the place he worked at for almost 40 years, it was
hard on him: his old company wasn’t doing well and was being dismantled
by unscrupulous executives. He found solace with his old co-workers, they
meet from time to time to talk and reminisce about the good old days.
What’s the point of material comfort when one’s mind is starved of meaning?
Make friends at work, and stay in touch with them.
6 things I got from “L’enrachinement” by Simone Weil
The English title of this 200 pages book is The Need for
Roots. I read it
in French.
Simone Weil is acute, delightful,
bold, insightful, and scandalous. Her vision is bucolic, and human.
Here are some salient ideas I got from her work.
1
Rome corrupted Catholicism by tying itself to the religion. God was made more
like a king ruling over the world rather than the world itself.
2
France has a long history as a dictatorial police state ruled by a single
person. Democracy hasn’t changed this.
3
Science replaced religion and tradition as the source of truth. While science
allows us to know the world better than these old principles, it can’t tell us
what is good or bad.
4
Christianity has become a matter of convenience.
5
The priesthood of science is just as corrupt and the religious priesthoods. The
noble scientist is an illusion.
6
France has 3 classes: bourgeois, workers, and peasants. The morale flows from
the bourgeoisie to the workers, and then to the peasants. Money is what matters
most, because that’s what matters to the bourgeois class.
Gmail’s UI replacement
Before I moved to Gmail I used mutt, a command line mail client. I
downloaded and uploaded emails with offlineimap, and used mutt to view,
move, and delete the downloaded emails. Offlineimap was fine, but I had
a few issues with it. Most of the time it was the local state getting out
of sync with the remote side. I’d hit Ctrl+C or close the terminal at
the wrong moment and some local state files would get corrupted. It was
usually easy to fix, occasionally I had to remove all my local emails and
re-download everything.
I set-up isync to see how it does, and thus far I like it. It seems a
bit more lightweight than offlineimap. I haven’t had any problem with it yet.
Back when I used mutt, it was a pain to configure the way I wanted and I
never really got comfortable with it even with my custom configuration.
I disliked that I had to write a lot of configuration to get it to work how
I wanted. Switching to Gmail was a breath of fresh air: because shortcuts
weren’t configurable I just had to learn Gmail’s defaults. Today mutt
is in a better shape than it was 10 years ago. I could give it another try,
but I decided to go with aerc a relatively recent Email reader. aerc
is opinionated about how its workflow, it’s configurable, but not to the
same extend mutt is. I am still learning how to use aerc, things are a bit
difficult to figure out at time: the current documentation isn’t that
great. Otherwise I like its philosophy.
The end of the free Google lunch
In July 2008 I moved my email with a custom domain to Gmail. Previously I
was self-hosting it on a Linode virtual private server.
The move was seamless. I was happy to avoid maintaining my own Email
infrastructure; Gmail was fast, reliable, and its user interface was nicer
than the other webmails I tried. I have been a happy customer for a long
time, and best of all it didn’t cost me anything. Back in 2008 this system
was called Google Apps, and it was free for personal use. It was
renamed to G suite, and is now called Google Workspace.
Today I received an email saying that this gravy train was about to stop,
unless I switched to a paid plan. I don’t mind paying for this, Google
provided an excellent service for almost 14 years.
Unfortunately, since I moved my email to Gmail, I lost faith in the
company’s ethics. The days when the corporation’s motto was “don’t be
evil” are gone. Since the firm’s main source of revenue is advertising I
believe its values are fundamentally at odds with mine. The big G has grown
enormously and expanded in various areas over the past decade. I don’t
feel comfortable having my personal information stirred into its gargantuan
data cauldron. Anybody paying money can get a ladle of data laced with some
of my dark secrets.
So it looks like 2022 will be the year I move away from Gmail. I plan to
document the journey here.
I recently got myself a new phone, and I noticed that my feed reading app:
gReader Pro didn’t get migrated to my new phone. It turns out it was
no longer on the Google’s app store. I paid for the app’s Pro version,
and I didn’t want to subscribe for $5/month to gReader Premium. Without a
Premium subscription there’s an Ad at the bottom of that takes about 10%
of the screen.
Luckily the app’s updated APK are still available via Github. I was able to
download it and get rid of the ads.
That is all.
2020 assessment
The end of the calendar year is an opportunity to look back and reflect:
How has this year transformed me?
The body
The Covid-19 pandemic got everyone locked down at home since March. I used to
bike to and from work every week-day, I missed these 40 minutes on the bike.
After a month inside I felt weaker, my lower back ached from the constant
sitting. I tried biking in the morning, but doing a tour and coming back home
30 minutes later didn’t work. I did it once and never found the motivation
to do it again.
A month into the lock-down a coworker —Marty— suggested we did push-ups and
publicly posted our daily ‘score’, so we created a dedicated Slack channel:
#beat-marty-fagan to track our progress. Marty walks across Antarctica
with his wife for fun, he knows what commitment to fitness is like.
I started doing 40 push-ups daily, after a week I added some squats and
lunges. Having a group of people posting their daily workout accomplishments
on Slack helped with motivation. Unfortunately after a couple of months
I was the only one still posting regularly on the channel. By that point
the habit was sufficiently ingrained that I kept going without the need for
peer pressure. Initially I only did body weight, after reading and watching
Pavel Tsatsouline I got a set of kettlebells for a more effective workout.
I worked out every weekday and most weekends
for about 8 months: I feel more energetic, and my lower back handles long
sitting sessions without getting sore. When I get out of the shower, I check
myself out in the mirror for an inappropriate amount of time, it does wonder for
my self-esteem. I love being hotter me.
My current workout routine takes about 10 minutes from start to finish. I’m
planning to keep doing it five times a week in 2021.
The mind
The best blog post I read in 2020 is: How I read by Slava Akhmechet. This
post gave me a renewed sense of purpose about reading and learning. Its
salient insight is that reading 5 books on a subject will give you a better
perspective on it than 99% of the people in the world.
The advice from the article I loved the most is: “try to read 40 pages a
day”, I put it in practice right away. I fell short of the 40 pages objective
most days, but still made steady progress. Since November I finished 3 books:
The Brothers Kamarasov, Line by Line, and One Day in the Life
of Ivan Denisovich. I only read 4 books in 2020, doing better next year
would be a nice.
Cold showers
I had a difficult Monday, I woke up with low energy and felt dizzy.
So instead of being productive that day I read articles online.
One of the posts I read was the excellent On stress and
comfort, where Slava Akhmechet discusses how he adapted to the Covid
lock-down and how cold morning showers lift his mood.
The next day I woke up still felling tired and sluggish. So I took a 10 seconds
cold shower that morning, and it rebooted my mind: I felt energized and had a
renewed sense of purpose. I had a productive day, and went to bed felling good.
Every day my friend Cyrille takes cold showers or goes for cold swims,
he started 6 weeks ago and he loves it thus far. I often heard or read
about the beneficial effects of cold water. When I was a student I took
cold showers to wake myself up after some late night partying, it worked
great. I stopped taking cold showers after graduating, possibly because the
late night partying also stopped.
There’s a randomized study that shows that people taking daily
cold showers tend to be less sick. The study seems robust and its results
significant. Many health professionals already use cold showers and baths
to help athletes recover.
With all these anecdotes about cold showers, it’s time I jump on the band
wagon and take one daily. I’ll update this blog with my experiences.
Books from the bricks and mortar
Over past 20 years I bought most of my books on Amazon, its inventory and
delivery are the best in the world. I can find rare or specialized books
priced competitively on Amazon, and its delivery is fast. The last time
I got a book at a store was in 2009, and it was a gift; the last time I
bough a book for myself at a store was in 2006. Amazon swallowed the book
distribution sector whole. I felt sad to see small book stores close down, but
I didn’t miss the book retail chains that Amazon killed one after the other.
Today Amazon is in a dominant position, but its value proposition has eroded,
books got harder to find and are more expensive. Amazon’s delivery is
still the best in the business, but I had numerous issues with it over the
past years. While the company always reimbursed me, these issues are a tax
on my time and tranquility.
For the first time in 14 years I got books from a local bookstore this
week. I did it for a few reasons:
I picked up the books in store the day after ordering online. Even small
inventories may have the book you’re looking for, while with Amazon
you always wait for the delivery. It was a breath a fresh air to order,
pick-up, and start reading a book in less than 24 hours.
A brick and mortar shop can order the books they don’t have in
store. It’s possible to order most books Amazon sells from any book shop,
at least at the ones I have looked at. I may still get some books online,
but for most of my needs my local retailer serves me better.
I like local pick-up. There are too many deliveries now because of the
Covid-19 pandemic. I like that my local shop keeps the books around
for when it’s convenient for me to pick up. Not dealing with delivery
contributes to my tranquility.
Support you local book store, and see if they have what you want in
stock. It’s nice to go out and talk to people.
Random things I learned about Kakoune today
I finally got around to read Vi(m) to Kakoune, this link was buried in my
bookmark list for a while, and as a former Vim user that switched to Kakoune
I should have read it earlier. There are a couple of things I learned from it.
Delete to line end:
alt-ld or Gld
This made me realize alt-h and alt-l can be used to extend the selection
buffer both ways. Nice.
Edit alternate file / Previously edited file:
ga (in Normal mode)
I’m going to use this one all the time from now on. ga is the gangster
command I needed when editing multiple buffers.
After 3 months of using Kakoune, I already feel more ‘connected’ to it
than I was with Vim. Kakoune’s commands make more sense, and I love that
my configuration file is only 29 lines.
I’m not quite at the level of productivity and speed I had with Vim.
It’s hard to outdo 14 years of use and fine-tuned configuration and
megabytes of plugins.
Stay tuned for more random bits about Kakoune.
gmi2html
Gemini is a hip alternative to the HTTP/HTML based Internet. I don’t want
to miss out on the hype, so I wrote gmi2html. It’s a text/gemini
to HTML converter written in Go, and it’s hosted on sourcehut:
$ go get git.sr.ht/~henryprecheur/gmi2html
$ go install git.sr.ht/~henryprecheur/gmi2html
Its design is inspired from Rob Pike’s talk: Lexical Scanning in Go. The
state of the lexer is kept in a callback, this neat trick simplifies the
lexer, and makes it more efficient. gmi2html reads its input from stdin and
writes the result to stdout, and there’s no flag:
$ gmi2html < input.gmi > output.html
It doesn’t support any extension like list, and heading yet. I’ll add
these features in the coming weeks.
The text/gemini markup format
A few weeks ago I read the Gemini specification; and I really like the
project’s markup format: text/gemini, a markup language with only essential
features. It’s a line oriented language with only four types of lines:
Text
Link
Preformatting toggle
Preformatted text
Special formatting like headings, unordered lists, and quote blocks are also
supported. And that’s all, that’s the entire markup language!
I love the minimalism of text/gemini and I’m considering using it for future
publications instead of Markdown.
I have used Markdown for my writings over the past 15 years, and I have
now grown tired of it. I used different Markdown to HTML converters over the
years, and they all have different quirks. I’ve been thinking about ditching
Markdown for about 5 years, but couldn’t find an alternative I liked. The
text/gemini markup feels like it’s a good fit for me. Unfortunately I
couldn’t find any command line text/gemini to HTML converter; maybe I
should write one?
OpenBSD’s sysupgrade
I run OpenBSD on my laptop and a server hosted in the Cloud. When I upgraded
OpenBSD on my server: I provisioned a new server instance running the OpenBSD
version to upgrade to; copied the configuration from the old to the new server;
altered my DNS to point to the new server; and shut down the old server. For
my laptop, I usually downloaded & installed the new system from the tarballs
using a script I wrote, and ran pkg_add after rebooting. My script didn’t
always work, I had to occasionally fix breakages after the upgrade.
That was until last week, when I used sysupgrade for the first time.
Sysupgrade automatically upgrade OpenBSD by downloading the new tarballs
along with the firmware files, reboot the machine, install the new system,
and finally upgrade the packages.
In both cases the upgrade was fast, didn’t require baby-sitting, and
everything worked out-of-the-box once the computer rebooted. I had to
upgrade my server twice to move from 6.6 to 6.8, since sysupgrade can’t
skip intermediate versions. There was some downtime: about 2 to 3 minutes
for each upgrade, for about 10 minutes of downtime in total. I also upgraded
my laptop with sysupgrade, I started the upgrade, made myself some tea,
and when I came back the laptop was all upgraded and ready to go.
And if you like to live on the bleeding edge, sysupgrade also allows you
to upgrade to a snapshot via the -s option. I used my own upgrade script to
do that, and it didn’t always work well. Now I can use sysupgrade and be
confident it will work.
I’m studying accounting these days. Learning about balance sheets, income
statements, and cash flows brought back memories of a story I read
about 5 years ago: the story of Crazy Eddie.
Crazy Eddie was an electronics retailer from New York City. Its owners
committed various securities fraud from 1969 until they got caught in 1987.
For its first 10 years in business the company was crazy profitable, Crazy
Eddie’s management and owner skimmed –stole & hid– cash from the company
and under-reported income to pay less taxes.
In the 80’s Crazy Eddie was getting ready for its IPO, and its managers
gradually reduced the amount or cash they skimmed to artificially increase
income over time. The goal was to increase its Statement of Cash Flows,
to make it look like the company was getting more and more profitable in
order to get a big fat valuation and raise tons of cash.
After its IPO, Crazy Eddie’s administrators didn’t slow down and
committed more securities fraud. They overstated the assets’ value,
laundered the money they skimmed by re-investing it into the company,
and understated accounts payable to benefit insiders and fool regulators.
The company had tons of debts, and didn’t include these liabilities in
its statement of cash flows, overstating the company’s position.
The Crazy Eddie story from whitecollarfraud.com is wildly entertaining,
I highly recommend it. Sam Antar, Crazy Eddie’s CFO, orchestrated many
of these frauds, and created this website to talk about it once he got out of
prison for his shenanigans. Securities fraud ain’t no joke.
bspwm and sxhkd are a great window manager
The X window system –the most popular graphical environment for Unix operating
systems like Linux and *BSD’s– gives its users the option to choose their
window manager. A window manager is a program that arranges windows around the
screen, and often adds decorations like a title bar with a close button, and
maybe a maximize, and minimize button beside. Tiling window managers arrange
windows into mutually non-overlapping frames, that’s what I use.
I used to run dwm, and switched to bspwm and sxhkd four weeks
ago. These programs work in tandem to manage windows and handle input events,
and it’s a beautiful thing.
First here’s an overview of how traditional window management works. Most
window managers use something called reparenting, where it becomes the top
window and all the other windows are its children. This lets the window
manager decorate these sub-windows. A typical event loop handles both the
administration of the window and the input events like keyboard, mouse,
or touch. That’s a traditional X application event loop.
Bspwm & sxhkd are different, they split the event loop into two different
processes; sxhkd pilots bspwm via a command line tool called bspc. The name
bspwm comes from BSP or binary space partition, while sxhkd means Simple
X hotkey daemon. Sxhkd handles keyboard, mouse, and other input events,
and bspwm only handles windows events, and ignore all input events. Sxhkd
drives bspwm by mapping hotkey to execution of the bspc command line tool
to tell bspwm what to do.
Because of this split configuration is straightforward, there are two
different configuration files, instead of one. Since these files have
different purposes, they can use different syntax. Sxhkd has a simple
and powerful configuration syntax. Each line of the configuration file is
interpreted as so:
If it is empty or starts with #, it is ignored.
If it starts with a space, it is read as a command.
Otherwise, it is read as a hotkey.
So if you want to start xterm when pressing the Alt and the Return keys
simultaneously, you put the following in sxhkd’s configuration:
alt + Return
xterm
Bspwm’s configuration is an executable that can be written any language,
it’s executed after the window manager starts. The executable is usually a
shell script that calls the bspc tool to configure bspwm. Clear configuration
& minimalism makes there two programs attractive options.
I use the “default” configuration that comes with sxhkd & bspwm and I
the only change I did was to reduce border between windows from 8 pixels to
4. What my thoughts are after 4 weeks? I got used to the new setup within
a few days, it was easy to learn coming from dwm, your experience may be
different if you have never used a tiling window manager.
bspwm and sxhkd are a great window manager. If you are running dwm, i3,
xnomad or some other tilling window manager, they may be a good alternative.
Khan Academy course review: Finance and capital markets / Housing
I finished the Housing module of Khan Academy’s course Finance and
capital markets. First it explains the process of evaluating if one should
buy or rent depending on one’s personal circumstances. It’s an important
evaluation that everyone should do, preferably away from their Realtor.
The course then goes over all the steps of evaluating, buying, and paying off
a place. Seeing the calculations and detailed explanations for the different
examples is helpful to understand how slightly different inputs can create
vastly different outcomes. For example if the price of the house goes up,
the buyer’s equity in the house goes up, and she can borrow cash with that
newly created equity as a collateral. This is called a home equity loan, and
that thing looks terrifying to me. One line from that part I really liked:
when people buy a place with a mortgage they go from renting a place to
renting money for a place. The interest on the loan is the rent.
The course also explains the etymology of the word mortgage, it is the bank
pledging to give the title after the debt is paid off. Mortgage is an old
French word, it’s a portmanteau composed of two words: mort –dead– and
gage –pledge–. It’s called a dead-pledge because the deal dies when it
is fully paid or when payment fails.
The course then introduces the different types of mortgages: fixed,
adjustable, and balloon. Fixed are just a fixed rate for a set number of
periods. Adjustable mortgages or ARM are based on an index, like one year
treasury bonds, these are riskier mortgages than fixed rate. ARM are
often mixed with fixed rate mortgages to make the risk more palatable to
the buyer. Then there are balloon mortgages. If you like to take on as
much risk as possible, and want a mortgage that can blow up in your face,
you’ll love balloon mortgages.
The module finishes with a high-level overview of the house buying process,
and explains what title, deed, and escrow are. I’m not planning
to buy a place anytime soon, and this part gave me another reason to wait:
buying a place is long, bureaucratic, stressful, and can be risky.
Khan Academy course review: Finance and capital markets / Interest and debt
I watched the first module of the course Finance and capital markets:
Interest and debt. It’s a series of 16 videos about 10 minutes long. I
had rudimentary knowledge of finance and accounting before watching these
videos, here’s what I learned.
I learned the term Principal, it’s the initial amount of money
invested or loaned from which interest are calculated. The course also
introduced the Rule of 72 that I already knew, went into more details
explaining why it was a good heuristic. I learned about the credit card
system, the interchange fees, where the money goes and in what proportion. I
learned about payday loans and how they work, and how some people get around
the credit system to still get some money when they need it to pay their
bills. Finally I learned more about the eerie number e, and how it
can be used to calculate interests continuously.
The pace of the lectures is mellow, but one needs to actively pay attention
to absorb the material. I casually watched most of these videos years ago,
and I didn’t retain much, I took note this time, and felt I covered and
absorbed the material much better this time.
One of the thing I lost when I switched from Vim to Kakoune is
digraphs. Vim’s digraphs lets you to input non-ASCII characters
by replacing multiple-characters combinations with the corresponding Unicode
character. I was born in France and I need to input accented letters when I
write to my folks back home. Because I use a QWERTY keyboard, I used Vim’s
digraphs to get around the lack of Latin characters on that keyboard’s
layout by entering the accented digraphs in Vim. With Kakoune I needed to find
another solution. I looked at different solutions like digraph.kak,
but I didn’t want to depend on an external program or a plug-in to do this.
After digging a bit more I found this blog post that mentioned the
altgr-intl variant of the US keyboard layout in X.org. It uses the right
Alt key as a dead key to input accented characters. It’s easy to set-up,
I added the following line in my .xsession file:
setxkbmap -rules evdev -model evdev -layout us -variant altgr-intl
The US-altgr-intl layout works well and I got used to it quickly, it doesn’t
interfere with my usual workflow since I rarely use the right Alt key. For
me this is a better solution than Vim’s digraphs, because it works with
every X11 application like web browsers and terminals, and it keeps my
Kakoune configuration lightweight.
Old mistakes we keep on fixing
Releasing software publicly is scary, with new features come bugs and
errors. Some of these bugs may be unfix-able in practice, and stick around
forever.
Today I encountered such a bug with aspell a unix command-line spell
checker. I was integrating aspell in with the Kakoune text editor. When I
executed the command :spell fr in Kakoune nothing happened. I was expecting
the bad words to be highlighted, but none of the glaring mistakes in my
prose got highlighted. I then looked at the *debug* buffer to see what
was up and I saw this.
Error: The file "..." is not in the proper format. Incompatible hash function.
Luckily it was easy to track down the issue, there’s a page in
aspell’s documentation talking about it. Long story short: language spell
files built with 32 bits systems aren’t compatible with 64 bits systems.
Someone didn’t realize that using the size_t type for an on-disk data
structure was a bad idea. Oops.
It’s this type of bug that makes releasing software terrifying. The bug is
baked into aspell forever unless someone adds a way to handle both types, and
that could be a significant refactoring of an unknown code base. The tragedy
of the commons keeps on playing.
Most distributions handle this by making aspell dictionaries architecture
specific. I bumped into this bug because I use VoidLinux that handles
aspell dictionaries as an architecture independent package. This was fixed
already, but I’m convinced I won’t be the last person to work-around
this bug.
14 years ago I ditched Emacs as my main text editor and switched to Vim. I
felt Emacs was too bloated and slow, I tried Vim for a few days and never
looked back once I got comfortable with it.
Vim is still fast today, but after 14 years of use I’ve come to dislike its
limitation and weird quirks. It does too much now, and it isn’t as ‘elegant’
as I wish it was.
I tried the Acme editor, and I loved the ideas behind it, but it wasn’t
suited to my keyboard-focused workflow. I switched back to Vim after a single
day of use, but the experience made me crave something better.
I also tried Neovim on and off over the past year, but I didn’t see the
point of switching to a fork of an editor I’ve come to dislike.
After reading good things about it online, I started using the Kakoune
text editor, and I’m impressed by its design and trade-offs thus far. I’m
still getting used to Kakoune’s new edition and navigation style, and yet
I already feel productive. After just a day using it I think it may be the
one. Expect more posts about Kakoune in the near future.
Zettelkasten with plain Vim
The zettelkasten (German: “slip box”) is a knowledge management and
note-taking method used in research and study.
I saw posts about this method pop-up on Hackernews & Lobsters lately. A
zettelkasten is like a personal-Wiki for your research, its salient idea is
linking cards or files together. I maintain a spark file since 2012, the
concept is similar to a zettelkasten: it’s a collection of notes; each note is
about a single idea and is revisited and re-edited regularly. Mine is a Git
repository with about 160 text files containing 400,000 characters as of today.
Since learning about the zettelkasten method I link between files more and it
helps revisit old notes and material more effectively.
I use Vim to navigate it with the gf command, it lets you edit the file
whose name is under or after the cursor. With this command I can easily link
files together, it makes the organization of my zettelkasten simple, and it
works out of the box with Vim.
For example if I want to link network/http.txt to network/ip.txt: I add a
line to the file network/http.txt with the filename I want to link to:
net/ip.txt. When I position my cursor over this line and enter gf Vim will
open network/ip.txt.
For tagging files together I use outline files. When I want to link a set of
files together, I create a new file and add all the filenames in it, like an
index file.
Since I got into the habit of linking files together I feel more inspired to
write and produce. I have revisited old ideas, and I feel the creative juices
flowing again.
Playing with web fonts
I updated this blog’s fonts, because I wanted to make it more snappy to download
and display. It uses the Charter and Anonymous Pro as body and mono-space fonts.
Before the update Charter was a WOFF font hosted on my server and Anonymous Pro
was a TFF font hosted on Google® Fonts™ —another piece of Google®’s surveillance
network—.
Instead, I stole the ideas in the CSS from Butterick’s Practical
Typography, and embedded the WOFF2 formatted
fonts encoded as base64 strings in the CSS style-sheet. Before this update, the
browser used the default system fonts while the external fonts loaded and then
swapped the fonts. This made the page blink a second after it appeared, that
blinking effect is now gone.
All the fonts together weight about 140KB which is a bit much for a mandatory
resource. My index page with all the posts I made since 2012 and the fonts
gzipped weights 210KB as of today. That’s far from the tens of Megabytes of data
the typical webpage eats in bandwidth. So hopefully this translates into a
better experience for you, my readers.
According to Firefox’s debugger this blog takes 32 seconds to fully download and
display on a GPRS connection —about 10KB/seconds—; that works out about 2.5
seconds on a 3G connection.
There’s something else I’d like to improve: WOFF2 fonts are supported only by
relatively new browsers. I would like to find a way to load the font only if the
browser supports it. Maybe a second CSS that’s conditionally loaded would do the
trick.
That’s all for today.
Spring cleaning bucket list: Scrub out surveillance capitalism
I’m doing some virtual spring cleaning, I moved all my websites to a new server
because the old one will shutdown soon.
This reminded me that I track you all with the largest surveillance capitalism
network in the world: Google®, with their service Analytics™. This service lets
me see how little you all care about my writing, which depresses me a little.
Also I feel like an ass for selling your data to an already too powerful
corporation.
I removed the Google® Analytics™ tracker from all my websites, including this
blog.
This place is safe now, and I feel better.
I’m working on updating this blog’s look. Charter is the blog’s body font,
it’s served as EOF and WOFF formats. Now there’s a new hot web font format in
town: WOFF 2, like WOFF but better I guess.
I looked for a WOFF2 version of Charter, and I found it in the Wikimedia UI
style guide’s repository. This may be a good option for those looking for a
great-looking font with a permissive license.
Pango 1.44 dropped support for bitmap fonts, this is a problem for me with
my favorite font —Terminus— being a bitmap font. This means that Vim’s GTK
front-end no longer works with Terminus.
To get around this I simply use Vim in with support for X11 clipboards in a
terminal. Here’s the little script I put together to do this:
#!/bin/sh
# Void Linux vim command doesn't support X11 clipboards
if command -v vim-x11 > /dev/null
then
readonly editor=vim-x11
else
readonly editor=vim
fi
exec urxvt -fn 'xft:Terminus:size=18' -e "$editor" "$@"
Go CBOR encoder: Episode 11, timestamps
This is a tutorial on how to write a CBOR encoder in
Go, where we’ll learn more about reflection and type
introspection.
Make sure to read the previous episodes, each episode builds on the previous
one:
In the previous episode we improved floating point number support in our
encoder. We implemented all the Go native types, now we’ll implement a custom
stype: time.Time, a timestamp type from Go’s standard library. The CBOR format
supports 3 timestamp types
natively:
RFC3339 string like “2019-02-01T17:45:23Z”
floating point epoch based values
integer value epoch based values
The CBOR format has special values called tags to represent data with
additional semantics like timestamps. Tags’ headers major type is 6 and
represents an integer number used to determine the tag content’s type. Each
tagged type has a unique integer identifier number.
For example URIs are represented as a tagged unicode string: first there’s the
header with the major type 6 —indicating it’s a tagged value— encoding the
integer 32 —the URIs’ identifier—, followed by the URI encoded as an UTF-8 CBOR
string.
How can we detect if we have a time.Time value in the encoder? Looking at
time.Time’s definition we see that it’s a struct,
a kind of value we already handle in the encoder. The reflect package lets us
query and compare value’s types, so we will check if the value’s type is
time.Time when we have a reflect.Struct kind and write a CBOR timestamp when
that’s the case.
There’s a bit of gymnastic needed to get time.Time’s type without allocating
extra stuff, we can either do:
reflect.TypeOf(time.Time{})
Or:
reflect.TypeOf((*time.Time)(nil)).Elem()
In the first case we create an empty time.Time object, pass an interface
pointing to it to reflect.TypeOf that will return its reflect.Type. In the
second case we create an empty interface to time.Time and retreive its type
directly. We’ll use the second way because it doesn’t create an empty time.Time
object and is therefor a bit more efficient.
In the main switch block we add a conditional statement in the reflect.Struct
case to check is the struct’s type is time.Time:
case reflect.Struct:
if x.Type() == reflect.TypeOf((*time.Time)(nil)).Elem() {
return ErrNotImplemented
}
return e.writeStruct(x)
Timestamps have two tagged data item types: 0 for RFC3339 timestamps encoded as
unicode strings, or 1 for epoch-based timestamps —floating point & integer
values—. Let’s add a new function to write the timestamps: writeTime. We’ll
handle string timestamps first, and implement scalar epoch-based timestamp types
second. Starting with RFC3339 strings, we lookup the example from the
spec, and add our first test case:
Back in cbor.go we add a few header constants required to encode the new tagged
types:
const (
// major types
...
majorTag = 6
...
// major type 6: tagged values
minorTimeString = 0
minorTimeEpoch = 1
...
)
The function writeTime writes the tag’s header with minorTimeString to indicate
a string follows, then it converts the timestamp into a RFC3339 string and
writes it to the output:
func (e *Encoder) writeTime(v reflect.Value) error {
if err := e.writeHeader(majorTag, minorTimeString); err != nil {
return err
}
var t = v.Interface().(time.Time)
return e.writeUnicodeString(t.Format(time.RFC3339))
}
We hook it up to the rest of the code by adding a call to writeTime in our main
switch statement:
case reflect.Struct:
if x.Type() == reflect.TypeOf((*time.Time)(nil)).Elem() {
return e.writeTime(x)
}
return e.writeStruct(x)
A quick go test to confirm writing string timestamps works, so let’s get
started with epoch-based timestamps.
Epoch-based timestamps are scalar values where 0 corresponds to the Unix epoch
(January 1, 1970), that can either be integer or floating point values. We’ll
minimize the size of our output by using the most compact type without losing
precision. The timestamp can either be an integer, a floating point number, or a
RFC3339 string. If the timestamp’s timezone isn’t UTC we’ll have to use the
largest type: RFC3339 strings, because we need to encode the timezone
information and we can’t do it with scalar timestamps. If the timestamp’s
timezone is UTC or is nil we can use a scalar timestamp because they are set in
UTC time. We’ll use an integer when the timestamp can be represented as whole
seconds or use a floating point number otherwise.
First we add a condition to only use RFC3339 strings when the timestamp has a
timezone that’s not UTC:
func (e *Encoder) writeTime(v reflect.Value) error {
var t = v.Interface().(time.Time)
if t.Location() != time.UTC && t.Location() != nil {
if err := e.writeHeader(majorTag, minorTimeString); err != nil {
return err
}
return e.writeUnicodeString(t.Format(time.RFC3339))
}
return ErrNotImplemented
}
Because we are changing the behavior of writeTime when the timezone is UTC, we
have to fix the first test case to use a timestamp with a non-UTC timezone set,
otherwise the test will fail with ErrNotImplemented returned. We replace the Z
—a shortcut for the UTC timezone— at the end of rfc3339Timestamp with +07:00:
Note that we had to call the .UTC() method on the time.Time object returned by
time.Unix, that’s because otherwise the object will have the computer’s local
timezone associated to it, a call on the UTC method get us a UTC timestamp.
Since time.Time stores its internal time as an integer counting the number of
nanoseconds since the Epoch, we’ll have to convert it into a floating point
number in seconds before writing it. To do this we define a constant to convert
from nanoseconds to seconds from the time’s module units:
Then we add the code after the block to handle string timestamps. We write the
header with minorTimeEpoch as its sub-type to indicate we have a scalar
timestamp, then write the converted value as a floating point number:
func (e *Encoder) writeTime(v reflect.Value) error {
var t = v.Interface().(time.Time)
if t.Location() != time.UTC && t.Location() != nil {
if err := e.writeHeader(majorTag, minorTimeString); err != nil {
return err
}
return e.writeUnicodeString(t.Format(time.RFC3339))
}
// write an epoch timestamp to preserve space
if err := e.writeHeader(majorTag, minorTimeEpoch); err != nil {
return err
}
var unixTimeNano = t.UnixNano()
return e.writeFloat(
float64(unixTimeNano) / float64(nanoSecondsInSecond))
}
If the timestamp in seconds is an integer number we can write it as an integer
timestamp without losing precision. Integers are usually more compact than
floating point numbers, we’ll always use them when possible. Another test case
from the spec makes it into cbor_test.go:
To determine if we can write an integer timestamp we check if the fractional
part of the timestamp in seconds is zero, then we convert unixTimeNano into
seconds, set the CBOR integer’s header minor type depending on the timestamp’s
sign, and use writeInteger to write the timestamp:
const nanoSecondsInSecond = time.Second / time.Nanosecond
func (e *Encoder) writeTime(v reflect.Value) error {
...
// write an epoch timestamp to preserve space
if err := e.writeHeader(majorTag, minorTimeEpoch); err != nil {
return err
}
var unixTimeNano = t.UnixNano()
if unixTimeNano%int64(nanoSecondsInSecond) == 0 {
var unixTime = unixTimeNano / int64(nanoSecondsInSecond)
var sign byte = majorPositiveInteger
if unixTime < 0 {
sign = majorNegativeInteger
unixTime = -unixTime
}
return e.writeInteger(sign, uint64(unixTime))
} else {
return e.writeFloat(
float64(unixTimeNano) / float64(nanoSecondsInSecond))
}
}
And it’s all we needed to do to support the non-native type time.Time!
We are done writing our CBOR encoder. It you would like to see other things
covered feel free to reach me at henry@precheur.org.
Go CBOR encoder: Episode 10, special floating point numbers
This is a tutorial on how to write a CBOR encoder in
Go, where we’ll learn more about reflection and type
introspection.
Make sure to read the previous episodes, each episode builds on the previous
one:
In the previous episode we added floating point number support to our
encoder.
We minimized the size of the output without losing precision. There’s still room
for improvement though: we encode all regular floating point numbers as 16 bits
numbers when possible, but there are also special numbers in the standard IEEE
754 that can be packed more efficiently:
Subnormal numbers, also called denormal numbers, denormalized numbers, or
subnumbers. They includes 0 which can’t be encoded accurately as a regular
floating point number
Infinities
Not a Number
The way the encoder works now these special values are all encoded as 32 or 64
bits floats, and lots of them we can be encoded as 16 bits numbers without
losing information.
We’ll starts with infinite values, then not a number values, and finish
subnormal numbers.
For infinite values, there are two types: positive and negative. The only thing
that changes with infinite values is the sign bit, the exponent is all 1’s, and
the fractional part is all 0’s. Infinite values are easy to detect in Go with
the math.IsInf function. To detect infinites values we add an if block with
math.IsInf at the beginning of the writeFloat function, and write a 16 bits
float with all 1’s in the exponent and all 0’s in the fractional:
func (e *Encoder) writeFloat(input float64) error {
if math.IsInf(input, 0):
return e.writeFloat16(math.Signbit(input), (1<<float16ExpBits)-1, 0)
}
...
}
Nan or Not a number is similar to infinites but has a changing fractional part.
The fractional part of a NaN carries some information, we’ll copy it as is and
just chop off the end, all the important information is in the first few bits.
We add the following to the second switch statement in writeFloat:
func (e *Encoder) writeFloat(input float64) error {
...
var (
exp, frac = unpackFloat64(input)
)
...
switch {
case math.IsNaN(input):
return e.writeFloat16(math.Signbit(input), 1<<float16ExpBits-1, frac)
...
}
}
And that’s all we need for not a number. To verify we implemented it correctly
we add the corresponding test cases from the CBOR spec in
cbor_test.go:
We now store tightly infinites and not a number, but here comes the hard part:
subnormal numbers. There’s a lot of bit fiddling ahead.
When an exponent’s binary value is all 0’s, it means we have a subnormal number,
and Zero is a subnormal number. Zero needs a special number because it cannot be
represented precisely when fractional part is prefixed by a 1 like with regular
floating point numbers. Even if the factional was all zeros and the exponent
very small, a regular floating point number can’t precisely represent 0 because
there’s always a 1 somewhere in in the fractional (like 0.000…01). Therefor we
have subnormal numbers that start with a 0 instead of a 1 to represent zero
precisely and other very small numbers more accurately.
Let’s start by encoding efficiently zero and negative zero. Negative zero is
zero with its sign bit set to one. Here are the two test cases we add to our
TestFloat test in cbor_test.go:
To get a negative zero in Go we have to use the math.Copysign function, because
the compiler turns the expression -0.0 into a positive zero. We turn the if
statement at the beginning into a switch, with an additional case to detect
zero, and encode it as a 16 bits float to preserve space:
func (e *Encoder) writeFloat(input float64) error {
switch {
case input == 0:
return e.writeFloat16(math.Signbit(input), 0, 0)
case math.IsInf(input, 0):
...
}
...
}
We don’t check if the input equals -0 because -0 equals 0. Zeros are done!
What other numbers can we represent as subnormal numbers? Let’s learn more about
them and the difference with regular numbers. Here’s the formula for regular 16
bits floating point numbers:
(−1)signbit × 2exponent−15 × 1.significantbits2
When we have 16 bits subnormal numbers the formula turns into:
(−1)signbit × 2−14 × 0.significantbits2
Regular numbers are prefixed with a 1 bit, and subnormal numbers start with 0
bit. This means that by shifting the bits to the left, we can represent regular
numbers with exponent lower than -14 as subnormal numbers. We’ll use the
smallest 16 bits subnormal number: 5.960464477539063e-8 as a example. Its
regular floating point representation is:
2-24 × 1.00000000002
The fractional part is all zeros and the exponent is -24. How can we represent
it as a 16 bits floating point number when the exponent is set to -14 and can’t
be changed? We shift the fractional part to the left, it’s like lowering the
exponent by the same amount. Every time we shift left the fractional part by 1
bit it’s equivalent to lowering the exponent by 1.
For our example we shift the fractional part by 10 bits, which is equivalent to
lowering the exponent by 10 to -24:
2-24 × 1.00000000002 = 2-14 ×
0.00000000012
As long as we can shift the fractional part to the left without dropping any 1’s
we can represent the number as a 16 bits float. In summary to encode a value as
a 16 bits subnormal numbers we have to:
Verify the exponent and the number of trailing zeros are within the ranges
required to encode precisely the input
Add a trailing 1 at the head of the regular fractional, since a those
number’s fractional doesn’t have a leading 1 like regular number’s do
Shift the fractional part to match the number’s exponent
The smallest possible 16 bits subnormal number is one of the example in the
CBOR spec. Let’s add it to the TestFloat test suite:
To check if we have a number that can be encoded as a subnormal number we add a
predicate function subnumber() with two parameters: the exponent, and the number
of trailing zeros in the fractional part. It verifies that the exponent is
within the range of what’s representable by a subnormal number, and that we
don’t drop any 1 from the fractional when we cut it:
func subnumber(exp int, zeros int) bool {
var d = -exp + float16MinBias
var canFitFractional = d <= zeros-float64FracBits+float16FracBits
return d >= 0 && d <= float16FracBits && canFitFractional
}
Then we add a case statement at the beginning of the second switch, such as we
encode the value as a 16 bits subnormal number when possible, and then fallback
to 32 bits float otherwise:
func (e *Encoder) writeFloat(input float64) error {
...
var (
exp, frac = unpackFloat64(input)
trailingZeros = bits.TrailingZeros64(frac)
)
if trailingZeros > float64FracBits {
trailingZeros = float64FracBits
}
switch {
...
case subnumber(exp, trailingZeros):
// this number can be encoded as 16 bits subnormal numbers
frac |= 1 << float64FracBits
frac >>= uint(-exp + float16MinBias)
return e.writeFloat16(math.Signbit(input), 0, frac)
case float64(float32(input)) == input:
...
}
}
Let’s take a closer look step by step. When subnumber() matches, we build the
new fractional part by prefixing the fractional part with a 1, this is the
implicit 1 prefix from the regular number formula:
frac |= 1 << float64FracBits
Then we shift the fractional by the difference between the number’s exponent and
the fixed exponent: -14 for 16 bits subnormal numbers:
frac >>= uint(-exp + float16MinBias)
Finally we write the number as a 16 bits floating point with a zero exponent:
One last run of go test confirms that everything works. We now pack tightly
all special float values, and with the subnormal numbers optimization we just
implemented we also pack 210 numbers more efficiently as 16 bits
floats.
We successfully encoded one of the most complex types Go natively supports. Next
time we’ll implement a custom type: timestamps.
Check out the repository with the full code for this episode.
Go CBOR encoder: Episode 9, floating point numbers
This is a tutorial on how to write a CBOR encoder in
Go, where we’ll learn more about reflection and type
introspection.
Make sure to read the previous episodes, each episode builds on the previous
one:
Go only supports float32 & float64 natively. To support 16 bits numbers we will
build the 16 bits values ourselves. We’ll implement 32 & 64 bits floats first,
and then do the 16 bits numbers. We’ll minimize the size of the output by
encoding numbers as tightly as possible, this means we’ll use 64 bits numbers
only when using smaller numbers would lose precision. We don’t want to lose
information or precision, the encoded numbers have to be exact.
As usual we take some examples from the CBOR spec, and look for
numbers that can only be represented as 32 and 64 bits floats and add a test
case for them. We find that 100,000.0 can be encoded exactly with a float32,
while 1.1 can only be represented by a float64.
We start with those two examples and add the new test:
To decide whether to use float32 or float64 for a value we convert the value to
float32 and compare it to the original float64 value. If both values are the
same we can safely encode the number as a float32 without losing precision.
Let’s add a new function writeFloat to do that:
We add writeFloat to our big switch statement on the input’s type:
switch x.Kind() {
...
case reflect.Float32, reflect.Float64:
return e.writeFloat(x.Float())
}
go test confirms TestFloat passes. We are done with 32 and 64 bits floats. The
first part was easy, but the second part won’t be this simple: there’s more work
ahead of us.
Next let’s add support for 16 bits floats. As mentioned before Go doesn’t
support float16 natively, so we’ll generate the binary value ourselves. What
kind of number can we store in a 16bits float? A 16 bits float looks like this:
SEEEEEFFFFFFFFFF
S is the sign bit, 0 for positive, 1 for negative. EEEEE is the 5 bits exponent,
and FFFFFFFFFF is the 10 bits fractional part.
According to the IEEE 754 spec the 5 bits exponent’s range is -14 to 15. if a
number’s exponent is within those limits we can encode it as a 16 bits float.
The 10 bits fractional is quite a bit smaller than the 23 bits of the 32 bits
floats’ exponent. We may lose precision when we chop off the end of a number’s
fractional part: if there’s a 1 anywhere in those dropped bits we lose
precision. To prevent this we will use a bit mask to ensure we’re not
dropping any bits. In summary we can encode a number as a 16 bits if and only
if:
Its exponent is between -14 and 15
Its fractional part doesn’t have any 1’s after its 10th bit
Let’s write some code: first we break down our numbers into those 3 parts. We
add the function unpackFloat64 to decompose float64 into its sign bit, exponent,
and fractional part. We unpack 64 bits floats because it’s the type with the
highest precision we support, and all numbers can be represented as float64. We
also add constants at the top to use for bit mask and shifting operations:
math.Float64bits converts the floating number to a uint64 type containing
float64’s raw binary value. We then extract the exponent by shifting r by
float64FracBits and mask it with expMask to trim off the bit sign. The result is
converted to an integer and we subtract the exponent’s bias from it to get the
real exponent value. The fractional part is extracted with a bit mask.
We’ll refactor writeFloat and introduce unpackFloat64, add use a bit mask to
determing what type we should use. The exponent range of 32 bits float is -126
to 127 and we need at least float32MinZeros = 23 - 10 = 13 trailing zeros at the
end of the fractional part.
We use a switch case with the smallest type first and use float64 only if
float16 and float32 don’t work:
go test still works, because we haven’t added any test to verify 16 bits
floats work. Let’s add support for 16 bits floats and add a test case for it.
1.0 is an easy number to represent with 16 bits, so we start with that:
To write 16 bits floats we add a new method writeFloat16 that takes all three
parameters needed to build a 16 bits float: sign bit, exponent, and fractional.
We turn them into a single 16 bits integer, and write the value to the output:
func (e *Encoder) writeFloat16(negative bool, exp uint16, frac uint64) error {
if err := e.writeHeader(majorSimpleValue, minorFloat16); err != nil {
return err
}
var output uint16
if negative {
output = 1 << 15 // set sign bit
}
output |= exp << float16FracBits
output |= uint16(frac >> (float64FracBits - float16FracBits))
return binary.Write(e.w, binary.BigEndian, output)
}
Finally we hook up writeFloat16 to writeFloat with a switch case. We check the
exponent’s range and that we’re not dropping any 1’s at the end of the
fractional for float16 and float32, if none match we fall-back to float64:
Our encoder handles float16, we’ve covered all 3 floating point number types. It
looks like we’re done with floats, but there’s still more cases and special
numbers we have to take care of. In the next episode we’ll add support for more
special numbers: Zero, Infinity, Not A Number, and subnormal numbers.
Check out the repository with the full code for this episode.
3 years ago I opened-up about an intimate, and controversial subject: the
font I use with my terminal and text-editor, the superb
Terminus.
My old monitor was a 24” 1080p monitor with a pixel pitch of 0.2715mm, it had
fat pixels. Terminus looked sharper on it than outline fonts because the
monitor’s pixels were so big, I liked how Terminus popped out with its sharp
edges. I tried to use outline fonts like Source Code Pro, but they didn’t
look as sharp and defined either under Linux and Windows. I played with the font
hinting but it didn’t improve the outline fonts’ look enough to match the good
old Terminus’ sharpness. I have stuck with my bitmap font of choice for over a
decade.
A year ago I got a bigger 27” monitor with a 2560 × 1440 resolution and a pixel
pitch of 0.233mm. The pixels on it are 15% smaller than on my old monitor, and
it’s noticeable, everything looks smaller and sharper on this screen.
I tried outline fonts again to see if the finer pixels meant I could ditch
my bitmap font and use a smoother outline font. I tried Source Code Pro and Deja
Vu Mono. They looked better than on my old monitor: the smaller pitch helped of
course, and the new screen is generally better so the fonts didn’t ‘bleed’ as
much. The improvement wasn’t enough for me to stick with outline font with the
new monitor: Terminus still looked sharper compared side by side. I eventually
switched back to Terminus, it is still the font that looks the best to me on
that new screen.
When I get a monitor with a pixel pitch below 0.2mm I may give outline fonts
another try, but for now I’ll stick with my favorite bitmap font as I did for
the past 10 years.
To encode structs we’ll mimic what the standard JSON encoder does and
encode structs into maps of strings to values. For example if we pass the
following to the JSON encoder:
struct {
a int
b string
}{
a: 1,
b: "hello",
}
It outputs:
{"a": 1, "b": "hello"}
The struct kind is different from the map kind we implemented in the previous
episode: with struct the fields’ are ordered and the keys are always
strings. Because struct’s keys are strings, we can’t use all the examples from
the spec like we did with maps, we can only use example with
string-only keys. This leaves us with these three test cases:
On the flip side because the keys are ordered we don’t have to look for each
individual pair in the output like we did with maps. We can use the function
testEncoder as it is for our test. Let’s add TestStruct to
cbor_test.go:
func TestStruct(t *testing.T) {
var cases = []struct {
Value interface{}
Expected []byte
}{
{Value: struct{}{}, Expected: []byte{0xa0}},
{
Value: struct {
a int
b []int
}{a: 1, b: []int{2, 3}},
Expected: []byte{
0xa2, 0x61, 0x61, 0x01, 0x61, 0x62, 0x82, 0x02, 0x03,
},
},
{
Value: struct {
a string
b string
c string
d string
e string
}{"A", "B", "C", "D", "E"},
Expected: []byte{
0xa5, 0x61, 0x61, 0x61, 0x41, 0x61, 0x62, 0x61, 0x42, 0x61,
0x63, 0x61, 0x43, 0x61, 0x64, 0x61, 0x44, 0x61, 0x65, 0x61,
0x45,
},
},
}
for _, c := range cases {
t.Run(fmt.Sprintf("%v", c.Value), func(t *testing.T) {
testEncoder(t, c.Value, c.Expected)
})
}
}
To encode struct we’ll iterate over the fields of the struct with an index using
Value.NumField and Value.Field, like this:
var v = reflect.ValueOf(struct {
AKey string
BKey string
}{AKey: "a value", BKey: "b value"})
for i := 0; i < v.NumField(); i++ {
fmt.Println(v.Field(i))
}
This prints:
a value
b value
We have the fields’ values, we still need their names to write the map. The
fields’ names aren’t stored in the value itself, they are stored in its
type. We’ll use v.Type().Field() to get a StructField with the name of
this particular field. For instance if we added the following at the end of the
listing above:
for i := 0; i < v.NumField(); i++ {
fmt.Println(v.Type().Field(i).Name)
}
We’d get the names of each field printed at the end:
AKey
BKey
Let’s assemble all that into a new function writeStruct in cbor.go.
writeUnicodeString writes the keys, and we encode the value recursively with the
encode() method:
func (e *Encoder) writeStruct(v reflect.Value) error {
if err := e.writeInteger(majorMap, uint64(v.NumField())); err != nil {
return err
}
// Iterate over each field and write its key & value
for i := 0; i < v.NumField(); i++ {
if err := e.writeUnicodeString(v.Type().Field(i).Name); err != nil {
return err
}
if err := e.encode(v.Field(i)); err != nil {
return err
}
}
return nil
}
We add a call to writeStruct in the main switch statement:
case reflect.Struct:
return e.writeStruct(x)
A quick run of go test confirms everything works as intended:
$ go test -v
...
--- PASS: TestStruct (0.00s)
--- PASS: TestStruct/{} (0.00s)
--- PASS: TestStruct/{1_[2_3]} (0.00s)
--- PASS: TestStruct/{A_B_C_D_E} (0.00s)
PASS
ok
Basic structs work, but we aren’t done yet. We’ll extend support for structs by
mimicking the standard JSON encoder and add support for struct tagging.
Here’s a summary of what option the encoder supports:
// Field appears in JSON as key "myName".
Field int `json:"myName"`
// Field appears in JSON as key "myName" and
// the field is omitted from the object if its value is empty[...]
Field int `json:"myName,omitempty"`
// Field appears in JSON as key "Field" (the default), but
// the field is skipped if empty.
// Note the leading comma.
Field int `json:",omitempty"`
// Field is ignored by this package.
Field int `json:"-"`
// Field appears in JSON as key "-".
Field int `json:"-,"`
We’ll implement these features with the cbor tag instead of json, like this:
Field int `cbor:"name,omitempty"`
Let’s write a test with the feature we want to verify, we’ll re-use this example
from the CBOR spec:
{"a": 1, "b": [2, 3]}
In TestStructTag we call testEncoder with a tagged struct and
checks the output. AField & BField have the name a & b respectively, while all
the other fields must be ignored:
func TestStructTag(t *testing.T) {
testEncoder(t,
struct {
AField int `cbor:"a"`
BField []int `cbor:"b"`
Omit1 int `cbor:"c,omitempty"`
Omit2 int `cbor:",omitempty"`
Ignore int `cbor:"-"`
}{AField: 1, BField: []int{2, 3}, Ignore: 12345},
[]byte{0xa2, 0x61, 0x61, 0x01, 0x61, 0x62, 0x82, 0x02, 0x03},
)
}
If we run TestStructTag now the struct won’t be encoded correctly: every field
will be in the output and the first two fields won’t have the right key.
The encoding/json package implements best-in-class tagging: we are going to
steal some of its code to save time. Why write something new when we have
some battle-tested code available?
We’ll copy encoding/json/tags.go into our project and we’ll add the function
isEmptyValue from encoding/json/encode.go to it. We’ll replace package
json with package cbor at the top to import the new code into our package.
The new file tags.go looks like this:
// Source: https://golang.org/src/encoding/json/tags.go
//
// Copyright 2011 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
package cbor
import (
"reflect"
"strings"
)
// tagOptions is the string following a comma in a struct field's "json"
// tag, or the empty string. It does not include the leading comma.
type tagOptions string
// parseTag splits a struct field's json tag into its name and
// comma-separated options.
func parseTag(tag string) (string, tagOptions) {
if idx := strings.Index(tag, ","); idx != -1 {
return tag[:idx], tagOptions(tag[idx+1:])
}
return tag, tagOptions("")
}
// Contains reports whether a comma-separated list of options
// contains a particular substr flag. substr must be surrounded by a
// string boundary or commas.
func (o tagOptions) Contains(optionName string) bool {
if len(o) == 0 {
return false
}
s := string(o)
for s != "" {
var next string
i := strings.Index(s, ",")
if i >= 0 {
s, next = s[:i], s[i+1:]
}
if s == optionName {
return true
}
s = next
}
return false
}
// Source for isEmptyValue:
//
// https://golang.org/src/encoding/json/encode.go
func isEmptyValue(v reflect.Value) bool {
switch v.Kind() {
case reflect.Array, reflect.Map, reflect.Slice, reflect.String:
return v.Len() == 0
case reflect.Bool:
return !v.Bool()
case reflect.Int, reflect.Int8, reflect.Int16, reflect.Int32, reflect.Int64:
return v.Int() == 0
case reflect.Uint, reflect.Uint8, reflect.Uint16, reflect.Uint32, reflect.Uint64, reflect.Uintptr:
return v.Uint() == 0
case reflect.Float32, reflect.Float64:
return v.Float() == 0
case reflect.Interface, reflect.Ptr:
return v.IsNil()
}
return false
}
Copying code like this may be bad for long-term maintenance, if the Golang
developers fix something in the upstream code we won’t get the fix until we copy
it ourselves. It’s OK to do that with this exercise because we’re here to learn,
not to ship! Here’s what each function does:
parseTag returns the name, and option for the tag
tagOptions.Contains lets us query the omitempty option
isEmptyValue is used with the omitempty option
Let’s refactor writeStruct to handle tagging. We need to know how many elements
are in our map before we write the header. For example if we had a struct with 3
fields but one of them had the tag cbor:"-" indicating the field must be
ignored, the encoded map would only have 2 key-value pairs. Instead of iterating
and writing key-values on the fly, we’ll parse the fields first, and write the
result to the output second. We’ll build the list of fields to encode and then
write the encoded map from that list.
We define a new type fieldKeyValue to hold our key-value pairs, and iterate over
each field in the struct and skip the fields marked with a tag to ignore it.
Then we write the list of fields to the output, the new
writeStruct function looks like:
func (e *Encoder) writeStruct(v reflect.Value) error {
type fieldKeyValue struct {
Name string
Value reflect.Value
}
var fields []fieldKeyValue
// Iterate over each field and add its key & value to fields
for i := 0; i < v.NumField(); i++ {
var fType = v.Type().Field(i)
var fValue = v.Field(i)
var tag = fType.Tag.Get("cbor")
if tag == "-" {
continue
}
name, opts := parseTag(tag)
// with the option omitempty skip the value if it's empty
if opts.Contains("omitempty") && isEmptyValue(fValue) {
continue
}
if name == "" {
name = fType.Name
}
fields = append(fields, fieldKeyValue{Name: name, Value: fValue})
}
// write map from fields
if err := e.writeInteger(majorMap, uint64(len(fields))); err != nil {
return err
}
for _, kv := range fields {
if err := e.writeUnicodeString(kv.Name); err != nil {
return err
}
if err := e.encode(kv.Value); err != nil {
return err
}
}
return nil
}
As you can see we get the information about each field’s tag via
fType.Tag.Get("cbor"). We skip the field if its tag is “-”; or has an empty
value with the “omitempty” option.
go test runs and confirms that struct tagging is implemented correctly.
structs are done and our encoder is getting closer to be usable by a third
party. We only have a few reflect.Kind’s that needs to handled:
Float32
Float64
Complex64
Complex128
UnsafePointer
We’ll implement floating and complex numbers, and ignore the UnsafePointer kind
since we can’t reliably encode it. We’ll cover floating point numbers in the
next episode.
Check out the repository with the full code for this episode.
Go CBOR encoder: Episode 7, maps
This is a tutorial on how to write a CBOR encoder in Go, where we’ll learn
more about reflection and type introspection in Go.
Read the previous episodes, each episode builds on the previous one:
CBOR has a object or map type like JSON: it’s an ordered list of key/value
pairs. We’ll use it to encode two different kinds of Go types: maps and structs.
We’ll implement maps first and add support for structs in the next episode.
Major type 5: a map of pairs of data items. Maps are also called tables,
dictionaries, hashes, or objects (in JSON). A map is comprised of pairs of
data items, each pair consisting of a key that is immediately followed by a
value. The map’s length follows the rules for byte strings (major type 2),
except that the length denotes the number of pairs, not the length in bytes
that the map takes up. For example, a map that contains 9 pairs would have an
initial byte of 0b101_01001 (major type of 5, additional information of 9 for
the number of pairs) followed by the 18 remaining items. The first item is
the first key, the second item is the first value, the third item is the
second key, and so on. […]
As usual we’ll use the examples from the CBOR RFC to write the tests.
Maps are challenging to test because CBOR maps are ordered while Go maps aren’t.
The order in which Go maps keys are returned is unspecified according to
Value.MapKeys’s documentation:
MapKeys returns a slice containing all the keys present in the map, in
unspecified order.
It means testEncoder cannot verify maps with more than one key/value pair in it,
because it expects a unique result. Consider this map:
{1: 2, 3: 4}
They are multiple valid CBOR encoding for this map, because Go maps’ items can
be in any order. With the example above the first key could either be 1 or 3.
We’ll have to check the different possibilities in the tests. For example from
the CBOR spec we see that:
{1: 2, 3: 4}
Turns into:
0xa201020304
Here’s the breakdown of the output:
0xa2 → header for a map of 2 pairs
0x01 → first key: 1
0x02 → first value: 2
0x03 → second key: 3
0x04 → second value: 4
Because the map has two elements there’s another valid CBOR encoding for it with
3 as the first key and 1 as the second key like this:
0xa2 → header for a map of 2 pairs
0x03 → first key: 3
0x04 → first value: 4
0x01 → second key: 1
0x02 → second value: 2
Our tests will have to handle unordered keys in the output. To verify the
results we will search for every individual key/value pair in the encoded map to
ensure all the values are here regardless of the order.
We don’t need to worry about this for maps with less than two entries, we have
two examples from the CBOR spec like that:
The empty map: {}
Array containing a map with a single element: ["a", {"b": "c"}]
We’ll use testEncoder for those two cases. Let’s add a new test function TestMap
in cbor_test.go with two subtests for the easy examples:
Now we’ll add what’s needed for multi-item maps: we verify the header’s major
type, and the map length, then search all key-value pairs in the output. The
tests cases we’ll use to test maps are:
To verify unordered maps the test needs the list of encoded key-value pairs. In
our previous tests the test cases where stored in a structure like this:
struct {
Value interface{}
Expected []byte
}
We’ll change it to hold what we need to verify the map, we’ll turn Expected from
a slice of bytes into a slice of slice of bytes. The length of Expected is the
size of the map. Items in Expected are encoded key-value pairs to look-up in the
result:
struct {
Value interface{}
Expected [][]byte
}
We add the new test cases and the code to check the result in the TestMap
function:
Each case the test will extract the major type and length of the map from the
header using a bit mask, then verify their values. Our test cases have less
than 23 elements in them, so the header will be only one byte. If we had a case
with more than 23 elements we would have to change the code accordingly.
Let’s add the loop to iterate over the cases and verify what’s in the header:
for _, c := range cases {
t.Run(fmt.Sprintf("%v", c.Value), func(t *testing.T) {
var buffer bytes.Buffer
if err := NewEncoder(&buffer).Encode(c.Value); err != nil {
t.Fatalf("err: %#v != nil with %#v", err, c.Value)
}
var (
header = buffer.Bytes()[0]
result = buffer.Bytes()[1:]
lengthMask = ^uint8(0) >> 3 // bit mask to extract the length
length = header & lengthMask
)
if header>>5 != majorMap {
t.Fatalf("invalid major type: %#v", header)
}
if int(length) != len(c.Expected) {
t.Fatalf("invalid length: %#v != %#v", length, len(c.Expected))
}
}
}
We haven’t verified the map’s content yet, let’s add it: we search for each pair
in the encoder’s output, then remove it from the output. Once we’re done
verifying all the key-values, we check if the slice is empty to ensure there’s
nothing left-over in the output. We add that code is at the end of the loop:
for _, c := range cases {
t.Run(fmt.Sprintf("%v", c.Value), func(t *testing.T) {
...
// Iterate over the key/values we expect in the map
for _, kv := range c.Expected {
if !bytes.Contains(result, kv) {
t.Fatalf("key/value %#v not found in result", kv)
}
// remove the value from the result
result = bytes.Replace(result, kv, []byte{}, 1)
}
// ensure we got everything is the map
if len(result) > 0 {
t.Fatalf("leftover in result: %#v", result)
}
}
}
Tests are done, now let’s get them working. To encode the map we’ll
write its size in the header, then recursively encode each key followed by its
Then we’ll add a case clause matching reflect.Map in the encode’s switch
statement and call writeMap from it:
const majorMap = 5
...
func (e *Encoder) writeMap(v reflect.Value) error {
if err := e.writeInteger(majorMap, uint64(v.Len())); err != nil {
return err
}
for _, key := range v.MapKeys() {
e.encode(key)
e.encode(v.MapIndex(key))
}
return nil
}
func (e *Encoder) encode(x reflect.Value) error {
switch x.Kind() {
...
case reflect.Map:
return e.writeMap(x)
}
}
As you can see we didn’t have to add much code to encode maps. The real
challenge was the tests. Implemention was easy this time, but it won’t be next
time: we’ll work with structs in the next episode and it’ll be a big one.
Check out the repository with the full code for this episode.
Go CBOR encoder: Episode 6, negative integers and arrays
This is tutorial on how to write a CBOR encoder in Go. Its goal is to teach
reflection and type introspection. I recommend you read the previous
episodes before jumping into this one:
Our CBOR encoder only accepts unsigned integers at the moment, to
support all integer types we have to handle negative numbers. Negative number
encoding is similar to positive number with a different major type. The spec
says:
Major type 1: a negative integer. The encoding follows the rules for unsigned
integers (major type 0), except that the value is then -1 minus the encoded
unsigned integer. For example, the integer -500 would be 0b001_11001 (major
type 1, additional information 25) followed by the two bytes 0x01f3, which is
499 in decimal.
As usual the tests come first, we re-use the examples from the CBOR
specification to add TestNegativeIntegers in cbor_test.go:
For the encoder to recognize all integers types we add a new case
clause in Encode()’s switch statement with the additional integer
kinds like reflect.Int. It checks the sign of the integer: If the integer is
positive we write it as a positive number, if it’s negative we turn it into an
unsigned integer using the formula -(x+1), and we write that number to the
output:
const majorNegativeInteger = 1
...
func (e *Encoder) encode(x reflect.Value) error {
...
case reflect.Int, reflect.Int8, reflect.Int16, reflect.Int32, reflect.Int64:
var i = x.Int()
if i < 0 {
return e.writeInteger(majorNegativeInteger, uint64(-(i + 1)))
} else {
return e.writeInteger(majorPositiveInteger, uint64(i))
}
...
}
8 lines of code was all we needed to support all integer types. That was
easy, now we move onto something harder: arrays.
Arrays are the first composite type we add to the encoder. An array is a
list of objects, it can contain any type of object like a JSON array:
[null, true, 1, "hello"]
Arrays have their own major type according to the spec:
Major type 4: an array of data items. Arrays are also called lists, sequences,
or tuples. The array’s length follows the rules for byte strings (major type
2), except that the length denotes the number of data items, not the length in
bytes that the array takes up. Items in an array do not need to all be of the
same type. For example, an array that contains 10 items of any type would have
an initial byte of 0b100_01010 (major type of 4, additional information of 10
for the length) followed by the 10 remaining items.
Because arrays can contain any type we’ll have to recursively encode objects,
like we did in episode 4 with pointers.
Before we get started we’ll refactor how we recursively encode objects. Our
encoder works with reflect.Value but the Encode() method takes an interface{}
not a reflect.Value. When we call Encode() recursively we convert the
reflect.Value into an interface which is then converted back into a
reflect.Value. Those conversions aren’t efficient, so we’ll move all the code in
the Encode() method into a new method called encode() —all lowercase— that takes
a reflect.Value as parameter. Encode() is now just a call to this new method:
func (e *Encoder) Encode(v interface{}) error {
return e.encode(reflect.ValueOf(v))
}
func (e *Encoder) encode(x reflect.Value) error {
switch x.Kind() {
...
case reflect.Ptr:
if x.IsNil() {
return e.writeHeader(majorSimpleValue, simpleValueNil)
} else {
// this replaces e.Encode(reflect.Indirect(x).Interface())
return e.encode(reflect.Indirect(x))
}
...
}
return ErrNotImplemented
}
With this small refactoring done let’s add tests based on the
examples from the CBOR specification to our test suite, we have five
cases:
To get the tests to pass we have to match all array and slice types, except byte
array and byte slice. We already match arrays and slices in the previous
episode when we implemented byte strings.
When we have an array-like object to encode, we pass it to a new method
writeArray. It writes the header with the length of the array, then iterates
over the array’s elements and encode them recursively. To iterate over the array
all we need are the methods reflect.Value.Len and
reflect.Value.Index, we write a simple for loop and retrieve each item
with v.Index(i):
majorArray = 4
...
func (e *Encoder) writeArray(v reflect.Value) error {
if err := e.writeInteger(majorArray, uint64(v.Len())); err != nil {
return err
}
for i := 0; i < v.Len(); i++ {
if err := e.encode(v.Index(i)); err != nil {
return err
}
}
return nil
}
Let’s hook up writeArray to the main switch statement in encode(). We want to
match array and slice not made of bytes. To achieve this we just need to add a
call to writeArray after the if statement to check if we have a byte string in
the reflect.Slice’s case clause. We literally add a single line to
cbor.go:
func (e *Encoder) encode(x reflect.Value) error {
switch x.Kind() {
...
case reflect.Array:
// Create slice from array
var n = reflect.New(x.Type())
n.Elem().Set(x)
x = reflect.Indirect(n).Slice(0, x.Len())
fallthrough
case reflect.Slice:
if x.Type().Elem().Kind() == reflect.Uint8 {
return e.writeByteString(x.Bytes())
}
// We don’t have a byte string therefor we have an array
return e.writeArray(x)
...
}
return ErrNotImplemented
}
TestArray successfully runs, we are done with arrays. Check out the
repository with the full code for this episode.
With the addition of arrays our encoder can now encode complex data structures.
We’re about to make it even better and dive deeper into Go reflection with the
next major type: maps. See you in the next episode.
Go CBOR encoder: Episode 5, strings: bytes & unicode characters
CBOR strings are more complex than the types we already implemented,
they come in two flavors: byte string, and unicode string. Byte strings are
meant to encode binary content like images, while Unicode strings are for
human-readable text.
We’ll start with byte string, here’s what the spec says:
Major type 2: a byte string. The string’s length in bytes is represented
following the rules for positive integers (major type 0). For example, a byte
string whose length is 5 would have an initial byte of 0b010_00101 (major type
2, additional information 5 for the length), followed by 5 bytes of binary
content. A byte string whose length is 500 would have 3 initial bytes of
0b010_11001 (major type 2, additional information 25 to indicate a two-byte
length) followed by the two bytes 0x01f4 for a length of 500, followed by 500
bytes of binary content.
If we encoded the 5 bytes “hello” as a CBOR byte string we’d have something like
that:
0x45 // header for byte string of size 5: (2 << 5) | 5 → 0x45
0x68 0x65 0x6C 0x6C 0x6f // The 5 byte string hello
To encode byte strings we’ll encode a regular CBOR integer with major type 2,
and then we’ll write the byte string itself right after. The header has the type
and the size of the string as a positive integer —we implemented this in
episode 3—; and the data sized by the integer in the header. Before we
can write the special header we’ll change the function writeInteger we wrote in
episode 3 to add a parameter for the major type so it is now configurable
by the caller and we modify the call to writeInteger() in Encode() to work with
the new call:
func (e *Encoder) writeInteger(major byte, i uint64) error {
switch {
case i <= 23:
return e.writeHeader(major, byte(i))
case i <= 0xff:
return e.writeHeaderInteger(major, minorPositiveInt8, uint8(i))
case i <= 0xffff:
return e.writeHeaderInteger(major, minorPositiveInt16, uint16(i))
case i <= 0xffffffff:
return e.writeHeaderInteger(major, minorPositiveInt32, uint32(i))
default:
return e.writeHeaderInteger(major, minorPositiveInt64, uint64(i))
}
}
...
case reflect.Uint, reflect.Uint8, reflect.Uint16, reflect.Uint32, reflect.Uint64:
// we pass major type to writeInteger
return e.writeInteger(majorPositiveInteger, x.Uint())
This change will be useful later when we implement more complex types: it’s
common to write an integer with the header for variable sized CBOR types.
Now we have to figure out how to match byte string types with reflect. Go
has two distict types that match CBOR byte strings: byte slices, and byte
arrays. If you don’t know the different between a slice and an array, I
recommend the splendid article from the Golang blog: Go Slices: usage and
internals. We’ll focus on slices first then arrays.
We start by adding tests based on the examples from the CBOR spec:
Slices have their own reflect kind: reflect.Slice. We only handle slices of
bytes, so we’ll have to check the slice elements’ type like this:
var exampleSlice = reflect.ValueOf([]byte{1, 2, 3})
if exampleSlice.Type().Elem().Kind() == reflect.Uint8 {
fmt.Println("Slice of bytes")
}
We use reflect.Uint8 in the if clause, because the byte type is an alias to
uint8 in Go.
We add another case clause in Encode’s switch statement for slices and we check
the slice’s elements’ type like this:
case reflect.Slice:
if x.Type().Elem().Kind() == reflect.Uint8 {
// byte string
}
Now all we have left to do is write the header and the byte string into the
output, we’ll add the writeByteString method to tuck all the boilerplate code
away from our main switch statement:
// we add the major type for byte string
majorByteString = 2
...
func (e *Encoder) writeByteString(s []byte) error {
if err := e.writeInteger(majorByteString, uint64(len(s))); err != nil {
return err
}
_, err := e.w.Write(s)
return err
}
... In Encode() ...
case reflect.Slice:
if x.Type().Elem().Kind() == reflect.Uint8 {
return e.writeByteString(x.Bytes())
}
A quick run of go test confirms byte slices work, but we’re not done with byte
strings yet, we still have to handle arrays. It’s easier to work with slices in
general, so we’ll convert arrays to slices to avoid writing array specific code
and re-use what we just wrote. We add the following code to our existing test
TestByteString:
// for arrays
t.Run("array", func(t *testing.T) {
a := [...]byte{1, 2}
testEncoder(t, &a, nil, []byte{0x42, 1, 2})
})
Let’s add another case clause right before the case clause matching
reflect.Slice:
case reflect.Array:
// turn x into a slice
x = x.Slice(0, x.Len())
fallthrough
case reflect.Slice:
...
We create a slice from our backing array with Value.Slice(), then
we run the tests and we get a surprise:
$ go test -v .
...
=== RUN TestByteString/array
panic: reflect.Value.Slice: slice of unaddressable array [recovered]
panic: reflect.Value.Slice: slice of unaddressable array
...
It turns out we have an “unaddressable” array, and we cannot create a slice on
it with Value.Slice() according to the doc. How are we going to
get out of this? reflect doesn’t let us reference the array directly, we need to
turn the array into something addressable: a pointer to the array. We create a
pointer to it with reflect.New, then we use the pointer with
reflect.Indirect to create our slice:
case reflect.Array:
// Create slice from array
var n = reflect.New(x.Type())
n.Elem().Set(x)
x = reflect.Indirect(n).Slice(0, x.Len())
fallthrough
case reflect.Slice:
...
A quick run of go test confirms this solved our issue with the unaddressable
array. All TestByteString tests now pass! We’re done with byte strings, unicode
strings are next.
Text strings are like byte strings with a different major type. We have the
header with the length of the string in bytes, and the data at the end. Text
data is encoded in UTF-8 —Go’s native string encoding— so there’s no need to
re-encode it: we can just write the string to the output as it is. Like we did
for byte strings we add examples from the CBOR spec in a new test called
TestUnicodeString:
In the previous episode we encoded positive integers and learned how to
write a CBOR item with a variable size. Our CBOR encoder can now
encode nil, true, false, and unsigned integers. cbor.Encoder has grown strong,
but type switches have their limits, we need more powerful weapons for the
battles ahead: we’re about to take on pointers, and reflect will be our
sword.
In the first episode of the series we encoded of the nil value since it
was the easiest value to start with, but we aren’t finished with the nil we
still got work to do to cover all cases. That’s because our encoder only handles
the “naked” nil value, but not typed pointers that are nil. Whaaat? There are
two kinds of nil pointers? Yep, that’s because nil by itself is special.
Consider the following code:
var p *int = nil
var v interface{} = p
switch v.(type) {
case nil:
fmt.Println("nil")
case *int:
fmt.Println("int pointer")
}
The example above prints “int pointer”, because v isn’t a regular value but an
interface that points to a int pointer value. Go interfaces are pairs of 32 or
64bits addresses: one for the type and one for the value. So in the type switch
above we match the *int case because p’s type is *int. If we replaced the v
definition with var v interface{} = nil, the program would print “nil”. That’s
because the type of a nil value is itself nil, but typed-pointers’ type aren’t.
Russ Cox’s article Go Data Structures: Interfaces is a superb
introduction to how Go interfaces work if you’d like to learn more.
Let’s exhibit the problem in our code and add a test for typed nil pointers:
func TestNilTyped(t *testing.T) {
var i *int = nil
testEncoder(t, i, nil, []byte{0xf6})
var v interface{} = nil
testEncoder(t, v, nil, []byte{0xf6})
}
And run our tests with go test to see what happens:
The *int(nil) value isn’t recognized. So why did plain nil worked? Because it’s
special: both its type and its value are nil. The Encode function matches the
naked nil with the case nil statement in the type switch, this means only
interfaces with a nil type will be matched. Therefor the code only works with
the naked nil value, but not with typed pointers.
It turns out there’s a package to address that: reflect introspects the type
system and let us match pointer types individually. The Law of
reflection is a great introduction to reflection and the use of
this package.
So we want to know if a value is a pointer. How does reflect help us? Consider
this snippet:
fmt.Println(reflect.ValueOf(nil).Kind())
var i *int = nil
fmt.Println(reflect.ValueOf(i).Kind())
It prints:
invalid
ptr
What happens here? First we convert each Go value to a reflect.Value,
then we query its type with the method Kind that returns a reflect.Kind
enumeration. reflect.Kind represents the specific kind of type that a Type
represents. Kinds are families of types. For example there is a kind for
structs —reflect.Struct—, for functions —reflect.Func—, and for pointers
—reflect.Pointer.
We see above that the naked nil value and a nil pointer to integer have
different kinds: invalid, and ptr. We’ll have to handle the two cases
separately.
Refactoring time! We replace the type switch with a switch statement on the Kind
of our value. In the example below x.Kind() allows us to distinguish types the
same way the type switch x.(type) did:
func (e *Encoder) Encode(v interface{}) error {
x := reflect.ValueOf(v)
switch x.Kind() {
case reflect.Invalid:
// naked nil value == invalid type
return e.writeHeader(majorSimpleValue, simpleValueNil)
case reflect.Bool:
var minor byte
if x.Bool() {
minor = simpleValueTrue
} else {
minor = simpleValueFalse
}
return e.writeHeader(majorSimpleValue, minor)
case reflect.Uint, reflect.Uint8, reflect.Uint16, reflect.Uint32, reflect.Uint64:
return e.writeInteger(x.Uint())
}
return ErrNotImplemented
}
To identify pointer types reflect has a Kind called reflect.Ptr. We add a
another case clause for reflect.Ptr, then if the pointer is nil we write the
encoded nil value to the output:
case reflect.Ptr:
if x.IsNil() {
return e.writeHeader(majorSimpleValue, simpleValueNil)
}
After we add that, a quick run of go test confirms that TestNilTyped works.
Splendid! We solved nil pointers. How about non-nil pointers? They are
relatively easy to handle: if we detect a pointer we can fetch the value it
refers to via reflect.Indirect. So when we get a pointer we
get the value it references instead of the memory address. Here’s an example of
how reflect.Indirect works:
var i = 1
var p = &i
var reflectValue = reflect.ValueOf(p)
fmt.Println(reflectValue.Kind())
fmt.Println(reflect.Indirect(reflectValue).Kind())
It prints:
int
ptr
When we find a non-nil pointer type, we call the Indirect function to retrieve
the pointed value and we recursively call the Encode method on that value. We
add a new test: TestPointer that verifies pointer referencing works as intended:
func TestPointer(t *testing.T) {
i := uint(10)
pi := &i // pi is a *uint
// should output the number 10
testEncoder(t, pi, nil, []byte{0x0a})
}
With our test written let’s add the code necessary to handle valid pointers in
our case clause:
case reflect.Ptr:
if x.IsNil() {
return e.writeHeader(majorSimpleValue, simpleValueNil)
} else {
return e.Encode(reflect.Indirect(x).Interface())
}
reflect.Indirect(x).Interface()) retrieves an interface to x’s underlying value,
we pass it recursively to Encode and return the result. So if we passed a
pointer to pointer to pointer to integer (***int) we’d have 3 recursive
calls to Encode. TestPointer now passes, we are done with pointers!
There’s a repository with the code for this episode.
The reflect package will help us to handle more complex types in subsequent
episodes. Next time we will encode string types: byte string, and Unicode
strings.
kgpdemux is a TCP demultiplexer that uses the KGP protocol. I wrote it about a
year ago as an experiment to use with Sauce Connect Proxy. It’s my best
example of how to uses channels to implement complex control flows in Go
efficiently. Sources are on Bitbucket: https://bitbucket.org/henry/kgp/src
The juicy part is kgp.go,
where most of the concurrency is implemented.
That is all.
Go CBOR encoder: Episode 3, positive integers
In the previous episode we wrote a CBOR encoder that can handle the
values nil, true, and false. Next we’ll focus on positive integers.
To proceed we have to learn more about how values are encoded. A CBOR object’s
type is determined by the 3 first bits of its first byte. The first byte is
called the header: it describes the data type and tells the decoder how to
decode what follows, sometime the header contains data about the value in the
additional 5 bits leftover but most of the time it contains information about
the type.
For example: the encoded nil value is a single byte with the value 246, in
binary that’s: 0b11110110. The first 3 bits are all 1’s, that’s 7 in decimal.
The nil value’s major type is 7, which correspond to the “simple values”
major type. The last 5 bits are 0b10110 or 22 in decimal, that’s the additional
value with the type of the value, in our case it’s nil. To summarize the nil
value’s major type is 7, and the additional value 22 identifies it as type nil.
Here’s how you’d reconstruct the header for nil from the major type and the
additional value:
byte(majorType << 5) | additionalValue
The booleans true and false have the same major type as nil: 7 and their
additional values are 20 and 21 respectively. We’d build booleans from their
major type and additional value like this:
fmt.Printf("%x\n", byte(7 << 5) | 20) // prints f4
fmt.Printf("%x\n", byte(7 << 5) | 21) // prints f5
Positive integers have their own major type: 0. With only 5 bits in the header
that’s not enough to encode values higher than 32, therefor integers’ encoding
in more complex than booleans and nil. The first 24 values are reserved for
integers from 0 to 23, for integers bigger than 23 we have to write extra bytes
to the output to encode them. To indicate how much data is needed to decode the
integer we have the special additional values 24, 25, 26, and 27, they
correspond to 8, 16, 32, and 64 bits integers respectively.
For example to encode 500 we need to use at least a 2 bytes integer, because 500
is too much to be represented as a single byte. So the first byte would be major
type 0 and additional value 25 to tell the decoder: “hey, what follows is a two
byte positive integer”. The header would look like this: 0b000_11001, followed
by two byte 0x01 0xf4, that’s 500 encoded as a 16 bits big-endian integer.
Start with the easy case: integers from 0 to 23. We add a method called
writeHeader to cbor.go that writes the single byte header to the output. To
avoid using magic numbers all over our code we’ll also set some constants for
the types we can encode thus far. We add the following to cbor.go:
const (
// major types
majorPositiveInteger = 0
majorSimpleValue = 7
// simple values == major type 7
simpleValueFalse = 20
simpleValueTrue = 21
simpleValueNil = 22
)
func (e *Encoder) writeHeader(major, minor byte) error {
h := byte((major << 5) | minor)
_, err := e.w.Write([]byte{h})
return err
}
We use writeHeader to clear the magic numbers we put in the Encode method from
the previous episodes. Our Encode method looks tighter now:
func (e *Encoder) Encode(v interface{}) error {
switch v.(type) {
case nil:
return e.writeHeader(majorSimpleValue, simpleValueNil)
case bool:
var minor byte
if v.(bool) {
minor = simpleValueTrue
} else {
minor = simpleValueFalse
}
return e.writeHeader(majorSimpleValue, minor)
}
return ErrNotImplemented
}
Our mini-refactoring is done, we check everything is still working with go
test and it does still work. Now that we cleaned that up and verified it works
we add tests for the small integers in cbor_test.go:
func TestIntSmall(t *testing.T) {
for i := 0; i <= 23; i++ {
testEncoder(t, uint64(i), nil, []byte{i})
}
}
We loop from 0 to 23, we build our expected return value and check it
corresponds to what the encoder gives us. In this case a single byte with the
major type 0, and our value i.
Some of you may have noticed that we turn our value i into an uint64 when we
pass it to testEncoder instead of a plain int. That’s because Go has different
integers types like uint64, and
int16, and plain int, unfortunately all those types are different for the Go
type system and require adding extra code to work. We will handle the other
integers later for now we’ll stick to uint64.
Small integers are easy to implement: in Encode switch’s statement we add a
case uint64: clause, and if the integer is between 0 and 23 we output the
header with the right additional value and that’s all:
case uint64:
var i = v.(uint64)
if i <= 23 {
return e.writeHeader(majorPositiveInteger, byte(i))
}
}
A quick run with go test confirms TestIntSmall works. Time to work on the
extended integers: as usual we’ll write the tests first. To get good coverage,
we’re going to copy the examples given in the appendix of the CBOR spec for
our tests.
We’ll use subtests to make it easier to track what test fails, subtests
allows you to define multiple sub-tests with different names inside a single
test function. Our subtests’ names will be the numbers we’re checking, for
example to test the integer 10 we’d do something like this:
func TestExample(t *testing.T) {
t.Run(
"10", // name of the subtest
func(t *testing.T) { // function to execute
testEncoder(t, uint64(10), nil, byte{0x0a})
},
)
}
When we run go test with this example we’ll have a test named
“TestExample/10”, we could add another call to t.Run() with the string “foo” as
name to create another subtest named “TestExample/foo”.
Let’s replace this example with real tests. We’ll use a table to store our test
cases, iterate over it, and verify each results. Our tests values and expected
outputs are taken from the CBOR spec examples:
Big CBOR integers have 2 parts: a header to determine the type, followed by
the value encoded as a big endian integer. For example 25 is encoded as
0x1819, that’s 2 bytes: the header is 0x18 or 24 in decimal, that corresponds to
a 8 bit integer type. The second byte after the header is 0x19 or 25 in decimal
the integer we encoded. To re-iterate: the header gives us the type of the value
and the bytes following the header is the value being encoded.
The first thing we’ll do is add a helper function to write our native integers
as big endian integers. It takes an interface{} as parameter instead of an
integer because the package encoding/binary uses the type of the value it
writes to determine how much data to write. For example passing the value 1
typed as a uint16 to binary.Write will output 2 bytes: 0x0001. This allows
us to cast our integer to the right type to encode our the correct sized integer
with binary.Write:
// writeHeaderInteger writes out a header created from major and minor magic
// numbers and write the value v as a big endian value
func (e *Encoder) writeHeaderInteger(major, minor byte, v interface{}) error {
if err := e.writeHeader(major, minor); err != nil {
return err
}
return binary.Write(e.w, binary.BigEndian, v)
}
We don’t want the big switch statement in the Encode method to become messy as
we’re adding more code, so we create a new method for our encoder: writeInteger
where we’ll put all the code to encode integers.
The writeInteger method encodes our single integer value and casts it to the
smallest integer type that can hold its value:
func (e *Encoder) writeInteger(i uint64) error {
switch {
case i <= 23:
return e.writeHeader(majorPositiveInteger, byte(i))
case i <= 0xff:
return e.writeHeaderInteger(
majorPositiveInteger, minorPositiveInt8, uint8(i),
)
case i <= 0xffff:
return e.writeHeaderInteger(
majorPositiveInteger, minorPositiveInt16, uint16(i),
)
case i <= 0xffffffff:
return e.writeHeaderInteger(
majorPositiveInteger, minorPositiveInt32, uint32(i),
)
default:
return e.writeHeaderInteger(
majorPositiveInteger, minorPositiveInt64, uint64(i),
)
}
}
As you can see we cast the value i into different integer types depending on how
big it is to minimize the size of what we write to the output. The less bytes we
use the better.
Encode now looks like this:
func (e *Encoder) Encode(v interface{}) error {
switch v.(type) {
case nil:
return e.writeHeader(majorSimpleValue, simpleValueNil)
case bool:
var minor byte
if v.(bool) {
minor = simpleValueTrue
} else {
minor = simpleValueFalse
}
return e.writeHeader(majorSimpleValue, minor)
case uint64,:
return e.writeInteger(v.(uint64))
}
return ErrNotImplemented
}
Once we add this little bit of code our integer tests will pass:
Let’s add the integer types we ignored thus far to be more exhaustive with what
our encoder supports:
case uint, uint8, uint16, uint32, uint64, int, int8, int16, int32, int64:
if v.(uint64) >= 0 {
return e.writeInteger(v.(uint64))
}
Now we can pass a positive int, int8, int16, int32, or int64 and it will work.
We can’t handle negative number yet.
That’ll all for now. There’s a repository with the code for this
episode. In the next episode we’ll introduce the reflect package to care of
pointers.
Go CBOR encoder: Episode 2, booleans
In the previous episode, we learned how to encode the nil value. Now we’ll
do booleans. According to the CBOR specification, booleans are represented
by a single byte: 0xf4 for false, and 0xf5 for true.
We’ll write the tests first, but before we do that let’s write a helper function
for our encoder tests. We want to avoid copy-pasting the same code all over our
tests. Looking at the test we wrote in the previous episode, that’s how all of
our future tests will look like:
func TestNil(t *testing.T) {
var buffer = bytes.Buffer{}
var err = NewEncoder(&buffer).Encode(nil)
if !(err == nil && bytes.Equal(buffer.Bytes(), []byte{0xf6})) {
t.Fatalf(
"%#v != %#v or %#v != %#v",
err, nil, buffer.Bytes(), []byte{0xf6},
)
}
}
We test something with a well defined interface: the encoder gets a value,
returns an error, and outputs an array of bytes. This means we can use this to
factor out most of the code into a single helper function named testEncoder, we
add this to our test file:
// testEncoder test the CBOR encoder with the value v, and verify that err, and
// expected match what's returned and written by the encoder.
func testEncoder(t *testing.T, v interface{}, err error, expected []byte) {
// buffer is where we write the CBOR encoded values
var buffer = bytes.Buffer{}
// create a new encoder writing to buffer, and encode v with it
var e = NewEncoder(&buffer).Encode(v)
if e != err {
t.Fatalf("err: %#v != %#v with %#v", e, err, v)
}
if !bytes.Equal(buffer.Bytes(), expected) {
t.Fatalf(
"(%#v) %#v != %#v", v, buffer.Bytes(), expected,
)
}
}
testEncoder will save quite a bit of typing. TestNil turns into a single line
—saving 8 lines— with testEncoder doing all the work:
With our current encoder only able to encode nil right now, if we run the tests
we’ll get a not implemented error thrown at us:
$ go test -v .
=== RUN TestNil
--- PASS: TestNil (0.00s)
=== RUN TestBool
--- FAIL: TestBool (0.00s)
cbor_test.go:19: err: &errors.errorString{s:"Not Implemented"} != <nil> with false
FAIL
FAIL _/home/henry/cbor 0.003s
Now we’ll implement booleans encoding and get those tests passing. From the
previous episode our Encode function looked like this:
var ErrNotImplemented = errors.New("Not Implemented")
// Can only encode nil
func (enc *Encoder) Encode(v interface{}) error {
switch v.(type) {
case nil:
var _, err = enc.w.Write([]byte{0xf6})
return err
}
return ErrNotImplemented
}
We need to add another case to the switch block to know if we have a boolean,
and then we’ll have to turn the generic interface{} named v into a boolean value
to know what the encode will output into its writer:
// Can only encode nil, false, and true
func (enc *Encoder) Encode(v interface{}) error {
switch v.(type) {
case nil:
var _, err = enc.w.Write([]byte{0xf6})
return err
case bool:
var err error
if v.(bool) {
_, err = enc.w.Write([]byte{0xf5}) // true
} else {
_, err = enc.w.Write([]byte{0xf4}) // false
}
return err
}
return ErrNotImplemented
}
The tricky part here is v.(bool): this turns the non-typed interface v into a
boolean value using type assertion.
Encode now works with booleans and our tests pass:
$ go test -v .
=== RUN TestNil
--- PASS: TestNil (0.00s)
=== RUN TestBool
--- PASS: TestBool (0.00s)
PASS
ok _/home/henry/cbor 0.003s
This wraps up the 2nd episode. Next we’ll encode a type that’s more than a
single byte of output: positive integers.
Let’s write a CBOR encoder in Go. We’ll learn more about type switching and
type manipulation with the reflect package. This is going to be a series of
posts, each building on the previous one. It requires a good understanding of
the Golang syntax.
CBOR is a data format described in RFC 7049, it’s like JSON but binary
instead of text. Its design goals are:
extremely small code size
fairly small message size
extensibility without the need for version negotiation
We use an interface similar to the encoding/json package. If you are
unfamiliar with this encoding sub-packages, I recommend you read the JSON and
Go article.
To start our empty package we’ll create a file named cbor.go like this:
var output = bytes.Buffer{}
var encoder = NewEncoder(&output)
var myvalue = 1234
// write the integer 1234 CBOR encoded into output
if err := encoder.Encode(&myvalue); err != nil {
...
}
We have our basic structure we can now start working on the encoder’s
implementation. In the previous example we encoded the integer 1234, but we
won’t start with integers, instead we will encode the value nil to start
because it’s the easiest value to encode.
According the CBOR specification the nil value is represented by a single byte:
0xf6
Let’s write a test with the testing package, we’ll
verify the encoder outputs the single byte 0xf6 into the result buffer when we
pass nil. We create a new file cbor_test.go beside cbor.go for our tests:
If we run the test with what we currently have we’ll get an error. No surprise:
we haven’t implemented anything yet, so the encoder won’t write to the output
buffer:
$ go test .
--- FAIL: TestNil (0.00 seconds)
cbor_test.go:15: <nil> != nil or []byte{} != []bytes{0xf6}
FAIL
FAIL _/home/henry/essays/cbor 0.011s
To implement the nil value encoding: we write the byte in the result when
calling Encode() and we’ll return an error if the value isn’t nil since we
haven’t implemented anything else yet:
var ErrNotImplemented = errors.New("Not Implemented")
// Can only encode nil
func (enc *Encoder) Encode(v interface{}) error {
switch v.(type) {
case nil:
var _, err = enc.w.Write([]byte{0xf6})
return err
}
return ErrNotImplemented
}
Here we’re using a type switch to determine the type of the value we got.
There’s only one case for now: nil, where we write the 0xf6 value in the output
and return the error to the caller.
And now the test succeeds:
=== RUN TestNil
--- PASS: TestNil (0.00s)
PASS
ok _/home/henry/essays/cbor 0.027s
We created the initial encoder that can encode a single value successfully. In
the next episode we’ll implement CBOR encoding for more basic Go types.
Boot OpenBSD with EFI for full resolution display
I got a new radeon graphic card for gaming. Unfortunately it’s not supported yet by the radeon(4) driver on OpenBSD. Luckily there’s a workaround: booting with UEFI. UEFI does all the talking with the graphics card, and this allows OpenBSD to use the screen’s full resolution on unsupported cards: it should work better than the vesa(4) driver. EFI boots the operating system differently than old school BIOS: it uses a special partition for the operating system’s bootloader.
First I needed to enable UEFI. It wasn’t called UEFI or EFI in the BIOS setup but something like “Windows 8 / 10 boot method”, I picked the regular version not the WHQL.
Second I created the EFI system partition to store the EFI bootloaders with gparted: I made a 100 MB partition formatted as FAT32 at the beginning of the disk, then set the flags “boot” & “esp” on it.
Third I created the system’s partition and did the install via a USB stick a usual. Once the install was done and before rebooting, I copied the OpenBSD’s EFI bootloader to the EFI system partition like this:
Updated reminder for later: pkg.conf doesn’t exist anymore on OpenBSD, now it’s
installurl(5) you use to setup the mirror
to install packages.
I work at a company that does Selenium stuff. So I build Selenium load testing
tools on the side, because I think it’s cool. Maybe someone will be impressed:
Vim is my day to day editor: I use it for coding and writing prose. Vim was made for programmers, not really for writers. Here are a few plugins I use when I write prose with Vim.
Goyo offers distraction free mode for Vim. It lets me to resize the writing surface in the editor’s window however I want.
I like soft line wraps when I write prose. Soft line wrap is when the text reaches the end of the window it gets wrap back to the beginning of the next line without inserting a new-line. Vim Pencil lets you do that.
vim-textobj-quote offers support for smart quoting as you type like Unicycle but this plugin is still maintained.
I have this snippet in my .vimrc to get in and out of the writer mode:
" I use a 80 columns wide editing window
function s:WriteOn()
call pencil#init({'wrap': 'soft', 'textwidth': 80})
Educate
Goyo 80x100%
endfunction
function s:WriteOff()
NoPencil
NoEducate
Goyo!
endfunction
command WriteOn call s:WriteOn()
command WriteOff call s:WriteOff()
Reminder for later, how to selectively rollback a file to the specified version:
git reset -p '<hash>' '<filename>'
The function ScanLines splits lines from a io.Reader, it returns a channel where
the results are written. Handy when you want to read things on the fly line by
line:
//
// Read input line by line and send it to the returned channel. Once there's
// nothing left to read closes the channel.
//
func ScanLines(input io.Reader) <-chan string {
var output = make(chan string)
go func() {
var scanner = bufio.NewScanner(input)
for scanner.Scan() {
output <- scanner.Text()
}
if err := scanner.Err(); err != nil {
fmt.Fprintln(os.Stderr, "reading input:", err)
}
close(output)
}()
return output
}
func main() {
var input = ScanLines(os.Stdin)
for x := range input {
fmt.Printf("%#v\n", x)
}
}
Reminder for later, OpenBSD’s pkg_delete utility can remove unused dependencies
automatically:
# pkg_delete -a
How to fill a PDF form with pdftk
I had a rather lenghty PDF form to fill, it took me 2 hours to do it becuase
copy-pasting didn’t work in with my PDF editor.
After I saved the file I realized that I clicked on a radio button I shouldn’t
have clicked on: Kids. I do not have kid, and the radio selection didn’t contain
the zero option, only one and more. After trying to get rid of that radio
selection for 5 minutes, it looked like there was no way to undo this: I had
selected something I couldn’t unselect.
I didn’t want to waste another 2 hours to fill out the form, I needed to fix
this by editing the PDF.
After a bit of googling I found pdftk, a command-line toolkit that can fill
& extract information out of PDF forms.
To unselect the radio box, I had to extract the form data. Pdftk can extract the
information into a text file that you can edit with a text editor.
pdftk input.pdf generate_fdf output form_data.fdf
Here it will generate form_data.fdf from input.pdf’s form values. After that I had
to modify the fdf file to get rid of my selection. In my case, I wanted to reset
the selection for the Kids radio selection.
/Kids [
<<
/V (1)
/T (RadioButtonList[0])
>>]
I changed it from “1 kid” to “nothing selected”.
/Kids [
<<
/V /Off
/T (RadioButtonList[0])
>>]
Then I had to re-enter the information from the FDF file into the PDF.
It took me around an hour to do all this, so pdftk saved me time. I liked it,
you can check out pdftk’s own examples to learne more, the documentation is
terse and complete.
Still bitmap after all those years
Bitmap fonts are pixel-art fonts. Unlike outline fonts they cannot be
automatically scaled with good results, to create a multi-size bitmap font you
have to create a different version for each size. They can’t be anti-aliased so
they tend to look blocky compared to outline fonts.
Outline fonts use Bézier curves, they are scallable, and their edges can be
anti-aliased to make them look nicer. Today everybody is running an operating
system that can render outline fonts decently, and can use those smooth looking
beauties with superior results compared to bitmap fonts.
Bitmap fonts are a thing of the past.
Yet, I still use a bitmap font for my day to day programming tasks. I
transitionned to an outline font for a while, but ultimately switched back after
a few months because the outline font didn’t seem as sharp.
It may be silly, but nothing looks as sharp as a bitmap font to me. I’m talking
what it looks like on a computer screen in 2016 with a dot pitch of 0.27 mm.
Because each pixel is either black or white and nothing smooths out the edges,
it’s sharp.
I salivate like everybody on those screenshots of multi-colored terminal window
with a fancy outline font that support ligatures, has cool emoji’s icons, and
rainbows of bright pastel colors. I’m sure it’s great to feel like you’re on
acid while you write code, but I like my bitmap font and my bland terminal
colors. It gets the job done and it’s easy on my eyes.
I’ll switch to outline fonts when I get a screen with a high pixel density for
my workstation, but for now I’ll use my bitmap font, it doesn’t look so bad with
today’s fat pixels.
Reminder for later, how to setup pkg.conf after a fresh OpenBSD install:
In that case we’d just like to extract the list data1, data2, & data3 from the
table. With the different markup in each cell it would take quite a bit of elbow
grease to clean it up. lxml has a special method that makes all that easy:
text_content. Here’s what the documentation says about it:
Returns the text content of the element, including the text content of its
children, with no markup.
For the previous HTML snippet we’d extract the data like this:
>>> from lxml import html
>>> root = html.fromstring(''' <td>
... <a href='...'><strong>data1</strong></a>
... </td>
... <td>
... data2
... </td>
... <td>
... data<em>3</em>
... </td>
... ''')
>>> [i.text_content().strip() for i in root.xpath('//td')]
['data1', 'data2', 'data3']
I got new speakers with a built-in USB-DAC for my home computer.
Once plugged OpenBSD recognized it as an USB-audio device, so far so good.
Unfortunatly I couldn’t get any sound out of it, but my computer’s sound card
—which is recognised as a separate device— worked.
It turns out that by default sndiod —the system audio mixer— uses the first
audio device it finds and ignores the others. To get it to use other devices you
must specify them in rc.conf.local like this:
sndiod_flags="-f rsnd/0 -f rsnd/1"
I restarted sndiod like this sudo /etc/rc.d/sndiod restart, and everything now
works nicely.
I keep diaries on different subjects. When I add a new entry I start by
inserting a timestamp at the top of the entry. I used to do it ‘manually’ –by
copying the current date and pasting in into Vim–, and yesterday I decided to
write a vim function to automate that.
There’s nothing especially hard about that, but it took me a while before I
figured out how to insert the timestamp at the current cursor position. It
didn’t look like there was any built-in Vim function to do it, and most solution
I found online seems overly complicated.
It turns out that all I needed was an execute statetement like this:
execute ":normal itext to insert", this will insert the string “text to
insert” at the current cursor position.
I this added to my vimrc:
function s:InsertISODate()
let timestamp = strftime('%Y-%m-%d')
execute ":normal i" . timestamp
echo 'New time: ' . timestamp
endfunction
function s:InsertISODatetime()
let timestamp = strftime('%Y-%m-%d %H:%M:%S')
execute ":normal i" . timestamp
echo 'New time: ' . timestamp
endfunction
command Today call s:InsertISODate()
command Now call s:InsertISODatetime()
Reminder for laster, reset a branch to what its remote branch is:
git checkout -B master origin/master
I looked for a decent weather app on Android for a while. I tried many, they
tended to be cluttered and overly complicated for what they were doing. I’m now
using Weather Timeline, it’s clear, fast, and simple. I check it every
morning, it gives me a quick and clean overview of the forecast, no need to dig
the information in sub-menus, there’s no ad, and it’s just $1.
Sometime you need an up-to-date virtualenv for your Python project, but the one
installed is an old version. I read virtualenv’s installation manual, but I
didn’t like much that you have to use sudo to bootstrap it. I came with an
alternative way of installing an up-to-date virtualenv as long as you have an
old version. In my case it was an Ubuntu 12.04 which ships virtualenv 1.7.
1st install the outdated version of virtualenv:
$ sudo apt-get install python-virtualenv
Then setup a temporary environment:
$ virtualenv $HOME/tmpenv
Finally use the environment created before to bootstrap an up-to-date one:
$ "$HOME/tmpenv/bin/pip" install virtualenv
$ "$HOME/tmpenv/bin/virtualenv" $HOME/env
$ rm -rf "$HOME/tmpenv" # Delete the old one if needed
I love Rob Pike’s talks: face-paced and intense. It’s a nice change from the
typical talks about programming: slow, and often more about self-promotion than
teaching.
I usually feel I miss a few things here and there when I listen to Rob, he’s
smart and expect you to be smart, he doesn’t talk down to you. I rarely
understand everything perfectly: this is good, it means I don’t fully master the
subject, it means I’m learning, it means he’s making a good use of my time.
This talk about implementing a bignum calculator is the perfect example,
Rob doesn’t spend much time reading and explaining the code or the examples: he
assumes that his audience is smart enough to understand the most of the details;
he focuses on the big picture, and the hard details.
I needed a set datastructure for a Go program, after a quick search on the
interweb I saw this reddit thread about sets, queues, etc…
Short answer: for sets use maps, specifically map[<element>]struct{}. My first
intuition was to use map[<element>]interface{}, but it turns out that an empty
interface takes 8 bytes: 4 bytes for the type, and 4 bytes for the value which
is always nil, while an empty structure doesn’t use any space.
There weren’t many details on how to do to it. So I just gave it a try, it was
pretty easy to figure out the implementation, as long as operations like union,
intersection arent’ needed.
That’s how I would implement an integer set:
type set map[int]struct{}
var myset = make(set) // Allocate the map
// Add an element to the set by adding an empty structure for the key 1
myset[1] = struct{}{}
// Check if we have 1 in our set
if _, ok := myset[1]; ok {
println("1 in myset")
} else {
println("1 not in myset")
}
// Remove the element from the set
delete(myset, 1)
REST toolbox for the command-line hacker
I work with lots of REST services those days. RESTful services are easy to
access and use because they’re based on well-known tech, this eliminates half of
the tedious work.
Unfortunatly the other tedious half is still here: interfacing.
We still need to get and convert the data from the original format to the format
we want. Lately I found two tools that help a great deal with HTTP and JSON:
HTTPie and Jq. Today I’ll talk about HTTPie.
I used cURL for almost a decade to deal with HTTP from the command-line,
as few month ago I heard about a command line client called HTTPie, that has a
nice interface that totally makes sense:
$ http --form POST localhost:8000/handler foo=1 bar=hello
What does it do? It does a HTTP POST on localhost:8000/handler with the
following body:
It’s exactly the kind of stuff I want. I often automate the common stuff away with a
function, like this:
http() {
# 1st parameter is the path, we pop it out of the parameter list
local urlpath = "$1"; shift
# since we use http has our function name we have to use `command' to
# call the executable http and not the function
command http --form POST "example.com$urlpath" "$*"
}
# Do a POST with the following parameters: foo=1&bar=2
http /test foo=1 bar=2
If you’d rather submit JSON instead of an url-encoded form, replace the --form
option with --json.
Give HTTPie a shot next time you want to talk with an HTTP service
from the command line: it may take you less time to learn it from scratch than
remember how to use cURL.
One day at work the Internet went out around 4pm, most of my co-workers couldn’t
work: most of the information they needed to work were online, and they didn’t
have local copies. If I was writing Python at the time, being offline would have
been a problem, I rely on the information on docs.python.org, frameworks, and
libraries’ documentation: all of which are online.
With Go, it’s less of a problem: if you have godoc installed you can access the
installed packages’ documentation using a local HTTP server:
$ "$GOPATH/bin/godoc" -http=:8080
Point your browser to localhost:8080 and here you have it: the documentation for
all your installed packages.
A few tips & tricks for properly managing views with Postgres:
Name the return values with AS
Type constant values by prefixing them with their type
For example consider the following:
$ CREATE VIEW myview AS SELECT 'bar';
WARNING: column "?column?" has type "unknown"
DETAIL: Proceeding with relation creation anyway.
CREATE VIEW
Be careful that the names and types of the view’s columns will be assigned the
way you want. For example:
CREATE VIEW vista AS SELECT 'Hello World';
is bad form in two ways: the column name defaults to ?column?, and the column
data type defaults to unknown. If you want a string literal in a view’s
result, use something like:
CREATE VIEW vista AS SELECT text 'Hello World' AS hello;
First we’ll name our string to get rid of the “?column?” name:
$ CREATE VIEW myview AS SELECT 'bar' AS bar;
WARNING: column "bar" has type "unknown"
DETAIL: Proceeding with relation creation anyway.
CREATE VIEW
Second we set the type of our return value by prefixing with TEXT:
$ CREATE VIEW myview AS SELECT TEXT 'bar' AS bar;
CREATE VIEW
That is all.
I wanted to upgrade to Go 1.3 on my Desktop at work. The version of Go shipped
with Ubuntu 14.10 is 1.2. I found this article talking about godeb a
go program to package Go into a .deb file installable on Ubuntu.
You still need Go 1.0+ installed, otherwise the installation is straightforward:
I work in open spaces a lot, my current job is in a shared open office. Open
spaces are known to imped workers productivity. It’s one of the worst
office arrangement, yet they are popular in IT. They offers a short-term gain:
cheaper space, for an invisible –but real– price: less productive and
satisfied workers. Noise and lack of privacy are the main causes of
disatisfaction, and while it’s difficult to address the lack of privacy,
something can be done about the noise. I deal with it via a two-pronged attack:
Earplugs, I use Howard Leight Max, and there are lots of other good
earplugs around with different shape and foam-type.
This way I get a good isolation from the environment, and it makes interruptions
awkward for the interrupter: he has to wait for me to take off my headphones and
earplugs. This makes interrupting me more costly, which is a nice side-effect.
I have regular headphones; I wonder how good the combo earplugs &
noise-cancelling would be.
I used Swiftkey as my Android keyboard: I found it worked better than the
default Android keyboard. I switch between English & French often, Swiftkey just
works without selecting a language while the default Android keyboard needs to
be switched between the 2 languages to work properly.
I was hanging-out on security.google.com, and saw that Swiftkey Cloud had
full-access to my Email: read & write access! I didn’t remember giving them any
of these permissions, I did at some point, but I don’t know when.
Reading one’s emails is a great way to improve her predictive typing, but it
wasn’t clear to me that they’d read my Emails. I’m almost certain there was no
big-fat pop-up saying so.
That kind of thing really annoys me: Emails are sacred, you don’t mess-up with
it unless you’re a company with no moral or ethic like Linkedin or Facebook… I
made fun of people that gave their Email and password to 3rd parties, but I kind
of did the same…
I revoked the access, deleted my Swiftkey Cloud account, removed the Switfkey
app from my phone, and switched back to Google keyboard, it came a long way
since I replaced it with Swiftkey a year ago.
I started a project in Go, when I got started everything was in a single file.
Now this file is too big for my own taste, so I split it into 2 separate files,
let’s call them main.go & util.go. In main.go I have the main() function, in
util.go I have functions used by main.go.
When I tried to run main.go directly I got this error:
$ go run main.go
# command-line-arguments
./main.go:150: undefined: SomeFunction
I didn’t want to create a package just for util.go, sometime source files really
are specific to a program and aren’t reusable.
My search for a solution on the web didn’t yield anything useful. I knew it was
possible, I saw programs like godeb do it. After a while I built the program
with go build to see if the error would be different, and it worked this time.
Weird… What’s going on?
Everything was the same except I didn’t specify what to build, in that case
main.go. Go just built every go source files in the directory, I got the same
error with go build with only main.go specified.
$ go build main.go
# command-line-arguments
./main.go:150: undefined: SomeFunction
That’s when it hit me, I just needed to list all the files necessary to run the
program on the command line:
$ go run main.go util.go
Here it is: go run needs the complete list of files to execute in the main
package. I’ll know it for next time!
I’m a hater, especially when it comes to programming languages. I approach most
of them with pessimism, I rarely look at new language and think it’s great right
away.
C
I started programming in high-school with QBasic. I made a small
choose-your-own-adventure style game in text-mode, then moved on to C. I didn’t
like C much: it didn’t feel expressive enough, and was too hard to use for the
noob I was. I started programming seriously after high-school and
discovered C++ during my 2nd year studying computer science. I became instantly
a fan, C++ had so many features, the language felt more expressive, more
powerful. I though I could master it within a few years. It took me a good 5
years to realize that C++ was too big: It seemed baroque and overly complex
after all this time. After 5 years I still didn’t master most of the language;
like everybody I just used a subset. I went back to C and saw what I wasn’t able
to see then: C was expressive and simple. I took me years of struggle with the
seemingly cool features of C++ to realize C was the best part of C++.
Javascript
I was a Javascript hater for a long time, the language seemed so absurdly hard
to work with, there were traps and gotcha. If you asked me 5 years ago: PHP or
Javascript I’d reply: “PHP of course! Javascript is terrible.” Then I learned
more about it thanks to Douglas Croford’s videos. While Javascript is not my
favorite language I came to appreciate it, today I’d pick it over PHP if I had
to start a new project.
Python
Python looked a bit ridiculous when I first used it. I didn’t like the
indentation to define blocks, or that the language was interpreted, I didn’t get
that a dynamic language opens up a realm of new possibilities. At the beginning
Python felt like slow, and dumbed down C++. It took time writing Python everyday
to fall in love with it, but after a year it was my favorite language. I’ve been
writing Python personally and professionally for 10 years now.
Go
My first impression of Go was: it’s kind of like a cleaned-up C. My main problem
was that concurrency was part of the language like Erlang, I though it’d be
better if the tools for concurrency were contained in a library like
multiprocessing in Python. Also there were a few things that really bothered me
with it like the semi-colon insertion, a known Javascript gotcha.
Then I heard about goroutines, channels, & Go’s select statement, after that it
all made sense. Go has an elegant solution to a fundamental problem of modern
computing: concurrency.
The semi-colon insertion turned out to be a convenient quirk.
Go became my new toy a month ago, it’s now on track to replace Python as my
favorite programming language.
1.2 GB is easier to understand than 1234567890 Bytes, at least for humans. I
write functions to ‘humanize’ numbers often, but it never seems worth keeping
those functions around since they are generally quick and easy to write. Today I
decided to finally stop rewriting the same thing over and over, and headed to
PyPi –the Python module repository–, and of course there’s a module to do
that on it:
The Oak Island Money Pit is the story of a 2 centuries long treasure hunt
on a small Island near Nova Scotia, Canada. What makes this treasure hunt
special is how must resource was sunk into it. For 200 years adventurers lost
their time, money, and sometime life trying to find the elusive gold & jewels.
Long story short: In 1795 after seeing lights in the island three teenagers
found a mysterious depression on the ground of the Island, and started treasure
hunting by digging in the ground. Clues, and signs of treasures were found,
fortunes were wasted on fruitless digging, and 6 people died. To this day
nothing of value was found. Speculations & theories about the origin of the
supposed treasure in the money pit abound.
Something that wasn’t addressed in the article: How the hell did all that stuff
got so deep into the ground? If digging deep enough was still a problem in the
1960’s how did 17th century men managed to dig a hole 100 feet deep, along with
booty traps, and flooding tunnels? Given the numerous difficulties the treasure
hunters went through for the past 200 years, it would have been a great
engineering feat. All of that without being detected by the locals, and keeping
it secret for 200 years.
Most adventurers probably though about it, and they gave their imaginary enemy
—the treasure digger— too much credit, and didn’t give their predecessors
enough credit.
The Oak Island Money Pit is a great story because it’s a great tragedy. The only
treasure on the island is the memories of this great human adventure.
I use xdm(1) as my login manager under OpenBSD. After I loging, it starts
xconsole(1). It’s not a big deal, but I’d rather not have it happen.
To stop xdm from starting a new xconsole for every session, edit
/etc/X11/xdm/xdm-config, remove or comment out the following lines:
I’m not a fan of 37signals, but I must admit their DNS service xip.io is
handy. I’m setting up some web servers right now, and I needed a domain to test
my configuration. The whole DNS dance is a bit time-consuming: add record to
zone file, & wait for my DNS to pick it up. With xip.io there’s no need to wait:
prepend your host’s IP address before .xip.io and the domain will resolve to
your own IP.
For example 127.0.0.1.xip.io will resolve to 127.0.0.1.
They are other services like this like ipq.co or localtest.me, but as
far as I know they don’t work out of the box: you have to register your
subdomain first or can only use it with localhost.
How to run PostgreSQL as a non-privileged user
The quick and dirty guide to setup a postgres database without root access.
To stop the server type Ctrl-C, or you can use pg_ctl if postgres runs in the
background:
pg_ctl stop -D "$postgres_dir"
$ sudo apt-get install zsh-doc
[...]
$ man zsh
No manual entry for zsh
See 'man 7 undocumented' for help when manual pages are not available.
$ man 7 undocumented
[...]
NAME
undocumented - No manpage for this program, utility or function
[...]
Damn you Ubuntu/Debian/whoever decided that man pages were ‘too big’ to be part
of documentation packages!
Copy a branch between Git repositories
Git is tricky to use: after 4 years I still have a hard time figuring out how to
do simple operations with it. I just spent 30 minutes on that one:
Say you have 2 copies of the same repository repo1 & repo2. In repo1 there’s a
branch called copybranch that you want to copy to repo2 without merging it: just
copy the branch. git pull repo1 copybranch from repo2 doesn’t work because it
will try to merge the copybranch into the current branch: no good.
It looks like git fetch repo1 copybranch would be the way to go, but when I
did it, here’s what I saw:
From repo1
* branch copybranch -> FETCH_HEAD
After that a quick look at the logs doesn’t show copybranch, FETCH_HEAD, or
any of the commits from copybranch. What happenned? Git copied the content of
copybranch, but instead of creating another branch named copybranch it creates a
temporary reference called FETCH_HEAD, but FETCH_HEAD doesn’t appear in the
logs. In summary: Git copied the branch, & made it invisible, because you
know… it makes perfect sense to hide what you just copied.
So how do you copy the branch, and create a branch with the same name
referencing the commits? Here it is:
git fetch repo1 copybranch:copybranch
I use VMWare Player to run my OpenBSD under Windows with my laptop –a Thinkpad
X1 Carbon–, its newer hardware wasn’t fully supported by OpenBSD when I got it.
I had issues with VirtualBox, it was slow: 50%+ of the CPU time was spend on
interupts, and I couldn’t find a solution on the Internets. After reading Ted
unangst’s blog where he describes his setup I decided to switch to VMWare
Player.
I use Putty to connect to the VM, and while VMWare worked well, sometime the
dynamically assigned IP changed. I had to reopen Putty to change the IP from
time to time, and it was getting annoying. It turns out that you can use static
IPs. VMWare uses a 255.255.255.0 netmask, and it reserves the 3-127 range for
static IPs. I put this in my /etc/hostname.em0:
inet 192.168.234.3 255.255.255.0
It didn’t work right away. It turns out that the gateway was at 192.168.234.2. I
put the following in /etc/mygate:
192.168.234.2
And things are now working nicely.
CSS3 Quickies
I did some web-design for a friend this January. I didn’t use HTML & CSS for a
while, I did quite a bit of tinkering with the graphic design of scratchpad 6
months ago, using a custom font and trying a few ‘newish’ features like media
queries. It was a good oportunity to discover & use the new HTML5 & CSS3
features.
My friend had a Joomla template he wanted to use for his site. His needs were
limited: 5 pages and maybe a contact form. Hosting this with Joomla seemed a bit
overkill, so I decided to ‘rip-off’ the template and create a clean HTML
skeleton for him to us.
First we tried to work from the source of the template, but the template’s HTML
& CSS were very hairy, I couldn’t wrap my head around it, so I decided to
rewrite it from scratch. Who doesn’t love to reinvent the wheel? :)
I used purecss on the 1st version, but I wasn’t satisfied
with the way it worked. I like to minimize HTML markup. I really dislike it
there are 5 divs to size what should be a single box, when all you need to do is
use CSS correctly. Unfortunately purecss works this way, you need to get your
boxes inside other boxes to get things to work correctly. It’s understandable
why they do that: it’s a CSS framework, the CSS directs the way the DOM is
structured. CSS is complicated to get to work without a few intermediate steps.
Since I was here to learn more about CSS, I dropped purecss, started using what
I learned studying it for the new template.
Here are the few things I tried while working on the site:
box-sizing
box-sizing: border-box is handy: it includes the border & the padding in the
box’s size. If you have 2px borders and a 1em padding in a 200px box, the box
will be 200px with 2px off for the border and 200px - 2 × 2px = 196px of usable
space. It simplify box’s placement, no more: My borders are 4px, my box is 200px
so that’s 4 × 2 + 200 = 208px… It’s only supported by IE9+, and it needs a
prefix on some browsers like Firefox. I used it when developing the site, at the
end of the design process I removed it, I had to make a few adjustments here and
there, but it was easy to do. border-box was neat though: no more pointless
tinkering. I’ll use it again for sure.
Media queries
Media queries are the basis of responsive design. Instead of using pixels as a
unit like most do, I use ems, the typographic unit of measure. That made
many things simpler, like re-calculating the size of the grid when adjusting the
font size.
While media queries aren’t that easy to use & lack expressive power there
weren’t too bad and I managed to do what I wanted without too much thinkering.
inline-block
display: inline-block; allows you to simplify box packing: designing layouts
requires less tweaks and hacks. inline-blocks are well supported by all modern
browsers. IE6 supports it –short-of–, and it even works correctly on IE7! I’m
kind of late to the party, better late than never.
CSS3 transition
Fancy, but meh. It’s all eye-candy, and I don’t think it improves usability /
readability one bit. I’ll still used them there and there to fade in and out
bits of interface.
I was trying to get Tornado’s AsyncHTTPTestCase to work with Motor,
but the tests were blocking as soon as there was a call to Motor. It turns out
that Motor wasn’t hooked to the tests’ IO loop, therefor the callbacks were
never called. I found the solution after looking at Motor’s own tests:
In HTML whitespaces aren’t significant, but single space between tags can wreak
havok with your CSS layout. Consider the following:
<div></div> <div></div>
Or
<div></div>
<div></div>
The space between the 2 div tags will insert a single space between the 2 divs.
If the 2 divs width were 50% of the parent, they wouldn’t fit in it because of
the added space. To fix this you have to remove the spaces:
<div></div><div></div>
It looks kind of ugly to have everything on the same line. My favorite way to
deal with this is to move the final tag’s chevron at right before the next tag,
like so:
<div></div
><div></div
><div></div>
It doesn’t look super nice, but it’s better than having everything glued on the
same line.
3rd party sharing & social-media buttons are a waste of your and your reader’s
time: Sweep the Sleaze
size_t
Random C fact of the day: I though that size_t was a built-in type, because
that’s what the operator sizeof is supposed to return. Roman Ligasor –a
co-worker– proved me wrong. It turns out that it’s defined in the header
stddef.h. C is a minimalist language, why define a built-in type that will
just be an alias to another built-in type?
Without transition: According to the CERT Secure Coding Standards, one
should use size_t instead of integer types for object sizes.
I use DuckDuckGo those days, one its best feature is the bang, a smart shortcuts to other websites. I use !man & !posix all the time: it’s give you direct access to the POSIX standard manuals & specification. That’s better than relying on Linux manuals, as I have to at work.
I started drinking coffee 15 years ago, when I was a student. Like many students
my sleep schedule was messed-up: I was working late, and getting up late. I
loved working at night: it’s quiet, there’s almost no distraction. To compensate
for my lack of sleep during the day I drank coffee, sodas, and occasionally tea.
After graduating, I stopped drinking coffee and soda for a while. I switched
to tea, 2 to 4 cups of tea every weekday for 10 years.
I started drinking coffee again 2 years ago when I started my new job. I drank
between 1 to 3 cups of coffee at the start of the day, and 2 to 4 cans of Diet
Soda during the day on top of that. I ingested 150mg to 400mg of caffeine
everyday. I though that coffee was by far the biggest source of caffeine, but it
turns out that sodas, and tea also contain a significant amount. A cup of coffee
contains around 100mg, a can of Diet Pepsi has 35mg, while a cup of tea is
around 40mg.
How much caffeine is too much? According to Wikipedia, 100mg per day is enough
to get you addicted:
[…] people who take in a minimum of 100 mg of caffeine per day (about the
amount in one cup of coffee) can acquire a physical dependence that would
trigger withdrawal symptoms that include headaches, muscle pain and stiffness,
lethargy, nausea, vomiting, depressed mood, and marked irritability.
The mayo clinic recommends cutting back for those who get more than 500mg
everyday, I suspect this limit is lower for me.
I had my last coffee Sunday morning, almost 4 days ago. I’ve experience most of
the withdrawal symptoms, it’s getting better, but I think I have another day or
two before I can feel normal again. I didn’t even consume that much caffeine. It
must be awful to be nauseous or vomit on top of the other symptoms. I imagine
only big consumers get these problems, but this tells you a lot about how strong
the addiction can be. The headaches are especially annoying, they’re caused by
an increase of blood flow in the head, compressing the brain. I usually exercise
when I want to get my mind off something or try to get back into a healthy
routine, but In the case of caffeine withdrawal, exercise seems to make the
headaches even worse. Aspirin works well, but it still hurts quite a bit. The
worse part is how irritable I am right now, I tend to go crazy when I’m on my
own, and idle. I get restless and my mind wanders, thinking of past personal
injustices, and how I’ll get revenge: I get angry for noting. I can’t even focus
on a book for more than 10 minutes without my mind wandering.
The good news is: it’s almost over.
[…] withdrawals occurred within 12 to 24 hours after stopping caffeine
intake and could last as long as nine days.
There were positive side effect: I used to go pee 3 to 5 times a day, not
anymore. My sleep seems to improve. Sleep is why I stopped caffeine consumption.
I don’t sleep well most nights, waking up tired but not sleepy.
Like most things, caffeine isn’t bad, but it has to be consumed in moderation. I
don’t plan to ban caffeine from my life, but I do need to reduce my consumption,
and take a break from time to time.
I always forget about the HTTP server in Python. I’ve been using a quick’n dirty
shell script with netcat to quickly serve a single file over HTTP for a while,
but this is easier, and works better:
python -m SimpleHTTPServer [port number]
It will serve the content of the current directory.
I’ve redesigned this space after reading the excellent Practical typography by
Matthew Butterick. I picked Charter as the font for the body text. Charter is
recommended in the appendix of the aforementioned book as one the bests
free-fonts by far. I tried Vollkorn from Google web fonts for a while before
switching to Charter. While Vollkorn looked fine to me, Charter looks even
better it feels crisper.
I picked fonts from the Source Pro family by Adobe as my sans & mono-spaced
fonts, but I may switch to one of the DejaVu’s fonts if I find one that I like
better.
I had problems with KiTTY: the session management doesn’t seem to work with
multiple sessions. I looked for an alternative and found this one:
http://jakub.kotrla.net/putty/. It’s basically a normal PuTTY with a patch to
store Sessions on disk. It lacks the URL click-to-open feature, but I think I
can live without it. I’ve been using it for 2 weeks now, and I’m happy with it.
curl is a useful tool if you’re working with HTTP. I’m fond of the -w
option: it prints all kind of information about the transfert, including timing:
My C is rusty. Here are a few tricks I forgot and had to rediscover:
int array[42];
int *pointer = array + 8;
// This will be 8, not 8 * sizeof int
size_t x = pointer - array;
Number of elements in an array:
int array[42];
// x == 42 * sizeof int. Not what we want
size_t x = sizeof array;
// The right way to do it: y == 42
size_t y = sizeof array / sizeof array[0];
That is all.
Reminder for later. Put a file-descriptor in non-blocking mode with fcntl:
I’ve tried to write regularly for more than 5 years. Yet I still struggle to
publish content monthly, let alone write something every week. It’s not really a
time problem: every week I waste many hours slacking on Internet, at work or at
home. I could probably turn one of those wasted hours into a semi-productive
writing session. I realize that ‘wasting’ time is a necessary evil, nobody can
be productive all the time: sometime you just need to turn the brain off to
recharge.
I’m not posting high quality content. Most of my posts in my Blog took 2 to
4 hours of focused effert to research and write, not including the time it takes
for the idea to mature. I want to post less, but post better articles.
Most of my writing happen in short bursts over a few days. I’m writing a lot at
the moment because I have a shiny new bitbucket repository with all my
essays in progress. Unfortunately once the novelty wears out, I don’t think I’ll
write at the same rate…
Scratchpad made me write more. It was inspired by Steven Johnson’s Spark
file. I think this new repository with short essays in progress is a good
complement to it. I’ll see how things turn out…
Cal Newport argues that the best way to write for wannabe writers like me
is to have an adaptable writing schedule every week. I wonder if I should
reserve a time-slot during the week for writing. 1 hour focused on writing may
turn out to be the trigger I need to think and publish valuable articles.
I used a bitmap font for many years for all my development related applications.
Bitmap fonts aren’t scalable, but they usually look sharper and clearer than the
scalable variety, because they fit perfectly to the pixels on the screen.
My font used to be Terminus, it’s sharp and I like its shape, but it’s
missing quite a few glyphs and it’s becoming harder to use as screen resolution
and DPI increase. I looked for new fonts to try the past few weeks. There are
many font for programming, here’s my selection of monospaced scalable fonts:
Source Code Pro is the font I use now for terminals and editors. This font
doesn’t have much personality, but it’s clear and doesn’t have any of the
common problems that monospaced fonts tend to have.
DejaVu Sans Mono a classic font, with a sharp look and plenty of symbols.
An exelent default font: it’s widely available and has been around for a
while, based on the classic BitSteam Vera fonts.
Liberation Mono close to the DejaVu Sans Mono. I like the previous one
better, but this one is a fine alternative.
Ubuntu Mono has more personality than the other fonts, but I’m not a big
fan of its look. Nice font overall, I may give it a try if I get bored of the
others.
It took me a while to figure out how to dump the content of UDP packets with
tcpdump/Wireshark. Here’s how you do it:
# Dump all the UDP traffic going to port 1234 into dump.pcap
$ sudo tcpdump -i lo -s 0 -w dump.pcap udp and port 1234
# Then we print the data on stdout
$ tshark -r dump.pcap -T fields -e data
This will print all the data in hex-encoded form, 1 packet per line. You’ll have
to decode it to get the data in binary form. The following Python program does
that:
import sys
for line in sys.stdin:
# Binary data
data = line.rstrip('\n').decode('hex')
print repr(data)
Last week I installed Bootstrap as scratchpad’s CSS, before I had a minimal
CSS with normalize.css, but I couldn’t get it to display the feed
“correctly” on my Smartphone: the article block occupied only half the screen. I
had to readjust the zoom-level after loading the page to get it ‘fullscreen’.
I’m unhappy to add a dependency like that to scratchpad, but I can’t be bothered
with CSS anymore. The folks from Twitter like to deal with that ridiculous shit,
so I figured I should use the thing that just works, no matter how much I
dislike the idea. At least the typography is a nicer and the color are
prettier.
I had a few problems with KiTTY a clone a PuTTY, and tmux, a terminal
multiplexer: pane separators where displayed as ‘q’ or ‘x’ instead of lines. It
turns out it’s a PuTTY problem, according to tmux FAQ:
PuTTY is using a character set translation that doesn’t support ACS line
drawing.
I had this problem for a while, and I didn’t manage to solve the problem with my
old bitmap font: Terminus. Maybe because the font is missing the glythes to draw
lines. There were actually a few problems:
My old font didn’t have line-drawing glyth (?)
PuTTY character translation problem
Encoding wasn’t properly detected by tmux
I recently switched to a new font: Source Code Pro by Adobe, which allowed me to
fix the problem with the missing glyth. I also had to tweak tmux & KiTTY a
little bit. In KiTTY’s settings, in the Windows > Translation Category: set the
remote character set to ‘UTF8’, and select the unicode line drawing code points.
Make sure your font support line-drawing code point. Start tmux with the -u
option to tell it to use UTF-8, and you should be good to go.
I instantly became a fan of micro-credits when I first heard about the idea. It
seemed to be a perfect way to get people out of poverty: loans to small
businesses in under-developed part of the world. Business, exchange, trade, and
the ability to sustain oneself is what lift people out of poverty: not
charities, and government hand-outs. Micro-credits fit nicely my view of the
world: individuals stepping up to raise their standard of living with help from
non-governmental bodies.
Planet money posted an article about micro-credit, or more precisely about
studies to determine how effective micro-loans are. It turns out it was a lot of
hype, and not a lot of result. It doesn’t look like micro-loans improve much the
standard of living of those who benefit from it.
It’s disappointing, but after thinking about it for a while it seemed foolish to
think that small sums of money here and there could have a significant impact on
the lives of those who live in poverty.
With Unix becoming more and more ubiquitous, the POSIX Shell is the
de-facto scripting language on most computers. Unfortunately it’s difficult to
write proper scripts with it: few really know it, and there’s no much good
documentation available: Rick’s sh tricks is one of the best resource I
found about pure POSIX Shell scripting.
After reading The Best Water Bottles on The Wirecutter –a review
website–, I got myself the 800ml Klean Kanteen Classic. It’s exactly what I
wanted: a water bottle with a mouth opening wide enough for ice cubes. That
should help me get over my diet soda addiction: I made myself lemon water this
afternoon, plain water has a difficult time competing with diet soda, but lemon
water is another story… :)
I’m in relatively good shape for a programmer. I’m not overweight, but I have a
little bit of extra fat, I usually shave it off during summer. As I got older,
I noticed that it’s getting harder to get back in shape. After reading Hacking
strength: Gaining muscle with least resistance & Least resistance weight
loss by the excellent Matt Might, I started using MyFitnessPal, a
food diary application. It has great ratings on Google Play Store with millions
of downloads, and I managed to log a full week with it without going crazy.
Getting up early in the morning has always been a problem for me. It got better
last year mostly because we have scrum stand-up meetings at 9:45am every
morning. I’d like to start work earlier, especially with summer coming up: I
want to get out by 5:00pm to enjoy the sun and the outdoors.
I started tracking my time of arrival at work using joe’s goals, a simple
log book. It’s quick and easy to get started and to use, I like it.
I just had a weird problem with sh’s syntax, the following function didn’t
parse:
f() {
return { false; } | true
}
f should return 0: a successful exit value. The problem is that using return
like that is invalid. Return expects a normal shell variable, not a list or
compound command according to the POSIX spec. The solution is simply to do
something like that:
f() {
{ false; } | true
}
This will return the last command’s exit code, in that case it’s true: so the
value is zero. It’s still difficult to find good information about
shell-scripting on the net: that I though I’d throw that here.
Google announced that they will shutdown Google Reader, their RSS feed
reader this summer. I’ve been using it for many years now, feeds and Blogs
aren’t trendy anymore: the cool places to post content are social networks. I
though about letting Reader go without finding a replacement. RSS feeds are a
better source of quality content than social networks, but checking Reader every
day takes between 10 to 20 minutes.
I still decided to look for an alternative, I briefly tried Newsblur demo.
I didn’t like the interface: it felt a little bit too cluttered, and the site
wasn’t working well when I tried it. Then I tried Feedly for a week, and so
far I’m impressed by it. The interface is clean and minimal, while still
providing all the things I had in Reader and more.
The shell job control can be good alternative to terminal multiplexer like
screen or tmux. Here’s a small example:
$ vi file
[1]+ Stopped(SIGTSTP) vi file
$ jobs
[1]+ Stopped(SIGTSTP) vi file
$ vi other
[2]+ Stopped(SIGTSTP) vi other
$ jobs
[1]- Stopped(SIGTSTP) vi file
[2]+ Stopped(SIGTSTP) vi other
$ fg %1 # resume 'vi file'
vi file
[1]+ Stopped(SIGTSTP) vi file
$ jobs
[1]+ Stopped(SIGTSTP) vi file
[2]- Stopped(SIGTSTP) vi other
$ fg %2 # resume 'vi other'
I used Todoist a few years to manage my TODO list. Then I gave up on
electrical TODO lists and Todoist, it just didn’t seem to work very well because
I needed to be near a computer at all time. I switched to paper, but it didn’t
work either, at least for my personal use. I forgot the notebook constantly, and
sometime didn’t check it for weeks. Now when I get home after work I write my
tasks for the evening on a small piece of paper: this works rather well, but
it’s only good for short-term tasks.
Since I got a Nexus 4, I can run modern apps without too much frustration. I’m
trying a few TODO list apps to see if any of them is any good. I tried Todoist
again: the web app, and the Android app. I didn’t like it: I remember Todoist
being more keyboard driven, all the shortcuts I learned back then seem to be
gone or not working. A lot of features are reserved for paid users, I didn’t
like it enough to pay those $30. Moving on.
I looked for a better alternative and found Wunderlist. It looked nice and
simple. After a week of usage, I’m not really convinced. It doesn’t do nested
tasks, or more precisely you can only have 2 levels of nesting, and adding
subtasks isn’t super intuitive. I use nested tasks quite heavily, that may be a
deal breaker. On the other hand the interface seems to be more or less the same
everywhere, which is a nice plus. I’ll keep using it for a little while, and
probably get rid of it, but I want to give it an honest try.
I liked Emacs Org-mode back in the days when I used Emacs. It could be an
interesting option.
According to the Podcast the only good way to improve reading speed is… Wait
for it:
To read faster, concentrate on reading slower, and read more often.
All those things they teach you with rapid reading are gimmicky: you just trade
comprehension for speed. Speed reading doesn’t look that attractive when you
know that… I’ve put the book at the bottom of my reading pile: I’ll probably
never read it.
My biggest problem when reading is distraction. I read a few paragraphs, I get
bored, and I look away for a few minutes. When I come back to the book, I’m out
of it, and I have to re-read those paragraphs to get back into it. When that
happens I estimate it takes me triple the time to read and comprehend this
chunk of text.
Meditation is supposed to help focus, but I think to thing I need the most is
sleep. I’m still pretty bad at going to bed at a reasonable time.
$ make config
$ make rmconfig # Delete old config
$ make config-recursive # Configure port & all its dependencies
$ make config-conditional # Configure port & deps without options
I love Nick Johnsonz’s Damn cool algorithms series, where he writes about
new or unusual algorithms. I just finished the post about Homomorphic hash.
It’s a cool idea, but it’s based on modular artithmetic like RSA, which is
rather slow even on modern computers. I wonder if an algorithm based on
Elliptic curve cryptography would be more practical.
Idea for later: A web based carcasonne like game with a bunch of “credible”
AI playing it to get it started. this would “solve” the chicken-eff problem that
most multiplayers game have: to attract users you need to already have a bunch
of them. The challenge would be to have an AI good enough to be midly
challenging to human player.
Apparently Disqus –an online commenting service for website– decided
unilaterally to put Ads on their users/customers websites:
The following extract shows how a messaging client’s text entry could be
arbitrarily restricted to a fixed number of characters, thus forcing any
conversation through this medium to be terse and discouraging intelligent
discourse.
<label>What are you doing? <input name=status maxlength=140></label>
On asserts
It’s common to see C/C++ projects disable asserts when building releases.
The book Cryptography engineering argues that it’s a mistake: production
code is exactly the place where assertions are most needed. That’s where things
should never go wrong, and if they do we shouldn’t sweep the problem under the
rug.
Patrick Wyatt an ex-Blizzard developer who worked on the early Warcraft,
Diablo, and StarCraft came to the same conclusion after working on Guild Wars:
it’s OK to “waste” a little bit of CPU to make sure production code runs
correctly.
Assertions aren’t that expensive, we really shouldn’t remove them in production.
These days speed is rarely an issue while correctness is always an issue.
Do-It-Yourself Dropbox based on Mercurial:
#!/bin/sh
set -o errexit # Exit as soon as a command returns an error
hg pull --update
hg commit --addremove --message "Update from $(hostname)" $@
hg push
How to use it:
$ hg clone <remote repo> ./shared
$ cd shared
$ cp ..../sync.sh . # sync.sh is the script above
$ touch file1 file2
$ ./sync.sh # This will add file1 & file2
$ rm file2
$ ./sync.sh # This will delete file2
$ touch file3 file4
$ ./sync.sh file3 # This will add file3 but not file4
I also have a script update.sh that doesn’t synchronize remotely:
#!/bin/sh
hg commit --addremove --message "Update from $(hostname)" $@
If you’re using an editor that writes temporary files in the directory like Vim
or Emacs don’t forget to add the relevant regex in the directory’s .hgignore.
\..*\.sw[op]
\.*~
If you have difficulties sleeping at night because of the noise, or if you work
in an open-space and can’t focus for very long: you should give earplugs a try.
They require a bit of adaptation, but after around 10 hours of use the initial
discomfort almost vanish.
I tried 4 different types of foam earplugs those past 10 years. Initially I used
EAR foam earplugs for many years. I tried the classic, and the neon yellow. The
classic weren’t great: they’re a bit rough against the skin, which make them
rather difficult to wear at night and for long periods of time. The neon yellow
were a bit softer and isolated better, I used them for 4 years after I ditched
the classic.
6 months ago, I decided to try some new ones. I ordered 20 pair of Howard Leight
MAX & Hearos Ultimate Softness Series after reading reviews on the web (links
below).
The Howard Leight MAX are great for work. They fit snugly, and isolate well.
They are a notch above the EAR neon yellow in comfort and isolation. They aren’t
that great for sleeping: if you wear them for more than 2 hours, they start
hurting you ear canal a bit. For sleeping the Hearos Ultimate Softness are
great: they don’t isolate as well as the others, but when you’re sleeping this
isn’t as important. What’s important when you’re sleeping is comfort, and the
Ultimate Softness are the most comfortable earplugs I ever tried. After a night
of sleep your ears wont hurt a bit. I’m planning to order 100 pairs of those new
ones: focus and sleep are 2 things I can’t afford to lose: I need all the help I
can get in life.
New year resolutions were trendy last week, I like to be fashionably late at
parties… I wont set ambitious goals for 2013, like winning an Olympic gold
medal, or having sex with Twin Swedish super-models. I’ll go for small
manageable goals for those first 3 months:
January: Give a talk. I’ll have to do it for work, so it should be ‘easy’.
February: Go watch a stand-up comedy. I’m planning to see Maria Bamford at the
Comedy Mix.
March: Get a new laptop, and use FreeBSD on it for at least 1 month. I think
I’ll get a Lenovo Thinkpad T430u.
Ideas for the rest of the year (1 per month):
Get a Powerball, use it every day and write my score down. Then create charts
with the data, and brag about it.
Learn Racket & write a smallish (~1000 lines) program in it. Maybe an RFC 3339
parser?
Play roller hockey for 1 hour or more at least 15 times
Buy a green biscuit, practice stick-handling with it for 15 minutes 5 times a
week
I bough Super dungeon explore last summer, I need to assemble and paint the
miniatures. I may not be able to do this in a single month though…
I often wonder what will come after the current “Internet”. CCN is a good
candidate to replace the whole or parts of the IP/TCP/HTTP stack, and it can run
on top of the existing stuff: IP, IP/TCP, etc… unlike say IPv6.
Awesome isn’t anymore. It’s overused, everybody knows it, as of today the first
result from Google for Awesome is the Urban dictionary’s definition:
An overused adjective intended to denote something as “cool” or “great” but
instead winds up meaning “lame.”
Instead of awesome, use great: Alexander the Great is better than Alexander the
Awesome. The worst with the overuse of the word awesome everywhere: it’s almost
always used inappropriately. It comes from awe: a feeling of fear and wonder.
Article about memory corruption, and how to detect if it’s a hardware or not
from Antirez, the guy who started Redis:
The white bean / chicken chili tasted OK, but it’s a bit long to cook I find.
I really like the black bean soup, it’s faster to make and tasted better. I
spiced it up with some red thai chili in the relish. I’m not big on spicy food,
but most soup are bland, and they often need a bit more punch.
Procrastination: I’ve finally filled the paperwork for a new savings account. I
got the form in April, I looked at it, put it on a shelf, and left it there for
more than 6 months. I didn’t forget about this form, I just left it, while it
was at the back my mind during those 6 months. It took me literally 2 minutes to
fill it out. Now I have to send the letter, and if I don’t do it right away I
may leave it in my desk for weeks.
I’m like everybody: an every day life underachiever.
Frank and Oak is an online menswear maker. They release a new collection
every month, and sell it exclusively on their website. They target guys with
slimmer bodies. I ordered a shirt and a blazer, it was quickly shipped and
delivered.
The shirt was a good surprise: it fits well out-of-box. I’m a skinny guy, the
shirts that fit me best are the ultra-slim 1MX from Express. Frank and Oak’s
shirt isn’t as snug, but it’s still one the best fitting shirt I had. For $45
you’ll get a decent shirt, and with the money you saved: get it tailored just
right.
The blazer is okay. It’s a little bit narrow around the shoulders, but overall
it’s pretty good considering it was only $50. It took me a while to realize the
side-pockets were sewed, they looked like those fake pockets you get on low-end
blazers.
I’ll probably order more from them.
I used to have a Wrap –a tiny x86 computer– as a router. It wasn’t doing
much routing though, since it only had my desktop connected to it. I messed up
with it 1 year ago while flashing the firmware, broke it, and never managed to
get it to work again.
I just ordered an Alix 2d13 as a replacement. It’s nice upgrade, with a USB
port, and an IDE connector. I’m planning to install OpenBSD 5.2 on it. It
will be released tomorrow, right before I get the new hardware.
It’s an expensive toy: $300+ not including shipping, but I have an open
platform I can hack and play with. I tried to use an old wireless router with
OpenWRT, but the wireless signal was pretty bad.
CREATE TABLE queue (
id INTEGER PRIMARY KEY,
available BOOLEAN NOT NULL DEFAULT TRUE
...
);
The table queue is only used to lock items, and mark them as them as done. You
can store data in the queue table, but I’d recommend to store it in a separate
table to keep the queue table relatively small.
To lock the next item in the queue:
UPDATE queue SET id = @last_queue_id := id, available = FALSE
WHERE available = TRUE ORDER BY id LIMIT 1
The key part is id = @last_queue_id := id: this will mark the next item with
available = FALSE and set the user variable@last_queue_id to its ID.
You can then get it with:
SELECT @last_queue_id
Once you’re done with the item, you delete it from the queue:
DELETE FROM queue WHERE id = @last_queue_id AND available = FALSE
The available = FALSE clause isn’t necessary, but I like to keep it just to be
extra safe.
Last night I ironed 5 shirts & 1 polo in less than 1 hour. I followed the
instructions from the videos I posted yesterday. That’s pretty good
considering that I haven’t ironed for 5+ years.
That’d work out as 1h30m weekly if I include pants. It’s too long, I’ll try to
lower that to 45m with the same “quality”.
My life is very exciting.
I try to look sharper those days, that’s why I started regularly ironing shirts
and pants. I used to iron weekly, but stopped after I moved to Vancouver. Back
then I never really tried to improve my ironing technique, it was a mindless
chore not worth perfecting.
That was a mistake, I should try to perfect my ironing technique BECAUSE ironing
is boring and time-consuming.
There are quite a few videos on how to iron stuff on Youtube. Those 2 are the
best I’ve found so far:
Impressive video presentation of acme(1), a “text editor” written by
Rob Pike. If you saw the introducing Xiki, it may look familiar. Many ideas
from Xiki are implemented in Acme. The idea that text IS the interface is pushed
quite far in Acme.
I’m a Vim user, I like to use the keyboard only to drive my computer. Acme takes
a completely different approach: The mouse is used everywhere, the mouse does
more than in regular programs: Button 2 execute the selected command for
example. I never used it, but I’m very tempted to look into it. My main grief
with Vim is that is difficult to script: VimScript is yet another language you
got to learn. The programming interface in Vim seems to be kind of ad-hoc and
looks difficult to learn and use.
Acme seems to be easier to interact with programmatically. It may very well be
my next text editor.
I posted on scratchpad a TED talk about aging, and how living past 100
years may be common in a not-so-distant future. I wasn’t convinced, but the talk
was cool and informative nonetheless.
It turns out that some guy named Edward Marks did his homework and looked
at life expectancy numbers and where we’re headed if we keep going at the same
rate. It looks like the dream that average people could live past 100 years is
indeed a just a dream. We die less, but progress is slowing down.
Steven Johnson is the author of 2 books about creativity: Where Good Ideas Come
From, and The Invention of Air. He wrote an article about what he calls a
Sparkfile, which is just a list of ideas.
He writes down every idea he has in this Sparkfile, and once in a while revisit
it to see if there’s anything of value, or ideas he could combine.
That was more or less what my scratchpad was for. Blog & note taking kind of
thing. I need to post more ideas in here.
I posted a while ago links to a few python projects worth checking out.
I tried them all for a bit, and I really like PyRepl: an alternative to the
standard Python interpreter readline-interface.
It crashes less than bpython, it’s lighter than IPython, and it’s
written 100% in Python. What’s not to like? Well it crashes from time to time
when I use the arrow keys, kind of like bpython. I haven’t managed to reproduce
the problem consistently yet, mostly because it’s rare.
You should use PyRepl for its better completion and coloring, it’s a pretty good
alternative to the bloated IPython.
He talks about Datomic a new database with an innovative architecture.
I’m making pizzas for dinner those days.
I used to coat the top of my pizza with a little bit of generic tomato sauce. I
saw a a pizza sauce recipe on my favorite cooking blog last week, and
decided to try it. It changes everything: because the previous sauces weren’t
thick enough I had to put a ton of toppings on my pizzas. This sauce is thicker,
and much tastier than what I used. Result: less topping, less work, and yet the
pizzas taste better.
I found it fascinating that money gets routinely reinvented by people who don’t
have access to “regular” currency. Bartering creates such a big transaction cost
that we’re almost hardwired to come up with something better.
Another surprising part is that metal money (coins) are likely a state
invention. It turns out that the private sector naturally uses things with real
“value”, like rice.
I tried DuckDuckGo a few times since it launched. Every time I wanted to
like it, but it had a few shortcomings: the results weren’t quite as good as
Google, or it was a little bit too unfamiliar.
I tried it once again this week, and I think it’s now good enough for me to
switch to it permanently. Google is full of spam right now, it looks like Google
refuses to ban content farms like Demand Media which operate eHow & Cracked.
This switch to DuckDuckGo has more to do with Google Search deteriorating
than DuckDuckGo giving me something new.
It was easy enough to prepare, but I made a big mistake: I didn’t taste the
pepper before integrating it into the vegetable mixture. It was spicy, too
spicy. Otherwise I recommend this recipe.
Side note: Fish is easy to cook, healthy, tasty, and often better for the
environment than meat.
I’m glad I took 40 minutes to watch it, I’ve learned and understood so much.
I started programming in C 15 years ago, but I don’t consider myself an expert
or anything like that. There are just too many aspects of C I’m not comfortable
with. One of those is the Preprocessor.
Something I never fully understood is the STRINGIFY macro hack. The preprocessor
has a “#” operator that turns its argument into a string.
STRINGIFY(foo) is expanded to “foo”, HELLO(world) is expanded to “Hello ”
“world”. So far, so good, but when you try to stringify another macro it doesn’t
work as expected:
#define STRINGIFY(x) #x
#define FOO bar
STRINGIFY(FOO) /* This will output ... "FOO", not "bar" */
STRINGIFY(__LINE__) /* This will output "__LINE__", not "4" */
If you look for a solution on the interweb, the answer is usually to use another
auxiliary macro, and it “magically” works:
#define STRINGIFY(x) STRINGIFY2(x)
#define STRINGIFY2(x) #x
#define FOO bar
STRINGIFY(FOO) /* This wil output "bar", why?!?!? */
Why does that work? Because the preprocessor doesn’t expand macro’s arguments,
but the result of the expansion can be expanded afterward. Here’s what happen:
STRINGIFY(FOO) is expanded to STRINGIFY2(FOO) because of #define STRINGIFY(x)
STRINGIFY2(x)
FOO is expanded to bar using #define FOO bar, we now have STRINGIFY2(bar)
I’m exited by 2 new languages at the moment: Go and Rust.
Rust is still pretty much in development, but Go is already stable. There are a
few introduction to Go floating around, but the latest one by Russ Cox just
shows you the good stuff:
I’m trying to get over a Diet Coke addiction. I drink 3 or 4 cans of Diet Coke
each day. There’s salt and caffeine in Diet Coke to make people pee, feel
thirsty, and drink more.
Going to the bathroom every 2 hours is not pleasant and may have armful health
effect in the long run.
There’s 45mg of caffeine, and 35mg of salt in a Diet Coke can. A can of Diet
Pepsi contains only 36mg of caffeine, and the same amount of salt. From now on
I’ll drink Diet Pepsi: 9mg isn’t a big reduction of caffeine, but it’s a step in
the right direction.
I like cycling caps, I like the snug fit. The short visor pointing down protects
well against the sun, the wind and the rain. I have a wool one from Walz
Caps, I had it for 3 months now and it instantly became my favorite hat.
I’ll get rid of my regular caps, and get 2 more of those.
Cycling caps are kind of hipster’s hats, but I can deal with that.
Try one once, you may like it ;)
Finding pants that fit was a problem for me. It takes a while to find one that
fits just right: each brand has a slightly different cut, and the right
combination of hip / inseam size is not always available. I usually ended buying
pants that were slightly too long or too short. There’s an easy solution to this
problem: tailors. For $10 or less you can get your pants shortened just right.
I’m embarrassed I never went to a tailor before: in retrospect it seems like
such an obvious thing to do.
Captain Clueless, signing-off.
Small python script to calculate a file’s entropy. It reads its input from
stdin.
#!/usr/bin/env python
from collections import defaultdict
from math import log
def entropy(input):
alphabet = defaultdict(int)
total = 0 # How many bytes we have in the buffer
buf = True
while buf:
buf = input.read(1024 * 64)
total += len(buf)
for c in buf:
alphabet[c] += 1
if total == 0 or len(alphabet) < 2:
return 0.0
entropy = 0.0
for c in alphabet.values():
x = float(c) / total
if x != 0:
entropy += x * log(x, len(alphabet))
return -entropy
if __name__ == '__main__':
import sys
print entropy(sys.stdin)
If you’re looking for narcissist people, there’s a new site that’s referencing
them:
I planned to write a funny essay about how I’m a nerd who can’t dress, that I
finally realized the obvious, and would take better care of my look from now on.
It turns out that writing funny essays takes a long time, and it wouldn’t be
that funny anyway.
In May I decided to get into that “style” thing. I grew a light beard back in
March, and it altered the way people perceive me more than I expected. No need
to show ID at the liquor store anymore, at 32 it was about time…
Others look at you a lot, way more than they listen to you. Look is a quick and
relatively reliable way to weight someone. A guy wearing a nicely cut suite is
not a homeless guy to anyone, someone wearing skinny torn jeans and a dirty
T-shirt ‘Punk not dead’ is probably not a banker.
I want to get better at looking good, and I want to talk about it. I’ll post
more later.
Too few updates lately… I have been pretty busy at work. Here’s something I’ve
learned today. In C, when I wanted to initialize an array to all zeros I used
memset. But there’s a simpler way:
int array[3] = {0, 0, 0};
OK, that’s nice. But what if the array is really big? There’s a shorter version
with the same effect:
int array[3] = {};
When you omit a parameter in an initializer, it automatically defaults to the
type’s zero value. If you want to initialize an array without specifying all the
elements, you can do something like that:
I bough a new bike a few months ago: a fixed gear. After a week of getting used
to it, it feels great. It feels like being connected to the ground: it’s easy to
adjust your speed, most of the time there’s no-need for brakes: you can slow
down with just the pedals. I’m going to keep the brakes on for a little while,
better be safe than sorry ;)
I still have my old bike. It’s going to be my “rainy days” bike. I plan on
converting it to fixed gear sometime after this summer.
I’ve been looking for a cheap track frame to replace my old road frame. I found
this one for $169.
libtomcrypt is a pleasure to work with: the code is clean, readable, and
things are well laid-out. One the few things I disliked was how ECC keys are
represented internally:
typedef struct {
void *x, *y, *z;
} ecc_point;
/** An ECC key */
typedef struct {
/** Type of key, PK_PRIVATE or PK_PUBLIC */
int type;
[...]
/** The public key */
ecc_point pubkey;
/** The private key */
void *k;
} ecc_key;
If type is PK_PUBLIC, the private component of the key should probably be
NULL.
I think this is suboptimal and potentially confusing. It seems to me that
the following would be better:
Introducing 2 different types for public and private keys allows us to be more
specific with our type requirements. For example the function
ecc_shared_secret looks like that:
int ecc_shared_secret(ecc_key *private_key, ecc_key *public_key,
unsigned char *out, unsigned long *outlen);
A new API could enforce the key’s type more easily:
int ecc_shared_secret(const struct ecc_private_key *private_key,
const struct ecc_public_key *public_key,
unsigned char *out, unsigned long *outlen);
This way you can get rid of checks at the beginning of some functions, like this
one:
/* type valid? */
if (private_key->type != PK_PRIVATE) {
return CRYPT_PK_NOT_PRIVATE;
}
Now the key type is explicit, private keys will only be struct
ecc_private_key, public ones struct ecc_public_key. If you want to be able to
pass both keys types, you can do something like this:
int ecc_some_function(struct ecc_public_key* public, void* private, ...);
And pass the private component of the key manually.
Designing a good API is hard. Even the little choices can be difficult. Let’s
take a function to decrypt AES blocks for example, this function will consume
a buffer 16 bytes at a time. Here’s what such function would look like:
void decrypt(const void* input, size_t nbytes);
(There’s no output parameter, we’re just looking at the input here)
input is a pointer to the buffer we’re working with, nbytes is how many
bytes to read from the buffer.
The function consumes blocks of 16 bytes, so what happens when nbytes is not a
multiple of 16? Should we silently ignore the few extra bytes? Should we have an
assert(nbytes % 16 == 0)? Maybe we could specify how many blocks to consume?
But then the API’s user would have to remember to divide the buffer size by 16.
I don’t know what the good answer is there.
Reading or sending data on a TCP socket looks simple, but it can be tricky.
read(2) & write(2) don’t have to consume the whole data they get. If you
call read on a socket requesting a billion bytes, you’ll probably get less.
If you need to read a set number of bytes, you’ll have to repeat calls until you
have all the data you want. Here’s an example of such loop in Python:
def readall(sock, count):
r = bytes()
while count:
x = sock.recv(count)
if x == '': # End of file
raise ...
r += x
count -= len(x)
return r
In Python sockets have a sendall method which does more or less that, but
there’s no recvall method. There’s another method: the socket.makefile()
function creates a file-like object for the socket. With this file-like object
there’s no need to loop to get the whole buffer back:
sock = socket.create_connection(...)
f = sock.makefile()
x = f.read(1234) # Will return a 1234 characters long string
You must make sure buffering is enabled for this to work.
PS: NOTE that calling read/write on a socket is basically the same a calling it
with recv/send without flags.
Reminder to self.
The Open Group Base Specifications Issue 7, also known as IEEE Std 1003.1-2008,
also known as POSIX.1, is publicly available here:
tmux, the terminal multiplexer from heaven, can copy-paste!
Here’s how to use it:
C-b [ start the copy
Move around a stop on the character where you want the selection to start
Press Space to start selection
Press Enter to finish selection
C-b ] paste the copied text
I finished the Kobold Guide to Board Game Design last night. I never tried
to create or modify a board game, but I think what makes board games great also
applies to other kind of games, especially online browser games.
The book is a collection of articles by various game designers sharing their
ideas and stories. The authors include: Richard Garfield (Magic: The Gathering),
Steve Jackson (Munchkin), and Dale Yu (Dominion). The book is easy to read and
mostly jargon-free.
I expected the book’s content to be more analytical than it is. Most articles
turned out to be practical and concrete, and it’s great. I wish there was a
little bit of ‘theory’, but board game designers don’t seem to use math &
statistics to balance and design their game.
Great read overall: highly recommended, five stars, and all.
Don’t just make random changes. There really are only two acceptable models of
development: “think and analyze” or “years and years of testing on thousands
of machines”. Those two really do work.
If you can ignore the irritating InfoQ video/presentation player, this Rich
Hickey’s interview is full of fresh and innovative ideas:
I can send an IP packet to Europe faster than I can send a pixel to the
screen. How f’d up is that?
Somebody asked about that on superuser, John Carmack went on to explain
why it was like that. He didn’t just cross reference a few sources to come
to his conclusion: he measured, he checked his assumptions with experiments. He
acted like a scientist, adapting his mental model to the world.
While hacking on my dwm I noticed this line in dwm.c:
while(running && !XNextEvent(dpy, &ev))
Notice the ‘not’ before XNextEvent. I wonder why it’s here, as far as I can
remember XNextEvent isn’t supposed to return something. A quick look at the
manual helped, but didn’t solve the problem:
int XNextEvent(Display *display, XEvent *event_return);
The function returns an int but there’s no explanation on what this return value
is. In X.org’s libX11’s source code XNextEvent always returns 0.
I imagine that somewhere there’s a version of the Xlibs that return a non-zero
value when there’s an error, maybe there’s a doc somewhere explaining what this
means, but I couldn’t find it. Or maybe it’s simply an undefined behavior, an
error…
When I connect to a remote host via SSH, I like to start a new Tmux session
or re-attach to an existing one. Before that’s how I was doing it:
if tmux has-session
then
tmux attach-session
else
tmux new-session
fi
There’s something much simpler: tmux attach || tmux new
Things that would be nice to fix with URLs.
Drop the //, http:google.com is shorter and easier to read. Tim Berners-Lee
regretted putting those 2 extra characters when he started the web.
Inverse domain names. Example: sub.domain.com, com is the top level, and sub
is the bottom level: domain names put the top level last. Path in URLs do the
opposite, the top level is first:
Alice is a librarian: she has a house full of books. Bob likes books, he wants
to read as many as he can, and is willing to pay. Alice –being of savvy
businesswoman– wants to open her library to the public for a fee. She wants to
maximize profits; Bob wants to maximize the number of books he gets, up to the
limit of what he can read, and he wants to minimize the money he spends.
The books are a limited resource, if Bob takes out all the copies out of the
library Alice can’t have more clients, she needs to manage her library to make
sure customers don’t abuse the system.
We’ll consider everything else to be equal. All books have the same value, and
they all take the same time to read.
Let’s consider 2 different business models:
All-you-can-read for a monthly fee, say $20.
Pay-per-book: Reading 1 book costs $1 for example.
All-you-can-read
Alice gets $20 from Bob, and he gets free access to the library for the rest of
the month.
To increase her profits Alice needs more customers, that’s the only way; she
can’t charge Bob more. Since the number of books is limited, more customers
means less books per customer. Every time a customer takes out a copy from the
library it reduces Alice’s potential profit. To maximize her profit Alice should
minimize its resource usage: limit the number of customers, or how many books a
customer can take each month. Bob tries read as many as he can: one extra book
doesn’t cost him anything, without any quota he can get more than his fair
share.
Alice’s goal is not aligned with Bob’s goal. Alice wants to reduce the number of
books Bob reads, Bob wants to maximize it.
Pay-per-book
With a price of $1 per book, Alice doesn’t really care how many customers she has.
She wants to rent as many books as possible. 1 or 100 customers doesn’t really
make a difference to her bottom-line.
Here our 2 characters’ goals are aligned, Alice wants Bob to read as much as he
can. Bob can choose how much he reads and spends.
So What?
It’s clear that the pay-per-use formula works better than all-you-can-read.
So why do we use the worst model for our Internet access and phone? Why don’t we
try to align the goals of carriers and customers?
Getting connected is not all about bandwidth; there are fixed costs than can’t
be easily covered by a pay-per-use model. A price per Gigabyte with a minimum
price per month could lower our bills while holding back the freeloaders.
That’s not going to happen anything soon though. Many have interests in keeping
the old fixed price per month going. Internet and phone carriers like it,
because it’s an easy way to maintain and increase their profits. Customers will
pick up plans that exceed their needs, and pay more than they should as a
result. The price of bandwidth tends to fall steadily over time, but carriers
don’t lower their price very often. Freeloaders also want to keep the system the
way it is, customers who use most of their quota are the ones getting the best
value from their broadband access.
If we want broadband to be more ubiquitous and cheaper, we need to treat it as a
real commodity, like water or electricity.
Instructions are simple 16 bits words with the following format (bits):
bbbbbbaaaaaaoooo
oooo is a 4 bits op code; a’s, and b’s are two 6 bits operand. The
instruction format is a little more complicated than that, but that’s roughly
it.
To me this looks like a pretty good candidate for a perfect hash function like
the ones created by Gperf. What kind of tree do we have?
The 4 bits values spawn 16 branches. 6 bits value spawn –according to my quick
glance at the spec– 12 branches. That’s 16 * 12 * 12 = 2304. Quite a bit more
than I expected. Gperf might not be such a good idea after all.
Google just announced Account Activity, a new feature that let you see what
you’ve done on its services every month. Big companies like Google and Facebook
know how valuable personal data is: it helps get more customer, and it can be
sold more or less directly to advertisers.
It’s also useful to their customers. I suspect my personal data is most valuable
to me, I’m sure I can get more use of it than Google or advertisers. This extra
data is another reason to use Google’s services. Well played Google, well
played.
You have to measure. Vitess swapped out one its protocols for an HTTP
implementation. Even though it was in C it was slow. So they ripped out HTTP
and did a direct socket call using python and that was 8% cheaper on global
CPU. The enveloping for HTTP is really expensive.
Not a surprise for me, but I guess it would be for most people. HTTP is not a
good general purpose protocol, it’s not even good at doing what it was designed
for. I try to avoid HTTP like the plague, but it’s difficult to go against the
herd of web-developers who think “HTTP is th3 4w3s0me!”
I haven’t maintained my bike for a year, last week-end it got a long overdue
tune-up.
Saturday I went to the Pedal Depot a small bike workshop, where you can rent
stands and tools to fix your bike, with friendly employees to help you when
you’re stuck or have a question, they also sell second hand parts.
There’s another bigger workshop nearby on Main St. I went there 3 years
ago, I was able to buy and assemble my current bike for around $200. It took me
almost 10 hours over 2 days to get it all done, it was long, but it was fun, and
I learned a lot.
The Green Biscuit seems like a good way to improve ice hockey stick
handling during summer. I play roller hockey with a ball during summer, when I
go back to ice hockey with a regular puck it takes a few weeks to get used to
the puck again.
Something to look at again in a few months.
I realized this week-end that I had an artificial scarcity problem with
bus tickets.
My old process was:
Realize I’m out of tickets
Go to the store and buy 1 booklet of 10
Repeat after a few weeks.
The new one is:
Realize I’m out of tickets
Go to the store and buy 10 booklets
Chill for 6 months because I now have 100 tickets
I don’t know if the post about nonces was really clear.
To encrypt the counter, you need to put it in a buffer the size of a block. For
example to encrypt the counter 1234567890, you’d have something like that
(notation little-endian):
You encrypt that block, then you use the resulting encrypted block as
initialization vector to decrypt the first block of the encrypted message.
I wanted to use only unixy tools to build scratchpad, but it turned out to
be cumbersome. The HTML and Atom generators are written in Python: 103 lines
of code as of today.
I tried to use rc and sh, but it was quite inconvenient to use. After
a while I decided to fall back to Python.
These days I read Cryptography Engineering. I just started the part about
block cipher modes, that’s where I learned about Nonces.
Using nonces with block ciphers is a good way to minimize the space taken by
Initializations Vectors or IV. Instead of sending an additional block with the
IV, you associate a number (counter) with each messages. The counter doesn’t
necessarily need to be transmitted with each messages, it can be implicit: for
example the first message could have 0 as nonce, the second 1, etc…
Then you encrypt the counter with the raw block cipher and use the result as
the IV for the 1st block. Simple and elegant, I really like this crypto
‘trick’.
From I’ve read so far I highly recommend Cryptography Engineering. It’s a
pleasure to read, and you might learn a thing or two.
I just played another game of Dominion. I’m just getting started with it,
but I can already understand why it’s such a popular game:
Simple to learn: you can start playing without knowing all the rules. And
there’s no secret when it’s your turn, other players can help you.
Quick: you never wait more than a few minutes before your next turn.
Deep gameplay: Dominion has a nice balance of luck and skill, kind of like
Poker.
Looking forward to play more Dominion! :)
I usually clone Virtualenv’s repository to use it. Here’s a quicker solution:
I just played Dominion with co-workers. I think it was the 1st time I
played a 4 players game of Dominion. It was a surprisingly fast, less than 40
minutes including setup time.
Looking forward to play some more Dominion.
I’ve added Google Analytics’ tracker on Scratchpad, because I can.
Scratchpad’s database is a simple log file like that:
2011-12-30T21:22:48-08:00
First post
^L
2011-12-31T01:42:24-08:00
Second post
^L
Just the time, and the content, and ^L –the ascii character 12 or form feed–
as separator.
This gives me ‘super easy simple’ back-ups. On another server’s crontab I just
add:
People get that long passwords with many different characters are safer. Short
password = bad, long password = good.
Humans are good at making analogies, a lot of us think that long keys are
better than short ones. A crypto-system with 1024 bits keys is safer than one
with 256 bits keys, right?
YES! Yes, if “everything else is equal”. If you know a little bit of
cryptography, you know there’s a lot more in a crypto-system’s security than
its key length. Passwords are much weaker than keys, most passwords are
problably weaker than a 32 bits random key.
Marketers uses our –correct– assumption that more bits in the key add much
security to sell us insecure products with very long keys. Something true makes
us believe something wrong. 10 years ago I believed that long keys
significantly improved security. I should be more wary of all those areas I
don’t know much about. If I make an analogy with cryptography: my lack of
knowledge will make me believe wrong things :)
Quick’n dirty shell script to serve a single file over HTTP:
You’ll need netcat (nc) to run it, it will listen on port 8000.
I’m writing the HTML and Atom generation scripts in Python. I tried to use Awk,
but it was kind of inconvenient. Lua seemed to be promising, but the lacks
idioms I got used to with Python made it a bit frustrating.
I wanted to finish a first version of Scratchpad before the end of January,
it’s late-February now: I better get on it with what I know. So Python it is.
Why are score pages so dawn awful?
I follow the NHL, I haven’t found any score page which does a decent job of
showing the information I want efficiently.
It might be a good project idea for later :)
/usr/bin shouldn’t exist, /usr was what /home used to be, and today’s Unix
hierarchy doesn’t make sense:
I find it remarkable that the early versions of C were developed and used in a
very constrained environment. Memory and CPU time were scarce, the language had
to be simple to implement. I’m convinced that constraints and limitations fuel
creativity, not restrain it. It’s easier to find a great solution to a problem
when the set of solutions is limited.
Thompson’s PDP-7 assembler outdid even DEC’s in simplicity; it evaluated
expressions and emitted the corresponding bits. There were no libraries, no
loader or link editor: the entire source of a program was presented to the
assembler, and the output file.with a fixed name.that emerged was directly
executable. (This name, a.out, explains a bit of Unix etymology; it is the
output of the assembler. Even after the system gained a linker and a means of
specifying another name explicitly, it was retained as the default executable
result of a compilation.)
I’ve read the Rust’s tutorial last night.
So far I’m pleased by its design choice. I like that everything is constant by
default. If you want to modify something, you have to declare it mutable since
the beginning.
The thing I’m not a big fan of are vectors. Vectors are Rust’s version of
arrays, they are ‘uniquely’ allocated in the heap. I like to have arrays on the
stack in C, I’m not sure it’s OK to have ‘everything’ on the heap in practice.
I just looked at Rust from Mozilla. It’s more or less a Go-like
language. After looking at it for 10 minutes, I think I like it better than Go.
Turns out that using multiple characters for RS is a GNU extension.
So my reverse example doesn’t work with other implementations of Awk. I should
know better, every time I’ve used GNU Unix programs I’ve run into portability
problems.
For scratch-pad I need to reverse the order of the records in my log file. The
tricky thing is that each record is one or more lines separated by a form feed
or page break: “\f” or ^L.
Here’s my 1st version using getline. getline allows us to consume lines
without exiting the current rule:
BEGIN { i = 0 }
{
do {
sort[i] = sort[i] $0 "\n";
getline; # Where the magic happens
} while ($0 != "\f");
sort[i] = sort[i] $0;
i += 1;
}
END {
for (x = i - 1; x > -1; x--) {
print sort[x];
}
}
It worked well enough, but there’s something much simpler. In Awk the record
separator can be any string, it doesn’t have to be a carriage return. We can
change to record separator at runtime using the RS variable. This simplifies
things:
BEGIN { RS="\f\n"; i = 0 }
{
sort[++i] = $0
}
END {
for (x = i; x > 0; x--) {
print sort[x] "\f";
}
}
1st was against a Protoss on Shattered temple. I fucked up, lost a SCV, forgot
to put one back to build the Baracks. I left right away, I don’t think it’s
smart to stay in a game if you screw up early. Better start over and get a
‘real’ game going.
2nd, against another Protoss was a win. He didn’t manage to scout me before my
3 Rax completed. Easy win. 3 Rax all-in is pretty strong at low levels.
I’m learning rc. This one took me a while to figure out:
; for (i in 1 100 10) { echo $i } | sort
1
100
10
It turns out that this is not what I expected: I though that the 3 numbers
would be sorted. Here’s how rc interprets it:
I didn’t know that command next which allows you to jump to the next line
without executing the other rules.
I forgot to mention that warning that Mercurial gives me:
warning: filename contains ':', which is reserved on Windows: 'scratchpad/2012-01-19T18:35:19-08:00'
Using RFC 3339 times as filenames was not a good idea.
I think I’m going to change the architecture of my scratch pad. Instead of a
collection of files in mercurial, I’ll use a simple log file. It’ll give me
version control for ‘free’, but I’ll lose the completion from the shell. On the
other end I wont use Mercurial, that’s one less tool to worry about.
Not sure if that’ll work. I’ll write a prototype and see how it goes.
I played 2 games of Starcraft during my lunch time break today. I lost against
2 silver players. I was way too passive during both games. I need to be more
aggressive, I should go 2 Rax pressure every time.
3rd day of practice. I’ll try to play more tonight.
I finally did a biggish blog post today. It’s been a while since I last
blogged. Time to get back into it seriously: 1 article per month. It doesn’t
matter if it’s bad. Maybe I’ll take stuff from this scratchpad. That’s what
it’s for ;-)
There’s going to be a Starcraft 2 tournament at work next month. I’m planning
to participate, but I didn’t play Starcraft for a while now. It’s time to get
back into it.
1st thing: pratice a set build order tonight and play a ladder game.
I lost my rear bike light today. I went to Canadian Tire to get a new one, I
was underwhelm by the choice and the price. A crappy looking rear light was $9;
batteries not-included. I’m cheap, I didn’t buy it.
I went to MEC instead. There for the same price I got a rear light and a
small front light, batteries included. A friendly service was also included.
I should know better, specialist store have almost always cheaper and/or have
more choice than generalist.
After working at Image-Engine, I realized how important color temperature is.
When you take a picture indoor and it looks all yellowish, that’s because the
ambient color temperature is ‘low’. Most cheapish light bulbs have a color
temperature around 3000K. When you are outside, the color temperature is
usually between 4000K and 6500K.
Today I decided to replace the light bulb over my desk at home with a 5000K+
light. Since almost nobody knows about color temperature, it is mostly absent
from the technical specifications of cheapish light bulbs. It’s hard to know
which one to buy. Luckily the Energy star website has a complete list of
its certified light bulbs, with their respective color temperature.
I’ll probably get a Philips ‘Mini Twister Daylight Compact Fluorescent Bulb’,
it is supposed to be ‘daylight’-like, with a color temperature of 6500K and a
good luminance.
I’m about to write the html generator for my scratchpad. I’ll probably use rc
and shell tools, there’s no need for something more complicated using Python
and Jinja for example.
Socks are a problem for me. I never have enough of them. I have plenty of
T-shirts, but only 10 pairs of socks at any given time.
It is time to end this, and buy 20 pairs in one go!
I wanted to generate all the HTML for the scratchpad using Python, but I’ll try
to use rc instead.
I just wrote a small pipe editor inspired by vipe. It’s a short shell
script:
#!/bin/sh
case $1 in
-h|--help)
cat <<EOF
Usage:
${0} [command] [arguments...]
${0} is a pipe editor. It read its standard input, opens in a text editor,
and write the result to its standard output.
EXAMPLES
Edit a file a write it into another file:
cat input | ${0} > output
or
${0} < input > output
Edit a file using gvim --nofork, a pipe the result into wc:
${0} gvim --nofork < input | wc
To call ${0} without any input, use /dev/null:
${0} < /dev/null
EOF
exit
esac
args=$*
function edit {
${args:-${VISUAL:-${EDITOR:-vi}}} $*
}
tmp=`mktemp`
cat > $tmp
edit $tmp < /dev/tty > /dev/tty
cat < $tmp
rm $tmp
Yay, I’m finally done with the first version of my little scratchpad.
I’ll tell you some more later, when this published ;-)
Let’s talk about the buildin fc in zsh. fc allows you to edit the last
command in your editor. It might sound kind of pointless, but it makes the
command line that much more powerfull. You can have ‘real’ programs doing
something useful in 1 minute right from the shell.
This makes languages like Awk that much more interesting.