NoSQL and “single-table” design pattern

The NoSQL “single-table” design pattern appears to be a polarizing topic with strong opinions for and against it. As best as I can tell, there’s no good reason for that!

I did a talk at AWS re:Invent last week along with Alex DeBrie. The talk was about deploying modern and efficient data models with Amazon DynamoDB. One small part of the talk was about the “single-table” design pattern. Over the next couple of days I have been flooded with questions about this pattern. I’m not really sure what all the hoopla is about this pattern, and why there is so much passion and almost religious fervor around this topic.

With RDBMS there are clearly defined benefits and drawbacks with normalization (and denormalization). Normalization and denormalization are an exercise in trading off between a well understood set of redundancies and anomalies, and runtime complexity and cost. When you normalize your data, the only mechanism to get it “back together” is using a “join”.

If you happen to use a database that doesn’t support joins, or if joins turn out to be expensive, you may prefer to accept the redundancies and anomalies that come with denormalization. This has been a long established pattern, for example in the analytics realm.

The “single-table” design pattern extends traditional RDBMS denormalization in three interesting ways. First, it quite often uses non-atomic datatypes that are not allowed in the normalized terminology of Codd, Date, and others. Second, makes use of the flexible schema support in NoSQL databases to commingle data from different entities in a single table. Finally, it uses data colocation guarantees in NoSQL databases to minimize the number of blocks read, and the number of API calls required in fetching related data.

Here’s what I think these options look like in practice.

First, this is a normalized schema with three tables. When you want to reconstruct the data, you join the tables. There are primary and foreign key constraints in place to ensure that data is consistent.

The next option is the fully denormalized structure where data from all tables is “pre-joined” into a single table.

The single-table schema is just slightly different. Data for all entities are commingled into a single table.

Application designers, and data modelers should look at their specific use-cases and determine whether or not they want to eliminate redundancies and inconsistencies and pay the cost of joins (or performing the join themselves in multiple steps if the database doesn’t support it), or denormalize and benefit from lower complexity and sometimes lower cost!

The other thing to keep in mind is that nothing in the single table design pattern requires you to bring all entities into a single table. A design where some entities are combined into a single table, coexists perfectly with others that are not.

What’s all the hoopla about?

Three things we should all do this holiday season

With Thanksgiving around the corner, we are getting into the holiday season. The final stretch before we enter the Christmas, New Year breaks.

This is a stressful time for most of us. Between the frantic rush to finish things before the break, family, travel, and the expenses associated with this period is the fact that many companies tighten their belts around this time of the year. Layoffs are not uncommon at this time of the year.

This year is a triple whammy – the usual year-end belt tightening, the aftermath of covid, and the huge layoffs that we are seeing in tech all over the world. This year is particularly bad.

Some of us are luckier than others. Some may have been through this many times, and are prepared (or indifferent). Some may be lucky to not be impacted by the cuts. Some may have strong family and professional networks.

Others are not. Unfortunately some others may be in the midst of personal upheavals. Some may not have been able to visit family overseas for years because of covid and visa issues, and unable to visit anytime soon. And then there are the layoffs.

So if you are one of the lucky ones, here are just three things I urge you to do this holiday season.

  1. Reach out to friends – show that you are there
  2. Be open to connections from strangers – lend an ear
  3. If you are in a position to adjust people’s schedules – ensure that everyone gets to take some time off

I realize that few of us can give strong assurances that everything will be ok, and few of us are in a position to actually hire anyone right now. But that’s not the point – be there for your fellow human being. Just being able to listen, and to say that you are there goes a long way. If you are in a role where you can adjust other’s schedules (on-call rotations, work shifts, …) be considerate and accommodate travel and personal schedules.

Why do you make it so hard to become a customer?

One of the hardest things for any business to do is gaining new customers. Anything that makes it hard for someone to become a customer is therefore a bad thing. So, I find it surprising how hard some companies make it to become a new customer.

Today’s example, my former employer, Verizon. I have been a Verizon customer for years. For (mostly silly) reasons, at 10pm last night, I wanted to add a new line of service to my account. I had a Google Pixel 6 in my hand, it had no SIM, and I just wanted to download an eSIM and get going.

Verizon’s website and mobile app said it could be done “immediately” and service active in 4 to 24 hours. So I entered my IMEI (for SIM2 as directed), I signed up, picked a number prefix, and was waiting for a QR code.

What I got was an email with a link to a website that said I needed to speak with a representative. And representatives aren’t available till 0800. So at 0805 I spoke with a representative who didn’t know what I wanted. With some gentle coaxing I got the representative to understand that I didn’t have an iPhone (never have even though she insisted that I have an iPhone) and that I needed a QR code to activate my phone. No such thing, she assured me. Just power cycle the phone she said. So I played along, no good. After 15m on the phone, Lisa figured out (maybe she did a google search) and said she could read out my QR code to me. Then she realized that she had to email it to me, which she did and in 30s I was all set.

Being curious about this kind of thing, I wondered why I needed a human involved in this process. I entered IMEI/2 (that’s eSIM on the Pixel 6) there’s no reason for a human at this point! I thought (maybe) Verizon encoded something fancy into the QR code, and it was somehow personalized.

As an example, here’s a QR code for an Airtel (Indian cellphone provider) on the left, and the Verizon QR code on the right.

The thing is this, the Airtel eSIM encodes a bunch of information, and if you were to decode this image, you’ll find

QR-Code:LPA:1$$97119........ many chars deleted .....E15A7

But, if you decode the Verizon QR code, what you get is literally this


Which makes perfect sense – the network knows IMEI/2, all you need to do is attempt to connect to the site (listed) and provide it IMEI/2 and you’ll be able to complete provisioning.

Verizon should (literally) be plastering this QR Code on every flat surface they can find and tell people that they just need to enter IMEI/2 on a website (which they do) and then service will be automatic. At the very minimum, they could just put it in the email I received – and I’d have had my phone up and running exactly as expected, in 15m or so.

But no, they have friction – a human in the process, and unfortunately one who doesn’t know how this is supposed to work. And one who is only available at 0800, and not 24×7.

Make it easy for people to onboard to your product, and your odds of success are increased. It doesn’t guarantee that you’ll succeed – if the product is shitty, people won’t come. Thankfully, Verizon’s product (their service and coverage) are great compared to the other providers. But even with that, if you make it hard for people to onboard to your service, they may just go somewhere else – like T-Mobile which has this on its webpage (

Guess what you get when you decode that?


Obvious? Clearly not (to Verizon)!

Getting started with open source

I was looking for a nice introduction to getting started in open source to share with a young person who is an undergraduate student in Computer Science in India. Unfortunately, while some people know about it, it appears that most universities don’t do a good job of preparing their students for the job interview, or the workplace.

One thing that I’ve always recommended is that people get a github account and showcase some of their work there. A second is to contribute to some open source project.

I found these two write-ups about the subject

If you are a student, or early in your career and looking to differentiate yourself from others, you should seriously look into this.

I posted this on linkedIn earlier today.

Not so remote after 2.5 years!

After over two years of working at #aws in the #dynamodb team, I finally got to meet many of my Chime-pals in #seattle last week. I joined this team as covid was just getting started, and had never met the #team that I worked with.

I’ve worked with remote co-workers before, but I’ve never been in a situation where I have had to work with team mates who I’ve not met in person for so long.

There is something primordial, and innately human in an in-person connection that just doesn’t come through in teleconferencing applications. But it is more than that – when you only meet people in meetings, your interactions are strictly focused on the job at hand and it is much harder to make a connection with the person. That connection is the important thing that makes a team tick, that transforms a group of individuals into a team.

I feel blessed to be able to work with this team, to work on fun and interesting problems at truly mind-blowing scale, and to do it from afar.

In writing about Prime Day 2022 [], Jeff Barr says “DynamoDB powers multiple high-traffic Amazon properties and systems including Alexa, the sites, and all Amazon fulfillment centers. Over the course of Prime Day, these sources made trillions of calls to the DynamoDB API. DynamoDB maintained high availability while delivering single-digit millisecond responses and peaking at 105.2 million requests per second.”

Think about that, and then think about the fact that this is just one of DynamoDB’s many customers, a list that includes names like ZoomThe Walt Disney CompanyDropboxNetflix and many more [].

If you are interested in joining us, we (quite literally) have openings all over the world. We are hiring in Dublin, Seattle, Bangalore, Vancouver, and plenty of other places. We are also hiring individuals who will work remotely. Whether your passion is software development, operations, product management, program management, or engineering management, we have roles that you may find interesting. We are hiring engineers to work on the data plane, and the control plane, and if you look at the list of open jobs [] you’ll get a sense for some of the cool features that we’re working on.

Come join us for an exciting ride!!

I posted the above on linkedIn.

Revisiting Prolog

Every decade (or so), I’ve found occasion to go and re-learn Prolog, and it just happened again. If you aren’t familiar with the Prolog programming language, the best description I can give you is this – Prolog is a declarative programming language where you focus on describing the what, and not the how of arriving at a particular result.

Each time I re-learn Prolog, I start with the usual family tree, and three towers problem (Tower of Hanoi|Benares|Bramha|Lucas, …), and get to whatever I’m trying to do. Most of us view the three towers problem as an example of recursion, and there the matter rests.

Simply put, to move N discs from the left tower to the right tower, you first move the top (N-1) discs from the left tower to the middle tower, and then move the Nth disc to the right tower, and then move the (N-1) discs from the middle tower to the right tower. You can prove (a simple proof by induction) that the N disc problem can be solved in (2^n – 1) moves.

In prolog this would look something like this (sample code):

move(1, A, B, _, Ct, Steps) :-
    Steps is Ct + 1,
    write(Steps), write(': '), write('Move one disc from '), 
    write(A), write(' to '), write(B), nl.
move(N, A, B, C, Ct, Steps) :-
    N > 1,
    M is N - 1,
    move(M, A, C, B, Ct, N1),
    move(1, A, B, C, N1, N2),
    move(M, C, B, A, N2, Steps).
towers(N) :-
    move(N, left, right, middle, 0, Steps),
    write('That took a total of '), write(Steps),
    write(' steps.'), nl.

When you run this program, the output looks like this:

$ swipl -g 'towers(3)' -g halt 
1: Move one disc from left to right
2: Move one disc from left to middle
3: Move one disc from right to middle
4: Move one disc from left to right
5: Move one disc from middle to left
6: Move one disc from middle to right
7: Move one disc from left to right
That took a total of 7 steps.

But this kind of solution doesn’t really show an important aspect of Prolog, the ability to explore the problem space, and discover the solution. The recursive solution shown above can be implemented just as well in C or python.

In Prolog, you can implement the solution a different way, and only provide the system a set of rules and have it discover a solution. Here’s one way to do that.

Here we define a high level goal (towers/1) just as above.

towers(N) :-
    findall(X, between(1, N, X), L),
    reverse(L, LeftList),
    towers(LeftList, [], [], [], State),
    showmoves(1, State).

But beyond that, the similarity ends. The implementation of towers/5 is totally different. It consists instead of seven rules.

At any time, there are one of 6 possible moves that one can make. Those 6 moves are to move the top disc from left, center or right, and move them to left, center, or right. That gives us left -> center, left -> right, center -> left, center -> right, right -> center, and right -> left. Each of those is a rule.

There’s the seventh rule to determine whether or not we’ve reached the desired end state.

Here’s an implementation of the check for the desired end state:

towers(Left, Center, Right, StateIn, StateOut) :-
    done(Left, Center, Right),
    StateOut = StateIn.
done([], [], _).

That’s it! The check to see whether, or not we are done is a single line of code. done/3 is called with Left, Center, and Right, and we’re done if Left and Center are empty and Right is something (what it is, we don’t care!).

The move from left to center (for example) looks like this:

towers(Left, Center, Right, StateIn, StateOut) :-
    move(Left, Center, LeftOut, CenterOut),
    State = [LeftOut, CenterOut, Right],
    append(StateIn, [State], X),
    towers(LeftOut, CenterOut, Right, X, StateOut).

That’s it! move/4 tries the move from Left to Center, and if it succeeds, we will record that state, and recurse. We record state so we can print the solution when we are done! You can see that towers/1 calls showmoves/2 which looks like this:

showmoves(_, []) :-
showmoves(N, [H|T]) :-
    format('State[~d]: ', N),
    format('Left ~w, Center ~w, Right ~w\n', H),
    N1 is N + 1,
    showmoves(N1, T).

showmoves/2 is recursive and prints out the moves in order. The lovely thing about this program is that all it has is the rules to adhere to, that’s it. Here’s a simple invocation.

$ swipl -g 'towers(2)' -g halt 
State[1]: Left [2], Center [1], Right []
State[2]: Left [], Center [1], Right [2]
State[3]: Left [], Center [], Right [2,1]

When all the program knows is to follow the rules, it needs some help to prevent ending up in a cycle. Doing that just requires one more line of code in the towers/5 rule.

towers(Left, Center, Right, StateIn, StateOut) :-
    move(Left, Center, LeftOut, CenterOut),
    State = [LeftOut, CenterOut, Right],
    \+ member(State, StateIn),
    append(StateIn, [State], X),
    towers(LeftOut, CenterOut, Right, X, StateOut).

It doesn’t produce the ‘best’ solution, but it does produce valid solutions! The step (highlighted) is an example of one that is legal, but clearly wasted.

$ swipl -g 'towers(4)' -g halt 
State[1]: Left [4,3,2], Center [1], Right []
State[2]: Left [4,3], Center [1], Right [2]
State[3]: Left [4,3], Center [], Right [2,1]
State[4]: Left [4], Center [3], Right [2,1]
State[5]: Left [4], Center [3,1], Right [2]
State[6]: Left [4,1], Center [3], Right [2]
State[7]: Left [4,1], Center [3,2], Right []
State[8]: Left [4], Center [3,2,1], Right []
State[9]: Left [], Center [3,2,1], Right [4]
State[10]: Left [], Center [3,2], Right [4,1]
State[11]: Left [2], Center [3], Right [4,1]
State[12]: Left [2], Center [3,1], Right [4]
State[13]: Left [], Center [3,1], Right [4,2]
State[14]: Left [], Center [3], Right [4,2,1]
State[15]: Left [3], Center [], Right [4,2,1]
State[16]: Left [3], Center [1], Right [4,2]
State[17]: Left [3,1], Center [], Right [4,2]
State[18]: Left [3,1], Center [2], Right [4]
State[19]: Left [3], Center [2,1], Right [4]
State[20]: Left [], Center [2,1], Right [4,3]
State[21]: Left [], Center [2], Right [4,3,1]
State[22]: Left [2], Center [], Right [4,3,1]
State[23]: Left [2], Center [1], Right [4,3]
State[24]: Left [], Center [1], Right [4,3,2]
State[25]: Left [], Center [], Right [4,3,2,1]

Recall that the towers(4) problem can be solved in 15 moves (here’s what the recursive solution looks like:

$ swipl -g 'towers(4)' -g halt 
1: Move one disc from left to middle
2: Move one disc from left to right
3: Move one disc from middle to right
4: Move one disc from left to middle
5: Move one disc from right to left
6: Move one disc from right to middle
7: Move one disc from left to middle
8: Move one disc from left to right
9: Move one disc from middle to right
10: Move one disc from middle to left
11: Move one disc from right to left
12: Move one disc from middle to right
13: Move one disc from left to middle
14: Move one disc from left to right
15: Move one disc from middle to right
That took a total of 15 steps.

While not the best solution, it is fascinating how prolog can find a valid solution with just the “what” and nothing about the “how”.

Complete code is here.

An issue described above is with the code producing sub-optimal results by moving the same disc in consecutive moves. A simple change to keep track of the last disc moved, and prevent it from being moved again. It didn’t have quite the expected results 😦

The first solution it finds is not great, but after a few tries it produces this (to explore alternater results, remove the cut on line 36).

State[1]: Left [4,3,2], Center [1], Right []
State[2]: Left [4,3], Center [1], Right [2]
State[3]: Left [4,3], Center [], Right [2,1]
State[4]: Left [4], Center [3], Right [2,1]
State[5]: Left [4], Center [3,1], Right [2]
State[6]: Left [4,2], Center [3,1], Right []
State[7]: Left [4,2], Center [3], Right [1]
State[8]: Left [4], Center [3,2], Right [1]
State[9]: Left [4], Center [3,2,1], Right []
State[10]: Left [], Center [3,2,1], Right [4]
State[11]: Left [], Center [3,2], Right [4,1]
State[12]: Left [2], Center [3], Right [4,1]
State[13]: Left [2,1], Center [3], Right [4]
State[14]: Left [2,1], Center [], Right [4,3]
State[15]: Left [2], Center [1], Right [4,3]
State[16]: Left [], Center [1], Right [4,3,2]
State[17]: Left [], Center [], Right [4,3,2,1]


Geiger counters and Radon detection

It is that time of the year, and I’ve been in touch with a number of people who I’ve not spoken with in months (in some cases since the same time last year). And a few asked me about Geiger counters, and radon detection that I’d written about in the last few posts.


Geiger counters aren’t a good way to measure household radon concentrations. But read on if you want to know more.

Geiger counters detect any form of radiation. The ones I was working with could detect α, β, or γ particles. That’s pretty much it. It tells you nothing about the source of the ionizing particle, its energy, or anything else. That’s it, it is a counter. It counts clicks per minute (CPM), and from that you can compute fancy things like sieverts an hour and such like.

Radon related problems are much more specific. There is exactly one decay of interest and that is the one from Radon to its decay to Polonium. The issue is that this Polonium is charged, and can adhere to dust and get inhaled. The health risk of radon is related to the concentration of Radon in the air, and there is no way to correlate CPM to Radon concentration.

So to get to Radon concentration, I’d have to go look at mechanisms based on alpha spectroscopy, or alpha track detectors, or traditional charcoal canisters.

That’s what I’m following up on now 🙂

DIY Geiger Counters

In the previous post, I described false starts with off the shelf radon detectors. Radon is radioactive, and anyone who has seen a movie or two knows that the good guys have Geiger counters that make noises when there is radioactivity. So of course, the solution is to get a geiger counter.

The first one I tried was one made by Mighty Ohm. Not knowing any better, I got one that had an SBM-20 tube. This tube detects beta, and gamma particles, but not alpha. Nice geiger counter, great kit, great to put it together and get it working, but let’s skip forward. I need one that’ll measure not just beta, and gamma, but also alpha particles.

Recall that when Radon decays, it releases an alpha particle. See the picture below.

Figure 1. Radioactive decay of Uranium to Lead, including half lives, and emission. Radon (Rn) is the only element that is a gas, the others are all solids.

I had two choices, get another kit, or get something that was pre-built. I chose the GMC-600+ from GQ Electronics. It comes pre-assembled, and pre-calibrated, it detects alpha, beta, and gamma particles, and reviews gave it a good battery life. Most importantly, it was available on Amazon with two day delivery. So I ordered one, and waited. After a false start with the first one (had a line of dead pixels), the second one has proved to be really good.

It also has a USB port on which it appears as a simple serial port device, and you can read, and write from it directly. They also give you some software (I didn’t try it, it was Windows only). I wrote some software based on their documented protocol, and it worked quite easily. GQ Electronics makes some interesting hardware, they clearly are not software people. But, I do like their Geiger counter, and I’ll open source the software I’ve written.

The GMC-600+ uses an LND-7317 tube. As shown on its specification page, it can detect alpha, beta and gamma particles. I found that to convert CPM to uSv/h for this tube, one must divide by 350. I’m not really sure why this is, but for now, I’m using this number and moving forward.

On two different days, I conducted the following experiment. I placed the geiger counter inside the air-conditioning duct, right next to the filter (inside a ziploc bag).

I then ran the circulating fans for two hours, and then shut them off.

On 9/26 the fans were run between 12:30 and 14:30 (local time). On 10/9 the fans were run between 13:45 and 15:45 (local time). Here are the results from the GMC-600+. Note, I converted CPM to mSv/year.

Figure 2. Test on 09/26, fans were run between 12:30 and 14:30 (local time).
Figure 3. Test on 10/09, fans were run between 13:45 and 15:45 (local time).

In the test on 10/09, the counter was placed in the a/c duct many hours before I started to run the fans. The background radiation is about 1 msV/year in both tests. On 9/26 the peak was just over 3 msV/year, on 10/09 it went just over 1.5 msV/year.

In both cases, the radiation level dropped by 50% (over background) in about 50 minutes.

If you look at the radioactive decay chart above, from Po218 to Pb210 takes ~50 minutes. It sure looks like the dust in the filter is radioactive, and has a decay characteristic that could be related to the decay from Po218 to Pb210! Lots of fun and interesting math to follow in the next blog post.

WARNING: You can’t just add half life (times) to get effective decay rates, and half life. A half life is an exponential decay curve, and mere addition is meaningless. It has been a while since I studied Bateman’s equations, but in the simplest form, Bateman’s assumes a chain of decay beginning with all particles of the first type in the chain. That’s not what I’m dealing with here – at the time when the fan goes off, there are a collection of particles on the filter, each with its own decay (half life), and fraction. The effective half life is more complex than Bateman.

Of residential radon tests, and sensors

In the last blog post, I started to describe how radon comes into houses, and the radioactive decay that causes it. During the home inspection process, a radon test would place some canisters in the basement for 12, or 24 hours. These canisters are then sent away to a lab for testing, and you get a result in a day or two. The Commonwealth of Massachusetts has some information about Radon as well as specific details about testing.

Figure 1. Side-by-side comparison of two radon detectors.

These tests give you a number representing the Radon level at the time when the test was done. But radon levels change throughout the day, even after the test is done. When I saw a passive radon system in the basement, I looked into getting a digital radon meter. I purchased a couple [you can find them at your local hardware store, I found mine online] and set them up in my basement.

After 24 hours they started showing numbers, and they consistently showed different numbers! I’ve blurred the manufacturer’s name on purpose.

Within reason, I can imagine differences, but when they were consistently different, and sometimes diverging, I wasn’t sure what to do. A few days later, one of them consistently showed a reading in excess of 6 pCi/L (pico-Curies/liter) and the second stayed stubbornly below 1.75 or so. What would you do?

I purchased a third one, and I put all three through a “reset” cycle, and then tossed them into a solid lead box. And I left them there, in that box for 36 hours. When I took them out, and reviewed the readings over the past 12 hours [there’s a 24 hour period when the meters report nothing], and all three of them were different. I wasn’t comfortable with that end result. I wanted higher confidence in the readings.

The next post will cover what came next, my own radon detector.

Of Radon, Radon tests, and home ownership

If you live in the New England area, and are about to purchase a house, you will likely come face to face with a Radon test. When you get a home inspection the inspector will likely do this for you. [Even if you waive your home inspection contingency, I strongly recommend that you get a home inspection – in the best case it is uneventful, in the worst case, you cap your loss at whatever you put down with your offer to purchase.]

Radon is the number one cause of lung cancer among non-smokers, according to EPA estimates. Overall, radon is the second leading cause of lung cancer. Radon is responsible for about 21,000 lung cancer deaths every year. About 2,900 of these deaths occur among people who have never smoked. On January 13, 2005, Dr. Richard H. Carmona, the U.S. Surgeon General, issued a national health advisory on radon.

Health Risk of Radon

I had a radon inspection, and the result was that the radon level was “acceptable” [1.7 pCi/L]. The EPA suggests the action level of 4 pCi/L so all’s well, right. Nothing to worry about.

When I moved in, I noticed that the basement had a “passive radon mitigation system”. So I started looking into this a bit further, and other than a bunch of companies who are trying to sell me a test. More credible documents from the EPA, and other places are hard to read, and understand. I tried to find something easier to understand. Hopefully this helps someone else looking for comprehensible radon information.

Here is some high school physics that you’ll need to understand what comes next. Radioactive elements decay over time. When a radioactive element decays, it emits some radiation, and transforms into another element which may, or may not itself be radioactive.

The rate of decay is measured by the element’s half-life. If you start with a gram of a radioactive substance, and this substance has a half-life of 1 day, then at the end of a day, you will have 1/2 a gram, after another day you will have 0.25 g, and so on. Hence the name, half-life.

Another thing you’ll need to understand is where Radon comes from. I’ve summarized that below. The radioactive decay begins with Uranium (U238) and progresses through various elements, till we end up with Lead (Pb206). Along the way, each decay has an associated half life, and a radioactive radiation that is either an alpha (α), or beta (β) particle.

Figure 1. Radioactive decay of Uranium to Lead, including half lives, and emission. Radon (Rn) is the only element that is a gas, the others are all solids.

The important thing to notice is that Radon (Rn) is the only element that is a gas, all others are solid. The gas leaks into the house through cracks in the basement floor, and within a few days decays into Polonium (Po218). The solid particles tend to stick to dust particles (due to electrical charge) and end up getting inhaled. If the contaminated dust sticks to the airways, further decay occurs within the body, and can cause the sensitive cells that they are close to. This is what leads to cancer.

The next post continues with a description of my adventures with household radon meters.

What the recent Facebook/WhatsApp announcements could mean

Ever since Facebook acquired WhatsApp (in 2014) I have wondered how long it would take before we found that our supposedly “end to end encrypted” messages were being mined by Facebook for its own purposes.

It has been a while coming, but I think it is now clear that end to end encryption in WhatsApp isn’t really the case, and will definitely be less secure in the future.

Over a year ago, Gregorio Zanon described in detail why it was that end-to-end encryption didn’t really mean that Facebook couldn’t snoop on all of the messages you exchanged with others. There’s always been this difference between one-to-one messages and group messages in WhatsApp, and how the encryption is handled on each. For details of how it is done in WhatsApp, see the detailed write-up from April 2016.

Now we learn that Facebook is going to be relaxing “end to end encrypted”. As reported in Schneier, who quotes Kalev Leetaru,

Facebook’s model entirely bypasses the encryption debate by globalizing the current practice of compromising devices by building those encryption bypasses directly into the communications clients themselves and deploying what amounts to machine-based wiretaps to billions of users at once.



Some years ago, I happened to be in India, and at a loose end, and accompanied someone who went to a Government office to get some work done. The work was something to do with a real-estate transaction. The Government office was the usual bustle of people, hangers-on, sweat, and the sounds of people talking on telephones, and the clacking of typewriters. All of that I was used to, but there was something new that I’d not seen before.

At one point documents were handed to one of the ‘brokers’ who was facilitating the transaction. He set them out on a table, and proceeded to take pictures. Aadhar Card (an identity card), PAN Card (tax identification), Drivers License, … all quickly photographed – and this made my skin crawl (a bit). Then these were quickly sent off to the document writer, sitting three floors down, just outside the building under a tree at his typewriter, generating the documents that would then be certified.

And how was this done: WhatsApp! Not email, not on some secure server with 256 bit encryption and security, just WhatsApp! India in general has a rather poor security practice, and this kind of thing is commonplace, people are used to it.

So now that Facebook says they are going to be intercepting and decrypting all messages and potentially sending them off to their own servers, guess what information they could get their hands on!

It seems pointless to expect that US regulators will do anything to protect consumers ‘privacy’ given that they’re pushing for weakening communication security themselves, and it seems like a foregone conclusion that Facebook will misuse this data, given that they have no moral compass (at least not one that is functioning).

This change has far-reaching implications and only time will tell how badly it will turn out but given Facebook’s track record, this isn’t going to end well.

The importance of longevity testing

airbus_a350_1000I worked for many years with, and for Stratus Technologies, a company that made fault tolerant computers – computers that just didn’t go down. One of the important things that we did at Stratus was longevity testing.

All software errors are not detectable quickly – some take time. Sometimes, just leaving a system to idle for a long time can cause problems. And we used to test for all of those things.

Which is why, when I see stuff like this, it makes me wonder what knowledge we are losing in this mad race towards ‘agile’ and ‘CI/CD’.

Airbus A350 software bug forces airlines to turn planes off and on every 149 hours

The AWD reads, in part

Prompted by in-service events where a loss of communication occurred between some avionics systems and avionics network, analysis has shown that this may occur after 149 hours of continuous aeroplane power-up. Depending on the affected aeroplane systems or equipment, different consequences have been observed and reported by operators, from redundancy loss to complete loss on a specific function hosted on common remote data concentrator and core processing input/output modules.

and this:

Required Action(s) and Compliance Time(s):

Repetitive Power Cycle (Reset):

(1) Within 30 days after 01 August 2017 [the effective date of the original issue of this AD], and, thereafter, at intervals not to exceed 149 hours of continuous power-up (as defined in the AOT), accomplish an on ground power cycle in accordance with the instructions of the AOT .

What is ridiculous about this particular issue is that it comes on the heals of Boeing 787 software bug can shut down planes’ generators IN FLIGHT, a bug where the generators would shutdown after 250 days of continuous operation, a problem that prompted this AWD!

Come on Airbus, my Windows PC has been up longer than your dreamliner!

The GCE outage on June 2 2019

I happened to notice the GCE outage on June 2 for an odd reason. I have a number of motion activated cameras that continually stream to a small Raspberry Pi cluster (where tensor flow does some nifty stuff). This cluster pushes some more serious processing onto GCE. Just as a fail-safe, I have the system also generate an email when they notice an anomaly, some unexplained movement, and so on.

And on June 2nd, this all went dark for a while, and I wasn’t quite sure why. Digging around later, I realize that the issue was that I relied on GCE for the cloud infrastructure, and gmail for the email. So when GCE had an outage, the whole thing came apart – there’s no resiliency if you have a single-point-of-failure (SPOF) and GCE was my SPOF.

WhiScreen Shot 2019-06-05 at 7.17.17 AMle I was receiving mobile alerts that there was motion, I got no notification(s) on what the cause was. The expected behavior was that I would receive alerts on my mobile device, and explanations as email. For example, the alert would read “Motion detected, camera-5 <time>”. The explanation would be something like “NORMAL: camera-5 motion detected at <time> – timer activated light change”,  “NORMAL: camera-3 motion detected at <time> – garage door closed”, or “WARNING: camera-4 motion detected at <time> – unknown pattern”.

I now realize that the reason was that the email notification, and the pattern detection relied on GCE and that SPOF caused delays in processing, and email notification. OK, so I fixed my error and now use Office365 for email generation so at least I’ll get a warning email.

But, I’m puzzled by Google’s blog post about this outage. The summary of that post is that a configuration change that was intended for a small number of servers ended up going to other servers, shit happened, shit cleanup took longer because troubleshooting network was the same as the affected network.

So, just as I had a SPOF, Google appears to have had an SPOF. But, why is it that we still have these issues where a configuration change intended for a small number of servers ends up going to a large number of servers?

Wasn’t this the same kind of thing that caused the 2017 Amazon S3 outage?

At 9:37AM PST, an authorized S3 team member using an established playbook executed a command which was intended to remove a small number of servers for one of the S3 subsystems that is used by the S3 billing process. Unfortunately, one of the inputs to the command was entered incorrectly and a larger set of servers was removed than intended.

Shouldn’t there be a better way to detect the intended scope of a change, and a verification that this is intended? Seems like an opportunity for a different kind of check-and-balance?

Building completely redundant systems sounds like a simple solution but at some point the cost of this becomes exorbitant. So building completely independent control and user networks may seem like the obvious solution but is it cost effective to do that?

Try this DIY Neutral Density Filter for Long Exposure Photos

I have heard of this trick of using welders glass as a cheap ND filter. But from my childhood experience of arc welding, I was not sure how one would deal with the reality that welders glasses are not really precision optics.

This article addresses at least the issue of coloration and offers some nice tips for adjusting color balance in general.

Automate everything

I like things to be automated, everything. Coffee in the morning, bill paautomatoryment, cycling the cable modem when it goes wonky, everything. The adage used to be, if you do something twice, automate it. I think it should be, “if you do anything, automate it, you will likely have to do it one more time”.

So I used to automate stuff like converting DOCX to PDF and PPTX to PDF on Windows all the time. But for the past two years, after moving to a Mac this is one thing that I’ve not been able to automate, and it bugged me, a lot.

No longer.

I had to make a presentation which went with a descriptive document, and I wanted to submit the whole thing as a PDF. Try as I might, Powerpoint and Word on the Mac would not make this easy.

It is disgusting that I had to resort to Applescript + Automator to do this.

I found this, and this.

It is a horrible way to do it, but yes, it works.

Now, before the Mac purists flame me for using Microsoft Word, and Microsoft Powerpoint, let me point out that the Mac default tools don’t make it any easier. Apple Keynote does not appear to offer a solution to this either, you have to resort to automator for this too.

So, eventually, I had to resort to automation based on those two links to make two PDFs and then this to combine them into a single PDF.

This is shitty, horrible, and I am using it now. But, do you know of some other solution, using simple python, and not having to install LibreOffice or a handful of other tools? Isn’t this a solved problem? If not, I wonder why?

Monitoring your ISP – Fun things to do with a Raspberry Pi (Part 1)

I have Comcast Internet service at home. I’ve used it for many years now, and one of the constant things over this period of time has been that the service is quite often very unreliable. I’ve gone for months with no problems, and then for some weeks or months the service gets to be terribly unreliable.

What do I mean by unreliable? That is best described in terms of what the service is like when it is reliable.

  • I can leave an ssh session to a remote machine up and running for days (say, an EC2 instance) – if I have keep-alive and things like that setup
  • VPN sessions stay up for days without a problem
  • The network is responsive, DNS lookups are quick, ICMP response is good, surfing the web is effortless, things like Netflix and Amazon movies work well
  • Both IPv4 and IPv6 are working well

You get the idea. With that in mind, here’s what I see from time to time:

  • Keeping an ssh session up for more than an hour is virtually impossible
  • VPN sessions terminate frequently, sometimes it is so bad that I can’t VPN successfully
  • DNS lookups fail (using the Comcast default DNS servers,,,  2001:558:feed::1, and 2001:558:feed::2). It isn’t any better with Google’s DNS because the issue is basic network connectivity
  • There is very high packet loss even pinging my default gateway!
  • Surfing the web is a pain, click a link and it hangs … Forget about streaming content

During these incidents, I’ve found that the cable modem itself remains fine, I can ping the internal interface, signal strengths look good, and there’s nothing obviously wrong with the hardware.

What I’ve found is that rebooting my cable modem generally fixes the problem immediately. Now, this isn’t always the case – Comcast does have outages from time to time where you just have to wait a few hours. But for the most part, resetting the cable modem just fixes things.

So I was wondering how I could make this all a bit better for myself.

An option is something like this. An “Internet Enabled IP Remote Power Switch with Reboot“. Or this, this, or this. The last one of those, Web Power Switch Pro Model, even sports a little web server, can be configured, and supports SNMP, and a REST API! Some of these gadgets are even Alexa compatible!

But, no – I had to solve this with a Raspberry Pi! Continued in Part 2.


Monitoring your ISP – Fun things to do with a Raspberry Pi (Part 2)

In Part 1 of this blog post, I described a problem I’ve been facing with my internet service, and the desired solution – a gizmo that would reboot my cable modem when the internet connection was down.

The first thing I got was a PiRelay from SB Components. This nifty HAT has four relays that will happily turn on and off a 110v or 250v load. The site claims 7A @ 240V, more than enough for all of my network gear. See image below, left.

Next I needed some way to put this in a power source. Initially I thought I’d get a simple power strip with individual switches on the outlets. I thought I could just connect the relays up in place of the switches and I’d be all set! So I bought one of these (above right).

Finally I just made a little junction box with four power outlets, and wired them up to the relays.

The software to control this is very straightforward.

  1. It turns out that the way Microsoft checks for internet connectivity is to do a get on “;, and that returns the text “Microsoft NCSI”. OK, so I do that.
  2. I also made a list of a dozen or so web sites that I visit often, and I make a conn.request() to them to fetch the HEAD.

If internet connectivity appear to be not working, power cycle “relay 0”, which is where my cable modem is running. And this is a simple cron job, runs every 10 minutes.

Works like a champ. Another simple Raspberry Pi project!

If you are interested, ping me and I’ll post more details. I intend to share the code for the project soon – once I shake out any remaining little gremlins!

Blinking the lights on your Raspberry Pi – as debugging aid

Debugging things on the Raspberry Pi by flashing the power LED.

I’ve often found that the most useful debugging technique is to be able to provide a visual cue that something is going on. And for that, blinking the power light on the Raspberry Pi is the easiest thing to do.

The power light (often called LED1) is always on, and bright red. So turning it off, and back on is a great little debugging technique.

A short note about the LEDs on Raspberry Pi. There are two, one is the green one [led0] for network activity, and the other is the red one [led1] for power.

They are exposed through


To turn off the red LED

echo 0 > /sys/class/leds/led1/brightness

To turn on the red LED

echo 0 > /sys/class/leds/led1/brightness

Doing this requires that you are privileged. So to make things easy I wrote it in C, put the binary in /bin, and turned on the setuid bit on it. I’ve also used a library that blinks the power LED in simple morse code to get a short message across. I can’t do more than about 10 wpm in my head now so while it is slow, it is very very useful.

The relationship between accuracy, precision, and relevance

OK, this is a rant.

graph1It annoys me to no end when people present graphs like this one. Yes, the numbers do in fact add up to 100% but does it make any sense to have so many digits after the decimal when in reality this is based on a sample size of 6? Wouldn’t 1/2, 1/3, 1/6 have sufficed? What about 0.5, 0.33 and 0.67. Do you really really have to go to all those decimal places?

Excel has made it easy for people to make meaningless graphs like this, where merely clicking a little button gives you more decimal places. I’m firmly convinced that just having more digits after the decimal point doesn’t really make a difference in a lot of situations.

Let’s start first with some definitions

accuracy is a “degree of conformity of a measure to a standard or a true value“.

precision is the “the degree of refinement with which an operation is performed or a measurement stated“.

One can be precise, and accurate. For example, when I say that the sun rises in the east 100% every single day, I am both precise, and accurate. (I am just as precise and accurate if I said that the sun rises in the east 100.000% of the time).

One can be precise, and inaccurate. For example, when I say that the sun rises in the east 90.00% of the time, I am being precise but inaccurate.

So, as you can see, it is important to be accurate; the question now is how precise does one have to be. Assume that I conduct an experiment and tabulate the results, I find that 1/2 the time I have outcome A, 1/3 of the time I have outcome B, and 1/3 of the time I have outcome C. It would be both precise, and accurate to state the results are (as shown in the pie chart above) 50.0000%, 16.66667%, and 33.33333% for the various possible outcomes.

But does that really matter? I believe that it does. Consider the following two pictures, these are real pictures, of real street signs.

2018-08-16 18.18.09

This sign is on the outskirts of Mysore, in India.

2018-09-08 10.37.06

This sign is in Lancaster, MA.

In the first picture (the one from Mysore, India), we have distances to various places, accurate to 0.01km (apparently). Mysore Palace is 4.00 km away, the zoo is 4.00 km away, Mangalore is 270.00 km away. What’s 0.01km? That’s about 10m (about 33 feet). It is conceivable that this is accurate (possible, not probably). So I’d say this is precise and may be accurate.

The second picture (the one from Lancaster, MA) is most definitely precise, to 4 places of the decimal point no less. The bridge is 3.3528 meters (the sign claims). It also indicates that it is 11 feet. A foot is 12 inches, an inch is 2.54 centimeters, and therefore a meter (100cm is 39.3701″) is exactly 3.2808 feet. Therefore 11 feet is 3.3528 meters exactly. So this is both precise, and accurate (assuming that the bridge does in fact have a 11′ clearance).

The question is this, is the precision (4.00km, or 3.3528m) really relevant? We’re talking about street signs, measuring things with a fair amount of error. In the case of the bridge, the clearance could change by as much as 2″ between summer and winter because of expansion and contraction of the road surface (frost heaves). So wouldn’t it make more sense to just stick with 11′, or 3.5 meters?

So back to our graph with the 50.0000%, 16.66667% and 33.33333%. Does it really matter to the person looking at the graph that these numbers are presented to a precision of 0.000001%? For the most part, given the fact that the experiment had a sample size of 6, absolutely not.

So please, when presenting facts (and numbers) please do think about accuracy; that’s important. But please make the precision consistent with the relevance. When driving a car to the zoo, is the last 33′ going to really kill me? or am I really interested in the clearance of the bridge accurate to the thickness of a human hair, or a sheet of paper?


Android in a virtual machine

Very often, I’ve found that it is an advantage to have Android running in a virtual machine (say on a desktop or a laptop) and use the Android applications.

Welcome to android-x86

Android runs on x86 based machines thanks to the android-x86 project. I download images from here. What follows is a simple, step-by-step How-To to get Android running on your x86 based computer with VMWare. I assume that much the same thing can be done with some other emulator (like VirtualBox).

Install android-x86

The screenshots below show the steps involved in installing android-x86 on a Mac.


Choose “Create a custom virtual machine”


Choose “FreeBSD 64-bit”


I used the default size (for now; I’ll increase the size later).


For starters, we can get going with this.


I was installing android v8.1 RC1


Increased #cores and memory as above.


Resized the drive to 40GB.


And with that, began the installation.

Options during installation.

The options and choices are self explanatory. Screenshots show the options that are chosen (selected).


One final thing – enable 3D graphics acceleration

Before booting, you should enable 3D graphics acceleration. I’ve found that without this, you end up in a text screen at a shell prompt.


And finally, a reboot!


That’s all there is to it, you end up in the standard android “first boot” process.

What I learned photographing a wedding

Recently, I had the opportunity to photograph a wedding. Some background for people reading this (who don’t have the context). The wedding was an “Indian wedding” in San Francisco, CA. It was a hybrid of a traditional North Indian and a South Indian wedding, and was compressed to two hours. There were several events that occurred before the wedding itself, and there were a few after the wedding.

San Francisco skyline

For example, there was a cruise around the San Francisco Bay (thankfully, good weather).

San Francisco Bay Bridge

There were also several indoor events (which were conducted in a basement, and Lord Ganeshaso little natural light). There were several religious ceremonies, a civil ceremony, and lots of food, and drink, and partying as well.

Before heading out to SFO, I read a bunch of stuff about photographing weddings, and I spoke with one person (Thanks Allison) very familiar with this. I took a bunch of gear with me, and I thought long and hard about how to deal with the professional photographer(s) who would also be covering the event.

I was hoping that I’d be able to work alongside them, and watch and learn (and not get in the way). I hoped that they’d not be too annoyed with a busybody with a bunch of gear, and I hoped that I could stay out of their way.

Thinking back, and looking at the pictures I took, I’ve learned a lot; a lot about taking photographs, a lot about myself, and a lot about the equipment that I have.

Shoot fully manual mode – most of the time

Outdoors, it may be possible to get away with auto ISO, but even there shooting anything other than manual focus, manual exposure and aperture is a bad idea. I’ve tried a number of different options for metering, and focus preference, but did not find them to be particularly fun. But, that did mean that I was shooting stopped down (f5.6 or smaller).

Bounce the flash off the roof

I used a Nikon Speedlight SB-700 and if I didn’t bounce it off the roof, the foreground subject got over exposed. Using the diffuser, and bouncing the flash off the roof produced much better results.

You do really want f2.8 a lot of the time!

While I often shot f5.6 or smaller, I did find myself shooting f2.8 quite a lot. Not as much as I thought I would, but certainly quite a lot. And it was good that I had lenses that could go to f2.8. Most of the time I found that I was shooting between 50mm and 90mm so it was quite annoying that I needed two lenses to cover this range. But I managed …

Shoot RAW (+JPEG, but definitely RAW)

I’ve found that many of the pictures I took needed Benefits of shooting RAWpost-processing that would was much easier with RAW. For example, some of them required significant color (temperature) and exposure adjustment.

One example is at left, I think the color temperature of the picture above is better than the one below. The significant amount of purple in the decorations caused the image to look a little bit too purple for my liking. Luckily the little white altar in the foreground gave me a good color reference.

I don’t want to get into the “can this be done with JPEG” debate; I’m sure that it can, and there are many who prefer JPEG. I just feel lucky that I shot everything RAW+JPEG.

LED Light Panels are a must

I have a great flash, but it is no match for a good LED light panel. I really need to get one of those things if I’m ever going to shoot a wedding, or any other event with a lot of people.

Take more pictures; way more pictures

I’m not a “spray and pray” kind-of person. I tend to look through the view finder a while before clicking. I try to frame a shot well, get everything to look just right, and then the subject has moved, or the ‘moment’ has passed. This happened a lot.

I really have to learn to accept a lower ‘good picture’ ratio, and capture the moment as best as I can, and crop, and post-process later.

Lose a lot of weight

The professionals were at least a hundred pounds lighter than I was. The way they moved clearly reflected a certain difference in our respective ‘momentum’s!

I definitely need more experience with photographing people, something that I’ve known for a while. The wedding was a great excuse for me to happily point a camera at people who were having animated conversations, and click. Now I have to find other venues where I can do the same thing, and learn more about this aspect of photography that I’ve really neglected for too long.

P.S. My thanks to Allison Perkel for all the pointers she gave me before I went on this trip.

How to Use ND Filters Creatively to Make the Most of a Scene

I’ve long known and used ND filters and graduated ND filters in bright light; didn’t realize that you could get some wonderful effects with them in darkness.

The examples in this article (via )  are just outstanding

Like this one of the moon (below).








Here, the “double stacked graduated ND filters” helped bring the brightness of the moon to a level comparable with the foreground.

The takeaway is that ND filters and graduated ND filters can be used in places where there is a huge difference in brightness of the various elements in the photograph.

Daylight Saving Time isn’t worth it, European Parliament members say

I have long suspected that this was the case, especially when no one could give a simple explanation why we did this.

  • For the farmers
  • For the farm animals
  • To save energy

Those are just some of the explanations I have heard.

But I get it, in countries that are wide (west to east) you need more time zones. That is the real solution.

This is Why ‘Zooming with Your Feet’ Isn’t the Same Thing

“Zooming with your feet” means getting closer to your subject physically instead of relying on a longer lens, but you should be aware that the results you won’t be the same. Here’s a 9-minute video from This Place that looks at how different focal lengths affect perspective when compared to “zooming with your feet.” Perspective distortion is often misunderstood — it’s an area of photography that many photographers may not need to explore or understand properly.

Source: This is Why ‘Zooming with Your Feet’ Isn’t the Same Thing

This article appeared almost a month ago

This article appeared almost a month ago, and I just got to reading it. It almost looks like it was written looking in the rear view mirror! Are there any of these seven trends that isn’t already a hot topic?

7 #Cloud Computing Trends to Watch in 2018

Old write-up about CAP Theorem

In 2011, Ken Rugg and I were having a number of conversations around CAP Theorem and after much discussion, we came up with the following succinct inequality. It was able to help us much better speak to the issue of what constituted “availability”, “partition tolerance”, and “consistency”. It also confirmed our suspicions that availability and partition tolerance were not simple binary attributes; Yes or No, but rather that they had shades of gray.

So here’s the write-up we prepared at that time.


Unfortunately, the six part blog post that we wrote (on never made it in the transition to the new owners.

Don’t use a blockchain unless you really need one

Now, that’s easy to say.

But, blockchain today is what docker was 18 months ago. 

Startups got funded for doing “Docker for sandwiches”, or “Docker for underpants” (not really, but bloody close).

Talks were accepted at conferences because of the title “Docker, Docker, Docker, Docker, Docker” (really).

And today it is blockchain.

I hope you weren’t counting on Project Fi …

Google’s Project Fi international data service goes down.

One of the things I have come to realize is that it is not a good plan to depend on services provided by the likes of Google.

They work 99.9 or 99.95% of the time. And they work well. But depend on them for five 9’s, that’s dumb.

Mysore School of Architecture

I happened to be at the Mysore School of Architecture, in Mysore, India and had a chance to walk around. And click some photographs 🙂 The building is bright and airy, and the pictures below show some views of this. There was a lot of art all around the campus, all of it created by the students. Some modern art under the stairs. Modern art in the courtyard. A nice painting which shows two views, depending on where you stand. A nice mural on the wall. Some lovely photos; the sun and the wind weathered the display, but the display and the captions are lovely. Many of the class rooms have nice caricatures of famous architects. And finally, some lovely origami under the stairs! I loved my short trip to the college and will surely be back when there are some students there.

%d bloggers like this: