Getting started with dynamodb-shell

Yesterday I posted a quick introduction to dynamodb-shell. Let’s go a little bit further today. ddbsh has quit a lot of bells and whistles for creating tables.

ap-south-1> help create table;
CREATE TABLE - Creates a DynamoDB table.

   CREATE TABLE [IF NOT EXISTS][NOWAIT] <name>
         ( attribute_name, attribute_type [,...] )
   primary_key billing_mode_and_throughput
   [gsi_list] [lsi_list] [streams] [table_class] [tags] ;

   attribute_type := NUMBER|STRING|BINARY
   primary_key := PRIMARY KEY key_schema
   key_schema := ( attribute_name HASH [, attribute_name RANGE])

   billing_mode_and_throughput := (BILLING MODE ON DEMAND)|BILLING MODE provisioned)
   provisioned := ( RR RCU, WW WCU )

   gsi_list := GSI ( gsi_spec )
   gsi_spec := gsi [, gsi ...]
   gsi := gsi_name ON key_schema index_projection [billing_mode_and_throughput]
   index_projection := (PROJECTING ALL) | (PROJECTING KEYS ONLY) | (PROJECTING INCLUDE projection_list)
   projection_list := ( attribute [, attribute ...] )

   lsi_list := LSI ( lsi_spec )
   lsi_spec := lsi [, lsi ...]
   lsi := lsi_name ON key_schema index_projection

   streams := STREAM ( stream_type ) | STREAM DISABLED
   stream_type := KEYS ONLY | NEW IMAGE | OLD IMAGE | BOTH IMAGES

   table_class := TABLE CLASS STANDARD | TABLE CLASS STANDARD INFREQUENT ACCESS

   tags := TAGS ( tag [, tag ...] )
   tag := name : value

Let’s make a table with a GSI and with DynamoDB Streams enabled. Since the CREATE TABLE command included “nowait”, the command completed immediately, and a subsequent describe shows that the table is being created.

ap-south-1> create table if not exists nowait balances ( id string, accttype string, balance number )
ap-south-1> primary key ( id hash, accttype range )
ap-south-1> billing mode provisioned ( 20 rcu, 20 wcu )
ap-south-1> gsi (balancegsi on (accttype hash, balance range) projecting all
ap-south-1>      billing mode provisioned ( 20 rcu, 20 wcu ))
ap-south-1> stream (both images);
CREATE
ap-south-1> describe balances;
Name: balances (CREATING)
Key: HASH id, RANGE accttype
Attributes:  accttype, S,  balance, N,  id, S
Created at: 2023-01-26T05:01:41Z
Table ARN: arn:aws:dynamodb:ap-south-1:632195519165:table/balances
Table ID: d84f734e-10e9-4c2d-a946-ed8820b82430
Table size (bytes): 0
Item Count: 0
Billing Mode: Provisioned (20 RCU, 20 WCU)
PITR is Disabled.
GSI balancegsi: ( HASH accttype, RANGE balance ),  Provisioned (20 RCU, 20 WCU), Projecting (ALL), Status: CREATING, Backfilling: NO
LSI: None
Stream: NEW_AND_OLD_IMAGES
Table Class: STANDARD
SSE: Not set
ap-south-1> 

After a few minutes, the table is created and we can start loading up some data.

ap-south-1> insert into balances ( id, accttype, balance) 
ap-south-1> values ( "Alice", "Savings", 200 ),
ap-south-1> ("Alice", "Checking", 500 ),
ap-south-1> ("Bob", "Checking", 250 ),
ap-south-1> ("Charlie", "Savings", 500),
ap-south-1> ("Charlie", "Checking", 200),
ap-south-1> ("David", "Savings", 1000);
INSERT
INSERT
INSERT
INSERT
INSERT
INSERT

Now, let’s assume that I want to see all the “Savings” account balances, I could do this:

ap-south-1> select * from balances where accttype = "Savings";
{accttype: Savings, balance: 200, id: Alice}
{accttype: Savings, balance: 500, id: Charlie}
{accttype: Savings, balance: 1000, id: David}
ap-south-1> 

Since the table has a primary key of (id, accttype), this becomes a scan. How do I know that, because I can explain the query.

ap-south-1> explain select * from balances where accttype = "Savings";
Scan({
   "TableName":   "balances",
   "ReturnConsumedCapacity":   "NONE",
   "FilterExpression":   "#ahaa1 = :vhaa1",
   "ExpressionAttributeNames":   {
      "#ahaa1":   "accttype"
   },
   "ExpressionAttributeValues":   {
      ":vhaa1":   {
         "S":   "Savings"
      }
   },
   "ConsistentRead":   false
})
ap-south-1> 

But, I do have a GSI on the account type (and balance), so I can do one better. I can attempt the query against the GSI (observe that where I had a table, I now have balances.balancegsi which is how I reference the GSI). This turns into a Query on the index instead of a table scan.

ap-south-1> select * from balances.balancegsi where accttype = "Savings";
{accttype: Savings, balance: 200, id: Alice}
{accttype: Savings, balance: 500, id: Charlie}
{accttype: Savings, balance: 1000, id: David}
ap-south-1> explain select * from balances.balancegsi where accttype = "Savings";
Query({
   "TableName":   "balances",
   "IndexName":   "balancegsi",
   "ConsistentRead":   false,
   "ReturnConsumedCapacity":   "NONE",
   "KeyConditionExpression":   "#akaa1 = :vkaa1",
   "ExpressionAttributeNames":   {
      "#akaa1":   "accttype"
   },
   "ExpressionAttributeValues":   {
      ":vkaa1":   {
         "S":   "Savings"
      }
   }
})

One other thing, let’s assume that I want to transfer some money from one account to another without the source account going negative – how would I do that?

Let’s say that Alice wants to transfer $100 from Checking to Savings. We could do this.

ap-south-1> select * from balances where id = "Alice";
{accttype: Checking, balance: 500, id: Alice}
{accttype: Savings, balance: 200, id: Alice}
ap-south-1> begin;
ap-south-1> update balances set balance = balance - 100 where id = "Alice" and accttype = "Checking" and balance >= 100;
ap-south-1> update balances set balance = balance + 100 where id = "Alice" and accttype = "Savings";
ap-south-1> commit;
COMMIT
ap-south-1> select * from balances where id = "Alice";
{accttype: Checking, balance: 400, id: Alice}
{accttype: Savings, balance: 300, id: Alice}
ap-south-1> 

For example, let’s say that Bob wants to give $300 to David. Should that be allowed?

ap-south-1> select * from balances where id = "Bob";
{accttype: Checking, balance: 250, id: Bob}
ap-south-1> 

Let’s see what happens if we try …

ap-south-1> begin;
ap-south-1> update balances set balance = balance - 300 where id = "Bob" and balance >= 300 and accttype = "Checking";
ap-south-1> update balances set balance = balance + 300 where id = "David" and accttype = "Savings";
ap-south-1> commit;
Transaction failed. TransactionCanceledException. 3D0KI0CHVS7UDTFDMAQ0E43EINVV4KQNSO5AEMVJF66Q9ASUAAJG. Transaction cancelled, please refer cancellation reasons for specific reasons [ConditionalCheckFailed, None]
ABORT
ap-south-1> select * from balances where id in ("Bob", "David");
{accttype: Checking, balance: 250, id: Bob}
{accttype: Savings, balance: 1000, id: David}
ap-south-1> 

The Condition Check Failure is a list of two statuses – the first one failed (ConditionCheckFailed) and the other produced no error. So there you have it,

  • a table with a GSI and you can query either the table or the GSI (specify table.gsi)
  • you can do multi-line transactions
  • you can use functions like IN in the where clause.

In the next blog post I’ll dig further into all that you can do in a WHERE clause.

I’ll leave it up to you to run explain on the transaction and see what it does. Hint, just change “begin” to “explain begin”.

Hello dynamodb-shell

ddbsh is an interactive shell for AWS DynamoDB.

DynamoDB Shell (ddbsh) is an interactive command line interface for Amazon DynamoDB. ddbsh is available for download at https://github.com/awslabs/dynamodb-shell.

ddbsh is provided for your use on an AS-IS basis. It can delete, and update table data, as well as drop tables. These operations are irreversible. It can perform scans and queries against your data and these can cost you significant money.

The quickest way to understand ddbsh is through a simple interactive session. First download the software and build the binary.

% ddbsh
ddbsh - version 0.1
us-east-1>

You are now at an interactive prompt where you can execute commands. The prompt shows that you are connected to us-east-1 (this is the default). You can override that if you so desire (commands in ~/.ddbsh_config will be automatically executed when you launch ddbsh). You can also dynamically reconnect to another region.

us-east-1> connect ap-south-1;
CONNECT
ap-south-1> 

That’s all there is to it. Now let’s get back to us-east-1 and take ddbsh for a spin. Let’s make a table. Commands are terminated with the ‘;’ character.

ap-south-1> connect us-east-1;
CONNECT
us-east-1> 
us-east-1> create table ddbsh_demo ( id number ) 
us-east-1> primary key ( id hash );
CREATE
us-east-1>

The CREATE TABLE command (by default) will wait till the table is created. You can have it submit the request and return with the NOWAIT option (see HELP CREATE TABLE for complete options).

By default it creates a table that is On-Demand (you can also create a table with provisioned billing mode, more about that later).

Now let’s insert some data and query it.

us-east-1> insert into ddbsh_demo (id, v) values ( 3, 4 ), (4, "a string value"), (5, {a: 4, b: [10, 11, 12], c: true, d: {x: 10, y: 10}});
INSERT
INSERT
INSERT
us-east-1> select * from ddbsh_demo;
{id: 3, v: 4}
{id: 4, v: "a string value"}
{id: 5, v: {a:4, b:[10, 11, 12], c:TRUE, d:{x:10, y:10}}}
us-east-1>  

You can do more fancy things with your query, like this.

us-east-1> select id from ddbsh_demo where v = 4;
{id: 3}
us-east-1> select * from ddbsh_demo where v.c = true;
{id: 5, v: {a:4, b:[10, 11, 12], c:TRUE, d:{x:10, y:10}}}
us-east-1> select * from ddbsh_demo where v.b[1] = 11;
{id: 5, v: {a:4, b:[10, 11, 12], c:TRUE, d:{x:10, y:10}}}
us-east-1> 

How about making some changes to the data? That’s easy enough.

us-east-1> update ddbsh_demo set z = 14, v.b[1] = 13 where id = 5;
UPDATE (0 read, 1 modified, 0 ccf)
us-east-1> select * from ddbsh_demo where id = 5;
{id: 5, v: {a:4, b:[10, 13, 12], c:TRUE, d:{x:10, y:10}}, z: 14}
us-east-1> 

Careful what you do with ddbsh … if you execute a command without a where clause, it can update more items than you expected. For example, consider this.

us-east-1> select * from ddbsh_demo;
{id: 3, v: 4}
{id: 4, v: "a string value"}
{id: 5, v: {a:4, b:[10, 13, 12, 13, 13], c:TRUE, d:{x:10, y:10}}, z: 14}
us-east-1> update ddbsh_demo set newval = "a new value";
UPDATE (3 read, 3 modified, 0 ccf)
us-east-1> select * from ddbsh_demo;
{id: 3, newval: "a new value", v: 4}
{id: 4, newval: "a new value", v: "a string value"}
{id: 5, newval: "a new value", v: {a:4, b:[10, 13, 12, 13, 13], c:TRUE, d:{x:10, y:10}}, z: 14}
us-east-1> 

Equally, you can accidentally delete more data than you expected.

us-east-1> delete from ddbsh_demo;
DELETE (3 read, 3 modified, 0 ccf)
us-east-1> select * from ddbsh_demo;
us-east-1> 

There, all the data is gone! Hopefully that’s what I intended.

There’s a lot more that you can do with ddbsh – to see what else you can do, check out the HELP command which lists all commands and provides help on each.

Two final things. First, ddbsh also supports a number of DDL commands (in addition to CREATE TABLE).

us-east-1> show tables;
ddbsh_demo | ACTIVE | PAY_PER_REQUEST | STANDARD | ba3c5574-d3ca-469b-aeb8-4ad8f8df9d4e | arn:aws:dynamodb:us-east-1:632195519165:table/ddbsh_demo | TTL DISABLED | GSI: 0 | LSI : 0 |
us-east-1> describe ddbsh_demo;
Name: ddbsh_demo (ACTIVE)
Key: HASH id
Attributes: id, N
Created at: 2023-01-25T12:15:15Z
Table ARN: arn:aws:dynamodb:us-east-1:632195519165:table/ddbsh_demo
Table ID: ba3c5574-d3ca-469b-aeb8-4ad8f8df9d4e
Table size (bytes): 0
Item Count: 0
Billing Mode: On Demand
PITR is Disabled.
GSI: None
LSI: None
Stream: Disabled
Table Class: STANDARD
SSE: Not set
us-east-1> 

Now let’s make some changes.

us-east-1> alter table ddbsh_demo set pitr enabled;
ALTER
us-east-1> alter table ddbsh_demo set billing mode provisioned ( 200 rcu, 300 wcu);
ALTER
us-east-1> alter table ddbsh_demo (v number) create gsi gsi_v on (v hash) projecting all billing mode provisioned ( 10 rcu, 20 wcu );
ALTER
us-east-1> describe ddbsh_demo;
Name: ddbsh_demo (ACTIVE)
Key: HASH id
Attributes: id, N, v, N
Created at: 2023-01-25T12:15:15Z
Table ARN: arn:aws:dynamodb:us-east-1:632195519165:table/ddbsh_demo
Table ID: ba3c5574-d3ca-469b-aeb8-4ad8f8df9d4e
Table size (bytes): 0
Item Count: 0
Billing Mode: Provisioned (200 RCU, 300 WCU)
PITR is Enabled: [2023-01-25T12:28:30Z to 2023-01-25T12:28:30Z]
GSI gsi_v: ( HASH v ), Provisioned (10 RCU, 20 WCU), Projecting (ALL), Status: CREATING, Backfilling: YES
LSI: None
Stream: Disabled
Table Class: STANDARD
SSE: Not set
us-east-1> 

Second, if you want to know what ddbsh is doing under the covers, use the EXPLAIN command. For example, how did ddbsh add the GSI?

us-east-1> explain alter table ddbsh_demo (v number) 
us-east-1> create gsi gsi_v on (v hash)
us-east-1> projecting all
us-east-1> billing mode provisioned ( 10 rcu, 20 wcu );
UpdateTable({
"AttributeDefinitions": [{
"AttributeName": "v",
"AttributeType": "N"
}],
"TableName": "ddbsh_demo",
"GlobalSecondaryIndexUpdates": [{
"Create": {
"IndexName": "gsi_v",
"KeySchema": [{
"AttributeName": "v",
"KeyType": "HASH"
}],
"Projection": {
"ProjectionType": "ALL"
},
"ProvisionedThroughput": {
"ReadCapacityUnits": 10,
"WriteCapacityUnits": 20
}
}
}]
})
us-east-1>

You can similarly use EXPLAIN on DML commands too.

us-east-1> explain update ddbsh_demo set z = 14, v.b[6] = 13 where id = 5;
UpdateItem({
   "TableName":   "ddbsh_demo",
   "Key":   {
      "id":   {
         "N":   "5"
      }
   },
   "UpdateExpression":   "SET #awaa1 = :vwaa1, #awaa2.#awaa3[6] = :vwaa2",
   "ConditionExpression":   "attribute_exists(#awaa4)",
   "ExpressionAttributeNames":   {
      "#awaa1":   "z",
      "#awaa2":   "v",
      "#awaa3":   "b",
      "#awaa4":   "id"
   },
   "ExpressionAttributeValues":   {
      ":vwaa1":   {
         "N":   "14"
      },
      ":vwaa2":   {
         "N":   "13"
      }
   }
})
us-east-1> 

When you issue a SELECT, ddbsh automatically decides how to execute it. To understand that, here’s another example. We create a new table with a PK and RK and EXPLAIN several SELECT statements. The first results in GetItem() the second in Query() and the third in Scan().

us-east-1> create table ddbsh_demo2 ( pk number, rk number ) 
us-east-1> primary key (pk hash, rk range);
CREATE
us-east-1> explain select * from ddbsh_demo2 where pk = 3 and rk = 4;
GetItem({
   "TableName":   "ddbsh_demo2",
   "Key":   {
      "pk":   {
         "N":   "3"
      },
      "rk":   {
         "N":   "4"
      }
   },
   "ConsistentRead":   false,
   "ReturnConsumedCapacity":   "NONE"
})
us-east-1> explain select * from ddbsh_demo2 where pk = 3;
Query({
   "TableName":   "ddbsh_demo2",
   "ConsistentRead":   false,
   "ReturnConsumedCapacity":   "NONE",
   "KeyConditionExpression":   "#ayaa1 = :vyaa1",
   "ExpressionAttributeNames":   {
      "#ayaa1":   "pk"
   },
   "ExpressionAttributeValues":   {
      ":vyaa1":   {
         "N":   "3"
      }
   }
})
us-east-1> explain select * from ddbsh_demo2;
Scan({
   "TableName":   "ddbsh_demo2",
   "ReturnConsumedCapacity":   "NONE",
   "ConsistentRead":   false
})
us-east-1> explain select * from ddbsh_demo2 where pk = 3 and rk > 5;
Query({
   "TableName":   "ddbsh_demo2",
   "ConsistentRead":   false,
   "ReturnConsumedCapacity":   "NONE",
   "KeyConditionExpression":   "#aAaa1 = :vAaa1 AND #aAaa2 > :vAaa2",
   "ExpressionAttributeNames":   {
      "#aAaa1":   "pk",
      "#aAaa2":   "rk"
   },
   "ExpressionAttributeValues":   {
      ":vAaa1":   {
         "N":   "3"
      },
      ":vAaa2":   {
         "N":   "5"
      }
   }
})
us-east-1> 

There you have it, a quick introduction to ddbsh. Take it for a ride! And if you like ddbsh, do tell your friends!

Condition Expressions and Data Integrity with DynamoDB

Found a good short read about maintaining data integrity in #dynamodb databases using condition expressions in updates.

TL;DR version your records and use a condition expression on the version. A condition expression is different from a key condition – the key condition identifies the item, the condition expression evaluates to true or false after identifying the item. If false, a CCF is thrown!

https://blog.awsfundamentals.com/understanding-and-handling-race-conditions-at-dynamodb

DynamoDB Condition Checks

DynamoDB Condition Checks are a very powerful (and somewhat) misunderstood capability that makes application development much easier.

Consider this most basic “financial” use-case. I have a table that contains customer account balances. Alice has $500 and Bob has $200. Bob wishes to give Alice $100. This is something really simple that screams “RDBMS” but is actually non-trivial with an RDBMS.

We all know that we need transactions because …

But consider this simple twist – How do you prevent Bob’s account balance from going negative?

There is (and I’d love to be proved wrong) no simple SQL solution to this in RDBMS without using one of (a) constraints, (b) triggers, or (c) stored procedures to implement this operation. The easiest is to stick a >= 0 constraint on the account balance. If Bob tries to give Alice more than $200, the update will fail with a constraint violation.

Why do you need one of these things? Consider this (flawed) implementation with an RDBMS.

--
-- WARNING: This implementation is flawed
--

BEGIN;

UPDATE BALANCES 
SET BALANCE = BALANCE + 300 
WHERE USER = "ALICE";

-- The update below may, or may not update Bob's balance!

UPDATE BALANCES 
SET BALANCE = BALANCE - 300 
WHERE USER = "BOB" AND BALANCE > 300;

COMMIT;

This will complete successfully with the first update giving Alice the money and the second succeeding without doing anything!

This is where the Condition Check in DynamoDB comes along. Consider this with DynamoDB

us-east-1> create table balances ( owner string ) primary key ( owner hash ) billing mode on demand;
CREATE
us-east-1> insert into balances ( owner, balance ) values ( "alice", 500 ), ( "bob", 200 );
INSERT
INSERT
us-east-1> select * from balances;
{balance: 200, owner: bob}
{balance: 500, owner: alice}

us-east-1> begin;
us-east-1> update balances set balance = balance + 300 where owner = "alice";
us-east-1> update balances set balance = balance - 300 where owner = "bob" and balance > 300;
us-east-1> commit;
Transaction failed. TransactionCanceledException. PLRD70PMCA382RM2S0T2OF4O7VVV4KQNSO5AEMVJF66Q9ASUAAJG. Transaction cancelled, please refer cancellation reasons for specific reasons [None, ConditionalCheckFailed]
ABORT
us-east-1> 

We’re being told that the second write in the transaction had a ConditionCheckFailure (see the reasons List, first had None, second had CCF).

Now, we could redo the same thing with a valid amount, less than $200 like this.

us-east-1> select * from balances;
{balance: 200, owner: bob}
{balance: 500, owner: alice}
us-east-1> begin;
us-east-1> update balances set balance = balance + 100 where owner = "alice";
us-east-1> update balances set balance = balance - 100 where owner = "bob" and balance > 100;
us-east-1> commit;
COMMIT
us-east-1> select * from balances;
{balance: 100, owner: bob}
{balance: 600, owner: alice}
us-east-1> 

Now that’s so much more civilized! The WHERE clause “balance > 100” turns into a condition check, and if it fails, so does the transaction.

In DynamoDB, that is all a single API call like so (illustrated using “explain”):

us-east-1> explain begin;
us-east-1> update balances set balance = balance + 100 where owner = "alice";
us-east-1> update balances set balance = balance - 100 where owner = "bob" and balance > 100;
us-east-1> commit;
TransactWriteItems({
   "TransactItems":   [{
         "Update":   {
            "Key":   {
               "owner":   {
                  "S":   "alice"
               }
            },
            "UpdateExpression":   "SET #akaa1 = #akaa1+:vkaa1",
            "TableName":   "balances",
            "ConditionExpression":   "attribute_exists(#akaa2)",
            "ExpressionAttributeNames":   {
               "#akaa1":   "balance",
               "#akaa2":   "owner"
            },
            "ExpressionAttributeValues":   {
               ":vkaa1":   {
                  "N":   "100"
               }
            }
         }
      }, {
         "Update":   {
            "Key":   {
               "owner":   {
                  "S":   "bob"
               }
            },
            "UpdateExpression":   "SET #alaa1 = #alaa1-:vlaa1",
            "TableName":   "balances",
            "ConditionExpression":   "attribute_exists(#alaa2) AND #alaa1 > :vlaa2",
            "ExpressionAttributeNames":   {
               "#alaa1":   "balance",
               "#alaa2":   "owner"
            },
            "ExpressionAttributeValues":   {
               ":vlaa1":   {
                  "N":   "100"
               },
               ":vlaa2":   {
                  "N":   "100"
               }
            }
         }
      }],
   "ClientRequestToken":   "1F087670-87E3-4973-8062-9A0A54E7F200"
})
us-east-1> 

Notice that there is the update expression that updates the balance down by $100, but also a condition check that requires an item to exist that meets the requirement

owner = "bob" and balance > 100

Nifty!

In SQL, different databases give you constructs like this (from SQLServer)

UPDATE BALANCES
SET BALANCE = BALANCE - 300
WHERE OWNER = "Bob" AND BALANCE > 300;

IF @@ROWCOUNT = 0 ...

You’d then be able to decide whether or not you actually updated a row, and take corrective action.

DynamoDB UpdateItem() provides a condition check expression (even without a transaction).

us-east-1> select * from balances;
{balance: 100, owner: bob}
{balance: 600, owner: alice}
us-east-1> update balances set balance = balance - 200 where owner = "bob" and balance > 200;
UPDATE (0 read, 0 modified, 1 ccf)
us-east-1>

That Update gets transformed into this API call.

us-east-1> explain update balances set balance = balance - 200 where owner = "bob" and balance > 200;
UpdateItem({
   "TableName":   "balances",
   "Key":   {
      "owner":   {
         "S":   "bob"
      }
   },
   "UpdateExpression":   "SET #aoaa1 = #aoaa1-:voaa1",
   "ConditionExpression":   "attribute_exists(#aoaa2) AND #aoaa1 > :voaa2",
   "ExpressionAttributeNames":   {
      "#aoaa1":   "balance",
      "#aoaa2":   "owner"
   },
   "ExpressionAttributeValues":   {
      ":voaa1":   {
         "N":   "200"
      },
      ":voaa2":   {
         "N":   "200"
      }
   }
})
us-east-1> 

Observe that in addition to specifying the Key (owner = bob), it provides a condition expression too!

%d bloggers like this: