'How to model a relationship in Dynamoose/DynamoDB?

I need the user to be linked with the invoice. In SQL database it would be a foreign key. How can I do this in Dynamoose?

const dynamoose = require('dynamoose');
const uuid = require('uuid');
const config = require('../config/config').get(process.env.NODE_ENV);

dynamoose.aws.sdk.config.update({
    region: config.awsRegion,
    accessKeyId: config.awsAccessKeyId,
    secretAccessKey: config.awsSecretKey
});

const userSchema = new dynamoose.Schema({
    uid: {
        type: String,
        hashKey: true,
        default: uuid.v1(),
    },
    name: {
        type: String,
        required: true
    },
    email: {
        type: String,
        required: true
    },
    password: {
        type: String,
        require: true
    },
    token: {
        type: String
    }
});

module.exports = dynamoose.model('User', userSchema);

Every record in other tables must have owner as the user who created it.



Solution 1:[1]

This is a really complicated topic compared to SQL.

DynamoDB isn't really a relational database. In SQL you would send one command to the database, it'd process it, and return the results, with the reference and everything. DynamoDB doesn't have that relational functionality built in. I mean you could do a get request to the User, and inside the user store the key or ID for the invoice, and link them that way then get the Invoice. But at that point you are doing 2 round trip requests to the server. Which obviously is not efficient at scale.

The reason I say it's not "really" a relational database, is because you can still model data in a relational type of way. It's not impossible to do it efficiently with DynamoDB.

Below I explain both ways you can do this, starting with the less efficient method of storing an ID and doing 2 requests to the DB. Followed by the more complicated method DynamoDB recommends that is very efficient.


Dynamoose has a concept where you can do that relationship by passing in a Model as the type for an attribute. It will store the key for you, and you can call document.populate to get the Invoice. The document.populate call is the one that will make that second request to the DB, and is not very efficient.

This would look something like the following.

const userSchema = new dynamoose.Schema({
    "id": String,
    // ...
});
const User = dynamoose.model("User", userSchema);

const gameSchema = new dynamoose.Schema({
    "id": String,
    "state": String,
    "user": User
});
const Game = dynamoose.model("Game", gameSchema);

const game = await Game.get("GAMEID1");
console.log(game.user); // This will print the user key, not the actual user document
const populatedGame = await game.populate();
console.log(populatedGame.user); // This will print the actual user document

Remember this is not efficient. I don't really recommend this method, but to play around with it and get it working fast (not fast as in high performance, but fast in terms of migrating existing code quickly), it does the job.


The more recommended solution is to completely rethink your data structure. This comes into play with single table design. I'd highly recommend watching the "Data Modeling with Amazon DynamoDB talk" at AWS reInvent 2019 for more information about the best practices around this.

The short version is that you model your data in a way where you can use Queries to grab the data you are looking for in a single request. Using hash and range keys to partition the data into efficient patterns.

You would basically have a User Schema and an Invoice Schema, but 1 model (table) for both. This looks a bit weird in Dynamoose (hopefully in the future we can change the API to make it cleaner). But it's only weird because of the fact that a Model in Dynamoose represents a table, not an entity. I'm hoping in v3 we are going to rethink how that works.

An example of how this would look in Dynamoose would be:

const Cat = dynamoose.model("Cat", [
    new dynamoose.Schema({"id": String, "name": String}),
    {"id": String, "age": Number}
]);

Then you could use queries to get the data you want. It really requires a huge mindset shift, but this is how you see those amazing performance gains at scale that DynamoDB tends to talk about a lot.


Hopefully this helps answer your question. Dynamoose has a Slack room you can join as well for more specific questions to your use case (you can find the link to join in the GitHub repo README). Feel free to comment on this answer if anything doesn't make sense or you need more clarity.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1