Data Storage

Sooner or later you want to retrieve or store some data somewhere. When modelling a pipeline it doesn’t always matter where. Especially when working at high abstraction levels you don’t want to make a choice of data store (just yet). We model a data store after a common symbol for a relational database table in UML, but don’t be fooled, the symbol just represent “a datastore” not a relational datastore specifically.

We mark the type of the datastore by the first letter or a short abbreviation of the actual datastore.

Data Store Examples

Origin of the Symbol

Modelled after the table and class icons in UML. It’s easy to draw on a whiteboard Start with something that comes close to a square and then at about 1/5th of the top, draw a horizontal line. In software, draw a square first, then position a small rectangle on top and join the shapes. That way you can write the letter of your datastore in the centre of the bottom rectangle.

Data Store Types

Below is a non extensive list of data store types. As with the processors the naming is (roughly) as follows:

Use the first letter of the data store, capitalized. Unless the datastore has a standard abbreviation of 2 characters.
If the resulting letter is ambiguous, this is no problem if it’s clear (to you and your team) from the diagram what the data store is.
To disambiguate add an extra letter, or a ‘.’.
For Aurora, add the the A after the database type.

Data store	Symbol
DynamoDB	D
Postgresql	P
S3	S3
Redis or Redshift	R
Redis	RD
RedShift	RS
MySQL or Memcached	M
MySQL	M.
Memcached	MD
Microsoft SQL	MS
Aurora	A
Aurora MySQL	MA
Aurora Postgres	PA
Memcached	MC
Oracle	O

Tables

The icon for a datastore is used to mark an individual table or collection. For example, if you indicate you want to store something in MySQL you use a datastore icon for each table you are accessing data from or writing into. Same goes for DynamoDB: you draw a data store for each table. For S3, you draw 1 data store icon per bucket. etc.