Dynamoid is an ORM for Amazon's DynamoDB for Ruby applications. It provides similar functionality to ActiveRecord and improves on Amazon's existing HashModel by providing better searching tools and native association support.
DynamoDB is not like other document-based databases you might know, and is very different indeed from relational databases. It sacrifices anything beyond the simplest relational queries and transactional support to provide a fast, cost-efficient, and highly durable storage solution. If your database requires complicated relational queries and transaction support, then this modest Gem cannot provide them for you, and neither can DynamoDB. In those cases you would do better to look elsewhere for your database needs.
But if you want a fast, scalable, simple, easy-to-use database (and a Gem that supports it) then look no further!
Installing Dynamoid is pretty simple. First include the Gem in your Gemfile:
gem 'dynamoid'
Dynamoid depends on the aws-sdk, and this is tested on the current version of aws-sdk (~> 3), rails (>= 4). Hence the configuration as needed for aws to work will be dealt with by aws setup.
Make sure you are using the version for the right AWS SDK.
Dynamoid version | AWS SDK Version |
---|---|
0.x | 1.x |
1.x | 2.x |
2.x | 2.x |
3.x | 3.x |
Configure AWS access: Reference
For example, to configure AWS access:
Create config/initializers/aws.rb
as follows:
Aws.config.update(
region: 'us-west-2',
credentials: Aws::Credentials.new('REPLACE_WITH_ACCESS_KEY_ID', 'REPLACE_WITH_SECRET_ACCESS_KEY'),
)
Alternatively, if you don't want Aws connection settings to be
overwritten for you entire project, you can specify connection settings
for Dynamoid only, by setting those in the Dynamoid.configure
clause:
require 'dynamoid'
Dynamoid.configure do |config|
config.access_key = 'REPLACE_WITH_ACCESS_KEY_ID'
config.secret_key = 'REPLACE_WITH_SECRET_ACCESS_KEY'
config.region = 'us-west-2'
end
Additionally, if you would like to pass in pre-configured AWS credentials (e.g. you have an IAM role credential, you configure your credentials elsewhere in your project, etc.), you may do so:
require 'dynamoid'
credentials = Aws::AssumeRoleCredentials.new(
region: region,
access_key_id: key,
secret_access_key: secret,
role_arn: role_arn,
role_session_name: 'our-session'
)
Dynamoid.configure do |config|
config.region = 'us-west-2',
config.credentials = credentials
end
For a full list of the DDB regions, you can go here.
Then you need to initialize Dynamoid config to get it going. Put code similar to this somewhere (a Rails initializer would be a great place for this if you're using Rails):
require 'dynamoid'
Dynamoid.configure do |config|
# To namespace tables created by Dynamoid from other tables you might have.
# Set to nil to avoid namespacing.
config.namespace = 'dynamoid_app_development'
# [Optional]. If provided, it communicates with the DB listening at the endpoint.
# This is useful for testing with [DynamoDB Local] (http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Tools.DynamoDBLocal.html).
config.endpoint = 'http://localhost:3000'
end
Dynamoid supports Ruby >= 2.3 and Rails >= 4.2.
Its compatibility is tested against following Ruby versions: 2.3, 2.4, 2.5, 2.6, 2.7, 3.0, 3.1, 3.2, 3.3, and 3.4, JRuby 9.4.x and against Rails versions: 4.2, 5.0, 5.1, 5.2, 6.0, 6.1, 7.0, 7.1, 7.2, and 8.0.
You must include Dynamoid::Document
in every Dynamoid model.
class User
include Dynamoid::Document
# fields declaration
end
Dynamoid has some sensible defaults for you when you create a new table, including the table name and the primary key column. But you can change those if you like on table creation.
class User
include Dynamoid::Document
table name: :awesome_users, key: :user_id, read_capacity: 5, write_capacity: 5
end
These fields will not change an existing table: so specifying a new
read_capacity and write_capacity here only works correctly for entirely
new tables. Similarly, while Dynamoid will look for a table named
awesome_users
in your namespace, it won't change any existing tables
to use that name; and if it does find a table with the correct name, it
won't change its hash key, which it expects will be user_id
. If this
table doesn't exist yet, however, Dynamoid will create it with these
options.
There is a basic support of DynamoDB's Time To Live (TTL) mechanism. If you declare a field as TTL field - it will be initialised if doesn't have value yet. Default value is current time + specified seconds.
class User
include Dynamoid::Document
table expires: { field: :ttl, after: 60 }
field :ttl, :integer
end
Field used to store expiration time (e.g. ttl
) should be declared
explicitly and should have numeric type (integer
, number
) only.
datetime
type is also possible but only if it's stored as number
(there is a way to store time as a string also).
It's also possible to override a global option Dynamoid::Config.timestamps
on a table level:
table timestamps: false
This option controls generation of timestamp fields
created_at
/updated_at
.
It's also possible to override table capacity mode configured globally
with table level option capacity_mode
. Valid values are
:provisioned
, :on_demand
and nil
:
table capacity_mode: :on_demand
If table capacity mode is on-demand, another related table-level options
read_capacity
and write_capacity
will be ignored.
You'll have to define all the fields on the model and the data type of each field. Every field on the object must be included here; if you miss any they'll be completely bypassed during DynamoDB's initialization and will not appear on the model objects.
By default, fields are assumed to be of type string
. Other built-in
types are integer
, number
, set
, array
, map
, datetime
,
date
, boolean
, binary
, raw
and serialized
. array
and
map
match List and Map DynamoDB types respectively. raw
type means
you can store Ruby Array, Hash, String and numbers. If built-in types do
not suit you, you can use a custom field type represented by an
arbitrary class, provided that the class supports a compatible
serialization interface. The primary use case for using a custom field
type is to represent your business logic with high-level types, while
ensuring portability or backward-compatibility of the serialized
representation.
The boolean fields are stored as DynamoDB boolean values by default.
Dynamoid can store boolean values as strings as well - 't'
and 'f'
.
So if you want to change the default format of boolean field you can
easily achieve this with store_as_native_boolean
field option:
class Document
include Dynamoid::Document
field :active, :boolean, store_as_native_boolean: false
end
By default date fields are persisted as days count since 1 January 1970
like UNIX time. If you prefer dates to be stored as ISO-8601 formatted
strings instead then set store_as_string
to true
class Document
include Dynamoid::Document
field :sent_on, :date, store_as_string: true
end
By default datetime fields are persisted as UNIX timestamps with
millisecond precision in DynamoDB. If you prefer datetimes to be stored
as ISO-8601 formatted strings instead then set store_as_string
to
true
class Document
include Dynamoid::Document
field :sent_at, :datetime, store_as_string: true
end
WARNING: Fields in numeric format are stored with nanoseconds as a
fraction part and precision could be lost. That's why datetime
field
in numeric format shouldn't be used as a range key.
You have two options if you need to use a datetime
field as a range
key:
- string format
- store
datetime
values without milliseconds (e.g. cut them manually withchange
method -Time.now.change(usec: 0)
Dynamoid
's type set
is stored as DynamoDB's Set attribute type.
DynamoDB supports only Set of strings, numbers and binary. Moreover Set
must contain elements of the same type only.
In order to use some other Dynamoid
's types you can specify of
option to declare the type of set elements.
As a result of that DynamoDB limitation, in Dynamoid only the following
scalar types are supported (note: does not support boolean
):
integer
, number
, date
, datetime
, serializable
and custom
types.
class Document
include Dynamoid::Document
field :tags, :set, of: :integer
end
It's possible to specify field options like store_as_string
for
datetime
field or serializer
for serializable
field for set
elements type:
class Document
include Dynamoid::Document
field :values, :set, of: { serialized: { serializer: JSON } }
field :dates, :set, of: { date: { store_as_string: true } }
field :datetimes, :set, of: { datetime: { store_as_string: false } }
end
DynamoDB doesn't allow empty strings in fields configured as set
.
Abiding by this restriction, when Dynamoid
saves a document it removes
all empty strings in set fields.
Dynamoid
's type array
is stored as DynamoDB's List attribute type.
It can contain elements of different types (in contrast to Set attribute
type).
If you need to store in array field elements of datetime
, date
,
serializable
or some custom type, which DynamoDB doesn't support
natively, you should specify element type with of
option:
class Document
include Dynamoid::Document
field :dates, :array, of: :date
end
By default binary fields are persisted as DynamoDB String value encoded
in the Base64 encoding. DynamoDB supports binary data natively. To use
it instead of String a store_binary_as_native
field option should be
set:
class Document
include Dynamoid::Document
field :image, :binary, store_binary_as_native: true
end
There is also a global config option store_binary_as_native
that is
false
by default as well.
You get magic columns of id
(string
), created_at
(datetime
), and
updated_at
(datetime
) for free.
class User
include Dynamoid::Document
field :name
field :email
field :rank, :integer
field :number, :number
field :joined_at, :datetime
field :hash, :serialized
end
You can optionally set a default value on a field using either a plain value or a lambda:
field :actions_taken, :integer, default: 0
field :joined_at, :datetime, default: -> { Time.now }
It might be helpful to define an alias for already existing field when naming convention used for a table differs from conventions common in Ruby:
field firstName, :string, alias: :first_name
This way there will be generated
setters/getters/<name>?
/<name>_before_type_cast
methods for both
original field name (firstName
) and an alias (first_name
).
user = User.new(first_name: 'Michael')
user.first_name # => 'Michael'
user.firstName # => 'Michael'
To use a custom type for a field, suppose you have a Money
type.
class Money
# ... your business logic ...
def dynamoid_dump
'serialized representation as a string'
end
def self.dynamoid_load(_serialized_str)
# parse serialized representation and return a Money instance
Money.new(1.23)
end
end
class User
include Dynamoid::Document
field :balance, Money
end
If you want to use a third-party class (which does not support
#dynamoid_dump
and .dynamoid_load
) as your field type, you can use
an adapter class providing .dynamoid_dump
and .dynamoid_load
class
methods for your third-party class. .dynamoid_load
can remain the same
from the previous example; here we just add a level of indirection for
serializing. Example:
# Third-party Money class
class Money; end
class MoneyAdapter
def self.dynamoid_load(_money_serialized_str)
Money.new(1.23)
end
def self.dynamoid_dump(money_obj)
money_obj.value.to_s
end
end
class User
include Dynamoid::Document
field :balance, MoneyAdapter
end
Lastly, you can control the data type of your custom-class-backed field
at the DynamoDB level. This is especially important if you want to use
your custom field as a numeric range or for number-oriented queries. By
default custom fields are persisted as a string attribute, but your
custom class can override this with a .dynamoid_field_type
class
method, which would return either :string
or :number
.
DynamoDB may support some other attribute types that are not yet supported by Dynamoid.
Along with partition key table may have a sort key. In order to declare
it in a model range
class method should be used:
class Post
include Dynamoid::Document
range :posted_at, :datetime
end
Second argument, type, is optional. Default type is string
.
Just like in ActiveRecord (or your other favorite ORM), Dynamoid uses associations to create links between models.
WARNING: Associations are not supported for models with compound primary key. If a model declares a range key it should not declare any association itself and be referenced by an association in another model.
The only supported associations (so far) are has_many
, has_one
,
has_and_belongs_to_many
, and belongs_to
. Associations are very
simple to create: just specify the type, the name, and then any options
you'd like to pass to the association. If there's an inverse association
either inferred or specified directly, Dynamoid will update both objects
to point at each other.
class User
include Dynamoid::Document
# ...
has_many :addresses
has_many :students, class: User
belongs_to :teacher, class_name: :user
belongs_to :group
belongs_to :group, foreign_key: :group_id
has_one :role
has_and_belongs_to_many :friends, inverse_of: :friending_users
end
class Address
include Dynamoid::Document
# ...
belongs_to :user # Automatically links up with the user model
end
Contrary to what you'd expect, association information is always
contained on the object specifying the association, even if it seems
like the association has a foreign key. This is a side effect of
DynamoDB's structure: it's very difficult to find foreign keys without
an index. Usually you won't find this to be a problem, but it does mean
that association methods that build new models will not work correctly -
for example, user.addresses.new
returns an address that is not
associated to the user. We'll be correcting this soon maybe someday,
if we get a pull request.
Dynamoid bakes in ActiveModel validations, just like ActiveRecord does.
class User
include Dynamoid::Document
# ...
validates_presence_of :name
validates_format_of :email, with: /@/
end
To see more usage and examples of ActiveModel validations, check out the ActiveModel validation documentation.
If you want to bypass model validation, pass validate: false
to save
call:
model.save(validate: false)
Dynamoid also employs ActiveModel callbacks. Right now the following callbacks are supported:
save
(before, after, around)create
(before, after, around)update
(before, after, around)validation
(before, after)destroy
(before, after, around)after_touch
after_initialize
after_find
Example:
class User
include Dynamoid::Document
# ...
before_save :set_default_password
after_create :notify_friends
after_destroy :delete_addresses
end
Dynamoid supports STI (Single Table Inheritance) like Active Record
does. You need just specify type
field in a base class. Example:
class Animal
include Dynamoid::Document
field :name
field :type
end
class Cat < Animal
field :lives, :integer
end
cat = Cat.create(name: 'Morgan')
animal = Animal.find(cat.id)
animal.class
#=> Cat
If you already have DynamoDB tables and type
field already exists and
has its own semantic it leads to conflict. It's possible to tell
Dynamoid to use another field (even not existing) instead of type
one
with inheritance_field
table option:
class Car
include Dynamoid::Document
table inheritance_field: :my_new_type
field :my_new_type
end
c = Car.create
c.my_new_type
#=> "Car"
Dynamoid supports type casting and tries to do it in the most convenient way. Values for all fields (except custom type) are coerced to declared field types.
Some obvious rules are used, e.g.:
for boolean field:
document.boolean_field = 'off'
# => false
document.boolean_field = 'false'
# => false
document.boolean_field = 'some string'
# => true
or for integer field:
document.integer_field = 42.3
# => 42
document.integer_field = '42.3'
# => 42
document.integer_field = true
# => 1
If time zone isn't specified for datetime
value - application time
zone is used.
To access field value before type casting following method could be
used: attributes_before_type_cast
and
read_attribute_before_type_cast
.
There is <name>_before_type_cast
method for every field in a model as
well.
Dynamoid supports Dirty API which is equivalent to Rails 5.2
ActiveModel::Dirty
.
There is only one limitation - change in place of field isn't detected
automatically.
Dynamoid's syntax is generally very similar to ActiveRecord's. Making new objects is simple:
u = User.new(name: 'Josh')
u.email = '[email protected]'
u.save
Save forces persistence to the data store: a unique ID is also assigned, but it is a string and not an auto-incrementing number.
u.id # => '3a9f7216-4726-4aea-9fbc-8554ae9292cb'
To use associations, you use association methods very similar to ActiveRecord's:
address = u.addresses.create
address.city = 'Chicago'
address.save
To create multiple documents at once:
User.create([{ name: 'Josh' }, { name: 'Nick' }])
There is an efficient and low-level way to create multiple documents (without validation and callbacks running):
users = User.import([{ name: 'Josh' }, { name: 'Nick' }])
Querying can be done in one of the following ways:
Address.find(address.id) # Find directly by ID.
Address.where(city: 'Chicago').all # Find by any number of matching criteria...
# Though presently only "where" is supported.
Address.find_by_city('Chicago') # The same as above, but using ActiveRecord's older syntax.
There is also a way to #where
with a condition expression:
Address.where('city = :c', c: 'Chicago')
A condition expression may contain operators (e.g. <
, >=
, <>
),
keywords (e.g. AND
, OR
, BETWEEN
) and built-in functions (e.g.
begins_with
, contains
) (see (documentation
)[https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Expressions.OperatorsAndFunctions.html]
for full syntax description).
Warning: Values (specified for a String condition expression) are
sent as is so Dynamoid field types that aren't supported natively by
DynamoDB (e.g. datetime
and date
) require explicit casting.
Warning: String condition expressions will be used by DynamoDB only
at filtering, so conditions on key attributes should be specified as a
Hash to perform Query operation instead of Scan. Don't use key
attributes in #where
's String condition expressions.
And you can also query on associations:
u.addresses.where(city: 'Chicago').all
But keep in mind Dynamoid - and document-based storage systems in general - are not drop-in replacements for existing relational databases. The above query does not efficiently perform a conditional join, but instead finds all the user's addresses and naively filters them in Ruby. For large associations this is a performance hit compared to relational database engines.
Warning: There is a caveat with filtering documents by nil
value
attribute. By default Dynamoid ignores attributes with nil
value and
doesn't store them in a DynamoDB document. This behavior could be
changed with store_attribute_with_nil_value
config option.
If Dynamoid ignores nil
value attributes null
/not_null
operators
should be used in query:
Address.where('postcode.null': true)
Address.where('postcode.not_null': true)
If Dynamoid keeps nil
value attributes eq
/ne
operators should be
used instead:
Address.where(postcode: nil)
Address.where('postcode.ne': nil)
There are three types of limits that you can query with:
record_limit
- The number of evaluated records that are returned by the query.scan_limit
- The number of scanned records that DynamoDB will look at before returning.batch_size
- The number of records requested to DynamoDB per underlying request, good for large queries!
Using these in various combinations results in the underlying requests
to be made in the smallest size possible and the query returns once
record_limit
or scan_limit
is satisfied. It will attempt to batch
whenever possible.
You can thus limit the number of evaluated records, or select a record from which to start in order to support pagination.
Address.record_limit(5).start(address) # Only 5 addresses starting at `address`
Where address
is an instance of the model or a hash
{the_model_hash_key: 'value', the_model_range_key: 'value'}
. Keep in
mind that if you are passing a hash to .start()
you need to explicitly
define all required keys in it including range keys, depending on table
or secondary indexes signatures, otherwise you'll get an
Aws::DynamoDB::Errors::ValidationException
either for Exclusive Start Key must have same size as table's key schema
or The provided starting key is invalid
If you are potentially running over a large data set and this is especially true when using certain filters, you may want to consider limiting the number of scanned records (the number of records DynamoDB infrastructure looks through when evaluating data to return):
Address.scan_limit(5).start(address) # Only scan at most 5 records and return what's found starting from `address`
For large queries that return many rows, Dynamoid can use AWS' support for requesting documents in batches:
# Do some maintenance on the entire table without flooding DynamoDB
Address.batch(100).each { |addr| addr.do_some_work && sleep(0.01) }
Address.record_limit(10_000).batch(100).each { |addr| addr.do_some_work && sleep(0.01) } # Batch specified as part of a chain
The implication of batches is that the underlying requests are done in
the batch sizes to make the request and responses more manageable. Note
that this batching is for Query
and Scans
and not BatchGetItem
commands.
At times it can be useful to rely on DynamoDB low-level pagination instead of fixed pages sizes. Each page results in a single Query or Scan call to DynamoDB, but returns an unknown number of records.
Access to the native DynamoDB pages can be obtained via the
find_by_pages
method, which yields arrays of records.
Address.find_by_pages do |addresses, metadata|
end
Each yielded pages returns page metadata as the second argument, which
is a hash including a key :last_evaluated_key
. The value of this key
can be used for the start
method to fetch the next page of records.
This way it can be used for instance to implement efficiently pagination in web-applications:
class UserController < ApplicationController
def index
next_page = params[:next_page_token] ? JSON.parse(Base64.decode64(params[:next_page_token])) : nil
records, metadata = User.start(next_page).find_by_pages.first
render json: {
records: records,
next_page_token: Base64.encode64(metadata[:last_evaluated_key].to_json)
}
end
end
You are able to optimize query with condition for sort key. Following
operators are available: gt
, lt
, gte
, lte
, begins_with
,
between
as well as equality:
Address.where(latitude: 10_212)
Address.where('latitude.gt': 10_212)
Address.where('latitude.lt': 10_212)
Address.where('latitude.gte': 10_212)
Address.where('latitude.lte': 10_212)
Address.where('city.begins_with': 'Lon')
Address.where('latitude.between': [10_212, 20_000])
You are able to filter results on the DynamoDB side and specify
conditions for non-key fields. Following additional operators are
available: in
, contains
, not_contains
, null
, not_null
:
Address.where('city.in': %w[London Edenburg Birmingham])
Address.where('city.contains': ['on'])
Address.where('city.not_contains': ['ing'])
Address.where('postcode.null': false)
Address.where('postcode.not_null': true)
WARNING: Please take into account that NULL
and NOT_NULL
operators check attribute presence in a document, not value. So if
attribute postcode
's value is NULL
, NULL
operator will return
false because attribute exists even if has NULL
value.
It could be done with project
method:
class User
include Dynamoid::Document
field :name
end
User.create(name: 'Alex')
user = User.project(:name).first
user.id # => nil
user.name # => 'Alex'
user.created_at # => nil
Returned models with have filled specified fields only.
Several fields could be specified:
user = User.project(:name, :created_at)
Querying supports consistent reading. By default, DynamoDB reads are eventually consistent: if you do a write and then a read immediately afterwards, the results of the previous write may not be reflected. If you need to do a consistent read (that is, you need to read the results of a write immediately) you can do so, but keep in mind that consistent reads are twice as expensive as regular reads for DynamoDB.
Address.find(address.id, consistent_read: true) # Find an address, ensure the read is consistent.
Address.where(city: 'Chicago').consistent.all # Find all addresses where the city is Chicago, with a consistent read.
If you have a range index, Dynamoid provides a number of additional other convenience methods to make your life a little easier:
User.where('created_at.gt': DateTime.now - 1.day).all
User.where('created_at.lt': DateTime.now - 1.day).all
It also supports gte
and lte
. Turning those into symbols and
allowing a Rails SQL-style string syntax is in the works. You can only
have one range argument per query, because of DynamoDB inherent
limitations, so use it sensibly!
In order to update document you can use high level methods
#update_attributes
, #update_attribute
and .update
. They run
validation and callbacks.
Address.find(id).update_attributes(city: 'Chicago')
Address.find(id).update_attribute(:city, 'Chicago')
Address.update(id, city: 'Chicago')
There are also some low level methods #update
, .update_fields
and
.upsert
. They don't run validation and callbacks (except #update
-
it runs update
callbacks). All of them support conditional updates.
#upsert
will create new document if document with specified id
doesn't exist.
Address.find(id).update do |i|
i.set city: 'Chicago'
i.add latitude: 100
i.delete set_of_numbers: 10
end
Address.find(id).update(if: { deliverable: true }) do |i|
i.set city: 'Chicago'
end
Address.update_fields(id, city: 'Chicago')
Address.update_fields(id, { city: 'Chicago' }, if: { deliverable: true })
Address.upsert(id, city: 'Chicago')
Address.upsert(id, { city: 'Chicago' }, if: { deliverable: true })
By default, #upsert
will update all attributes of the document if it already exists.
To idempotently create-but-not-update a record, apply the unless_exists
condition
to its keys when you upsert.
Address.upsert(id, { city: 'Chicago' }, { unless_exists: [:id] })
In order to delete some items delete_all
method should be used. Any
callback won't be called. Items delete in efficient way in batch.
Address.where(city: 'London').delete_all
You can define index with global_secondary_index
:
class User
include Dynamoid::Document
field :name
field :age, :number
global_secondary_index hash_key: :age # Must come after field definitions.
end
There are the following options:
hash_key
- is used as hash key of an index,range_key
- is used as range key of an index,projected_attributes
- list of fields to store in an index or has a predefined value:keys_only
,:all
;:keys_only
is a default,name
- an index will be created with this name when a table is created; by default name is generated and contains table name and keys names,read_capacity
- is used when table created and used as an index capacity; by default equalsDynamoid::Config.read_capacity
,write_capacity
- is used when table created and used as an index capacity; by default equalsDynamoid::Config.write_capacity
The only mandatory option is name
.
WARNING: In order to use global secondary index in Document.where
implicitly you need to have all the attributes of the original table in
the index and declare it with option projected_attributes: :all
:
class User
# ...
global_secondary_index hash_key: :age, projected_attributes: :all
end
There is only one implicit way to query Global and Local Secondary Indexes (GSI/LSI).
The second way implicitly uses your GSI through the where
clauses and
deduces the index based on the query fields provided. Another added
benefit is that it is built into query chaining so you can use all the
methods used in normal querying. The explicit way from above would be
rewritten as follows:
where(dynamo_primary_key_column_name => dynamo_primary_key_value,
"#{range_column}.#{range_modifier}" => range_value)
.scan_index_forward(false)
The only caveat with this method is that because it is also used for
general querying, it WILL NOT use a GSI unless it explicitly has defined
projected_attributes: :all
on the GSI in your model. This is because
GSIs that do not have all attributes projected will only contain the
index keys and therefore will not return objects with fully resolved
field values. It currently opts to provide the complete results rather
than partial results unless you've explicitly looked up the data.
Future TODO could involve implementing select
in chaining as well as
resolving the fields with a second query against the table since a query
against GSI then a query on base table is still likely faster than scan
on the base table
Warning
Please note that this API is experimental and can be changed in future releases.
Multiple modifying actions can be grouped together and submitted as an all-or-nothing operation. Atomic modifying operations are supported in Dynamoid using transactions. If any action in the transaction fails they all fail.
The following actions are supported:
#create
/#create!
- add a new model if it does not already exist#save
/#save!
- create or update model#update_attributes
/#update_attributes!
- modifies one or more attributes from an existig model#delete
- remove an model without callbacks nor validations#destroy
/#destroy!
- remove an model#upsert
- add a new model or update an existing one, no callbacks#update_fields
- update a model without its instantiation
These methods are supposed to behave exactly like their non-transactional counterparts.
Models can be created inside of a transaction. The partition and sort
keys, if applicable, are used to determine uniqueness. Creating will
fail with Aws::DynamoDB::Errors::TransactionCanceledException
if a
model already exists.
This example creates a user with a unique id and unique email address by
creating 2 models. An additional model is upserted in the same
transaction. Upsert will update updated_at
but will not create
created_at
.
user_id = SecureRandom.uuid
email = '[email protected]'
Dynamoid::TransactionWrite.execute do |txn|
txn.create(User, id: user_id)
txn.create(UserEmail, id: "UserEmail##{email}", user_id: user_id)
txn.create(Address, id: 'A#2', street: '456')
txn.upsert(Address, 'A#1', street: '123')
end
Models can be saved in a transaction. New records are created otherwise
the model is updated. Save, create, update, validate and destroy
callbacks are called around the transaction as appropriate. Validation
failures will throw Dynamoid::Errors::DocumentNotValid
.
user = User.find(1)
article = Article.new(body: 'New article text', user_id: user.id)
Dynamoid::TransactionWrite.execute do |txn|
txn.save(article)
user.last_article_id = article.id
txn.save(user)
end
A model can be updated by providing a model or primary key, and the fields to update.
Dynamoid::TransactionWrite.execute do |txn|
# change name and title for a user
txn.update_attributes(user, name: 'bob', title: 'mister')
# sets the name and title for a user
# The user is found by id (that equals 1)
txn.update_fields(User, '1', name: 'bob', title: 'mister')
end
Models can be used or the model class and key can be specified.
#destroy
uses callbacks and validations. Use #delete
to skip
callbacks and validations.
article = Article.find('1')
tag = article.tag
Dynamoid::TransactionWrite.execute do |txn|
txn.destroy(article)
txn.delete(tag)
txn.delete(Tag, '2') # delete record with hash key '2' if it exists
txn.delete(Tag, 'key#abcd', 'range#1') # when sort key is required
end
All of the transaction methods can be called without the !
which
results in false
instead of a raised exception when validation fails.
Ignoring validation failures can lead to confusion or bugs so always
check return status when not using a method with !
.
user = User.find('1')
user.red = true
Dynamoid::TransactionWrite.execute do |txn|
if txn.save(user) # won't raise validation exception
txn.update_fields(UserCount, user.id, count: 5)
else
puts 'ALERT: user not valid, skipping'
end
end
Transactions can also be built without a block.
transaction = Dynamoid::TransactionWrite.new
transaction.create(User, id: user_id)
transaction.create(UserEmail, id: "UserEmail##{email}", user_id: user_id)
transaction.upsert(Address, 'A#1', street: '123')
transaction.commit # changes are persisted in this moment
To run PartiQL statements Dynamoid.adapter.execute
method should be
used:
Dynamoid.adapter.execute("UPDATE users SET name = 'Mike' WHERE id = '1'")
Parameters are also supported:
Dynamoid.adapter.execute('SELECT * FROM users WHERE id = ?', ['1'])
Listed below are all configuration options.
adapter
- useful only for the gem developers to switch to a new adapter. Default and the only available value isaws_sdk_v3
namespace
- prefix for table names, default isdynamoid_#{application_name}_#{environment}
for Rails application anddynamoid
otherwiselogger
- by default it's aRails.logger
in Rails application andstdout
otherwise. You can disable logging by settingnil
orfalse
values. Settrue
value to use defaultsaccess_key
- DynamoDb custom access key for AWS credentials, override global AWS credentials if they're presentsecret_key
- DynamoDb custom secret key for AWS credentials, override global AWS credentials if they're presentcredentials
- DynamoDb custom pre-configured credentials, override global AWS credentials if they're presentregion
- DynamoDb custom credentials for AWS, override global AWS credentials if they're presentbatch_size
- when you try to load multiple items at once withbatch_get_item
call Dynamoid loads them not with one api call but piece by piece. Default is 100 itemscapacity_mode
- used at a table creation and means whether a table read/write capacity mode will be on-demand or provisioned. Allowed values are:on_demand
and:provisioned
. Default value isnil
which means provisioned mode will be used.read_capacity
- is used at table or indices creation. Default is 100 (units)write_capacity
- is used at table or indices creation. Default is 20 (units)warn_on_scan
- log warnings when scan table. Default istrue
endpoint
- if provided, it communicates with the DynamoDB listening at the endpoint. This is useful for testing with DynamoDB Localidentity_map
- ensures that each object gets loaded only once by keeping every loaded object in a map. Looks up objects using the map when referring to them. Isn't thread safe. Default isfalse
.Use Dynamoid::Middleware::IdentityMap
to clear identity map for each HTTP requesttimestamps
- by default Dynamoid setscreated_at
andupdated_at
fields at model creation and updating. You can disable this behavior by settingfalse
valuesync_retry_max_times
- when Dynamoid creates or deletes table synchronously it checks for completion specified times. Default is 60 (times). It's a bit over 2 minutes by defaultsync_retry_wait_seconds
- time to wait between retries. Default is 2 (seconds)convert_big_decimal
- iftrue
then Dynamoid converts numbers stored inHash
inraw
field to float. Default isfalse
store_attribute_with_nil_value
- iftrue
Dynamoid keeps attribute withnil
value in a document. Otherwise Dynamoid removes it while saving a document. Default isnil
which equals behaviour withfalse
value.models_dir
-dynamoid:create_tables
rake task loads DynamoDb models from this directory. Default is./app/models
.application_timezone
- Dynamoid converts alldatetime
fields to specified time zone when loads data from the storage. Acceptable values -:utc
,:local
(to use system time zone) and time zone name e.g.Eastern Time (US & Canada)
. Default isutc
dynamodb_timezone
- When a datetime field is stored in string format Dynamoid converts it to specified time zone when saves a value to the storage. Acceptable values -:utc
,:local
(to use system time zone) and time zone name e.g.Eastern Time (US & Canada)
. Default isutc
store_datetime_as_string
- iftrue
then Dynamoid stores :datetime fields in ISO 8601 string format. Default isfalse
store_date_as_string
- iftrue
then Dynamoid stores :date fields in ISO 8601 string format. Default isfalse
store_empty_string_as_nil
- store attribute's empty String value as NULL. Default istrue
store_boolean_as_native
- iftrue
Dynamoid stores boolean fields as native DynamoDB boolean values. Otherwise boolean fields are stored as string values't'
and'f'
. Default istrue
store_binary_as_native
- iftrue
Dynamoid stores binary fields as native DynamoDB binary values. Otherwise binary fields are stored as Base64 encoded string values. Default isfalse
backoff
- is a hash: key is a backoff strategy (symbol), value is parameters for the strategy. Is used in batch operations. Default idnil
backoff_strategies
: is a hash and contains all available strategies. Default is{ constant: ..., exponential: ...}
log_formatter
: overrides default AWS SDK formatter. There are several canned formatters:Aws::Log::Formatter.default
,Aws::Log::Formatter.colored
andAws::Log::Formatter.short
. Please look intoAws::Log::Formatter
AWS SDK documentation in order to provide own formatter.http_continue_timeout
: The number of seconds to wait for a 100-continue HTTP response before sending the request body. Default option value isnil
. If not specified effected value is1
http_idle_timeout
: The number of seconds an HTTP connection is allowed to sit idle before it is considered stale. Default option value isnil
. If not specified effected value is5
http_open_timeout
: The number of seconds to wait when opening a HTTP session. Default option value isnil
. If not specified effected value is15
http_read_timeout
:The number of seconds to wait for HTTP response data. Default option value isnil
. If not specified effected value is60
create_table_on_save
: iftrue
then Dynamoid creates a corresponding table in DynamoDB at model persisting if the table doesn't exist yet. Default istrue
Dynamoid supports basic, ActiveRecord-like optimistic locking on save
operations. Simply add a lock_version
column to your table like so:
class MyTable
# ...
field :lock_version, :integer
# ...
end
In this example, all saves to MyTable
will raise an
Dynamoid::Errors::StaleObjectError
if a concurrent process loaded,
edited, and saved the same row. Your code should trap this exception,
reload the row (so that it will pick up the newest values), and try the
save again.
Calls to update
and update!
also increment the lock_version
,
however, they do not check the existing value. This guarantees that a
update operation will raise an exception in a concurrent save operation,
however a save operation will never cause an update to fail. Thus,
update
is useful & safe only for doing atomic operations (e.g.
increment a value, add/remove from a set, etc), but should not be used
in a read-modify-write pattern.
You can use several methods that run efficiently in batch mode like
.find_all
and .import
. It affects Query
and Scan
operations as
well.
The backoff strategy will be used when, for any reason, some items could not be processed as part of a batch mode command. Operations will be re-run to process these items.
Exponential backoff is the recommended way to handle throughput limits exceeding and throttling on the table.
There are two built-in strategies - constant delay and truncated binary exponential backoff. By default no backoff is used but you can specify one of the built-in ones:
Dynamoid.configure do |config|
config.backoff = { constant: 2.second }
end
Dynamoid.configure do |config|
config.backoff = { exponential: { base_backoff: 0.2.seconds, ceiling: 10 } }
end
You can just specify strategy without any arguments to use default presets:
Dynamoid.configure do |config|
config.backoff = :constant
end
You can use your own strategy in the following way:
Dynamoid.configure do |config|
config.backoff_strategies[:custom] = lambda do |n|
-> { sleep rand(n) }
end
config.backoff = { custom: 10 }
end
There are a few Rake tasks available out of the box:
rake dynamoid:create_tables
rake dynamoid:ping
In order to use them in non-Rails application they should be required explicitly:
# Rakefile
Rake::Task.define_task(:environment)
require 'dynamoid/tasks'
The Rake tasks depend on :environment
task so it should be declared as
well.
In test environment you will most likely want to clean the database between test runs to keep tests completely isolated. This can be achieved like so
module DynamoidReset
def self.all
Dynamoid.adapter.list_tables.each do |table|
# Only delete tables in our namespace
if table =~ /^#{Dynamoid::Config.namespace}/
Dynamoid.adapter.delete_table(table)
end
end
Dynamoid.adapter.tables.clear
# Recreate all tables to avoid unexpected errors
Dynamoid.included_models.each { |m| m.create_table(sync: true) }
end
end
# Reduce noise in test output
Dynamoid.logger.level = Logger::FATAL
If you're using RSpec you can invoke the above like so:
RSpec.configure do |config|
config.before(:each) do
DynamoidReset.all
end
end
In addition, the first test for each model may fail if the relevant models are not included in included_models
. This can be fixed by adding this line before the DynamoidReset
module:
Dir[File.join(Dynamoid::Config.models_dir, '**/*.rb')].sort.each { |file| require file }
Note that this will require all models in your models folder - you can also explicitly require only certain models if you would prefer to.
In Rails, you may also want to ensure you do not delete non-test data accidentally by adding the following to your test environment setup:
raise "Tests should be run in 'test' environment only" if Rails.env != 'test'
Dynamoid.configure do |config|
config.namespace = "#{Rails.application.railtie_name}_#{Rails.env}"
end
There is a config option logger
. Dynamoid writes requests and
responses to DynamoDB using this logger on the debug
level. So in
order to troubleshoot and debug issues just set it:
class User
include Dynamoid::Document
field name
end
Dynamoid.config.logger.level = :debug
Dynamoid.config.endpoint = 'http://localhost:8000'
User.create(name: 'Alex')
# => D, [2019-05-12T20:01:07.840051 #75059] DEBUG -- : put_item | Request "{\"TableName\":\"dynamoid_users\",\"Item\":{\"created_at\":{\"N\":\"1557680467.608749\"},\"updated_at\":{\"N\":\"1557680467.608809\"},\"id\":{\"S\":\"1227eea7-2c96-4b8a-90d9-77b38eb85cd0\"}},\"Expected\":{\"id\":{\"Exists\":false}}}" | Response "{}"
# => D, [2019-05-12T20:01:07.842397 #75059] DEBUG -- : (231.28 ms) PUT ITEM - ["dynamoid_users", {:created_at=>0.1557680467608749e10, :updated_at=>0.1557680467608809e10, :id=>"1227eea7-2c96-4b8a-90d9-77b38eb85cd0", :User=>nil}, {}]
The first line is a body of HTTP request and response. The second line -
Dynamoid internal logging of API call (PUT ITEM
in our case) with
timing (231.28 ms).
Dynamoid borrows code, structure, and even its name very liberally from the truly amazing Mongoid. Without Mongoid to crib from none of this would have been possible, and I hope they don't mind me reusing their very awesome ideas to make DynamoDB just as accessible to the Ruby world as MongoDB.
Also, without contributors the project wouldn't be nearly as awesome. So many thanks to:
- Chris Hobbs
- Logan Bowers
- Lane LaRue
- Craig Heneveld
- Anantha Kumaran
- Jason Dew
- Luis Arias
- Stefan Neculai
- Philip White *
- Peeyush Kumar
- Sumanth Ravipati
- Pascal Corpet
- Brian Glusman *
- Peter Boling *
- Andrew Konchin *
* Current Maintainers
Running the tests is fairly simple. You should have an instance of DynamoDB running locally. Follow these steps to setup your test environment.
-
First download and unpack the latest version of DynamoDB. We have a script that will do this for you if you use bash, and homebrew on a Mac.
bin/setup
-
Start the local instance of DynamoDB to listen in 8000 port
bin/start_dynamodblocal
-
and lastly, use
rake
to run the tests.rake
-
When you are done, remember to stop the local test instance of dynamodb
bin/stop_dynamodblocal
If you run into issues, please try these steps first. NOTE: You can use any version manager: rvm, rbenv, chruby, asdf-ruby
asdf install ruby 3.1.1
asdf local ruby 3.1.1
gem update --system
bundle install
See SECURITY.md.
This documentation may be useful for the contributors:
- https://docs.aws.amazon.com/sdk-for-ruby/v3/developer-guide/welcome.html
- https://docs.aws.amazon.com/sdk-for-ruby/v3/api/index.html
The gem is available as open source under the terms of the MIT License . See LICENSE for the official Copyright Notice.