is necessary doubly link data mongodb, similar mysql, related entries have ids pointing each other. in other words, there performance difference between:
db.events.find({userids: myid}).fetch()
and
db.events.find({_id: {$in: [1, 2, 3, 4]} }).fetch()
usually, $in
queries surprisingly fast, whether it's right approach depends on cardinality of data (max. events per user, max. users per event) , query patterns.
in general, idea of indexes avoid having link back-and-forth. makes cumbersome updates mentioned unnecessary. it's easier query, easier maintain, , easier paginate.
the last argument particularly important depends on query patterns: let's want show ten recent events user attended to. can create index {userids : 1, eventdate: -1}
match query , doesn't have pull or iterate all events user went to.
if wanted query using other method, you'd have store eventdate
in user, seems rather awkward.
on other hand, if events huge, run problems object size of event (think 1 million participants). might want denormalize names of participants display purposes, makes object larger.
if choose work $in
queries, keep in mind that
- performance degrades array becomes large. i'm not sure causes this, had trouble when array grew past 1 or 2 thousand(!) elements.
- if ids spread far, mongodb might have hit lot of buckets. depends on type of keys you're using, can painful (e.g. if you're using
objectids
, user attended event every day you'll hit every_id
bucket have, older ones, can expensive).