Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#16] user based cross device tracking #17

Merged
merged 3 commits into from
Apr 4, 2017

Conversation

gingerlime
Copy link
Collaborator

  • adding an optional user_id attribute to experiments
  • when this attribute is present, variant selection will
    be based on the user_id as part of the seed for a hash
    function (along with experiment name and the type of caller)
  • tracking uuid for Gimel and Keen will also change to be
    based on the user_id as seed, so the same user will only
    get tracked once from different devices. Unless the goal
    is not-unique.
  • updated tests and added adapter tests for Gimel and Keen.io

Yoav added 2 commits March 19, 2017 15:48
* adding an optional user_id attribute to experiments
* when this attribute is present, variant selection will
  be based on the user_id as part of the seed for a hash
  function (along with experiment name and the type of caller)
* tracking uuid for Gimel and Keen will also change to be
  based on the user_id as seed, so the same user will only
  get tracked once from different devices. Unless the goal
  is not-unique.
* updated tests and added adapter tests for Gimel and Keen.io
Copy link
Collaborator

@joker-777 joker-777 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks in general very good. Only a few comments

CHANGES Outdated
* 0.15.0-rc.1 BREAKING changes (only if you wrote your own tracking adapter)
- User-based / Cross-device tracking
- If you use your own tracking adapter, you'll have to change it.
* `experiment_start` now accepts the `experiment` as the first parameter
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would use the word 'expects' since you really have to pass this parameter. 'accepts' sounds quite optional.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, great suggestion!

},
goal_complete: function(experiment_name, variant, event_name) {
keen_client.addEvent(experiment_name, {variant: variant, event: event_name});
goal_complete: function(experiment, variant, event_name, _props) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why _props when you don't use it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

by convention variables that are passed over but not used are marked with _ prefix. I think it's useful to know that it's being passed over from the experiment, even if it's not used.

callback = @_remove_uuid(item.properties.uuid)
@_ajax_get(@url, item.properties, callback)
callback = @_remove_quuid(item.properties._quuid)
@_ajax_get(@url, utils.omit(item.properties, '_quuid'), callback)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you have to remove the _quuid here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's no need to send it over the wire.

@experiment_start: (experiment_name, variant) =>
@_track(@namespace, "#{experiment_name} | #{variant}", 'Visitors')
@experiment_start: (experiment, variant) =>
@_track(@namespace, "#{experiment.name} | #{variant}", 'Visitors')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the third parameter should be an object, no?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not in this tracking adapter... the tracking adapters don't share any code and can implement internally differently.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general I think you should try to keep the interfaces of the adapters similar.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The interface is the same. This is internal implementation.


@goal_complete: (experiment_name, variant, goal) =>
@_track(@namespace, "#{experiment_name} | #{variant}", goal)
@goal_complete: (experiment, variant, goal, _props) =>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In PersistentQueueKeenAdapter the third parameter is the goal_name and here it is just goal. I think you should stay consistent with it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for spotting. Will change this.

_random: (salt) ->
return utils.random() unless @user_id
seed = "#{@name}.#{salt}.#{@user_id}"
utils.random(seed)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need this seed for the user_id? Can't you just always use utils.random("#{@name}.#{salt}.#{@user_id}")?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you asking why I'm assigning it to a variable? not strictly necessary. I can pass it directly as you suggested. But might be a bit more readable to understand what this string is used for?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No that wasn't my question. My question is why you can't use utils.random("#{@name}.#{salt}.#{@user_id}") even when there is no @user_id? Simply writing

_random: ->
  utils.random("#{@name}.#{@user_id}")

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That won't be random though...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you quickly explain how does it work with the salt?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Take a look at the implementation. It's super small and simple. It uses a hash function with the seed and generates a random number based on the seed. If the seed is different for different users, the result hash will produce a different hash, which would get "translated" to a different number between 0 and 1. So for a given experiment, salt and user_id, we'll get a pseudo-random number. If you send the same user_id (or null), then you'll always get the same result, which won't be (pseudo)random...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we don't have a user_id, using a "real" random number is preferable.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#16 has a link to planout which might explain this in more detail.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, thanks

@joker-777
Copy link
Collaborator

@gingerlime Somehow I missed it. Why do I need to specifically add the user_id check to the goals and also to the triggers. Can't you do it internally? Is there a use case where you add the user_id but want to trigger an experiment even if it doesn't exist?

@gingerlime
Copy link
Collaborator Author

@joker-777 I wrote about it on this wiki page https://github.com/Alephbet/alephbet/wiki/User-based-and-Cross-device-tracking

Is there a use case where you add the user_id but want to trigger an experiment even if it doesn't exist?

I don't think you can mix not-logged-in and logged-in on the same experiment, without getting unreliable results unfortunately. I tried to cover a couple of cases where this might cause inconsistent results on the wiki page.

Why do I need to specifically add the user_id check to the goals and also to the triggers. Can't you do it internally?

It might be possible to do this internally, but this definitely complicates things. You'd need to store/cache the user_id. You'd need to know when the user_id is missing, vs when it wasn't meant to be used at all... It's probably a good idea, but need to think carefully and design it to work seamlessly without creating unexpected behaviour. For now, I think it's ok to expect the caller to make sure the user_id is available, and it's not too difficult to cache it externally if you want to "remember" who the user was initially.

@joker-777
Copy link
Collaborator

The only way to achieve this is to add another attribute, in addition to user_id. Something like cross_device: true. I don't think it is to complicated to add some small checks internally. In fact I think you only need to add return if @cross_device && !@user_id to the method which calls the trigger callback and to add_goal and add_goals methods. I didn't dig into the code though.

@gingerlime
Copy link
Collaborator Author

two attributes for one purpose makes the Alephbet API ugly. If the intention is to make it transparent, then this should be handled internally with only the user_id attribute. It does make things a bit more complicated however.

@gingerlime gingerlime merged commit b70fe7e into master Apr 4, 2017
@gingerlime gingerlime deleted the 16-user-based-cross-device-tracking branch April 4, 2017 11:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants