Cohorts and their meaning

***First, an update on the status of the alpha release — It is imminent!! We hit a few snags, but will have you authorized to start entering data this week.***

What are cohorts? 

Cohorts are groups of individuals who share a set of similarities.

How can we exploit this type of grouping?

Lets say we want to offer advice to an individual (lets call this person, A) on how he/she can alter training/nutrition to perform better. One way to do this would be to randomly pick a training protocol/nutrition system and ask the person to follow it for a month and see what happens. The probability that this will be successful is small.

How do we increase the probability that we provide advice that works?

We find a group of individuals who approximately 30 days ago (+/-) were very similar to  similar to A, in multiple categories. We then narrow this group down to those individuals who are similar to A and also improved their performance output over the past 30 days. As we are all interested in improving overall work capacity, a max effort lift, or some specific movement/skill, this information would be very useful to individual A. Now — what did this group of people do over 30 days to improve their performance? Its very likely that implementing these changes could have a significant effect on individual A’s performance.

How do we determine who is in a cohort?

We use what’s referred to as a ‘graphical model’ to determine similarity between individuals and to assess what the commonalities are between/among these groups of highly similar people. Graphical model analysis will compare large groups of individuals and find those who share many edges with individual A. Read: those who are similar to individual A in many aspects. The thicker the edge (i.e: the closer the values are) the more alike the individuals may be. See the figure below for an illustration.

We compare individual A (highlighted by red arrow) to the entire group, with respect to multiple variants (work capacity, muscle group bias, height, weight, race, age etc etc). We find that he/she is more similar to about 30 people within the group (circled) than to the other ~500 people. This allows us to focus in on this group and identify changes they have made to improve. We can then use this as a starting point to make suggestions to individual A.

Cohort analysis = Accurate advice 


5 thoughts on “Cohorts and their meaning

  1. Deepak, we spoke briefly about this, if this is not the place, email me…
    as you know we’ve been doing a looser version of this for the past 5 years on the blog and looking for these “groups” by only using invalid techniques of years of “watching” and “eyeing” it and being in the trenches….
    was wondering if this creation of the group will lead to some error that we did along the way…that the other small groups are smaller but performing better…does the larger groups in one area as on the chart indicate “better” performance?..or more aligned due to principles of the “norm” of improvement in training – as most will when the training is changed for a small period of time like 30 days…

    if i am not clear, let me know, or at least we can start some ideas on it…and i can give feedback on a large user group

    • deepak8612 says:

      Hi James

      Good to know that you are reading the blog critically.
      Few clarifications:
      1. The ‘groups’ or clusters of individuals are merely individuals who are similar to one another in many ways (height, weight, work capacity, power generation, muscle group bias etc)
      • This is determined by stringent statistical analysis.
      2. The size of the group merely indicates how many individuals belong to that group and have no indication on ‘performance’. But individuals within a particular group would have similar levels of performance.
      3. The chart is a ‘snapshot’ in time that takes into account all accumulated data for the individuals (and thus is not limited to only 30 days)
      4. Once we have establish these groups we can then look at them over any period of time. 30 days was merely used as an example. For the sake of ease lets consider the same example — Individual A is considered equivalent in many metrics to the rest of the individuals in that group. The nuance here is that the metrics compared for individual A are where he/she is currently. These measures are then compared to the rest of the population as they were 30 days ago (for arguments purpose.. this could be 60 dys or 6 months). Now you have identified individual A to a group of similar individuals. But the advantage is that you also have data from the rest of the group for 30/60 days past that point where you compared them. So you know how they have changed over that time period and what behavioral changes may/may not have led to those improvements (or diminished performance). This allows us to give you real behavior changes (nutrition/training/supplementation) that led to improvements in individuals who are similar to you (or in this case individual A).

      I hope I have made some sense. If not, please let me know. I can get Dan (the resident stats PhD) to be more accurate, if needed

      • deepak, getting it..
        just have seen outlier groups before that followed different programming in exercise for instance and nutrition as well that got just as good results with different templates…so was wondering if i am on the wrong page in asking that question that “is it possible that outlier groups less significant in number” get “results”?

  2. Hi James,

    Thanks for the comments! Definitely a valid point, I think that we’d expect outlying groups to (potentially) contain relevant information with regards to effective programming. In looking at and analyzing the significance of these smaller groups what we’re looking for is how consistent the results are in terms of improvement in work capacity, load, etc. I think a great example is what Blair Morrison has done with a one-on-one-off training schedule (where each day-on consists of three workouts focusing on different elements). While this is certainly different from what would be considered a standard approach to training, it certainly has worked for him (4th at the CF 2011 Games, up from 23rd in 2010). It’s not to say this would be right for everyone (or isn’t potentially confounded with other variables), but it would warrant consideration and would probably be considered a statistically significant cohort. What we’re generally looking for is what appear to be generally effective training regiments and pointing them out to people in a sensible way. So, for example, recommending a training regiment which worked for an experienced athlete to a less experienced one would probably not make a lot of sense. Does this answer your question? Please let me know, thanks!

    Dan Samarov

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s