2015 | Kelly R. Evenson1,2*, Michelle M. Goto1 and Robert D. Furberg2
This systematic review evaluates the validity and reliability of popular consumer-wearable activity trackers (Fitbit and Jawbone) in estimating steps, distance, physical activity, energy expenditure, and sleep. The review includes 22 studies (20 on adults, 2 on youth) published through July 31, 2015. For step counting, both Fitbit and Jawbone showed high correlation with laboratory-based criteria (Pearson or intraclass correlation coefficients >0.80). Distance estimation was assessed in one study for Fitbit, finding over-estimation at slower speeds and under-estimation at faster speeds. Two field-based studies on physical activity showed mixed results, with one study finding higher correlation (Spearman CC 0.86) and another showing a wide range in correlation (intraclass CC 0.36–0.70). Energy expenditure was often under-estimated, and sleep measures were over-estimated for total sleep time and sleep efficiency, while wake after sleep onset was under-estimated. Interdevice reliability was reported for seven Fitbit studies but not for Jawbone. Walking- and running-based trials indicated high interdevice reliability for steps, distance, and energy expenditure. The review concludes that Fitbit trackers generally have higher validity for steps, fewer studies on distance and physical activity, and lower validity for energy expenditure and sleep. High interdevice reliability was observed for steps, distance, energy expenditure, and sleep for certain Fitbit models. As new activity trackers enter the market, documenting their measurement properties can guide their use in research settings.This systematic review evaluates the validity and reliability of popular consumer-wearable activity trackers (Fitbit and Jawbone) in estimating steps, distance, physical activity, energy expenditure, and sleep. The review includes 22 studies (20 on adults, 2 on youth) published through July 31, 2015. For step counting, both Fitbit and Jawbone showed high correlation with laboratory-based criteria (Pearson or intraclass correlation coefficients >0.80). Distance estimation was assessed in one study for Fitbit, finding over-estimation at slower speeds and under-estimation at faster speeds. Two field-based studies on physical activity showed mixed results, with one study finding higher correlation (Spearman CC 0.86) and another showing a wide range in correlation (intraclass CC 0.36–0.70). Energy expenditure was often under-estimated, and sleep measures were over-estimated for total sleep time and sleep efficiency, while wake after sleep onset was under-estimated. Interdevice reliability was reported for seven Fitbit studies but not for Jawbone. Walking- and running-based trials indicated high interdevice reliability for steps, distance, and energy expenditure. The review concludes that Fitbit trackers generally have higher validity for steps, fewer studies on distance and physical activity, and lower validity for energy expenditure and sleep. High interdevice reliability was observed for steps, distance, energy expenditure, and sleep for certain Fitbit models. As new activity trackers enter the market, documenting their measurement properties can guide their use in research settings.