methods

The Rosalind Franklin Appathon Prize and Tech day: A celebration of pioneering women in STEMM past, present and future.

Posted on March 4, 2016March 4, 2016 by rebeccalawn

Last week Angela Attwood and I attended the Rosalind Franklin Appathon Prize and Tech day. The appathon was set up by UCL Professor Rachel McKendry with funds from her Royal Society Rosalind Franklin Award. The Rosalind Franklin Award recognises outstanding women in science with an idea for how to raise the profile of other women in their field. Professor McKendry’s idea was to hold an appathon with two challenges. The first challenge was to invent an app that would empower women to become leaders in STEMM (science, technology engineering, mathematics and medicine). The second challenge was to recognise a woman who had pioneered a new app for research, societal good or enterprise. Marcus had nominated me for the second challenge for my work developing the iPad app About Face for teaching emotion recognition to children with autism. Although I was not short listed in the end, I was pleased to be invited along to the prize and tech day to celebrate the winners and their apps.

On the day we were shown videos showcasing the shortlisted apps, some of which you can see here. The winner of Challenge 1 was Amazing STEMM Trailblazers an app for teaching children about influential women in STEMM through games. The runner up was STEMM Role Models, an app for finding female experts to speak at conferences. My personal favourite in this category was EyeSTEMM an app which suggests STEMM careers for young users based on photos they upload to show their interests. There were many exceptional apps shortlisted for Challenge 2, all of which would have been worthy winners. The winner was eSexual Health Clinic, developed by epidemiologist Pam Sonnenberg and her team at UCL. It provides Chlamydia test results and after care, including ordering antibiotic medication to the users local community pharmacy. This app demonstrates the potential of mobile technology to revolutionise health care and reduce strain on the NHS. The runner up was Findme developed by psychologist Sue Fletcher-Watson from Edinburgh University. The app aims to teach children with autism social skills such as following eye gaze and gestures. It was created with consultation with individuals with autism, and the effectiveness of the app has been tested in a randomised control trial. Also shortlisted was Drink Less, which may be of particular interest to readers of this blog. This app was developed by Professor Susan Michie and her team of health psychology researchers at UCL. The app aims to help people keep track of, and to cut down on, their alcohol intake. The data from this app is being used to test how well particular techniques for cutting down work when delivered in an app, rather than by a clinician.

Professor Dame Athene Donald speaking about how to promote women to become leaders in science research.

As well announcing the winning apps, the afternoon also included a number of brilliant talks addressing the challenges and opportunities for women in STEMM industries today. Speakers included Jennifer Glynn, Rosalind Franklin’s sister, who spoke of male-only common rooms in universities which were the norm in the 1950s, reminding us of how far we have come in terms of gender equality in academia. Baroness Martha Lane Fox, founder of lastminute.com, spoke about gender imbalance in the technology industry (only 17% of tech jobs in the UK are held by women) and emphasised the importance of encouraging women to consider STEMM jobs in order to address skills shortages. Finally, Professor Dame Athene Donald gave an inspirational talk about her experiences as a leading female physicist and her ideas for how to promote women in science. This talk I found particularly interesting as a woman seeking a career in this field.

The whole day was very inspiring, both in terms of showing the potential of technology to make positive differences in research, health and education and in terms of showcasing the many women who are playing a leading role in this field. Although there are clearly still challenges being faced by women in STEMM, the message of the day was one of confidence that these can be overcome.

Medication for cognitive impairment in traumatic brain injury: little evidence to support its use

Posted on February 18, 2016February 10, 2016 by rebeccalawn

by Eleanor Kennedy @Nelllor_

This blog originally appeared on the Mental Elf site on 18th January 2016.

Traumatic Brain Injury (TBI) is classified by The World Health Organization as the leading cause of death and disability among children and young adults worldwide (WHO, 2006, p. 164). An estimated 235 per 100,000 Europeans acquire brain injuries each year, with more than 6 million TBI survivors already living in Europe (Tagliaferri et al, 2006).

There are many long-lasting consequences of TBI including cognitive, behavioural and emotional problems (Barnes & Ward, 2005). Pharmacotherapy interventions have been suggested to alleviate cognitive impairment in TBI sufferers. The current review aimed to assess the evidence for such interventions (Dougall et al, 2015).

Traumatic brain injury is the leading cause of death and disability among children and young adults worldwide.

The Cochrane Dementia and Cognitive Improvement Group’s Specialised Register was searched for studies that examined the effectiveness of pharmacological treatment for cognitive impairment in people with traumatic brain injury. The search included both healthcare databases and trial registers. Studies were included if:

The study design was either a randomised controlled trial (RCT) or cross-over design study
The study investigated one centrally acting pharmacological agent that modulate one or more of the main neurotransmitter systems
Participants had to have experienced the TBI resulting in chronic cognitive impairment at least 12 months prior to assessment

The primary outcomes of interest were performance on psychometric and neuropsychological tests or scores on screening measures that measured memory and cognitive function; global severity of cognitive impairment and global impression of change. Acceptability of treatment (as measured by withdrawal from trial), safety, mortality and subjective benefit were all secondary outcomes.

Analyses were carried out on results from phase one of each included study.

Results

Four studies in total were included in the review (3 from the United States, one from Sweden). Seven RCTs that matched inclusion criteria were found, however, two cross-over design studies could not be included as data for phase one was not available from the authors; another study was not included due to the lack of a placebo control. Table 1 summarises the treatments and participants.

Study	N Participants	Treatment	Duration of treatment
Jhaet al. 2008	51 (age 16 to 65)	Modafinil; effects histaminergic, serotonergic, and glutaminergic activity	4 weeks
Johansson et al. 2012	12 (age 30 to 65)	(−)-OSU6162; monoamine stabiliser agent with dopaminergic and serotonergic effects	4 weeks
Ripley et al. 2014	60 (age 18 to 65)	Atomoxetine; noradrenaline reuptake inhibitor	2 weeks
Silver et al., 2006	157 (age 18 to 50)	Rivastigmine; an acetylcholinesterase and butyrylcholinesterase inhibitor	12 weeks

Primary outcome

Neither modafinil nor atomoxetine demonstrated superiority over placebo on any measure of cognition. The effects of rivastigmine were superior on one measure in the current review (CANTAB RVIP −44.54 milliseconds, 95% CI −88.62 to −0.46), but not in the original trial. Rivastigmine was also effective on the same measure in a subgroup of participants with greater cognitive impairment.

Superiority over placebo for (−)-OSU6162 was demonstrated in Trail Making Test A (−9.20 seconds, 95% CI −12.19 to −6.21), Trail Making Test B (−6.20 seconds, 95%CI,−7.81 to−4.59) and WAIS-III digit symbol coding (8.60, 95% CI 6.47 to 10.73), however the score in Trail Making Test D was higher for placebo (53.80 seconds, 95% CI 36.76 to 70.24) (Johansson 2012).

Secondary outcomes

Safety and acceptability were two secondary outcomes that were reported on. Participants reported more adverse effects for modafinil and atomoxetine, however this was not statistically supported. One participant required a dose reduction in the (-)-OSU6162 trial due to adverse effects. More participants taking rivastigmine reported nausea compared to those taking placebo (19/80, 23.8%versus 6/77, 7.8%, risk ratio 3.05, 95% CI 1.29 to 7.22). Two people dropped out of the modafinil treatment arm, none in the placebo group. There were no deaths reported in any of the included studies.

Strengths and limitations

The review included only randomised controlled trials to assess the effects of centrally acting pharmacological agents for treatment of chronic cognitive impairment subsequent to traumatic brain injury in adults. There were very strict inclusion criteria and the authors chose to only include data from phase one of the treatment. This is a strength for cross-over design studies particularly as this controls for the possibility of long term treatment effects once a group’s treatment is switched to placebo following pharmacological treatment. However, two studies were excluded because data from phase one were unavailable.

The limited number of included studies, rather than a limitation, is likely to be indicative of a lack of well controlled research into pharmacological treatments for cognitive impairment following TBI.

Conclusions

There was no evidence to support modafinil or atomoxetine as a treatment for cognitive impairment as a result of TBI. There was weak evidence to suggest that rivastigmine may be helpful in the treatment of cognitive impairment in one measure of cognitive functioning in this review, however the same effect was not significant in the original study possibly due to the use of a different statistical test, and the findings that (−)-OSU6162 may be superior to placebo must be interpreted with caution as the sample size in this group was so small (n=6).

Overall the authors concluded that:

there is insufficient evidence to determine whether pharmacological treatment is effective in chronic cognitive impairment in TBI.

Two of the four included studies had fatigue as their primary outcome, which further suggests that more research in the specific area of cognition may be necessary.

In closing, the review highlights a gap in the research in such treatments for TBI, the authors suggest that future research should also focus on outcomes such as neurobehavioral symptoms as well as cognitive impairment and memory performance.

This review highlights a lack of RCTs that explore the potential value of medication for cognitive impairment following traumatic brain injury.

Links

Primary paper

Dougall D, Poole N, Agrawal N. Pharmacotherapy for chronic cognitive impairment in traumatic brain injury. Cochrane Database of Systematic Reviews 2015, Issue 12. Art. No.: CD009221. DOI: 10.1002/14651858.CD009221.pub2.

Other references

Barnes M, Ward A. (2005) Oxford Handbook of Rehabilitation Medicine. Oxford University Press.

Jha A, Weintraub A, Allshouse A, Morey C, Cusick C, Kittelson J, Gerber D. (2008) A randomized trial of modafinil for the treatment of fatigue and excessive daytime sleepiness in individuals with chronic traumatic brain injury. Journal of Head Trauma Rehabilitation, 23(1), 52–63. doi:10.1097/01.HTR.0000308721.77911.ea (PubMed abstract)

Johansson B, Carlsson A, Carlsson ML, Karlsson M, Nilsson MKL, Nordquist-Brandt E, Rönnbäck L. (2012) Placebo-controlled cross-over study of the monoaminergic stabiliser (-)-OSU6162 in mental fatigue following stroke or traumatic brain injury. Acta Neuropsychiatrica, 24, 266–274. doi:10.1111/j.1601-5215.2012.00678.x [PubMed record]

Ripley DL, Morey CE, Gerber D, Harrison-Felix C, Brenner LA, Pretz CR, Wesnes K. (2014) Atomoxetine for attention deficits following traumatic brain injury: Results from a randomized controlled trial. Brain Injury, 28(January 2016), 1514–1522. doi:10.3109/02699052.2014.919530 [PubMed abstract]

Silver JM, Koumaras B, Chen M, Mirski D, Potkin SG, Reyes P, Gunay I. (2006) Effects of rivastigmine on cognitive function in patients with traumatic brain injury. Neurology, 67, 748–755. [PubMed abstract]

Tagliaferri F, Compagnone C, Korsic M, Servadei F, Kraus J. (2006) A systematic review of brain injury epidemiology in Europe. Acta Neurochirurgica, 148(3), 255–68; discussion 268. doi:10.1007/s00701-005-0651-y [PubMed abstract]

WHO. (2006) Neurological Disorders: Public Health Challenges. World Health Organisation (p. 232).http://www.who.int/mental_health/neurology/neurological_disorders_report_web.pdf

Photo credits

Research doesn’t just happen in the lab anymore: Mechanical Turk, Prolific Academic, and online testing.

Posted on March 20, 2015 by meryemgrabski

By Michael Dalili @michaeldalili

Over the years, from assessment to analysis, research has steadily shifted from paper to PC. The modern researcher has an ever-growing array of computer-based and online tools at their disposal for everything from data collection to live-streaming presentations of their work. While shifting to computer- or web-based platforms is easier for some areas of research than others, this has proven to work especially well in psychology. These platforms can be used for anything from simply hosting an online version of a questionnaire, to recruiting and testing participants on cognitive tasks. Throughout the course of my PhD, I have increasingly used online platforms for multiple purposes, ranging from participants completing questionnaires online on Bristol Online Survey, to recruiting participants using Amazon Mechanical Turk and completing a task hosted on the Xperiment platform. And I’m not alone! While it’s impossible to estimate just how many researchers are using computer- and web-based platforms to conduct their experiments, we have a better idea of how many researchers are using online crowdsourcing platforms such as Mechanical Turk and Prolific Academic for study recruitment. Spoiler alert: It’s A LOT! In this blog post I will describe these two platforms and give an account of my experiences using them for online testing.

Amazon Mechanical Turk, or MTurk for short, is the leading online crowdsourcing platform. Described as an Internet marketplace for work that requires human intelligence, MTurk was publicly launched in 2005, having previously been used internally to find duplicates among Amazon’s product webpages. It works as follows: workers (more commonly known as “Turkers”), who are individuals who have registered on the service, complete Human Intelligence Tasks (known as HITs) created by Requestors, who approve the completed HIT and compensate the Workers. Prior to accepting HITs, Workers are presented with information about the task, the duration of the task, and the amount of compensation they will be awarded upon successfully completing the task. Right now there are over 280,000 HITs available, ranging widely in terms of the type and duration of task as well as compensation. Amazon claims its Workers number over 500,000 ranging from 190 countries. They can be further sub-divided into “Master Categories”, who are described by Amazon as being “an elite group of Workers who have demonstrated superior performance while completing thousands of HITs across the Marketplace”. At time of writing, there are close to 22,000 Master Workers, with about 3,800 Categorization Masters and over 4,500 Photo Moderation Masters. As you might imagine, some Requestors can limit who can complete their HITs by assigning “Qualifications” that Workers must attain before participating in their tasks. Qualifications can range from requiring Master status to having approved completion of a specific number of HITs. While most Workers are based in the US, the service does boast an impressive gender balance, with about 47% of its users being women. Furthermore, Turkers are generally considered to be younger and have a lower income compared to the general US internet population, but possess a similar race composition. Additionally, many Workers worldwide cite Mechanical Turk as their main or secondary sources of income.

Since its launch, MTurk has been very popular, including among researchers. The number of articles on Web of Science with the search term “Mechanical Turk” has gone from just over 20 in 2012 to close to 100 in 2014 (see Figure 1). A similar search on PubMed produces 15 publications since the beginning of 2015.

graph — Figure 1. The number of articles found on the Web of Science prior to 2015 with the search term ‘Mechanical Turk’ within the ‘psychology’ research area. Used with permission fromWoods, Velasco, Levitan, Wan, & Spence (in preparation).

However, the popularity of MTurk has not come without controversy. Upon completing a HIT, Workers are not compensated until their task has been “approved” by the Requestor. Should the Requestor reject the HIT, the Worker receives no compensation and their reputation (% approval ratings) decreases. Many Turkers have complained about having had their HITs unfairly rejected, claiming Requestors keep their task data while withholding payment. Amazon has refused to accept responsibility for Requestors’ actions, claiming it merely creates a marketplace for Requesters and Turkers to contract freely and does not become involved in resolving disputes. Additionally, Amazon does not require Requestors to pay Workers according to any minimum wage, and a quick search of available HITs reveals many tasks requiring workers to devote a considerable amount of time for very little compensation. However, MTurk is only one of several crowdsourcing platforms, including CloudCrowd, CrowdFlower, and Prolific Academic.

Launched in 2014, Prolific Academic describes itself as “a crowdsourcing platform for academics around the globe”. Founded by collaborating academics from Oxford and Sheffield, Prolific Academic markets itself specifically as a platform for academic researchers. In fact, until August 2014, registration to the site was limited to UK-based individuals with academic emails (*.ac.uk) until it was opened up to everyone with a Facebook account (for user authentication purposes). Going a step further than its competition in appealing to academic researchers, Prolific Academic offers an extensive list of pre-screening questions (including questions about sociodemographic characteristics, levels of education or certifications, and more) that researchers can use to determine if someone is eligible to complete their study. Therefore, before someone can access and complete their study, they have to answer screening questions selected by the researcher. Individuals who have already completed screening questionnaires (available immediately upon signing up) will only be shown studies they are eligible for under the study page. At the time of writing this blog, according to the site’s homepage there are 5,081 individuals signed up to the site, with over 26,000 data submissions to date. Additionally, the site reports that participants have earned over £26,000 overall thus far. According to the site’s own demographics report from November 2014, 62% of users are male and the average age of users is about 24. Users are predominantly based in the US or UK. However, 1,500 users have joined since this report alone! Unlike MTurk and most other crowdsourcing platforms, Prolific Academic stipulates that researchers must compensate participants appropriately, which they term “Ethical Rewards”, requiring that participants be paid a minimum of £5 an hour.#

I have had experience using both MTurk and Prolific Academic in conducting and participating in research. With the assistance of Dr Andy Woods and his Xperiment platform, where my experimental task is hosted online, I was able to get an emotion recognition task up and running online. This opened up the possibility of studies on larger and more diverse samples, as well as studies being completed in MUCH shorter time frames. With Andy’s help in setting up studies on MTurk, I have run three studies on the platform since July 2014, ranging in sample size from 100 to 243 participants. Most impressively, each study was completed in a matter of hours; conducting the same study in the lab would have taken months! Similarly, given the short duration of these tasks, and the speed and ease of completing and accessing study documents on a computer, these studies cost less than they would have had they been conducted in the lab.

My experience with Prolific Academic has only been as a participant thus far but has been very positive. All the studies I completed have adhered to the “Ethical Rewards” requirement, and all researchers have been prompt in compensating me following study completion. Study duration estimates have been accurate (if anything generous) and compensation is only withheld in the case of failed catch trials (more on that below). The site is very easy to use with a user-friendly interface. It is easy to contact researchers as well, which is helpful for any queries or concerns. I know several colleagues as well who have had similar experiences and I hope to run a study on the platform in the near future.

While there have been several criticisms of conducting research on these crowd-sourcing platforms, the most common one amongst researchers is that data acquired this way will be of lesser quality than data from lab studies. Critics argue that the lack of a controlled testing environment, possible distractions during testing, and participants completing studies for compensation as quick as possible without attending to instructions are all reasons against conducting experiments on these platforms. Given the fact that research using catch trials (trials included in experiments to assess whether participants are paying attention or not) has shown failure rates ranging from 14% to 46% in a lab setting, surely participants completing tasks from their own homes would do just as badly, if not worse? We decided to investigate for ourselves. In two of our online studies, we added a catch trial as the study’s last trial, shown below.

Out of the 343 people who completed the two studies, only 3 participants failed the catch trial. That is less than 1% of participants! And we are not the only ones who have found promising results from studies using crowdsourcing platforms. Studies have shown that Turkers perform better on online attention checks than traditional subject pool participants and that MTurk Workers with high reputations can ensure high-quality data, even without the use of catch trials. Therefore, the quality of data from crowdsourcing platforms does not appear to be problematic. However, using catch trials is still a very popular and useful way of identifying participants who may not have completed tasks with enough care or attention.

Since the launch of MTurk, many similar platforms have appeared and advances have been made. MTurk has been used for everything from getting Turkers to write movie reviews to helping with missing persons searches. It’s safe to say that crowdsourcing is here to stay and has changed the way we conduct research online, with many of these sites’ tasks working on mobile and tablet platforms as well. While people have been using computers and web platforms in testing for a long time now, using crowdsourcing platforms for participant recruitment is still in its infancy. Since the launch of MTurk, many similar platforms have appeared and advances have been made. With many new possibilties emerging with the use of these platforms, it is an exciting time to be a researcher.

Shifting the Evidence

Posted on May 28, 2013 by oliviamaynard

An excellent paper published a few years ago, Sifting the Evidence, highlighted many of the problems inherent in significance testing, and the use of P-values. One particular problem highlighted was the use of arbitrary thresholds (typically P < 0.05) to divide results into “significant” and “non-significant”. More recently, there has been a lot of coverage of the problems of reproducibility in science, and in particular distinguishing true effects from false positives. Confusion about what P-values actually tell us may contribute to this.

It is often not made clear whether research is exploratory or confirmatory. This distinction is now commonly made in genetic epidemiology, where individual studies routinely report “discovery” and “replication” samples. That in itself is helpful – it’s all too common for post-hoc analyses (e.g., of sub-groups within a sample) to be described as having been based on a priori hypotheses. This is sometimes called HARKing (Hypothesising After the Results are Known), which can make it seem like results were expected (and therefore more likely to be true), when in fact they were unexpected (and therefore less likely to be true). In other words, a P-value alone is often not very informative in telling us whether an observed effect is likely to be true – we also need to take into account whether it conforms with our prior expectations.

One way we can do this is by taking into account the pre-study probability that the effect or association being investigated is real. This is difficult of course, because we can’t know this with certainty. However, what we perhaps can estimate is the extent to which a study is exploratory (the first to address a particular question, or use a newly-developed methodology) or confirmatory (the latest in a long series of studies addressing the same basic question). Broer et al (2013) describe a simple way to take this into account and increase the likelihood that a reported finding is actually true. Their basic point is that the likelihood that a claimed finding is actually true (which they call the positive predictive value, or PPV) is related to three things: the prior probability (i.e., whether the study is exploratory or confirmatory), the statistical power (i.e., the probability of finding an effect if it really exists), and the Type I error rate (i.e., the P-value or significance threshold used). We have recently described the problems associated with low statistical power in neuroscience (Button et al., 2013).

What Broer and colleagues show is that if we adjust the P-value threshold we use, depending on whether a study is exploratory or confirmatory, we can dramatically increase the likelihood that a claimed finding is true. For highly exploratory research, with a very low prior probability, they suggest a P-value of 1 × 10^-7. Where the prior probability is uncertain or difficult to estimate, they suggest a value of 1 × 10^-5. Only for highly confirmatory research, where the prior probability is high, do they suggest that a “conventional” value of 0.05 is appropriate.

Psychologists are notorious for having an unhealthy fixation on P-values, and particularly the 0.05 threshold. This is unhelpful for lots of reasons, and many journals now discourage or even ban the use of the word “significant”. The genetics literature that Broer and colleagues draw on has learned these lessons from bitter experience. However, if we are going to use thresholds, it makes sense that these reflect the exploratory or confirmatory nature of our research question. Fewer findings might pass these new thresholds, but those that do will be much more likely to be true.

References:

Broer L, Lill CM, Schuur M, Amin N, Roehr JT, Bertram L, Ioannidis JP, van Duijn CM. (2013). Distinguishing true from false positives in genomic studies: p values. Eur J Epidemiol; 28(2): 131-8.

Button KS, Ioannidis JP, Mokrysz C, Nosek BA, Flint J, Robinson ES, Munafò MR. (2013). Power failure: why small sample size undermines the reliability of neuroscience. Nat Rev Neurosci; 14(5): 365-76.

Posted by Marcus Munafo and thanks to Mark Stokes at Oxford University for the ‘Statistical power is truth power’ image.