Recent quotes:

When the insurance company monitors your driving in real time does it help? New research finds that it helps on a number of levels, from safety to consumer cost -- ScienceDaily

"We found that UBI users tend to improve the safety of their driving in general, and in once specific area by decreasing their daily average number of hard-brakes by an average of 21 percent after six months," said Miremad Soleymanian. "Our research found that the number miles driven tend to stay the same and that both younger drivers and females tend to improve their UBI scores more than older drivers and males."

Side-effects not fully reported in more than 30 percent of healthcare reviews -- ScienceDaily

The new study looked at the reporting of adverse events in 187 systematic reviews published between 2017 and 2018. Systematic reviews in health research aim to summarise the results of controlled healthcare interventions and provide evidence of the effectiveness of a healthcare intervention. Research showed that 35 per cent of reviewers did not fully report the side-effects of the medical intervention under review. Dr Su Golder, from the University of York's Department of Health Sciences, said: "Despite reviewers stating in their own protocols that adverse events should be included in the review, 65 per cent fully reported the event as intended by the protocol, eight per cent entirely excluded them, and the remaining 27 per cent either partially reported or changed the adverse event outcomes." "Just over 60 per cent, however, didn't even include adverse events in their protocols, which suggests that a more proactive approach is needed to prompt reviewers to report on potential harmful side-effects in their reporting of healthcare interventions."

Do differences in gait predict the risk of developing depression in later life? -- ScienceDaily

Gait parameters and mental health both have significant impacts on functional status in later life. The study's findings suggest that gait problems may represent a potentially modifiable risk factor for depression.

Gut Bacteria Linked to Depression Identified – Neuroscience News

Mireia Valles-Colomer (VIB-KU Leuven): ‘Many neuroactive compounds are produced in the human gut. We wanted to see which gut microbes could participate in producing, degrading, or modifying these molecules. Our toolbox not only allows to identify the different bacteria that could play a role in mental health conditions, but also the mechanisms potentially involved in this interaction with the host. For example, we found that the ability of microorganisms to produce DOPAC, a metabolite of the human neurotransmitter dopamine, was associated with better mental quality of life.’

‘Omnigenic’ Model Suggests That All Genes Affect Every Complex Trait | Quanta Magazine

Starting about 15 years ago, geneticists began to collect DNA from thousands of people who shared traits, to look for clues to each trait’s cause in commonalities between their genomes, a kind of analysis called a genome-wide association study (GWAS). What they found, first, was that you need an enormous number of people to get statistically significant results — one recent GWAS seeking correlations between genetics and insomnia, for instance, included more than a million people. Second, in study after study, even the most significant genetic connections turned out to have surprisingly small effects. The conclusion, sometimes called the polygenic hypothesis, was that multiple loci, or positions in the genome, were likely to be involved in every trait, with each contributing just a small part. (A single large gene can contain several loci, each representing a distinct part of the DNA where mutations make a detectable difference.)

will.i.am on personal data ownership

Personal data needs to be regarded as a human right, just as access to water is a human right. The ability for people to own and control their data should be considered a central human value. The data itself should be treated like property and people should be fairly compensated for it.

WillIAm-on-consumer-health-data-services

Today, my gadgets may count my steps, but they aren’t seeing the big picture: what I ate, how I felt, what my blood pressure is. New services, built from the point of view of the consumer, will benefit me by sharing and interconnecting my own data, rather than selling it on. When more trust is established, my personal “agent” or “assistant” should merge relevant things together that are currently just disconnected data points.

BARBARIANS AT THE GATE: CONSUMER-DRIVEN HEALTH DATA COMMONS AND THE TRANSFORMATION OF CITIZEN SCIENCE

A few state court cases have found patients own their medical records under specific circumstances.118 Unfortunately, the pertinent body of state medical records law generally applies in traditional healthcare settings and seemingly does not govern commercial providers of PHD devices and services, such as purveyors of medical and fitness devices. Courts do not recognize an individual property right in personal information such as one’s name, address, and social security number.119 Commercial databases that hold such information are generally treated as the property of the companies that compiled them.120 In a famous case121 where plaintiffs sought to block a company from disclosing their personal information by selling its mailing lists, Vera Bergelson notes an implicit judicial bias “that, to the extent personal information may be viewed as property, that property belongs to the one who collects it.”122 This bias— if it exists—is reminiscent of the ancient res nullius doctrine from natural resource law, which treated assets such as subsurface mineral deposits and wild animals as unowned until somebody discovers and captures (takes possession of) them.123 “Rarely used today, it let private owners stake claims as in the Klondike gold rush.”124

BARBARIANS AT THE GATE: CONSUMER-DRIVEN HEALTH DATA COMMONS AND THE TRANSFORMATION OF CITIZEN SCIENCE

This article explores how these mechanisms, imbedded in major federal research and privacy regulations, enshrine institutional data holders—entities such as hospitals, research institutions, and insurers that store people’s health data—as the prime movers in assembling large-scale data resources for research and public health. They rely on approaches—such as de-identification of data and waivers of informed consent—that are increasingly unworkable going forward. They shower individuals with unwanted, paternalistic protections—such as barriers to access to their own research results—while denying them a voice in what will be done with their data.

Privacy in the age of medical big data | Nature Medicine

Big data is often defined by ‘three Vs’: volume (large amounts of data), velocity (high speed of access and analysis), and variety (substantial data heterogeneity across individuals and data types), all of which appear in medical data2.

All of Us enrollees can now share health data from their Fitbit accounts with researchers | MobiHealthNews

“Collecting real-world, real-time data through digital technologies will become a fundamental part of the program,” Eric Dishman, director of the All of Us Research Program, said in a statement. “This information in combination with many other data types will give us an unprecedented ability to better understand the impact of lifestyle and environment on health outcomes and, ultimately, develop better strategies for keeping people healthy in a very precise, individualized way.”

Everything big data claims to know about you could be wrong: To understand human health and behavior, researchers would do better to study individuals, not groups -- ScienceDaily

"If you want to know what individuals feel or how they become sick, you have to conduct research on individuals, not on groups," said study lead author Aaron Fisher, an assistant professor of psychology at UC Berkeley. "Diseases, mental disorders, emotions, and behaviors are expressed within individual people, over time. A snapshot of many people at one moment in time can't capture these phenomena." Moreover, the consequences of continuing to rely on group data in the medical, social and behavioral sciences include misdiagnoses, prescribing the wrong treatments and generally perpetuating scientific theory and experimentation that is not properly calibrated to the differences between individuals, Fisher said.

Can Big Data Help Psychiatry Unravel the Complexity of Mental Illness? - Scientific American

Psychiatrist Charles DeBattista of Stanford University and colleagues, compared electroencephalograms (EEGs) collected from depressed patients, with a database of EEGs from over 1,800 patients that included information about response to specific treatments. Using EEG measures to guide decisions about treatment alternatives led to significantly better outcomes than clinical treatment selection.

Is soda bad for your brain? (And is diet soda worse?): Both sugary, diet drinks correlated with accelerated brain aging -- ScienceDaily

Now, new research suggests that excess sugar -- especially the fructose in sugary drinks -- might damage your brain. Researchers using data from the Framingham Heart Study (FHS) found that people who drink sugary beverages frequently are more likely to have poorer memory, smaller overall brain volume, and a significantly smaller hippocampus -- an area of the brain important for learning and memory. But before you chuck your sweet tea and reach for a diet soda, there's more: a follow-up study found that people who drank diet soda daily were almost three times as likely to develop stroke and dementia when compared to those who did not.

Fitbit creates research library with Fitabase, publishes results of corporate wellness study | MobiHealthNews

The library currently has 163 different published studies that mention using a Fitbit (or a few of them) as part of their study design. The pace of research using the wearables has been accelerating every year, Ramirez said, posing what Fitabase believed was a need for a comprehensive library. “So we wanted to make it a public resource where anyone who wants to explore Fitbit research can have a one-stop shop. It’s meant to be a library down the street, and it will continue to grow as people do more research.”

Instagram photos reveal predictive markers of depression

Using Instagram data from 166 individuals, we applied machine learning tools to successfully identify markers of depression. Statistical features were computationally extracted from 43,950 participant Instagram photos, using color analysis, metadata components, and algorithmic face detection. Resulting models outperformed general practitioners' average diagnostic success rate for depression. These results held even when the analysis was restricted to posts made before depressed individuals were first diagnosed. Photos posted by depressed individuals were more likely to be bluer, grayer, and darker. Human ratings of photo attributes (happy, sad, etc.) were weaker predictors of depression, and were uncorrelated with computationally-generated features. These findings suggest new avenues for early screening and detection of mental illness.

How Vector Space Mathematics Reveals the Hidden Sexism in Language

The team does this by searching the vector space for word pairs that produce a similar vector to “she: he.” This reveals a huge list of gender analogies. For example, she;he::midwife:doctor; sewing:carpentry; registered_nurse:physician; whore:coward; hairdresser:barber; nude:shirtless; boobs:ass; giggling:grinning; nanny:chauffeur, and so on. The question they want to answer is whether these analogies are appropriate or inappropriate. So they use Amazon’s Mechanical Turk to ask. They showed each analogy to 10 turkers and asked them whether the analogy was biased or not. They consider the analogy biased if more than half of the turkers thought it was biased.

New Depression Model Outperforms Psychiatrists

Data mined from clinical trials may soon help doctors tailor antidepressant therapy to their patients, the authors say. Currently, only about 30% of patients get relief from the first drug they are prescribed, and it can often take a year or more before doctors find the right medication to alleviate symptoms of depression. The Yale team analyzed data from a large clinical trial on depression and pinpointed 25 questions that best predicted the patients’ response to a particular antidepressant. Using these questions, they developed a mathematical model to predict whether a patient will respond to Celexa after three months of treatment. “These are questions any patient can fill out in 5 or 10 minutes, on any laptop or smartphone, and get a prediction immediately,” explained Adam Chekroud, Ph.D. candidate in the Human Neuroscience Lab and lead author of the paper.

@elerianm says too much data dilutes the experience

Multiple metrics can confuse rather than enlighten; and they can add to a sense of underachievement. My Fitbit, even though it’s the most basic model, goes beyond measuring steps and miles. It also claims to be able to tell me how many calories I have burned and the number of “active minutes” in the day -- and it sets a daily target for each. I have no idea how I am supposed to internalize all these data points, including their order of importance. So I find myself pursuing multiple objectives that are highly correlated but, frustratingly, are not sufficiently linear in their relationship -- adding to the potential for performance anxiety.

Apple to Spend $1.9 Billion Building Two Europe Data Centers

Apple Inc. plans to spend 1.7 billion euros ($1.9 billion) building data centers in Ireland and Denmark in its biggest-ever European investment[…] The centers, located in Athenry, Ireland, and Viborg, Denmark, will be powered by renewable energy[…] The project lets Apple address European requests for data to be stored closer to local users and authorities, while also allowing it to benefit from a chilly climate that helps save on equipment-cooling costs.

Edge cases are expensive to solve

The old saying in the machine learning community is that “machine learning is really good at partially solving just about any problem.” For most problems, it’s relatively easy to build a model that is accurate 80–90% of the time. After that, the returns on time, money, brainpower, data etc. rapidly diminish. As a rule of thumb, you’ll spend a few months getting to 80% and something between a few years and eternity getting the last 20%. (Incidentally, this is why when you see partial demos like Watson and self-driving cars, the demo itself doesn’t tell you much — what you need to see is how they handle the 10–20% of “edge cases” — the dog jumping out in front of the car in unusual lighting conditions, etc).

Analytics moves from last touch to holistic

They were doing that analysis for some time actually with a method called “last touch.” That means identifying the last thing that the customer did before they bought—as in they clicked an ad and then they bought whatever. The company figured it must’ve been that ad that caused the customer to buy the print. Or someone got a direct mail campaign message and then they bought the calendar. That was the motivation. [Shutterfly] looked at that process and said, “You know, that’s a good model. It’s a good approximation, but it would be better to look at everything touching the user before their last purchase and since the purchase before that.” This greatly expanded the data that they had to consider to do the analysis, so the process became very slow. It took two days to compute the likely marketing channels for all their orders.

The Internet Archive tries to remember

Right now, the archive holds around 20 petabytes of data, including 500,000 pieces of software, more than 2 million books, 3 million hours of TV, and 430 billion web pages. In a single day, they digitize more than 1,000 books. They capture TV 24 hours a day. In a week, they save more than 1 billion URLs. As of 2013, only 8 percent of the archive was uploaded by users, some 53,000 people who have accounts with the archive. In order to continue the work of creating “universal access to all knowledge,” as is the archive’s mission, they want to get as many people working on the project as possible.

Biggest data

In a collective farm, a pig gave birth to three piglets. The Party committee was convened and decided that to report about only three piglets would make a bad impression in the district Party committee. So, they reported that five piglets were born in the farm. The district Party committee reported to the Region Party committee that seven piglets were born in the collective farm. In their report to the Ministry of Agriculture, the Region Party committee advised that the socialist obligation to increase the number of pigs by twelve, has been successfully fulfilled. To please comrade Brezhnev, the Ministry reported that twenty piglets were born, ahead of the planned date. "Very good," comrade Brezhnev said. "Three piglets you'll give to the workers of Leningrad. Three you'll give to the heroic city of Moscow. Five you'll put aside for exports. Five you'll send to the starving African children. The rest you store as a strategic food reserve. Nobody shall touch it!"

The Milky Way's location

Back in 2010, I signed up for the email lists of 70 advocacy groups.  I collected over 2100 emails from them over a six-month period, and hand-coded each of them.  I also watched Rachel Maddow and Keith Olbermann every night and recorded the topics of the two shows.  The data analysis was tedious and left me with a wicked caffeine addiction.  But it also left me with an unmatched understanding of e-mail membership activation strategies. So that’s why I hand-code all my own data.  Call me the crotchety old guy of the “big data” age.  While everyone else is learning hadoop and python, I’m still futzing around with Excel.  But there’s a method to the madness.  It’s thought-work, which leads to insights, which improve my other methods.  Coding my own data gives me a feel for the research topic.