Reusing open data

I was thrilled when I learned that the QUEST center at the BIH was going to award open data reuse awards. The details can be found on their website, but the bottom line is this: open science does not only implicate opening up your data, but actually the use of open data. So if everybody open up their data, but nobody is actually using it, the added values is quite limited. 

For that reason I started some projects back in 2015/2016 designed to see how easy it actually is to find data that could be used to answer a question that you are actually interested in. The answer is, not always as easy. The required variables might not be there, and even i they are, it is quite complex to start using a database that is not build by yourself. To understand the value of your results, you have to understand how the data was collected. One study proofed to be so well documented that it was a contender: the English Longitudinal Study on Aging. on of the subsequent analyses that did was published in a paper that I have mentioned before on this blog. The good news, and the reason why I am writing this blog entry, is because this paper was just awarded the BIH QUEST open data reuse data award.

The award has a 1000 euro attached to it, money the group can spend on travel and consumables. Now, do not get me wrong, 1000 euro is nothing to sneeze at. But 1000 euro is not going to be major driver in your decision whether to reuse open data or not. But the award is nice and I hope effective in stimulating open science, especially as can stimulate the conversation and critical evaluation on the value of reusing open data .     

Advertisements

Long journey, short(ish) story

This is a short story about a long journey. It is about a of which the journey started in 2013 if I am not mistaken. In that year, we decided to link the RATIO case-control study to the data from the Central Buro of Statistics (CBS) in the Netherlands, allowing us to turn the case-control study into a follow-up study.

The first results of this analyses were already published some time ago under as “Recurrence and Mortality in Young Women With Myocardial Infarction or Ischemic Stroke”. To get these results in that journal, we were asked to reduce the paper to a letter. WE did and hope we were able to keep the core message clean and clear: the risk of arterial events, after arterial events, remains high over long period of time 15+ years) and remain true to type.

Just last week (!) we published another analyses of the data, where we contrast the long term risk for those with a presumably hypercoagulable blood profile to those who do not show a tendency to clotting. The bottom line is that, if anything, there is a dose-response between hypercoagulability and arterial thrombosis for ischemic stroke patients, but not for myocardial infarction patients. This is all in line with the conclusions on the role of hypercoagulability and stroke based on data from the same study. But I have to be honest: the evidence is not that overwhelming: the precision is low, as seen by the broad confidence intervals. And with regard to the point estimates, no clinically relevant effects seen. Then again, it is a piece of the puzzle that is needed to understand the role of hypercoagulability in young stroke.

main figure from the paper: Q4 vs Q1 is almost doubling in risk

There is a lot to tell about this publication: how difficult it was to get the study data linked to the CBS to get to the 15 year follow up, how AM did a fantastic job organizing the whole project,  how quartile analyses are possibly not the best way to capture all information that is in the data, how we had tremendous delays because of peer review – especially in the last journal, or how bad some of the peer review reports were, how one of the peer reviewers was a commercial enterprise – which for some time paid people to do peer review, how the peer review reports are all open, how it was to get the funding for getting the paper not locked away behind a paywall.

But I want to keep this story short and not dwell too much on the past. The follow-up period was long, the time it took u to get this published was long, let us keep the rest of the story as short as possible. I am just glad that it is published and finally to be shared with the world.

Pre-prints start to sound better and better…

Finding consensus in Maastricht

source https://twitter.com/hspronk

Last week, I attended and spoke at the Maastricht Consensus Conference on Thrombosis (MCCT). This is not your standard, run-of-the-mill, conference where people share their most recent research. The MCCT is different, and focuses on the larger picture, by giving faculty the (plenary) stage to share their thoughts on opportunities and challenges in the field. Then, with the help of a team of PhD students, these thoughts are than further discussed in a break out session. All was wrapped up by a plenary discussion of what was discussed in the workshops. Interesting format, right?

It was my first MCCT, and I had difficulty envisioning how exactly this format will work out beforehand. Now that I have experienced it all, I can tell you that it really depends on the speaker and the people attending the workshops. When it comes to the 20 minute introductions by the faculty, I think that just an overview of the current state of the art is not enough. The best presentations were all about the bigger picture, and had either an open question, a controversial statement or some form of “crystal ball” vision of the future. It really is difficult to “find consensus” when there is no controversy as was the case in some plenary talks. Given the break-out nature of the workshops, my observations are limited in number. But from what I saw, some controversy (if need be only constructed for the workshop) really did foster discussion amongst the workshop participants.

Two specific activities stand out for me. The first is the lecture and workshop on post PE syndrome and how we should able to monitor the functional outcome of PE. Given my recent plea in RPTH for more ordinal analyses in the field of thrombosis and hemostasis – learning from stroke research with its mRS- we not only had a great academic discussion, but made immediately plans for a couple of projects where we actually could implement this. The second activity I really enjoyed is my own workshop, where I not only gave a general introduction into stroke (prehospital treatment and triage, clinical and etiological heterogeneity etc) but also focused on the role of FXI and NETS. We discussed the role of DNase as a potential for co-treatment for tPA in the acute setting (talking about “crystal ball” type of discussions!). Slides from my lecture can be found here (PDF). An honorable mention has to go out to the PhD students P and V who did a great job in supporting me during the prep for the lecture and workshop. Their smart questions and shared insights really shaped my contribution.

Now, I said it was not always easy to find consensus, which means that it isn’t impossible. In fact, I am sure that themes that were discussed all boil down to a couple opportunities and challenges. A first step was made by HtC and HS from the MCCT leadership team in the closing session on Friday which will proof to be a great jumping board for the consensus paper that will help set the stage for future research in our field of arterial thrombosis.

Messy epidemiology: the tale of transient global amnesia and three control groups

Clinical epidemiology is sometimes messy. The methods and data that you might want to use might not be available or just too damn expensive. Does that mean that you should throw in the towel? I do not think so.

I am currently working in a more clinical oriented setting, as the only researcher trained as a clinical epidemiologist. I could tell about being misunderstood and feeling lonely as the only who one who has seen the light, but that would just be lying. The fact is that my position is one privilege and opportunity, as I work with many different groups together on a wide variety of research questions that have the potential to influence clinical reality directly and bring small, but meaningful progress to the field.

Sometimes that work is messy: not the right methods, a difference in interpretation, a p value in table 1… you get the idea. But sometimes something pretty comes out of that mess. That is what happened with this paper, that just got published online (e-pub) in the European Journal of Neurology.  The general topic is the heart brain interaction, and more specifically to what extent damage to the heart actually has a role in transient global amnesia. Now, the idea that there might be a link is due to some previous case series, as well as the clinical experience of some of my colleagues. Next step would of course to do a formal case control-study, and if you want to estimate true measure of rate ratios, a lot effort has to go into the collection of data from a population based control group. We had neither time nor money to do so, and upon closer inspection, we also did not really need that clean control group to answer some of our questions that would progress to the field.

So instead, we chose three different control groups, perhaps better referred as reference groups, all three with some neurological disease. Yes, there are selections at play for each of these groups, but we could argue that those selections might be true for all groups. If these selection processes are similar for all groups, strong differences in patient characteristics of biomarkers suggest that other biological systems are at play. The trick is not to hide these limitations, but as a practiced judoka, leverage these weaknesses and turn them into a strengths. Be open about what you did, show the results, so that others can build on that experience.

So that is what we did. Compared patients with migraine with aura, vestibular neuritis and transient ischemic attack, patients with transient global amnesia are more likely to exhibitsigns of myocardial stress. This study was not designed – nor will if even be able to – understand the cause of this link, not do we pretend that our odds ratios are in fact estimates of rate ratios or something fancy like that. Still, even though many aspects of this study are not “by the book”, it did provide some new insights that help further thinking about and investigations of this debilitating and impactful disease.

The effort was lead by EH, and the final paper can be found here on pubmed.

Genetic determinants of activity and antigen levels of contact system factors

2018-11-08 12_43_09-RATIO instol zymogen.ppt [Compatibility Mode] - PowerPoint
One of my slides with a cartoon of the intrinsic coagulation system. I know, the reality is way more complicated, but still, I like the picture!
The contact system, or intrinsic coagulation system, have for a long time been an undervalued part of the thrombosis and hemostasis field. Not by me. I love FXI & FXII Not just now, since FXI is suddenly the “new kid on the block” as the new target for antithrombotic treatment through ASOs, but already since I started my PhD in 2007/2008. As any of my colleagues from back then will confirm, I couldn’t shut up about FXI and FXII as I thought that my topic was the only relevant topic in the world. Although common amongst young researcher, I do apologize for this now that I have 20/20 hindsight.

Still, it is only natural that some of the work I continues to be focused on those little bit weird coagulation proteins. Are they relevant to hemostasis? Are they relevant in pathological thrombus formation? What is their role in other biological systems? Questions that the field is only slowly getting answers to. Our latest contribution to this is the analyses of genetic variations in the genes that code for these protein, and estimate if the levels of activation and antigen are in fact -in part- genetically determinant.

This analysis was performed in the RATIO study, from which we primarily focused on the control group. That control group is relatively small for a genetic analyses, but given that we have a relative young group the hope is that the noise is not too bad to pick up some signals. Additionally, given the previous work in the RATIO study, I think this is the only dataset that has a comprehensive phenotyping of the intrinsic coagulation proteins as it includes measures of protein activity, antigen and activation.

The results, which we published in the JTH, are threefold: we were able to confirm previously reported associations between known genetic variations and phenotype. Se were also able to identify two new loci (i.e. KLKB1 rs4253243 for prekallikrein and KNG1rs5029980 for HMWK levels). Third, we did not find evidence of strong associations between variation in the studied genes and the risk of ischemic stroke or myocardial infarction. Small effects can however not be ruled, as the sample size of this study is not enough to yield very precise estimates. 

The work was spearheaded by JLR, with tons of help by HdH, and in collaboration with the thrombosis group at the LUMC.

The paper is published in the JTH, and as always, can also be found at my Mendeley profile.

Getting your life back on track after stroke: returning to work

https://goo.gl/CbNPSE

Stroke severity and incidence might be stabilizing, or even decreasing over time in western countries, but this sure is not true for other parts of the world. But here is something to think about: with increasing survival, people will suffer longer from the consequences of stroke. This is of course especially true if the stroke occured at a young age.

To understand the true impact of stroke, we need to look beyond increased risk of secondary events. We need to understand how the disease affects day-to-day life, especially long term in young stroke patients. The team in Helsinki (HSYR) took a look at the pattern of young stroke patients returning to work. The results:

We included a total of 769 patients, of whom 289 (37.6%) were not working at 1 year, 323 (42.0%) at 2 years, and 361 (46.9%) at 5 years from IS.

That is quite shocking! But how about the pattern? For that we used lasagna plots, something like heatmaps for longitudinal epidemiological data. The results are above: the top panel is just the data like in our database, while the lower data has some sorting to help interpret the results a bit better. 

The paper can be found here, and I am proud to say that it is open access, but you can as always just check my Mendeley profile.

Aarnio K, Rodríguez-Pardo J, Siegerink B, Hardt J, Broman J, Tulkki L, Haapaniemi E, Kaste M, Tatlisumak T, Putaala J. Return to work after ischemic stroke in young adults. Neurology 2018; 0: 1.

Cardiac troponin T and severity of cerebral white matter lesions: quantile regression to the rescue

quantile regression of high vs low troponin T and white matter lesion quantile

A new paper, this time venturing into the field of the so-called heart-brain interaction. We often see stroke patients with cardiac problems, and vice versa. And to make it even more complex, there is also a link to dementia! What to make of this? Is it a case of chicken and the egg, or just confounding by a third variable?  How do these diseases influence each other?

This paper tries to get a grip on this matter by zooming in on a marker of cardiac damage, i.e. cardiac troponin T. We looked at this marker in our stroke patients. Logically, stroke patients do not have increased levels of troponin T, yet, they do. More interestingly, the patients that exhibit high levels of this biomarker also have high level of structural changes in the brain, so called cerebral white matter lesions. 

But the problem is that patients with high levels of troponin T are different from those who have no marker of cardiac damage. They are older and have more comorbidities, so a classic case for adjustment for confounding, right? But then we realize that both troponin as well as white matter lesions are a left skewed data. Log transformation of the variables before you run linear regression, but then the interpretation of the results get a bit complex if you want clear point estimates as answers to your research question.

So we decided to go with a quantile regression, which models the quantile cut offs with all the multivariable regression benefits. The results remain interpretable and we don’t force our data into distribution where it doesn’t fit. From our paper:

In contrast to linear regression analysis, quantile regression can compare medians rather than means, which makes the results more robust to outliers [21]. This approach also allows to model different quantiles of the dependent variable, e.g. 80th percentile. That way, it is possible to investigate the association between hs-cTnT in relation to both the lower and upper parts of the WML distribution. For this study, we chose to perform a median quantile regression analysis, as well as quantile regression analysis for quintiles of WML (i.e. 20th, 40th, 60th and 80th percentile). Other than that, the regression coefficients indicate the effects of the covariate on the cut-offs of the respective quantiles of the dependent variable, adjusted for potential covariates, just like in any other regression model.

Interestingly, the result show that association between high troponin T and white matter lesions is the strongest in the higher quantiles. If you want to stretch to a causal statement that means that high troponin T has a more pronounced effect on white matter lesions in stroke patients who are already at the high end of the distribution of white matter lesions. 

But we should’t stretch it that far. This is a relative simple study, and the clinical relevance of our insights still needs to be established. For example, our unadjusted results might indicate that the association in itself might be strong enough to help predict post stroke cognitive decline. The adjusted numbers are less pronounced, but still, it might be enough to help prediction models.

The paper, led by RvR, is now published in J of Neurol, and can be found here, as well as on my mendeley profile.

 von Rennenberg R, Siegerink B, Ganeshan R, Villringer K, Doehner W, Audebert HJ, Endres M, Nolte CH, Scheitz JF. High-sensitivity cardiac troponin T and severity of cerebral white matter lesions in patients with acute ischemic stroke. J Neurol Springer Berlin Heidelberg; 2018; 0: 0.