Post tagged: data-science

Pointers for analytics of healthcare data

A start

In recent time I've been asked for pointers on analysing complex healthcare data. This is a difficult issue. Healthcare analytics / health informatics / medical informatics / etc. range over a wide area, driven a wide variety of interests and outcomes, overlapping hugely in some areas and not at all in others. The …

The rules of analysis

Opinions, I got 'em'

While mentoring some juniors, I started to think about the rules of thumb for analysing data that I've built up over the years. While I'm certainly not the world's greatest data scientist (or it's greatest bioinformatician, statistician, biomedical scientist, etc.), it seems worthwhile trying to capture them here. these are …

What I done learned about REDCap

A few surprises

For those not in the know, REDCap is a platform for creating and editing databases through the web. And by and large, it works fine. It saves a lot of development effort. It provides good reporting tools for users. It's secure and robust. But there are some things to be …

(Re-)building databases with csvsql

Tips and traps when going from a database dump to a database

The scenario

You have a bunch of related CSV files.

Maybe they're the result of a raw database dump. Maybe they've been generated in some other way: experimental results, various public data sets, whatever. But the important thing is that you need to make a database from them. Perhaps because …

Tools for data

How to store data, what to use

Prompted by a recent tweet asking what people used for storing and managing their data, I wrote down my own hard-won lessons on the topic. In rough order of preference and data complexity:

A hierarchical strategy

Use restructured text for documentation

Or markdown / asciidoc. The advantages of this being:

  • It's …