• Crash Course in Open Targets Part 3: Taming the Data Downloads

    12 min read

    Now we’ve covered the basics of Open Targets and dug deeper into the evidence that makes up the data, this final part of the crash course covers how to harness it at a bigger scale through the data downloads.

  • Crash Course in Open Targets Part 2: Genetics Deep Dive

    9 min read

    In Part 1 of this crash course, we learnt how to browse and access data in Open Targets through the web interface and the GraphQL APIs. In Part 2, I’ll go into more detail about the kind of data that’s available, and demonstrate how you might explore the genetic evidence linking targets and diseases.

  • Crash course in Open Targets Part 1: Browsing and Querying Target-Disease Associations

    11 min read

    Open Targets is a public-private partnership for systematic identification, prioritisation, and validation of drug targets. In this three-part blog post series, I will give an overview of Open Targets, the data available, and how to access it through the web interface, programmatically through the APIs, and using the data downloads.

  • Claps and Comments in Blogdown

    6 min read

    I don’t track visitors on this blog, and while that is nice for their privacy and for my conscience, it does sometimes feel like talking into the void. So I recently added three features for readers to communicate: comment functionality via utterances, an applause-button by Colin Eberhardt, and some buttons for sharing posts to social media.

  • rstudio::global(2021) %>% summarise()

    15 min read

    rstudio’s annual conference went online and global this year, which meant that I could attend for the first time. There were lots of interesting talks, and it was great to see familiar faces from GitHub/Twitter avatars come to life! This is a summary of the notes I took during the talks I attended, including links to watch the full talk where available; to see all the available talk videos visit the rstudio website.

  • Making Tables Shiny: DT, formattable, and reactable

    21 min read

    In this blog post I demonstrate the basics of each package, with some formatting examples. All three are nice, but I have a preference for DT because it’s quick and highly customisable. I include some extra DT examples including how to include icons, images, tooltips, heatmap-style column fills, abbreviated column output, and how to make very simple tables.

  • Packaging Your R Code

    8 min read

    Packages are more than just a convenient way to use methods and data sets developed by others in the community, and to get your own out into the world. Creating a package is also convenient way to collect your project in a tidy, documented, tested, reproducible and shareable form. Not only that, but with the usethis workflow package, the process is extremely easy and encourages good coding practice.

  • UpSetR: Beyond the Venn diagram

    7 min read

    UpSetR is a package for visualising the intersections of many more sets than is feasible with, for example, Venn diagrams. They are particularly useful when there are many sets but the intersections are relatively sparsely populated. In my research, I find these plots extremely powerful for showing large amounts of information in an attractive and intuitive way, with lots of options for exploring the data.

  • Handling JSON data with jq

    3 min read

    For a new project I’ve unwillingly transitioned from the safety and comfort of CSV files to a scary new world of JSON files.

  • Combining inset plots with facets using ggplot

    4 min read

    I recently spent some time working out how to include mini inset plots within ggplot facets, and I thought I would share my code in case anyone else wants to achieve a similar thing.

  • Not-proteins in Parliament

    3 min read

    Last term I took a break from folding proteins to spend three months working in Westminster at the Parliamentary Office of Science and Technology (POST).

  • Brief introduction to ggpairs

    4 min read

    In this blog post I will introduce a fun R plot, ggpairs, that’s useful for exploring distributions and correlations.

  • Introduction to R Markdown

    4 min read

    Two of our esteemed OPIGlets presented a workshop on collaborative research using Jupyter Notebook this week at ISMB in Chicago. Their workshop highlights the importance of finding ways to share your work conveniently and reproducibly. So on a related note, I thought I would share a brief introduction to another useful tool, R Markdown with RStudio, which I use to present updates to various supervisors and to remember what I did three months (or three days) ago. This method of sharing work is highly readable, reproducible, and narrative-driven.

  • Latexing with gvim

    3 min read

    Here I’ll share my set-up for writing Latex with gvim instead of a separate Latex editor. If you are text-editor averse, this blog post is not for you. But if, like me, you love vim and hate useless GUIs, this might be helpful.

  • Protein Structure Classification: Order in the Chaos

    4 min read

    This post gives an overview of systematic approaches for protein structure classification based on sequence and structural similarity, which are essential for understanding protein function and evolution, as well as for training and benchmarking new methods for protein structure prediction.