Posts

How to host your podcast with github

15 October 2022

If you want to make your own podcasts the usual method is probably to use a commercial service that handles all the file hosting and distribution. However the process is actually surprisingly simple for those with a bit of technical knowledge and patience. Web standards are really the key to how podcasting works. RSS (Really Simple Syndication) and its offshoot Atom are how sites create a ‘feed’ that indicates when updates are made. RSS is just XML-formatted plain text. This...

Read more...

Excess mortality in Ireland is still high in 2022

08 September 2022

In a previous post I showed how can use daily deaths from RIP.ie to get up to date mortality estimates in Ireland. This is a useful alternative to official GRO data which lag behind by some months. It is possible to determine sex from the death notices but not age. To summarise, RIP.ie data shows unusually high mortality for 2021/22 which continues to the present (September 2022). Below is an updated plot showing how 2022 values are still trending higher....

Read more...

Mapping the historical development of Tallaght

13 August 2022

Tallaght is large surburb of Dublin about 13 km southwest of Dublin city, near the foothills of the Wicklow Mountains. Originally founded as a monastic settlement in 769 AD, it later became an important defensive outpost along the ‘Pale’ boundary. It remained a rural village until the 1960s when the Irish government commissioned town planner Myles Wright to devise an expansion plan for Dublin City. The Myles Wright Plan which was broadly adopted resulted in the creation of the three...

Read more...

Plotting gridded quantitative data with geopandas - Irish forestry

07 August 2022

A previous post showed how to create grids over polygons using geopandas. To make practical use of this requires you have some kind of quantitative data that you want to bin into each grid polygon. This could be a summary statistic over that area derived from a more fine grained spatial dataset. To do this we would combine the grid with the original data using sjoin and then aggregate using dissolve with some defined aggregating function. sjoin is like merging...

Read more...

Make regular grids from polygons with geopandas

04 August 2022

Geopandas is a Python package that provides a geospatial extension to pandas. Geodataframes store geographic data such as points and polygons which can be plotted. This post adapts code from both James Brennans and Sabrina Chans blogs to show how to make square and hexagonal grids out of any polygons. This method is often used to bin areas in discrete regions for the purpose of representing summary statistics. The functions are given below. Both uses the total boundary area and...

Read more...

Plot phylogenies with annotation in R using ggtree and gheatmap

02 August 2022

There are many online examples of how to draw phylogenetic trees using various R tools. One is ggtree, based on the ggplot packages, which provides a wide range of options. This example shows how to write some functions that can plot trees with an arbitrary number of heatmap annotations, given the appropriate meta data in a data.frame object. The columns to be used are provided as a list along with color maps for each. The first column is used for...

Read more...

Parallelize a function in Python that returns a pandas DataFrame

10 April 2022

A common way to make functions run faster is to parallelize them. One way to achieve this in Python is to use the multiprocessing library. It can be tricky to get right though and won’t lend itself well to certain kinds of function. One application is when the function operates over a range of values that allow it to be split into smaller pieces, a subset of the whole. These pieces can then be joined together at the end. This...

Read more...

batchfilerename - A simple utility for batch file renaming

28 March 2022

Changing filenames in bulk is time consuming if done manually. If you are changing the filenames according to some pre-determined pattern then it’s much easier to automate the process. You can even do this with the ren command in Windows though it’s run from the command line. On linux you can use smart-file-renamer or a host of command line methods. If you want something simple, batchfilerename batch file renaming utility written in pure Python. It provides a graphical dialog that...

Read more...

Using IGV inside Jupyter Lab notebooks

20 March 2022

IGV is a popular Java based genome browser. You can install IGV in linux with a package manager such as using apt in Ubuntu or download the binary. Jupyter notebooks are browser based environments for Python programming with a lot of additional libraries that add functionality. Widgets can also be used. IGV also uses the igv.js javascript file to implement browser embedding. This also allows it to be used in Jupyter Lab by using the igv-jupyterlab package. Install using: pip...

Read more...

Scrape paginated tables in Python with beautifulsoup

23 February 2022

Say you want to extract a large table from multiple web pages without having to manually copy and paste. Ideally you would access the data from a website via an API. But some websites are just built for human readability and don’t provide this. However the salient information can be parsed out of web page elements in target pages in many cases. This is called scraping. Python provides easy methods to automatically retrieve and parse such data. This can be...

Read more...

All posts