Posts

Make networkx Delaunay graphs from geopandas dataframes
23 January 2023
Geopandas Geodataframes store spatial data such as points and polygons. This post adapts code a from some examples online to show how to convert spatial points into a Delaunay graph. This is a undirected graph with edges between adjacent points only. This may be useful for building a spatial contact network of neighbouring points and doing further processing. Here we convert into networkx graphs. First we make a GeoDataFrame from some random points: def random_points(n): import random # Create an...
Using Molecular Nodes in Blender to visualise proteins
14 January 2023
Blender is a free program capable of advanced 3D modelling, animation and rendering. It is used to create photo realistic images for many applications such as animated films or architecture. I used to be interested in using Blender to create images of proteins and other molecules. This was a rather convoluted process. It seems things have moved on since I last looked at this area. There is now a very useful addon called Molecular Nodes that can import and render...
Scrape dynamic tables in Python with Playwright
29 December 2022
Sometimes it’s necessary to scrape a website or some pages that contain elements generated dynamically, often via javascript. This means you can’t always get all the html content in the page directly from the url. You may have to simulate interaction in the browser like clicking a button on a form and then waiting to get the content. This can be done with the Playwright library which was created to support automated testing of websites. Note that some websites won’t...
Can ChatGPT solve bioinformatic problems with Python?
21 December 2022
There has been a lot of talk about OpenAI’s new technology, ChatGPT. This is basically a very advanced chatbot. How it works is beyond my ability to explain. It is trained on a huge amount of information from the internet and can answer general questions or write poems and essays. It can also code in virtually any language quite well. You will see plenty of youtube videos marvelling at it’s ability to produce (sometimes) usable code upon description of a...
DALLE-2 and AI generated art.
19 December 2022
DALL-E 2 (or just dalle) is an AI model from OpenAI used to generate art and imagery from text entered by humans. It is ‘a generative language model’ that takes sentences (called prompts) and creates corresponding original images. The actual method I cannot explain as I do not understand it well enough. Basically, it appears to be made of two models. One that converts the semantic meaning of some text into a vector space that is an image representation (CLIP)....
How to host your podcast with github
15 October 2022
If you want to make your own podcasts the usual method is probably to use a commercial service that handles all the file hosting and distribution. However the process is actually surprisingly simple for those with a bit of technical knowledge and patience. Web standards are really the key to how podcasting works. RSS (Really Simple Syndication) and its offshoot Atom are how sites create a ‘feed’ that indicates when updates are made. RSS is just XML-formatted plain text. This...
Excess mortality in Ireland is still high in 2022
08 September 2022
In a previous post I showed how can use daily deaths from RIP.ie to get up to date mortality estimates in Ireland. This is a useful alternative to official GRO data which lag behind by some months. It is possible to determine sex from the death notices but not age. To summarise, RIP.ie data shows unusually high mortality for 2021/22 which continues to the present (September 2022). Below is an updated plot showing how 2022 values are still trending higher....
Mapping the historical development of Tallaght
13 August 2022
Tallaght is large surburb of Dublin about 13 km southwest of Dublin city, near the foothills of the Wicklow Mountains. Originally founded as a monastic settlement in 769 AD, it later became an important defensive outpost along the ‘Pale’ boundary. It remained a rural village until the 1960s when the Irish government commissioned town planner Myles Wright to devise an expansion plan for Dublin City. The Myles Wright Plan which was broadly adopted resulted in the creation of the three...
Plotting gridded quantitative data with geopandas - Irish forestry
07 August 2022
A previous post showed how to create grids over polygons using geopandas. To make practical use of this requires you have some kind of quantitative data that you want to bin into each grid polygon. This could be a summary statistic over that area derived from a more fine grained spatial dataset. To do this we would combine the grid with the original data using sjoin and then aggregate using dissolve with some defined aggregating function. sjoin is like merging...
Make regular grids from polygons with geopandas
04 August 2022
Geopandas is a Python package that provides a geospatial extension to pandas. Geodataframes store geographic data such as points and polygons which can be plotted. This post adapts code from both James Brennans and Sabrina Chans blogs to show how to make square and hexagonal grids out of any polygons. This method is often used to bin areas in discrete regions for the purpose of representing summary statistics. The functions are given below. Both uses the total boundary area and...All posts
- 23 Jan 2023 » Make networkx Delaunay graphs from geopandas dataframes
- 14 Jan 2023 » Using Molecular Nodes in Blender to visualise proteins
- 29 Dec 2022 » Scrape dynamic tables in Python with Playwright
- 21 Dec 2022 » Can ChatGPT solve bioinformatic problems with Python?
- 19 Dec 2022 » DALLE-2 and AI generated art.
- 15 Oct 2022 » How to host your podcast with github
- 08 Sep 2022 » Excess mortality in Ireland is still high in 2022
- 13 Aug 2022 » Mapping the historical development of Tallaght
- 07 Aug 2022 » Plotting gridded quantitative data with geopandas - Irish forestry
- 04 Aug 2022 » Make regular grids from polygons with geopandas
- 02 Aug 2022 » Plot phylogenies with annotation in R using ggtree and gheatmap
- 10 Apr 2022 » Parallelize a function in Python that returns a pandas DataFrame
- 28 Mar 2022 » batchfilerename - A simple utility for batch file renaming
- 20 Mar 2022 » Using IGV inside Jupyter Lab notebooks
- 23 Feb 2022 » Scrape paginated tables in Python with beautifulsoup
- 29 Jan 2022 » Ireland mortality data from RIP.ie, updated for 2021
- 12 Jan 2022 » Pandemic restrictions have caused misery in low income countries
- 14 Nov 2021 » High vaccination rates don't prevent transmission of SARS-CoV-2
- 12 Nov 2021 » Seasonality of SARS-CoV-2
- 18 Oct 2021 » Bacterial SNP detection with nanopore vs. illumina sequencing
- 03 Sep 2021 » Natural immunity to SARS-CoV-2
- 10 Jul 2021 » Comparison of SNP detection using duplicate sequencing runs in SNiPgenie
- 19 Jun 2021 » wgMLST vs the reference-align-SNP-calling method for M.bovis
- 15 Jun 2021 » Deaths in Ireland from RIP.ie - another look
- 10 Jun 2021 » A whole genome MLST (wgMLST) implementation in Python
- 18 May 2021 » Viewing the THOR dataset with Bokeh and Panel
- 15 May 2021 » The scale of US bombing in Southeast Asia revealed in the THOR dataset
- 26 Feb 2021 » A phylogenetic tree viewer with Qt and Toytree
- 16 Feb 2021 » A simple GIS plugin for Tablexplore
- 28 Jan 2021 » Ireland deaths in 2019/2020 compared to previous years
- 25 Jan 2021 » Daily deaths in Ireland from RIP.ie in 2019 and 2020
- 20 Jan 2021 » Visualizing Irish girls names since 1970
- 15 Jan 2021 » M. bovis spoligotyping from WGS reads
- 11 Jan 2021 » Linux application packaging and universal formats
- 23 Dec 2020 » Detecting polymorphisms in the RD900 region of MTBC species
- 19 Dec 2020 » Tablexplore - a desktop tool for table analysis
- 29 Nov 2020 » Epidemics, PCR and the dangers of mass testing
- 28 Nov 2020 » Convert a multi-sample VCF to a pandas DataFrame
- 15 Nov 2020 » A network agent based infection model with Mesa
- 10 Nov 2020 » Find PFAM domains in protein sequences with Python
- 02 Nov 2020 » Covid-19 and T cell immunity
- 28 Oct 2020 » Estimating Irelands tree coverage with QGIS and GeoPandas
- 18 Oct 2020 » Build an exe using pyinstaller with GitHub Actions
- 06 Oct 2020 » A simple image gallery in Jekyll without plugins
- 05 Sep 2020 » An MHC-Class I binding predictor with sklearn, part 2
- 18 Aug 2020 » Ireland COVID-19 trend in positive rate
- 15 Aug 2020 » Predicting cross-reactive T cell epitopes in Sars-CoV-2
- 06 Aug 2020 » COVID tracking project - tests vs positive rates
- 24 Jul 2020 » Death causes in England and Wales comparison - Winton Centre
- 21 Jul 2020 » Sequence alignment viewer with Qt/PySide2
- 11 Jul 2020 » Eurostat deaths from all causes dataset plots
- 07 Jul 2020 » pathogenie - A desktop application for microbial genome annotation
- 19 May 2020 » Fasta alignment from a multi sample VCF - a less terrible method
- 12 May 2020 » SNiPgenie - a tool for SNP site detection from NGS data
- 28 Apr 2020 » Simple MTBC regions of difference analysis with Python
- 19 Apr 2020 » Finding all amino acid mutations in SARS-CoV-2
- 14 Apr 2020 » A simple agent based infection model with Mesa and Bokeh
- 07 Apr 2020 » Create a fasta alignment from a multi sample VCF
- 01 Apr 2020 » COVID-19 ECDC data dashboard with Panel
- 28 Mar 2020 » COVID-19 ECDC data plots with Bokeh
- 18 Mar 2020 » Run bcftools mpileup in parallel with Python
- 11 Mar 2020 » Deploy a Python application with snapcraft
- 02 Mar 2020 » Model of the SARS-CoV-2 spike protein in Blender
- 28 Feb 2020 » Explore the SARS-CoV-2 spike protein sequences using Python tools
- 18 Feb 2020 » Updates to a genome annotation on the ENA via Webin-CLI
- 05 Feb 2020 » Plot fastq file metrics with Python
- 30 Jan 2020 » Compile windows exe files with MSYS2
- 25 Jan 2020 » A simple genome browser with Qt and dna_features_viewer
- 06 Jan 2020 » Interactive plots of World development indicators with Panel
- 03 Jan 2020 » Concurrent processes in PySide2/PyQt5 applications
- 14 Dec 2019 » Genome annotation with BLAST, Prodigal and Biopython
- 29 Nov 2019 » Embed Bokeh plots in Jekyll markdown
- 28 Nov 2019 » Categorical region plots with geopandas
- 15 Nov 2019 » Choropleth maps with geopandas, Bokeh and Panel
- 05 Nov 2019 » Analysis of MTBC regions of difference with NucDiff
- 20 Oct 2019 » Rapid Average Nucleotide Identity calculation with FastANI
- 13 Oct 2019 » NucDiff for bacterial whole genome comparisons
- 30 Sep 2019 » Plotting global sea ice extent data with four different Python packages
- 24 Sep 2019 » Interactively view datasets with HoloViews
- 15 Sep 2019 » Javascript callbacks for linking bokeh plots to panel widgets
- 31 Aug 2019 » Retrieving genome assemblies via Entrez with Python
- 12 Aug 2019 » Accessing data from the PDB with Python
- 22 Jul 2019 » Bioinformatics on the Raspberry Pi 4
- 11 Jul 2019 » A sequence alignment viewer with Bokeh and Panel
- 02 Jul 2019 » Dashboards with PyViz Panel for interactive web apps
- 17 May 2019 » Predicting neoantigens
- 04 Apr 2019 » Make protein models with Blender
- 20 Mar 2019 » Sequence, gene and protein databases: are you confused?
- 27 Feb 2019 » Unknown proteins in Mycobacterium tuberculosis
- 25 Feb 2019 » Reading and writing genbank/embl files with Python
- 25 Nov 2018 » Using epitopepredict for MHC binding prediction in Python
- 12 Nov 2018 » Create an MHC-Class I binding predictor in Python
- 09 Oct 2018 » Creating a local RefSeq protein blast database
- 14 Aug 2018 » Create a bacterial GFF from a genbank file for BCFtools/csq
- 05 Jul 2017 » DataExplore - grouped plots in version 0.8.0
- 11 Dec 2015 » Example: plotting miRNA abundance data (advanced)
- 15 Sep 2015 » Looking at the Titanic dataset
- 18 Jul 2015 » Zenodo and sharing your software
- 14 Jun 2015 » Educational software for data analysis
- 11 Jun 2015 » DataExplore Features
- 30 May 2015 » DataExplore Introduction