My brief background introduction can be accessed here via my blog, which showcases several different cheminformatics, machine learning and data science projects in drug discovery using various software toolkits. The latest project I've worked on is about building a simple deep learning model about adverse drug reactions, with a note about the manually-collected data on cytochrome P450 3A4 substrates.
There are also several other projects I've worked on over the past year or so such as:
- Cytochrome P450 and approved drugs - CYP3A4 and 2D6 inhibitors
- Tree series in machine learning on ChEMBL-derived data (decision tree 1, decision tree 2, decision tree 3, random forest, random forest classifier, boosted trees)
- Working with scaffolds in small molecules - Manipulating SMILES strings
- Molecular visualisation (Molviz) web application - Using Shiny for Python web application framework (interactive data table part)
- Shinylive app in Python - Embedding app in Quarto document (app embedded in web page) & using pyodide.http to import csv files
- Small molecules in ChEMBL database 1 - Parquet file in Polars dataframe library, 2 - Preprocessing data in Polars dataframe library, 3 - Building logistic regression model using scikit-learn and 4 - Evaluating logistic regression model in scikit-learn (other older posts - cross-validation & hyper-parameter tuning and re-training & re-evaluation with scikit-learn pending future updates)
Open-source contributions: practical_cheminformatics_tutorials, chembl_downloader