R tutorial: Just getting started with R? Here is a post on inspecting univariate data

If you are new to R, then perhaps a look at simple univariate data is a good place to start.  In this RPubs post, I take a look at both categorical and numerical data.  It is quite easy to calculate descriptive statistics of univariate data and to visualize it using plots.  Click the link and have a look.

By the way, the file is also available on GitHub.

World Bank data on maternal mortality using R

The World Bank provides open data for many indicators across most countries, spanning the last few decades.

This data is available online with searches available by country codes (iso2c and iso3c), indicator names, and by dates. The indicators can be viewed here. It can also be accessed via an application programming interface (API).  The WDI library in R provides access through this API, allowing for easy search and retrieval of data.

In this post, written as an R-markdown file, and available on RPubs and GitHub, I showcase the WDI library by looking at maternal mortality rates for the United States, Brazil, and South Africa.

Follow the links and have a look.

R tutorial: Testing assumptions for parametric tests

In this post, written as an R-markdown file and posted on RPubs, I discuss the assumptions for the use of parametric tests in R.

Parametric tests such as the various t tests, analysis of variance (ANOVA), and correlations are only valid if certain assumptions are met. When these assumptions are not met, the use of these tests in your research may lead to false claims.

In the post I show you the most important assumptions and how to test for them using the R programming language.

The post is available on RPubs and the markdown file is on GitHub.

Sharing your machine learning models with others

Jupyter notebook


So, you’ve spent a lot of time and effort in creating your python machine learning model.  The parameters have been tweaked and the metrics look great.

Now what?  How do you share it with others to use?  Well, one easy way it to pickle it.  The pickle library in python allows you to write your model as a file, that others can open.  They can then simply enter their own data for prediction.

In this YouTube tutorial I create a random forest regressor model, export it as a pickle file, and then import it for use.  Have a look at how easy it all is.