At a recent meeting of fellow surgeons in my department, an interesting difference of opinion arose. It relates to our trainees’ knowledge of statistics. Unfortunately, the meeting did not allow any time to properly discuss the topic.
Some background to illuminate your way. Registration as a medical specialist in South Africa is regulated by the Health Professions Council. In recent years, the Council has introduced the completion of a mandatory research project, culminating in a dissertation. This accompanies the usual prescribed formal examinations.
Universities in the country manage the research projects by way of a Master’s degree, for which all trainees must register.
The difference of opinion was simple. From the opposite corner of the ring, it was suggested that our trainees require no knowledge of statistical analysis and should hand in their data to a statistician and merely use the results in their reports.
I do not share this opinion and feel strongly that all medical professionals should have an understanding of the topic. While not all doctors and specialists are interested in research, I do believe that an understanding of statistics empowers the individual when evaluating published research. This in turns helps to inform and change their practice. As a surgeon, I know it does mine. With no formal program for statistical teaching in our department, I looked towards open education.
To this end, I was a leading proponent in getting the University of Cape Town to sign up with the Coursera and FutureLearn massive open online course platforms. The creation of twelve courses were funded by the Vice Chancellor and my course on Understanding Medical Research was the first to launch on Coursera. It has been a phenomenal experience and the feedback has been tremendous.
Unfortunately, austerity measures have curtailed these efforts. I funded my second course on Coursera through an external loan. It is on the use of Julia (mathematical biology using scientific computing) and was created in collaboration with the Applied Mathematics Department. The honors section of the course is on data management and statistical analysis.
To further my resolve in teaching medical statistics, I have taken to the Udemy platform with a course on medical statistics using Mathematica. In the last few days I have also launched a course on the use of SPSS in healthcare and life science statistics. Udemy is an interesting platform and I would encourage its use.
Link to the course: SPSS for healthcare and life science statistics
Course on SPSS for medical statistics
My opinion, though, is clear. Learning to analyze data, is an empowering skill for everyone in healthcare.
So, you’ve spent a lot of time and effort in creating your python machine learning model. The parameters have been tweaked and the metrics look great.
Now what? How do you share it with other to use? Well, one easy way it to pickle it. The pickle library in python allows you to write your model as a
Pickle your model.
file, that others can open. They can then simply enter their own data for prediction.
In this YouTube tutorial I create a random forest regressor model, export it as a pickle file, and then import it for use. Have a look at how easy it all is.
The scikit learn library for python is a powerful machine learning tool.
K means clustering, which is easily implemented in python, uses geometric distance to create centroids around which our data can fit as clusters.
In the example attached to this article, I view 99 hypothetical patients that are prompted to sync their smart watch healthcare app data with a research team. The data is recorded continuously, but to comply with healthcare regulations, they have to actively synchronize the data. This example works equally well is we consider 99 hypothetical customers responding to a marketing campaign.
In order to prompt them, several reminder campaigns are run each year. In total there are 32 campaigns. Each campaign consists only of one of the following reminders: e-mail, short-message-service, online message, telephone call, pamphlet, or a letter. A record is kept of when they sync their data, as a marker of response to the campaign.
Our goal is to cluster the patients so that we can learn which campaign type they respond to. This can be used to tailor their reminders for the next year.
In the attached video, I show you just how easy this is to accomplish in python. I use the python kernel in a Jupyter notebook. There will also a mention of dimensionality reduction using principal component separation, also done using scikit learn. This is done so that we can view the data as a scatter plot using the plotly library.
I note more and more published papers on machine learning. As a clinician, I find it a fascinating way of looking at patient data. In case you are not familiar with machine learning, the definition given over at Wikipedia is: Machine learning is the subfield of computer science that gives computers the ability to learn without being explicitly programmed. …machine learning explores the study and construction of algorithms that can learn from and make predictions on data.
That is exactly what machine learning is used for in medicine as well. In a particular branch of machine learning, called supervised learning, a dataset of predictor variables together with a known outcome variable can be passed to the machine, which in turns constructs a model from the data. A selection of the data is usually kept separately and is used to test the model. Given that the outcomes are know, it is trivial to calculate the accuracy of the model. Once a model is generated, data without a known outcome can be passed to the model, which will predict the outcome. This can indeed be very useful in medicine.
There are many tools available to do machine learning. I use both Python and Mathematica. It is really easy to do. I have put together a short video on YouTube for those familiar with Mathematica, just to show how easy it is.
In the video I use random forest, logistic regression, and support vector machines models to predict the presence of appendicitis from the simulated modified Alvarado score predictor variables.
Understanding statistical analysis and interpreting the results of research papers are just as important as the ability to correctly diagnose the cause of acute abdominal pain.
Medical knowledge is expanding at a rapid pace. This is evident by the number of research papers being published every year. Although medical students and residents attend a formal education program, it is journal papers that serve as masters of education for the majority of a professional’s life.
The ability to understand the results section of a paper is crucial in deciding to change clinical practice. In order to do this effectively, knowledge of statistics is vital.
Yet, formal training is statistics takes a back seat when it comes to anatomy, physiology, and, clinical teaching. When statitics is part of the curriculum, it is often positioned as less important. It gets even worse when taught with mathematical emphasis. Whilst it may be rigorous to teach using equations, a subset of medical students are lost in this effort.
No medical school can look the other way. Data analysis and computational thinking is part of the future of healthcare. I was reminded of this when I came across this article again, after reading it almost two years ago: NYU medical students learning to analyze big data.
Our efforts at University of Cape Town are growing too. The massive open online course: Understanding clinical research on the Coursera platform, has now had more than 23,000 participants. In the division of General Surgery, I teach the use of data analysis and computational thinking to great effect, using IBM SPSS, Python, Julia, and Mathematica.
It’s time data science and statistical analysis to take its rightful place in medical school curricula.
In an effort to complete the 2016 academic year, the University of Cape Town leadership have called upon the body of lecturers to make use of online and blended teaching material. The University, as others in the country, are reopening their doors under difficult circumstances. These relate to continued protest action and the absence of consensus amongst students and staff on the if-and-how of reopening the University. With classroom attendance expected to be poor or even unwarranted, the problem of providing didactic learning had to be addressed. The solution, online learning. A simple call to put recordings of lectures online and to incorporate already existing web-based material.
I am well familiar with this concept. With more than 1,000 lectures on YouTube, two courses on the massive open online course (MOOC) platform Coursera® (here & here), and an international award in open education from the Open Education Consortium, I am sold on the concept of freeing knowledge from its academic confines. Knowledge through education is power. The access to it is a fundamental right and it should not be a commodity. There can be no better tool to uplift a population, than through proper education.
So now, UCT wants to embrace online education as an instant solution to save the academic year. So why, after pouring so much energy into the creation of online educational resources, am I not elated, ecstatic, vindicated? To be honest, I do experience these feelings. It is, however, mixed with feelings of trepidation, anxiety, and even frustration.
Frustrated, because my plea for the large scale creation of online resources have fallen on deaf ears. We need only look at the efforts of leading Universities such as the Massachusetts Institute of Technology, Stanford, Harvard and many others that have embraced the online space in their educational efforts. Not only to the benefit of their local students, but the world at large. UCT should have been creating these resources at scale a long time ago.
We have to take cognizance of the fact that the efforts of leading Universities took years to develop. Built with the input of experienced staff and stakeholders. Experts who know that simply transforming face-to-face teaching or printed material into video and electronic format does not constitute education. The problem cannot be solved with a purely cognitivist approach and most certainly, not overnight.
There are many problems inherent in the call for the rapid production of online course material. One glaring example is the lack of formative and summative assessment. The face-to-face method of providing learning material (lectures), asking a few unstructured questions during lecturing and sitting back in judgement during tests and exams is already a suboptimal approach to education. When replacing this flawed concept with unstructured online teaching, the outcome must certainly be viewed with concern. To develop a proper educational resource takes time, effort, experience, research, and most importantly, engagement and consultation with students. Watch this video from smaccDUB on how students can choreograph their own education.
The call to make online resources available must be supported. We need to do so in a measured and structured manner, though. To the University’s credit the Dean of the Health Sciences Faculty has called for the creation of a technology in education committee. The Centre for Innovation in Learning and Teaching have published an excellent guide to the creation of online educational resources. Furthermore, they provide individual consultations and hold regular workshops. Hopefully we can use this opportunity to align our efforts with those of the leading Universities in the world.
October 2016 has seen the launch of my second course on the Coursera massive open online course (MOOC) platform. Whereas my first course dealt with the statistics used in healthcare research, this one teaches the new Julia language for scientific computing. You can find it here.
As with other Coursera offerings, you can pay a nominal fee to get a verified certificate from the University of Cape Town, else you can audit the course for free. Remember, though, that it is always possible to state that you do not have the financial resources to pay for the verified certificate and Coursera will waive the fee and you will still get your certificate.
You can also learn more about Julia from their home page. Let me know what you think.
Before I forget, the Jupyter notebooks for the course are available on GitHub.
Just to show off what Jupyter notebooks can do, this post will render part 1 of lesson 1 of my lecture series on complex variables. Have a look.
After many months of preparation, my massive open online course (MOOC) on healthcare statistics has gone live on Coursera today, December 01, 2015. To sign up follow this link: Coursera.
This course build an intuitive understanding of statistics, without the use of complicated mathematical equations. Everything from descriptive statistics to hypothesis testing, confidence intervals, p-values, Student’s t-test, chi-square tests and many more are explained.
On completion of this course you should feel confident in properly evaluating the published literature or even embark on your own research.
So, how can an academic surgical unit benefit from the computer code development skills of people such as Wes McKinney of pandas fame or the educational skills of an engineering professor such as Lorena Barba of Numerical MOOC (numerical massive open online course) fame? Answer: A lot. This post is about our efforts to transition from antiquated to more modern forms of surgical training and assessment, all with the help of the one of the best software projects out there, Project Jupyter. This is Groote Schuur after all!
The teaching and assessment paradigm has stood for many, many decades. Do four years of surgical rotations, watch what your superiors do, present on ward rounds, go to the clinic, take calls, assist in theatre, do some cases, attend (most) academic meetings (read: watch yet another PowerPoint presentation), pass three exams. Presto. Specialist. That’s how its done now, that how is was done in the 00’s, the 1990’s, 80’s, 70’s, 60’s, 50’s, 40’s,… You get the point. Hey, depending on which source you read, it was in the the 40’s that the overhead projector was first used by the military in World War II. If you think about it, an overhead transparency projector is just PowerPoint without a computer. If you slipped in one transparency while the other is still showing, it;s just like a slide transition!
Depending on your working environment, you might be surrounded by people in full support of this form of education. It has always worked that way. Why change now? Well, as the argument goes, by that logic bloodletting should still be all the rage. You will note that in contrast to medical education, actual medicine has come on in leaps and bounds. We buy into the new paradigm that is evidence-based medicine. So why is it so difficult to accept and, even more difficult, to practice evidence based medical education?
Some of us are fortunate enough to work in countries where there are national efforts and frameworks in place to motivate for change. Have a look at the CanMEDS program in Canada. Two of the key concepts in their program are patient-centred care and competency-based assessment. Without going into the detail of their programs, I want to concentrate on these two aspects. Reason being, it gives us a practical starting point. For those unfortunate enough not to work in countries with national frameworks and support, small steps have to be taken.
So what solutions have we implemented in the Acute Care Surgery Unit at Groote Schuur Hospital? First and foremost, involve the patients. They are at the centre of what we do after all. Why should they have no say in the evaluation of their care? Fortunately, validated tools are available when you turn to the literature. At this time we use the Jefferson scale of patient’s perception of physician empathy. Moving on to competency assessment, there is the Ward Round Assessment tool amongst many others. Point being, we are moving away from the 20-second, mark either average or above average on the end-of-rotation subjective question scorecard. You know the one: (1) Knowledge, (2) Surgical skill, (3) Punctuality…
Now, the Acute Care Surgery Unit is brand new (you can learn more about us from my talk at this year’s Association of Surgeons of South Africa conference here). We certainly have no research assistants, money, or personnel to help us in our efforts towards patient-centred, competency-based education. This whole process has to be self-driven. Solutions to the problem? Well, that’s the easy bit. The World has changed over the last few years. No longer is knowledge locked away behind expensive paywalls. If you want to learn something, go online. For me, it all started with the Massachusetts Institute of Technology (MIT). Their open courseware opened a whole new world to me. MIT and the massive open online course platforms such as Coursera (to which I will shortly contribute), EdX and FutureLearn (to name but a few) are handing the keys of knowledge to all humankind.
This brings me to Project Jupyter and computer languages such as IPython and Julia. If you have no access to software development teams and big budget research units, do yourself a favor, search for tutorials on these projects. You will find so many wonderful men and women, going out of their way to empower you with these tools. Even a lowly surgeon such as myself have online tutorials. Have a look at these:
The Klopper Lectures on Julia
Mini project: Medical research using Julia
Back to what this post is all about. Here, you will find a link to some of our results using Project Jupiter (Github). To protect patients and trainees, the data have been altered and are not a true reflection of anyone or any given period. What it does show, though, is how easy it is to use data to properly guide the training of our residents; and this is just our first small step.