Doing Journalism with Data

Astounding is an understatement for describing the wealth of information available in the internet. All that’s necessary is to make time and prevent yourself from getting information overload. I personally derive immense satisfaction in discovering new things, and online courses through platforms like Coursera and Canvas help me learn just about anything I need personally and professionally. I made certain sacrifices to my social life and that involves cutting down on my social media consumption to maximize my online learning. But it has proven to be quite worth it; all these resources, if I try to enroll on them offline from where I am physically, cost quite a fortune. There’s this traditional academic option to specialize with graduate studies and it’s a good thing to have. But if I, for example, would like a crash course on a topic outside of my industry of choice, an online course is the best way to go.

These days, not having money for tuition fee is not an excuse. In fact, education has been made more accessible. And each course has its own set of online tools, at least at the technical end of the spectrum. For example, here is my most recent discovery: Open Refine for cleaning data. This was discussed in detail from the European Journalism Center’s limited free course offering of Data Journalism at Canvas:

Google Refine open source tool for Data Journalism

Google Refine open source tool for Data Journalism

 

Some of the certificate-issuing courses like EJC’s data journalism offering are limited. Sadly, I only learned of the course last week and I basically had less than a week to skim through an online material designed to be studied by an online learner for 5-6 weeks.

Sample Data Journalism Module Video Lecture

Sample Data Journalism Module Video Lecture

Coming from an engineering background, I found the math side of the online course to be a bit too basic. But it’s a useful refresher for old engineers who are interested in taking a data-related slant for their careers or personal projects. The course was really designed for journalists with limited mathematical background, and it’s a perfect primer which teaches just enough math to make evidence-intensive data-driven journalistic pieces. It’s not as programming or math-heavy as the online data science course at Coursera. The data searching side, on the other hand, is quite new to me.

Some of the things discussed like regular expression in Google searches, web scraping, and data cleaning are familiar to me from my online pursuits. But there are others like new and amazing tools that make it easy to create data-driven stories. It’s a shame that they are only offering it for free for a limited amount of time. As of this writing, they will be closing it within two days and the material will only be available to me until April 2015.

There is one downside to enrolling in so many online courses: it makes it more difficult to tie everything together. I was told by one of my coding mentors that the best way to sift through the streams of information from the internet is by choosing a pet project, one at a time, and then just researching the ones relevant or necessary to accomplishing that pet project. I guess that’s my current life hack recently– to create a series of pet projects that will allow me to make use of all these tools and techniques that I harnessed from the internet and turn it into something usable and reproducible. And whatever good I find out of these pet projects, I will certainly share my learnings here at Helena blog.